cturan/llama.cpp

Auteur	SHA1 Message	Date
Johannes Gäßler	658987cfc9 CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (#13014)	il y a 9 mois
Georgi Gerganov	2f74c354c0 graph : make FA compatible with MLA + add initial Metal kernels (#12953)	il y a 9 mois
Jeff Bolz	015022bb53 vulkan: enable coopmat2 FA gqa and split_k optimizations more often (#12931)	il y a 9 mois
Georgi Gerganov	1d2b613445 tests : fix init order (#0)	il y a 9 mois
Diego Devesa	fe92821ea9 ggml : add bilinear upscale support (ggml/1185)	il y a 9 mois
Jeff Bolz	f01bd02376 vulkan: Implement split_k for coopmat2 flash attention. (#12627)	il y a 9 mois
Georgi Gerganov	b4ae50810e metal : improve FA + improve MoE (#12612)	il y a 9 mois
Jeff Bolz	9b169a4d4e vulkan: fix mul_mat_vec failure in backend tests (#12529)	il y a 10 mois
Georgi Gerganov	ba932dfb50 ggml : fix quantized cpy op (#12310)	il y a 10 mois
Jeff Bolz	eddfb43850 vulkan: Optimize mul_mat_vec p021 and nc shaders (#12505)	il y a 10 mois
Gaurav Garg	517b5ddbf0 CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case (#12183)	il y a 10 mois
Molly Sophia	7dfad387e3 llama: Add support for RWKV v7 architecture (#12412)	il y a 10 mois
Jeff Bolz	bf69cfe62f vulkan: fix bug in coopmat1 mul_mat_id (#12316)	il y a 10 mois
cmdr2	0cbee131ad cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129)	il y a 10 mois
cmdr2	87abb7e903 cuda/cpu: Increase support for fp16 unary operations (ggml/1125)	il y a 10 mois
cmdr2	f54a4ba11e Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121)	il y a 10 mois
Diego Devesa	d5c63cd7f9 test-backend-ops : add option -p to filter by op params (#12155)	il y a 10 mois
William Tambellini	70680c48e5 ggml : upgrade init_tensor API to return a ggml_status (#11854)	il y a 10 mois
Johannes Gäßler	5fa07c2f93 CUDA: optimize FA for GQA + large batches (#12014)	il y a 11 mois
Rémy O	2eea03d86a vulkan: implement several ops relevant for ggml_opt (#11769)	il y a 11 mois
Johannes Gäßler	fd08255d0d CUDA: non-contiguous (RMS) norm support (#11659)	il y a 11 mois
Akarshan Biswas	6e84b0ab8e SYCL : SOFTMAX F16 mask support and other fixes (#11261)	il y a 11 mois
Johannes Gäßler	8137b4bb2b CPU/CUDA: fix (GQA) mul mat back, add CUDA support (#11380)	il y a 11 mois
Jeff Bolz	564804b79b tests: fix some mul_mat test gaps (#11375)	il y a 11 mois
Jeff Bolz	44e18ef939 vulkan: fix coopmat2 flash attention for non-contiguous inputs (#11281)	il y a 1 an
Jeff Bolz	bd38ddea01 vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166)	il y a 1 an
Johannes Gäßler	9c8dcefe17 CUDA: backwards pass for misc. ops, add tests (#11257)	il y a 1 an
Johannes Gäßler	432df2d5f9 RoPE: fix back, CUDA support for back + noncont. (#11240)	il y a 1 an
Molly Sophia	ee7136c6d1 llama: add support for QRWKV6 model architecture (#11001)	il y a 1 an
Jeff Bolz	716bd6dec3 vulkan: optimize mul_mat for small values of N (#10991)	il y a 1 an

Récemment Précédemment

Historique des commits Trouver

Historique des commits