cturan/llama.cpp

Auteur	SHA1 Message	Date
Georgi Gerganov	9dcd200d57 metal : remove memory pools (#15966)	il y a 4 mois
Adam	0fa154e350 rocm.Dockerfile: added gfx1200,gfx1201 architectures to support AMD Radeon RX 9000 series (#15994)	il y a 4 mois
Ruben Ortlam	261e6a20ff Vulkan: Clean up mul_mm shader (#15987)	il y a 4 mois
lcy	a0e13dcbe5 build: fix the build failures of Windows HIP release job (#15984)	il y a 4 mois
Georgi Gerganov	a14bd35014 metal : fix kernel requirements (#15983)	il y a 4 mois
Radoslav Gerganov	918b26f197 rpc : fix regression when --device is used (#15981)	il y a 4 mois
Diego Devesa	9ecb884346 releases : update ROCM, add gfx1200, gfx1201, gfx1151 (#15972)	il y a 4 mois
Radoslav Gerganov	d1c6f11f47 doc : update documentation for --tensor-split (#15980)	il y a 4 mois
Aaron Teo	6380d6a3e7 ggml-zdnn: rm user mapped buffers (#15965)	il y a 4 mois
Jeff Bolz	aa0c461efe vulkan: fix failing dequant shaders (#15862)	il y a 4 mois
Jeff Bolz	b9c9c9f789 vulkan: initialize vulkan-hpp to allow using extension function pointers (#15705)	il y a 4 mois
Diego Devesa	50f4281a6f llama : allow using iGPUs with --device (#15951)	il y a 4 mois
Georgi Gerganov	55758b00ca metal : refactor kernel loading (#15964)	il y a 4 mois
Georgi Gerganov	f161463a54 metal : allow ops to run concurrently (#15929)	il y a 4 mois
Georgi Gerganov	84d7b2fca1 metal : fix memory leaks (#15962)	il y a 4 mois
Aaron Teo	40be51152d ggml-zdnn: fix #15414, activate FP16 and BF16 acceleration and incorrect zTensor free (#15839)	il y a 4 mois
Eric Curtin	4bf5549269 Add docker protocol support for llama-server model loading (#15790)	il y a 4 mois
Haiyue Wang	f4e664f838 context : remove redundant explicit casting to the same type (#15948)	il y a 4 mois
Georgi Gerganov	f088b6a84f server : adjust prompt similarity thold + add logs (#15913)	il y a 4 mois
Ruben Ortlam	304ac5693d Vulkan iGPU device selection overhaul and PCI ID API support (#15947)	il y a 4 mois
Mathieu Baudier	6c88ad8fa7 vulkan: Make device memory check more portable (#15939)	il y a 4 mois
Neo Zhang Jianyu	704d90c987 Revert "sycl: add usage of enqueue_functions extension (#14244)" (#15910)	il y a 4 mois
Diego Devesa	360d6533db ggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device type (#15797)	il y a 4 mois
Johannes Gäßler	0e6ff0046f CUDA: larger SRAM reads for tile FA, AMD FP16 dot (#15927)	il y a 4 mois
ddh0	df082f5630 nitpick : correct MB to MiB (#15934)	il y a 4 mois
Daniel Bevenius	24a6734daf ggml-cpu : add check for ARM MATMUL_INT8/i8mm support (#15922)	il y a 4 mois
Charles Xu	2b3efea9a4 kleidiai: fix GGML_ASSERT(*cur_backend_id != -1) failed (#15614)	il y a 4 mois
hipudding	c0389dba43 CANN: Disable acl_graph for prefill stage (#15933)	il y a 4 mois
Oliver Simons	00681dfc16 CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3% E2E performance (#15872)	il y a 4 mois
Jie Fu (傅杰)	4f658855fa llama : support T5 models with unequal number of encoder-decoder layers (#15909)	il y a 4 mois

Récemment Précédemment

Historique des commits Trouver

Historique des commits