cturan/llama.cpp

Autor	SHA1 Wiadomość	Data
Georgi Gerganov	47f931c8f9 server : enable cache_prompt by default (#10501)	1 rok temu
Georgi Gerganov	106964e3d2 metal : enable mat-vec kernels for bs <= 4 (#10491)	1 rok temu
Shane A	80acb7b430 Rename Olmo1124 to Olmo2 (#10500)	1 rok temu
Diego Devesa	10bce0450f llama : accept a list of devices to use to offload a model (#10497)	1 rok temu
Johannes Gäßler	1f922254f0 Github: update issue templates [no ci] (#10489)	1 rok temu
brucepro	a9a678a6b2 Add download chat feature to server chat (#10481)	1 rok temu
Georgi Gerganov	9ca2e67762 server : add speculative decoding support (#10455)	1 rok temu
Diego Devesa	5931c1f233 ggml : add support for dynamic loading of backends (#10469)	1 rok temu
Georgi Gerganov	f6d12e7df8 tests : fix compile warning	1 rok temu
Georgi Gerganov	b756441104 metal : minor code formatting	1 rok temu
Neo Zhang Jianyu	5a8987793f [SYCL] Fix building Win package for oneAPI 2025.0 update (#10483)	1 rok temu
Georgi Gerganov	d9d54e498d speculative : refactor and add a simpler example (#10362)	1 rok temu
Georgi Gerganov	cce5a90075 flake.lock: Update (#10470)	1 rok temu
Diego Devesa	dc39012cba llama : fix op mul check with command-r-plus (#10476)	1 rok temu
Gabe Goodhart	9336db462c convert : XLMRoberta Type Vocab Size (#10458)	1 rok temu
momonga	96fa2c5e2d fix gguf-py: Conversion error when multiple licenses are configured (#9807)	1 rok temu
Diego Devesa	55ed008b2d ggml : do not use ARM features not included in the build (#10457)	1 rok temu
蕭澧邦	6dfcfef078 ci: Update oneAPI runtime dll packaging (#10428)	1 rok temu
Johannes Gäßler	599b3e0cd4 GitHub: ask for more info in issue templates (#10426)	1 rok temu
leo-pony	c18610b4ee CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216)	1 rok temu
Diego Devesa	a5e47592b6 cuda : optimize argmax (#10441)	1 rok temu
Georgi Gerganov	1bb30bf28c llama : handle KV shift for recurrent models (#10402)	1 rok temu
Georgi Gerganov	87a533be57 sync : ggml	1 rok temu
slaren	59b9172822 ggml/sched : do not skip views in pre-assignments	1 rok temu
Johannes Gäßler	02e4eaf22f ggml-opt: fix data corruption (ggml/1022)	1 rok temu
Jeff Bolz	9abe9eeae9 vulkan: predicate max operation in soft_max shaders/soft_max (#10437)	1 rok temu
bandoti	f95caa7954 cmake: add link dependencies to cmake find pkg (#10433)	1 rok temu
Diego Devesa	fab5d30ff6 llama : add .clang-format file (#10415)	1 rok temu
Jeff Bolz	8fd4b7fa29 vulkan: copy iq4_nl LUT into shared memory (#10409)	1 rok temu
Jeff Bolz	1bacb9f625 vulkan: further optimize mul_mat_vec using larger loads (#10387)	1 rok temu

Nowsze Starsze

Historia zmian Szukaj

Historia zmian