cturan/llama.cpp

Autor	SHA1 Mensaje	Fecha
Xuan Son Nguyen	3573fa8e7b server : (refactor) no more json in server_task input (#10691)	hace 1 año
Georgi Gerganov	d9c3ba2b77 ggml : disable iq4_nl interleave size 8 (#10709)	hace 1 año
Georgi Gerganov	ce4a7b8493 server : various fixes (#10704)	hace 1 año
Djip007	19d8762ab6 ggml : refactor online repacking (#10446)	hace 1 año
Georgi Gerganov	c2a16c0bdb server : fix free of spec context and batch (#10651)	hace 1 año
0cc4m	3df784b305 Vulkan: VK_KHR_cooperative_matrix support to speed up prompt processing (#10597)	hace 1 año
Robert Ormandi	86a1934978 metal : Extend how Llama.cpp locates metal resources (#10676)	hace 1 año
Sukriti Sharma	784a14aa49 convert : add support for Roberta embeddings (#10695)	hace 1 año
Georgi Gerganov	c5ede3849f convert : add custom attention mapping	hace 1 año
Xuan Son Nguyen	f162d45a21 common : bring back --no-warmup to server (#10686)	hace 1 año
Xuan Son Nguyen	6c5bc0625f server : (refactoring) do not rely on JSON internally (#10643)	hace 1 año
Plamen Minev	7736837d62 fix(server) : not show alert when DONE is received (#10674)	hace 1 año
Jeff Bolz	c9c6e01dae vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (#10206)	hace 1 año
Riccardo Orlando	6fe6247831 llama : add Minerva 7B model support (#10673)	hace 1 año
Georgi Gerganov	0cd182ebcc sync : ggml	hace 1 año
PAB	a8cbab201d ggml: add `GGML_SET` Metal kernel + i32 CPU kernel (ggml/1037)	hace 1 año
PAB	c2082d93a8 ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)	hace 1 año
Daniel Bevenius	d405804be8 py : update outdated copy-paste instructions [no ci] (#10667)	hace 1 año
aryantandon01	f112d198cd Update deprecation-warning.cpp (#10619)	hace 1 año
Georgi Gerganov	1da7b76569 server : fix speculative decoding with context shift (#10641)	hace 1 año
Diego Devesa	59f4db1088 ggml : add predefined list of CPU backend variants to build (#10626)	hace 1 año
Diego Devesa	2803540814 ggml-cpu : fix HWCAP2_I8MM value (#10646)	hace 1 año
ltoniazzi	253b7fde91 Fix HF repo commit to clone lora test models (#10649)	hace 1 año
JFLFY2255	8d0cfd554a llama: Support MiniCPM-1B (with & w/o longrope) (#10559)	hace 1 año
Jeff Bolz	2759916d86 vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (#10642)	hace 1 año
Nicolò Scipione	40c6d79fb5 SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (#10584)	hace 1 año
Wang Ran (汪然)	98036d5670 fix typo of README.md (#10605)	hace 1 año
Frankie Robertson	cd2f37b304 Avoid using __fp16 on ARM with old nvcc (#10616)	hace 1 año
Benson Wong	da6aac91f1 Add docs for creating a static build (#10268) (#10630)	hace 1 año
piDack	01e6d9bb71 clip : add sycl support (#10574)	hace 1 año

Posterior Anterior

Historial de Commits Buscar

Historial de Commits