cturan/llama.cpp

Auteur	SHA1 Message	Date
Xuan Son Nguyen	6c5bc0625f server : (refactoring) do not rely on JSON internally (#10643)	il y a 1 an
Plamen Minev	7736837d62 fix(server) : not show alert when DONE is received (#10674)	il y a 1 an
Jeff Bolz	c9c6e01dae vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (#10206)	il y a 1 an
Riccardo Orlando	6fe6247831 llama : add Minerva 7B model support (#10673)	il y a 1 an
Georgi Gerganov	0cd182ebcc sync : ggml	il y a 1 an
PAB	a8cbab201d ggml: add `GGML_SET` Metal kernel + i32 CPU kernel (ggml/1037)	il y a 1 an
PAB	c2082d93a8 ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)	il y a 1 an
Daniel Bevenius	d405804be8 py : update outdated copy-paste instructions [no ci] (#10667)	il y a 1 an
aryantandon01	f112d198cd Update deprecation-warning.cpp (#10619)	il y a 1 an
Georgi Gerganov	1da7b76569 server : fix speculative decoding with context shift (#10641)	il y a 1 an
Diego Devesa	59f4db1088 ggml : add predefined list of CPU backend variants to build (#10626)	il y a 1 an
Diego Devesa	2803540814 ggml-cpu : fix HWCAP2_I8MM value (#10646)	il y a 1 an
ltoniazzi	253b7fde91 Fix HF repo commit to clone lora test models (#10649)	il y a 1 an
JFLFY2255	8d0cfd554a llama: Support MiniCPM-1B (with & w/o longrope) (#10559)	il y a 1 an
Jeff Bolz	2759916d86 vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (#10642)	il y a 1 an
Nicolò Scipione	40c6d79fb5 SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (#10584)	il y a 1 an
Wang Ran (汪然)	98036d5670 fix typo of README.md (#10605)	il y a 1 an
Frankie Robertson	cd2f37b304 Avoid using __fp16 on ARM with old nvcc (#10616)	il y a 1 an
Benson Wong	da6aac91f1 Add docs for creating a static build (#10268) (#10630)	il y a 1 an
piDack	01e6d9bb71 clip : add sycl support (#10574)	il y a 1 an
Jeff Bolz	cc98896db8 vulkan: optimize and reenable split_k (#10637)	il y a 1 an
Xuan Son Nguyen	91c36c269b server : (web ui) Various improvements, now use vite as bundler (#10599)	il y a 1 an
Georgi Gerganov	1cd3df46bd scripts : remove amx sync	il y a 1 an
Georgi Gerganov	c505471857 sync : ggml	il y a 1 an
mahorozte	e9e661bd59 CUDA: remove unnecessary warp reduce in FA (ggml/1032)	il y a 1 an
PAB	efb6ae9630 feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019)	il y a 1 an
PAB	667d70d170 metal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml/1026)	il y a 1 an
Xuan Son Nguyen	3b4f2e33e2 llama : add missing LLAMA_API for llama_chat_builtin_templates (#10636)	il y a 1 an
Nikolaos Pothitos	82bca2257b readme : add option, update default value, fix formatting (#10271)	il y a 1 an
Georgi Gerganov	0115df2f65 metal : small-batch mat-mul kernels (#10581)	il y a 1 an

Récemment Précédemment

Historique des commits Trouver

Historique des commits