cturan/llama.cpp

نویسنده	SHA1 پیام	تاریخ
Kawrakow	4a3156de2f CUDA: faster dequantize kernels for Q4_0 and Q4_1 (#4938)	2 سال پیش
David Pflug	a836c8f534 llama : fix missing quotes (#4937)	2 سال پیش
Kawrakow	467a882fd2 Add ability to use importance matrix for all k-quants (#4930)	2 سال پیش
Georgi Gerganov	bb0c139247 llama : check LLAMA_TRACE env for extra logging (#4929)	2 سال پیش
Georgi Gerganov	9408cfdad6 scripts : sync-ggml-am.sh option to skip commits	2 سال پیش
Georgi Gerganov	03c5267490 llama : use LLAMA_LOG_ macros for logging	2 سال پیش
Kawrakow	a128c38de8 Fix ffn_down quantization mix for MoE models (#4927)	2 سال پیش
Alex Azarov	5f5fe1bd60 metal : correctly set SIMD support flags on iOS (#4923)	2 سال پیش
Karthik Kumar Viswanathan	ac32902a87 llama : support WinXP build with MinGW 8.1.0 (#3419)	2 سال پیش
Kawrakow	147b17ac94 2-bit quantizations (#4897)	2 سال پیش
Kawrakow	807179ec58 Make Q3_K_S be the same as olf Q3_K_L for Mixtral-8x7B (#4906)	2 سال پیش
Georgi Gerganov	76484fbfd3 sync : ggml	2 سال پیش
Johannes Gäßler	c71d608ce7 ggml: cache sin/cos for RoPE (#4908)	2 سال پیش
Georgi Gerganov	4be5ef556d metal : remove old API (#4919)	2 سال پیش
Georgi Gerganov	0ea069b87b server : fix prompt caching with system prompt (#4914)	2 سال پیش
Georgi Gerganov	f172de03f1 llama : fix detokenization of non-special added-tokens (#4916)	2 سال پیش
Georgi Gerganov	2d57de5255 metal : disable log for loaded kernels (#4794)	2 سال پیش
David Friehs	df845cc982 llama : minimize size used for state save/load (#4820)	2 سال پیش
Someone	6b48ed0893 workflows: unbreak nix-build-aarch64, and split it out (#4915)	2 سال پیش
Yann Follet	722d33f34e main : add parameter --no-display-prompt (#4541)	2 سال پیش
texmex76	c30b1ef39a gguf : fix potential infinite for-loop (#4600)	2 سال پیش
Georgi Gerganov	b38b5e93ae metal : refactor kernel loading code (#4794)	2 سال پیش
Johannes Gäßler	7dc78764e2 compare-llama-bench: tweak output format (#4910)	2 سال پیش
Ziad Ben Hadj-Alouane	356327feb3 server : fix deadlock that occurs in multi-prompt scenarios (#4905)	2 سال پیش
makomk	ee8243adaa server : fix crash with multimodal models without BOS token (#4904)	2 سال پیش
Georgi Gerganov	15ebe59210 convert : update phi-2 to latest HF repo (#4903)	2 سال پیش
Georgi Gerganov	de473f5f8e sync : ggml	2 سال پیش
Georgi Gerganov	f238461236 ggml : fix 32-bit ARM compat for IQ2_XS (whisper/1758)	2 سال پیش
slaren	fa5c1fb44a backend_sched : fix assignments	2 سال پیش
Maximilian Winter	52ee4540c0 examples : add pydantic models to GBNF grammar generator (#4883)	2 سال پیش

جدیدتر قدیمی‌تر

تاریخچه Commit ها یافتن

تاریخچه Commit ها