cturan/llama.cpp

Author	SHA1 Message	Date
Kawrakow	334a835a1c ggml : importance matrix support for legacy quants (#4969)	2 years ago
Maximilian Winter	4feb4b33ee examples : add complete parallel function calling example (#4974)	2 years ago
Georgi Gerganov	959ef0c0df perplexity : fix kv cache handling for hellaswag (#4981)	2 years ago
Georgi Gerganov	c37b3474e6 flake.lock: update flake-parts, flake-parts/nixpkgs-lib, and nixpkgs (#4920)	2 years ago
Paul Tsochantaris	158f8c9e21 metal : localized logic in `ggml_metal_graph_compute` (#4924)	2 years ago
Neuman Vong	862f5e41ab android : introduce starter project example (#4926)	2 years ago
Alex Azarov	3a48d558a6 metal : replace loop of dispatch_async with dispatch_apply (#4934)	2 years ago
Alex Azarov	7c8d3abd1a metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (#4936)	2 years ago
Maximilian Winter	122ed4840c examples : fix and improv docs for the grammar generator (#4909)	2 years ago
Justine Tunney	a0b3ac8c48 ggml : introduce GGML_CALL function annotation (#4850)	2 years ago
Daniel Bevenius	d75c232e1d finetune : use LLAMA_FILE_MAGIC_GGLA (#4961)	2 years ago
stduhpf	e0324285a5 speculative : threading options (#4959)	2 years ago
ngc92	3e5ca7931c pass cpu-architecture arguments only to host code (C;C++) (#4943)	2 years ago
David Friehs	4483396751 llama : apply classifier-free guidance to logits directly (#4951)	2 years ago
Victor Z. Peng	d9aa4ffa6e awq-py : fix typo in awq-py/README.md (#4947)	2 years ago
Georgi Gerganov	ddb008d845 cuda : fix dequantize kernel names (#4938)	2 years ago
Kawrakow	2faaef3979 llama : check for 256 divisibility for IQ2_XS, IQ2_XXS (#4950)	2 years ago
Kawrakow	4a3156de2f CUDA: faster dequantize kernels for Q4_0 and Q4_1 (#4938)	2 years ago
David Pflug	a836c8f534 llama : fix missing quotes (#4937)	2 years ago
Kawrakow	467a882fd2 Add ability to use importance matrix for all k-quants (#4930)	2 years ago
Georgi Gerganov	bb0c139247 llama : check LLAMA_TRACE env for extra logging (#4929)	2 years ago
Georgi Gerganov	9408cfdad6 scripts : sync-ggml-am.sh option to skip commits	2 years ago
Georgi Gerganov	03c5267490 llama : use LLAMA_LOG_ macros for logging	2 years ago
Kawrakow	a128c38de8 Fix ffn_down quantization mix for MoE models (#4927)	2 years ago
Alex Azarov	5f5fe1bd60 metal : correctly set SIMD support flags on iOS (#4923)	2 years ago
Karthik Kumar Viswanathan	ac32902a87 llama : support WinXP build with MinGW 8.1.0 (#3419)	2 years ago
Kawrakow	147b17ac94 2-bit quantizations (#4897)	2 years ago
Kawrakow	807179ec58 Make Q3_K_S be the same as olf Q3_K_L for Mixtral-8x7B (#4906)	2 years ago
Georgi Gerganov	76484fbfd3 sync : ggml	2 years ago
Johannes Gäßler	c71d608ce7 ggml: cache sin/cos for RoPE (#4908)	2 years ago

Newer Older

Commit History Find

Commit History