cturan/llama.cpp

Author	SHA1 Message	Date
Georgi Gerganov	f9be42add0 readme : add quantization info	2 years ago
Georgi Gerganov	574406dc7e ggml : add Q5_0 and Q5_1 quantization (#1187)	2 years ago
Ásgeir Bjarni Ingvarsson	87a6f846d3 Allow setting the rng seed after initialization. (#1184)	2 years ago
DaniAndTheWeb	ea3ad7eb60 Updating build instructions to include BLAS support (#1183)	2 years ago
Pavol Rusnak	859fee6dfb quantize : use `map` to assign quantization type from `string` (#1191)	2 years ago
Stephan Walter	4afcc37869 Update SHA256SUMS after quantization change (#1181)	2 years ago
ostix360	667c501334 py : cast lora_alpha to int in convert-lora-to-ggml (#1170)	2 years ago
Pavol Rusnak	bb98e77be7 nix: use convert.py instead of legacy wrapper convert-pth-to-ggml.py (#981)	2 years ago
Georgi Gerganov	7a32fcb3b2 ggml : add Q8_0 quantization format (rename the old one to Q8_1) (ARM NEON) (#1179)	2 years ago
unbounded	dd0eabc049 ggml : use full range for Q4_0 and Q4_2 quantization (#729)	2 years ago
xaedes	54bb60e268 ggml : fix bug in ggml_compute_forward_sum_f32 (#1162)	2 years ago
Georgi Gerganov	8a0f8673ba ggml : export symbols (#1155)	2 years ago
xaedes	0c5692345d examples : add save_load_state example (#1150)	2 years ago
Georgi Gerganov	957c8ae21d llama : increase scratch buffer size for 65B (ref #1152)	2 years ago
mgroeber9110	9b0a4d4214 examples/main README improvements and some light refactoring (#1131)	2 years ago
Stephan Walter	2ec83428de Fix build for gcc 8 and test in CI (#1154)	2 years ago
slaren	e4cf982e0d Fix cuda compilation (#1128)	2 years ago
Georgi Gerganov	c4fe84fb0d llama : refactor get / set state + remove redundant kv cache API (#1143)	2 years ago
slaren	1d78fecdab Fix LoRA acronym (#1145)	2 years ago
Georgi Gerganov	284685f169 scripts : add helper scripts to synch ggml repo	2 years ago
DannyDaemonic	edce63baa9 Added README.md for main with examples and explanations (#1139)	2 years ago
Georgi Gerganov	ec9cdb6752 ggml : do not print perf ops that have not been used at all	2 years ago
Georgi Gerganov	e4422e299c ggml : better PERF prints + support "LLAMA_PERF=1 make"	2 years ago
Stephan Walter	53c8434398 Improve AVX2 for vec_dot_q4_3_q8_0 (#1138)	2 years ago
Pavol Rusnak	c6524f46eb readme : update gpt4all instructions (#980)	2 years ago
Yishuo Wang	c9e2c26f41 A better `packNibbles` and `mul_sum_i8_pairs_float` implementation using AVX512 (#1119)	2 years ago
Georgi Gerganov	0e018fe008 ggml : fix Q4_3 cuBLAS	2 years ago
Stephan Walter	857308d1e8 ci : trigger CI for drafts, but not most PR actions (#1125)	2 years ago
Stephan Walter	c50b628810 Fix CI: ARM NEON, quantization unit tests, editorconfig (#1122)	2 years ago
unbounded	5f939498d5 ggml : unit test for quantization functions (#953)	2 years ago

Newer Older

Commit History Find

Commit History