Commit History

Author SHA1 Message Date
  Georgi Gerganov f9be42add0 readme : add quantization info 2 years ago
  Georgi Gerganov 574406dc7e ggml : add Q5_0 and Q5_1 quantization (#1187) 2 years ago
  Ásgeir Bjarni Ingvarsson 87a6f846d3 Allow setting the rng seed after initialization. (#1184) 2 years ago
  DaniAndTheWeb ea3ad7eb60 Updating build instructions to include BLAS support (#1183) 2 years ago
  Pavol Rusnak 859fee6dfb quantize : use `map` to assign quantization type from `string` (#1191) 2 years ago
  Stephan Walter 4afcc37869 Update SHA256SUMS after quantization change (#1181) 2 years ago
  ostix360 667c501334 py : cast lora_alpha to int in convert-lora-to-ggml (#1170) 2 years ago
  Pavol Rusnak bb98e77be7 nix: use convert.py instead of legacy wrapper convert-pth-to-ggml.py (#981) 2 years ago
  Georgi Gerganov 7a32fcb3b2 ggml : add Q8_0 quantization format (rename the old one to Q8_1) (ARM NEON) (#1179) 2 years ago
  unbounded dd0eabc049 ggml : use full range for Q4_0 and Q4_2 quantization (#729) 2 years ago
  xaedes 54bb60e268 ggml : fix bug in ggml_compute_forward_sum_f32 (#1162) 2 years ago
  Georgi Gerganov 8a0f8673ba ggml : export symbols (#1155) 2 years ago
  xaedes 0c5692345d examples : add save_load_state example (#1150) 2 years ago
  Georgi Gerganov 957c8ae21d llama : increase scratch buffer size for 65B (ref #1152) 2 years ago
  mgroeber9110 9b0a4d4214 examples/main README improvements and some light refactoring (#1131) 2 years ago
  Stephan Walter 2ec83428de Fix build for gcc 8 and test in CI (#1154) 2 years ago
  slaren e4cf982e0d Fix cuda compilation (#1128) 2 years ago
  Georgi Gerganov c4fe84fb0d llama : refactor get / set state + remove redundant kv cache API (#1143) 2 years ago
  slaren 1d78fecdab Fix LoRA acronym (#1145) 2 years ago
  Georgi Gerganov 284685f169 scripts : add helper scripts to synch ggml repo 2 years ago
  DannyDaemonic edce63baa9 Added README.md for main with examples and explanations (#1139) 2 years ago
  Georgi Gerganov ec9cdb6752 ggml : do not print perf ops that have not been used at all 2 years ago
  Georgi Gerganov e4422e299c ggml : better PERF prints + support "LLAMA_PERF=1 make" 2 years ago
  Stephan Walter 53c8434398 Improve AVX2 for vec_dot_q4_3_q8_0 (#1138) 2 years ago
  Pavol Rusnak c6524f46eb readme : update gpt4all instructions (#980) 2 years ago
  Yishuo Wang c9e2c26f41 A better `packNibbles` and `mul_sum_i8_pairs_float` implementation using AVX512 (#1119) 2 years ago
  Georgi Gerganov 0e018fe008 ggml : fix Q4_3 cuBLAS 2 years ago
  Stephan Walter 857308d1e8 ci : trigger CI for drafts, but not most PR actions (#1125) 2 years ago
  Stephan Walter c50b628810 Fix CI: ARM NEON, quantization unit tests, editorconfig (#1122) 2 years ago
  unbounded 5f939498d5 ggml : unit test for quantization functions (#953) 2 years ago