Commit History

Author SHA1 Message Date
  comex 2663d2c678 Windows fixes (#890) 2 years ago
  comex 180b693a47 Print model version. 2 years ago
  comex f963b63afa Rewrite loading code to try to satisfy everyone: 2 years ago
  unbounded 62cfc54f77 Add quantize-stats command for testing quantization (#728) 2 years ago
  Ivan Stepanov 4953e9007f llama : always sort logits before nucleus sampling (#812) 2 years ago
  Georgi Gerganov 986b6ce9f9 ggml, llama : avoid heavy V transpose + improvements (#775) 2 years ago
  Ivan Stepanov 5a8c4f6240 llama : define non-positive top_k; top_k range check (#779) 2 years ago
  Ivan Stepanov cd7fa95690 Define non-positive temperature behavior (#720) 2 years ago
  Christian Falch e986f94829 Added api for getting/setting the kv_cache (#685) 2 years ago
  Marian Cepok c0bb1d3ce2 ggml : change ne to int64_t (#626) 2 years ago
  Stephan Walter 81040f10aa llama : do not allocate KV cache for "vocab_only == true" (#682) 2 years ago
  Justine Tunney ee0c40dd6d Introduce GGML migration tool for new file format 2 years ago
  Justine Tunney 6f23ba5ee2 Ensure --mlock works properly with mmap() support 2 years ago
  Justine Tunney 78ca9838ee Make loading weights 10-100x faster 2 years ago
  Slaren a017390358 Initial windows support (untested) 2 years ago
  Slaren ac184d5147 Always initialize mm_addr and mm_length in llama_model 2 years ago
  Slaren 276e5b7811 Unmap the file in llama_free 2 years ago
  Slaren d68c5dc435 Make mmap_file static 2 years ago
  Slaren 64bde3ffd4 Fix ggml_init_params in quantize 2 years ago
  Slaren c03ae8dca1 Add mmap support for model files 2 years ago
  Georgi Gerganov 0ba76c1e73 llama : fix compile warnings when reading the vocab 2 years ago
  Maël Kerbiriou 41318d708e llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) 2 years ago
  thement d0aaff571c py : add temporary script to convert old ggml files to newer version (#539) 2 years ago
  Stephan Walter 436e561931 all : be more strict about converting float to double (#458) 2 years ago
  Stephan Walter c1f885067c ggml : introduce structs for the q4 data blocks (#356) 2 years ago
  Georgi Gerganov 03f7e33560 Cleanup STL headers + fix embedding examples + minor stuff 2 years ago
  Georgi Gerganov 4640eff23d Don't interefe with BLAS for large prompts by running only 1 thread 2 years ago
  slaren 29b7baab67 Add timings for the prompt evaluation (#478) 2 years ago
  Georgi Gerganov 2a2e63ce05 Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS 2 years ago
  Jed Fox 58e6c9f36f Add support for file load progress reporting callbacks (#434) 2 years ago