Commit History

Author SHA1 Message Date
  Georgi Gerganov 4f4bf35f46 py : fix missing added_tokens_dict for SPM and BPE vocabs (#4971) 2 years ago
  Kawrakow 2b3a665d39 llama : use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 (#4996) 2 years ago
  Paul Tsochantaris 7563293665 metal : remove unnecessary nil check (#4986) 2 years ago
  David Renshaw f46c0c1b0e llama : fix copy/paste error in llama_sampling_params comment (#4994) 2 years ago
  Georgi Gerganov 5c99960901 py : remove unnecessary hasattr (#4903) 2 years ago
  Philip Taron bee938da74 nix: remove nixConfig from flake.nix (#4984) 2 years ago
  Daniel Bevenius cec8a48470 finetune : add training data file to log message (#4979) 2 years ago
  Kawrakow 334a835a1c ggml : importance matrix support for legacy quants (#4969) 2 years ago
  Maximilian Winter 4feb4b33ee examples : add complete parallel function calling example (#4974) 2 years ago
  Georgi Gerganov 959ef0c0df perplexity : fix kv cache handling for hellaswag (#4981) 2 years ago
  Georgi Gerganov c37b3474e6 flake.lock: update flake-parts, flake-parts/nixpkgs-lib, and nixpkgs (#4920) 2 years ago
  Paul Tsochantaris 158f8c9e21 metal : localized logic in `ggml_metal_graph_compute` (#4924) 2 years ago
  Neuman Vong 862f5e41ab android : introduce starter project example (#4926) 2 years ago
  Alex Azarov 3a48d558a6 metal : replace loop of dispatch_async with dispatch_apply (#4934) 2 years ago
  Alex Azarov 7c8d3abd1a metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (#4936) 2 years ago
  Maximilian Winter 122ed4840c examples : fix and improv docs for the grammar generator (#4909) 2 years ago
  Justine Tunney a0b3ac8c48 ggml : introduce GGML_CALL function annotation (#4850) 2 years ago
  Daniel Bevenius d75c232e1d finetune : use LLAMA_FILE_MAGIC_GGLA (#4961) 2 years ago
  stduhpf e0324285a5 speculative : threading options (#4959) 2 years ago
  ngc92 3e5ca7931c pass cpu-architecture arguments only to host code (C;C++) (#4943) 2 years ago
  David Friehs 4483396751 llama : apply classifier-free guidance to logits directly (#4951) 2 years ago
  Victor Z. Peng d9aa4ffa6e awq-py : fix typo in awq-py/README.md (#4947) 2 years ago
  Georgi Gerganov ddb008d845 cuda : fix dequantize kernel names (#4938) 2 years ago
  Kawrakow 2faaef3979 llama : check for 256 divisibility for IQ2_XS, IQ2_XXS (#4950) 2 years ago
  Kawrakow 4a3156de2f CUDA: faster dequantize kernels for Q4_0 and Q4_1 (#4938) 2 years ago
  David Pflug a836c8f534 llama : fix missing quotes (#4937) 2 years ago
  Kawrakow 467a882fd2 Add ability to use importance matrix for all k-quants (#4930) 2 years ago
  Georgi Gerganov bb0c139247 llama : check LLAMA_TRACE env for extra logging (#4929) 2 years ago
  Georgi Gerganov 9408cfdad6 scripts : sync-ggml-am.sh option to skip commits 2 years ago
  Georgi Gerganov 03c5267490 llama : use LLAMA_LOG_ macros for logging 2 years ago