Commit History

Author SHA1 Message Date
  DAN™ 889bdd7686 command-r : add BPE pre-tokenization (#7063) 1 year ago
  Georgi Gerganov 92139b90af tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 1 year ago
  alwqx 6ecf3189e0 chore: fix typo in llama.cpp (#7032) 1 year ago
  Georgi Gerganov 9c67c2773d ggml : add Flash Attention (#5021) 1 year ago
  Georgi Gerganov f4ab2a4147 llama : fix BPE pre-tokenization (#6920) 1 year ago
  Johannes Gäßler c4f708a93f llama : fix typo LAMMAFILE -> LLAMAFILE (#6974) 1 year ago
  Xuan Son Nguyen 7bb36ccf91 gguf : enforce that tensor names are unique (#6905) 1 year ago
  agray3 928e0b7013 Reset schedule earlier to allow overlap with ggml graph computation on device (#6933) 1 year ago
  Pierrick Hymbert 0c4d489e29 quantize: add imatrix and dataset metadata in GGUF (#6658) 1 year ago
  slaren 017e6999b5 add basic tensor data validation function (#6884) 1 year ago
  Georgi Gerganov dba497e0c1 cmake : restore LLAMA_LLAMAFILE_DEFAULT 1 year ago
  slaren d6e1d44f16 llama : synchronize before get/set session data (#6911) 1 year ago
  slaren 0ead1f1072 llama : check that all the tensor data is in the model file (#6885) 1 year ago
  Georgi Gerganov aa750c1ede tests : minor bash stuff (#6902) 1 year ago
  jiez 1966eb2615 quantize : add '--keep-split' to quantize model into shards (#6688) 1 year ago
  Douglas Hanley b4e4b8a935 llama : add llama_get_pooling_type function (#6862) 1 year ago
  Johannes Gäßler 28103f4832 Server: fix seed for multiple slots (#6835) 1 year ago
  Tristan Druyen abd3314064 llama : add phi 3 chat template (#6857) 1 year ago
  liuwei-git c8297c6af5 llama : add phi3 support (#6852) 1 year ago
  Georgi Gerganov 8960fe86ae llama : fix typo in <|im_end|> token text (#6745) 1 year ago
  Georgi Gerganov 40f74e4d73 llama : add option to render special/control tokens (#6807) 1 year ago
  Wouter 7dbdba5690 llama : add llama-3 chat template (#6751) 1 year ago
  Pedro Cuenca b97bc3966e llama : support Llama 3 HF conversion (#6745) 1 year ago
  nopperl 9958c81b79 Implement the OLMo architecture (#6741) 1 year ago
  slaren 0d56246f4b ggml : group all experts in a single ggml_mul_mat_id (#6505) 1 year ago
  Ren Xuancheng e11b2e6e1e Qwen2 : assume tied weights if lm_head/output weights is missing (#6738) 1 year ago
  slaren c71bfd736e llama : fix compatibility with old 2 expert models (#6735) 1 year ago
  Georgi Gerganov 532c1737a1 llama : make general.name optional (#6709) 1 year ago
  Ashish dbceec87c0 llama : add StableLM2 12B (#6635) 1 year ago
  Shijie f4dea7da18 llama : add qwen2moe (#6074) 1 year ago