Commit History

Author SHA1 Message Date
  Johannes Gäßler 28103f4832 Server: fix seed for multiple slots (#6835) 1 year ago
  Tristan Druyen abd3314064 llama : add phi 3 chat template (#6857) 1 year ago
  liuwei-git c8297c6af5 llama : add phi3 support (#6852) 1 year ago
  Georgi Gerganov 8960fe86ae llama : fix typo in <|im_end|> token text (#6745) 1 year ago
  Georgi Gerganov 40f74e4d73 llama : add option to render special/control tokens (#6807) 1 year ago
  Wouter 7dbdba5690 llama : add llama-3 chat template (#6751) 1 year ago
  Pedro Cuenca b97bc3966e llama : support Llama 3 HF conversion (#6745) 1 year ago
  nopperl 9958c81b79 Implement the OLMo architecture (#6741) 1 year ago
  slaren 0d56246f4b ggml : group all experts in a single ggml_mul_mat_id (#6505) 1 year ago
  Ren Xuancheng e11b2e6e1e Qwen2 : assume tied weights if lm_head/output weights is missing (#6738) 1 year ago
  slaren c71bfd736e llama : fix compatibility with old 2 expert models (#6735) 1 year ago
  Georgi Gerganov 532c1737a1 llama : make general.name optional (#6709) 1 year ago
  Ashish dbceec87c0 llama : add StableLM2 12B (#6635) 1 year ago
  Shijie f4dea7da18 llama : add qwen2moe (#6074) 1 year ago
  Daniel Bevenius 4fbd8098e6 gguf : add special tokens metadata for FIM/Infill (#6689) 1 year ago
  compilade 132f55795e llama : fix restoring the number of outputs from state files (#6687) 1 year ago
  David Renshaw 1958f7e06c llama : add missing kv clear in llama_beam_search (#6664) 1 year ago
  Chao Jiang 04fbc5f23e Add Command R chat template (#6650) 1 year ago
  Pierrick Hymbert 4bd0f93e4a model: support arch `DbrxForCausalLM` (#6515) 1 year ago
  jiez 91c736015b llama : add gguf_remove_key + remove split meta during quantize (#6591) 1 year ago
  MasterYi1024 dee7f8d692 Correct free memory and total memory. (#6630) 1 year ago
  Clint Herron 04a5ac211e Optimization: eliminate addition of redundant stacks when advancing grammar. (#6616) 1 year ago
  Olivier Chafik cbaadc9294 grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses) (#6609) 1 year ago
  Pierrick Hymbert b804b1ef77 eval-callback: Example how to use eval callback for debugging (#6576) 1 year ago
  slaren 4f407a0a35 llama : add model types for mixtral (#6589) 1 year ago
  Jared Van Bortel 1b67731e18 BERT tokenizer fixes (#6498) 1 year ago
  Carolinabanana 5dc9dd7152 llama : add Command R Plus support (#6491) 1 year ago
  Georgi Gerganov cc4a95426d llama : fix attention layer count sanity check (#6550) 1 year ago
  Georgi Gerganov b73e564b16 quantize : fix precedence of cli args (#6541) 1 year ago
  Rick G e3c337d87c llama : support negative ith in llama_get_ API (#6519) 1 year ago