Commit History

Author SHA1 Message Date
  Jeff Bolz aea8ddd516 vulkan: fix coopmat2 validation failures (#11284) 1 year ago
  Georgi Gerganov 9f7add1cde examples : fix add_special conditions (#11311) 1 year ago
  Christopher Nielsen 90d987b105 mmap: add include for cerrno (#11296) 1 year ago
  Michael Podvitskiy a4251edd6f cmake: fix shell command quoting in build-info script (#11309) 1 year ago
  Xuan Son Nguyen ec7f3ac9ab llama : add support for Deepseek-R1-Qwen distill model (#11310) 1 year ago
  Georgi Gerganov ef6dada60c cont : fix whitespaces (#11305) 1 year ago
  Kyle Bruene ae3c1db2f9 llama : re-add LLM_ARCH_PHIMOE (#11305) 1 year ago
  Georgi Gerganov 92bc493917 tests : increase timeout when sanitizers are enabled (#11300) 1 year ago
  Georgi Gerganov b9daaffe02 simple-chat : fix BOS being added to each message (#11278) 1 year ago
  Nicolò Scipione 99487b57d4 SYCL: Introducing memory host pool (#11251) 1 year ago
  Eric Curtin a1649cc13f Adding linenoise.cpp to llama-run (#11252) 1 year ago
  Georgi Gerganov 4dd34ff831 cmake : add sanitizer flags for llama.cpp (#11279) 1 year ago
  Xuan Son Nguyen f30f099228 server : implement cancellable request (#11285) 1 year ago
  Georgi Gerganov f26c874179 scripts : restore hf.sh (#11288) 1 year ago
  LostRuins Concedo 6390a998bf tts : add guide tokens support (#11186) 1 year ago
  Jeff Bolz 44e18ef939 vulkan: fix coopmat2 flash attention for non-contiguous inputs (#11281) 1 year ago
  codezjx 3edfa7d375 llama.android: add field formatChat to control whether to parse special tokens when send message (#11270) 1 year ago
  Radoslav Gerganov 667d72846c rpc : early register backend devices (#11262) 1 year ago
  Georgi Gerganov a133566d34 vocab : fix double-eos check (#11273) 1 year ago
  David Renshaw 960ec65273 llama : fix deprecation message: vocabable -> vocab (#11269) 1 year ago
  musoles 7a689c415e README : added kalavai to infrastructure list (#11216) 1 year ago
  Jeff Bolz bd38ddea01 vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166) 1 year ago
  Jeff Bolz 466300fe14 vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (#11206) 1 year ago
  Jeff Bolz 206bc53422 vulkan: optimize coopmat2 q2_k dequant function (#11130) 1 year ago
  RunningLeon 4dbc8b9cb7 llama : add internlm3 support (#11233) 1 year ago
  Johannes Gäßler 9c8dcefe17 CUDA: backwards pass for misc. ops, add tests (#11257) 1 year ago
  Xuan Son Nguyen 681149ced2 llama : add `llama_model_load_from_splits` (#11255) 1 year ago
  fj-y-saito c67cc9837d ggml: aarch64: implement SVE kernels for q4_K_q8_K vector dot (#11227) 1 year ago
  Eve adc5dd92e8 vulkan: scale caching for k quants + misc fixes (#11081) 1 year ago
  Georgi Gerganov f11cfdfd7f ci : use -no-cnv in gguf-split tests (#11254) 1 year ago