Commit History

Author SHA1 Message Date
  Diego Devesa e0e912f49b llama : add option to override model tensor buffers (#11397) 9 months ago
  Georgi Gerganov e0dbec0bc6 llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 10 months ago
  Xuan-Son Nguyen 7841fc723e llama : Add Gemma 3 support (+ experimental vision capability) (#12343) 10 months ago
  Georgi Gerganov bdcf8b6a56 cont : fix mmap flag print (#11699) 11 months ago
  Georgi Gerganov ed926d8833 llama : fix defrag logic (#11707) 11 months ago
  magicse 333820d749 llama : fix progress dots (#11730) 11 months ago
  tv1wnd 855cd0734a llama : fix old glm4 models (#11670) 11 months ago
  Johannes Gäßler fd08255d0d CUDA: non-contiguous (RMS) norm support (#11659) 11 months ago
  piDack 0cec062a63 llama : add support for GLM-Edge and GLM-Edge-V series models (#10573) 11 months ago
  Molly Sophia 325afb370a llama: fix missing k_cache store for rwkv6qwen2 (#11445) 11 months ago
  Johannes Gäßler df984e0147 llama: refactor llama_decode_impl (#11381) 11 months ago
  Frank Mai 1d8ee06000 rpc: fix register position (#11424) 11 months ago
  Radoslav Gerganov 667d72846c rpc : early register backend devices (#11262) 1 year ago
  Xuan Son Nguyen 681149ced2 llama : add `llama_model_load_from_splits` (#11255) 1 year ago
  Johannes Gäßler 432df2d5f9 RoPE: fix back, CUDA support for back + noncont. (#11240) 1 year ago
  Georgi Gerganov afa8a9ec9b llama : add `llama_vocab`, functions -> methods, naming (#11110) 1 year ago
  Molly Sophia ee7136c6d1 llama: add support for QRWKV6 model architecture (#11001) 1 year ago
  Pierrick Hymbert f8feb4b01a model: Add support for PhiMoE arch (#11003) 1 year ago
  Xuan Son Nguyen 4d2b3d8804 lora : improve compat with `mergekit-extract-lora` (#11131) 1 year ago
  Georgi Gerganov ecebbd292d llama : remove unused headers (#11109) 1 year ago
  Xuan Son Nguyen 09186fabbe llama : remove check flash_attn with lora (#11104) 1 year ago
  Asghar Ghorbani 96a1dc27c3 llama : prevent system info string accumulation across calls (#11101) 1 year ago
  Daniel Bevenius 6369f867a4 llama : rename missed batch params/vars to ubatch (#10059) 1 year ago
  Georgi Gerganov 47182dd03f llama : update llama_model API names (#11063) 1 year ago
  Georgi Gerganov 5047dd3546 llama : use _impl suffix instead of _internal (#11060) 1 year ago
  fairydreaming 9394bbd484 llama : Add support for DeepSeek V3 (#11049) 1 year ago
  DAN™ 46be942214 llama : add support for the cohere2 model architecture (#10900) 1 year ago
  Georgi Gerganov f66f582927 llama : refactor `src/llama.cpp` (#10902) 1 year ago
  Yun Dou b92a14a841 llama : support InfiniAI Megrez 3b (#10893) 1 year ago
  ymcki 6f0c9e034b llama : support for Llama-3_1-Nemotron-51B (#10669) 1 year ago