Commit History

Author SHA1 Message Date
  Georgi Gerganov 0d52a69e4b ci : fix cmake option (#11125) 1 year ago
  Mathieu Baudier 02f0430141 Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (#11117) 1 year ago
  ag2s20150909 bec2183f2c fix: Vulkan shader gen binary path when Cross-compiling (#11096) 1 year ago
  Johannes Gäßler 53ff6b9b9f GGUF: C++ refactor, backend support, misc fixes (#11030) 1 year ago
  Diego Devesa 017cc5f446 ggml-backend : only offload from host buffers (fix) (#11124) 1 year ago
  Diego Devesa a3d50bc022 ggml-backend : only offload from host buffers (#11120) 1 year ago
  Radoslav Gerganov a4dd490069 rpc : code cleanup (#11107) 1 year ago
  Akarshan Biswas c0d6f790d0 SYCL: Use get_multi_ptr instead of deprecated get_pointer in wkv6 (#11087) 1 year ago
  Eric Curtin dc7cef9f37 llama-run : fix context size (#11094) 1 year ago
  Georgi Gerganov ecebbd292d llama : remove unused headers (#11109) 1 year ago
  Xuan Son Nguyen 96be8c3264 github : add cmd line field to bug report (#11090) 1 year ago
  Georgi Gerganov e6e7c75d94 server : fix extra BOS in infill endpoint (#11106) 1 year ago
  Xuan Son Nguyen 09186fabbe llama : remove check flash_attn with lora (#11104) 1 year ago
  Asghar Ghorbani 96a1dc27c3 llama : prevent system info string accumulation across calls (#11101) 1 year ago
  Daniel Bevenius 6369f867a4 llama : rename missed batch params/vars to ubatch (#10059) 1 year ago
  Georgi Gerganov 47182dd03f llama : update llama_model API names (#11063) 1 year ago
  Georgi Gerganov 3e6e7a6bc2 tokenize : escape the prompt (#11058) 1 year ago
  Georgi Gerganov ae2f606bb5 mmap : fix fileno macro clash (#11076) 1 year ago
  Georgi Gerganov 727368c60f llama : use LLAMA_TOKEN_NULL (#11062) 1 year ago
  Georgi Gerganov 5047dd3546 llama : use _impl suffix instead of _internal (#11060) 1 year ago
  Johannes Gäßler 46e3556e01 CUDA: add BF16 support (#11093) 1 year ago
  0cc4m b56f079e28 Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver (#11074) 1 year ago
  fairydreaming 9394bbd484 llama : Add support for DeepSeek V3 (#11049) 1 year ago
  matt23654 f922a9c542 [GGML][RPC] Support for models with non-512-aligned tensors over RPC. (#11047) 1 year ago
  DAN™ 46be942214 llama : add support for the cohere2 model architecture (#10900) 1 year ago
  Georgi Gerganov 78c6785175 sync : ggml 1 year ago
  Georgi Gerganov 5e3b08d606 ggml : do not install metal source when embed library (ggml/1054) 1 year ago
  Daniel Bevenius db68c93b57 ggml : improve inputs log sched_print_assignments (ggml/1053) 1 year ago
  Gilad S. c31fc8b966 fix: Vulkan shader gen binary path (#11037) 1 year ago
  Molly Sophia 4b0c638b9a common : disable KV cache shifting automatically for unsupported models (#11053) 1 year ago