Commit History

Author SHA1 Message Date
  0cc4m a8a1f33567 Vulkan: Add DP4A MMQ and Q8_1 quantization shader (#12135) 9 months ago
  Georgi Gerganov 1790e73157 cmake : fix whitespace (#0) 9 months ago
  Georgi Gerganov 0114a32da0 sync : ggml 9 months ago
  Sandro Hanea a7724480fd cmake: improve Vulkan cooperative matrix support checks (whisper/2966) 9 months ago
  Sigbjørn Skjæret 1a85949067 llava : proper description fix (#12668) 9 months ago
  Akarshan Biswas 6c02a032fa SYCL: Remove misleading ggml_sycl_op_flatten function (#12387) 9 months ago
  Sigbjørn Skjæret f52d59d771 llava : fix clip loading GGUFs with missing description (#12660) 9 months ago
  marcoStocchi 52de2e5949 tts : remove printfs (#12640) 9 months ago
  Sigbjørn Skjæret 2c3f8b850a llama : support BailingMoE (Ling) (#12634) 9 months ago
  Georgi Gerganov 4663bd353c metal : use constexpr in FA kernels + fix typedef (#12659) 9 months ago
  Juyoung Suk b3de7cac73 llama : add Trillion 7B model support (#12556) 9 months ago
  Sergei Vorobyov 7242dd9675 llama-chat : Add Yandex instruct model template support (#12621) 9 months ago
  R0CKSTAR 492d7f1ff7 musa: fix all warnings, re-enable `-DLLAMA_FATAL_WARNINGS=ON` in ci and update doc (#12611) 9 months ago
  Georgi Gerganov d3f1f0acfb sync : ggml 9 months ago
  Xuan-Son Nguyen 360dc22c00 cpu : rm unused variable (ggml/1166) 9 months ago
  cmdr2 a62d7fa7a9 cpu: de-duplicate some of the operators and refactor (ggml/1144) 9 months ago
  Daniel Bevenius e408d4351a ggml : add logging for native build options/vars (whisper/2935) 10 months ago
  Daniel Bevenius 3891e183c6 examples : command.wasm updates (whisper/2904) 10 months ago
  Xuan-Son Nguyen af6ae1efb2 llama : fix non-causal mask for gemma 3 (#12615) 9 months ago
  Djip007 0bb2919335 llama : change cpu_buft_list order: ACCEL -> GPU host -> CPU extra -> CPU (#12632) 9 months ago
  Jay a69f846351 cmake : fix ccache conflict (#12522) 9 months ago
  hipudding d07a0d7a79 CANN : remove clang-format in ggml-cann (#12607) 9 months ago
  Sigbjørn Skjæret 3714c3ee1a llama : fix incorrect Qwen2Moe ffn_moe_out graph callback (#12631) 9 months ago
  Georgi Gerganov b4ae50810e metal : improve FA + improve MoE (#12612) 9 months ago
  Icenowy Zheng b86f600723 vulkan: fix coopmat shader generation when cross-compiling (#12272) 9 months ago
  Johannes Gäßler dd373dd3bf llama: fix error on bad grammar (#12628) 9 months ago
  Benson Wong 5d01670266 server : include speculative decoding stats when timings_per_token is enabled (#12603) 9 months ago
  Radoslav Gerganov ef03229ff4 rpc : update README for cache usage (#12620) 9 months ago
  amritahs-ibm 13731766db llamafile : ppc64le GEMV forwarding for FP32. (#12594) 9 months ago
  Radoslav Gerganov ab6ab8f809 rpc : send hash when tensor data is above some fixed threshold (#12496) 9 months ago