Commit History

Author SHA1 Message Date
  Aman Gupta 6eea666912 llama-graph: avoid expand_forward for fusion (#17633) 1 month ago
  Xuan-Son Nguyen ff90508d68 contributing: update guidelines for AI-generated code (#17625) 1 month ago
  Adrien Gallouët 0a4aeb927d cmake : add option to build and link LibreSSL (#17552) 1 month ago
  Tarek Dakhran 2ba719519d model: LFM2-VL fixes (#17577) 1 month ago
  Xuan-Son Nguyen 7f8ef50cce clip: fix nb calculation for qwen3-vl (#17594) 1 month ago
  Xuan-Son Nguyen 3c136b21a3 cli: add migration warning (#17620) 1 month ago
  Adrien Gallouët beb1f0c503 common : throttle download progress output to reduce IO flush (#17427) 1 month ago
  Aaron Teo def5404f26 common: add LLAMA_LOG_FILE env var (#17609) 1 month ago
  Gilad S. fa0465954f ggml: fix: macOS build with `-DGGML_BACKEND_DL=ON` (#17581) 1 month ago
  ddh0 5a6241feb0 common: update env var name (#17588) 1 month ago
  Aman Gupta c7af376c29 CUDA: add stream-based concurrency (#16991) 1 month ago
  Mahekk Shaikh 00425e2ed1 cuda : add error checking for cudaMemcpyAsync in argsort (#17599) 1 month ago
  Acly 385c3da5e6 vulkan : fix FA mask load with bounds check (coopmat2) (#17606) 1 month ago
  Xuan-Son Nguyen ab49f094d2 server: move server-context to its own cpp|h (#17595) 1 month ago
  Haiyue Wang 8c32d9d96d server: explicitly set the function name in lambda (#17538) 1 month ago
  Igor Smirnov 0874693b44 common : fix json schema with '\' in literals (#17307) 1 month ago
  Neo Zhang 7d2add51d8 sycl : support to malloc memory on device more than 4GB, update the doc and script (#17566) 1 month ago
  ixgbe f698a79c63 ggml: replace hwcap with riscv_hwprobe for RVV detection (#17567) 1 month ago
  Ruben Ortlam 47a268ea50 Vulkan: MMVQ Integer Dot K-Quant and MUL_MAT_ID support (#16900) 1 month ago
  Jeff Bolz 59d8d4e963 vulkan: improve topk perf for large k, fix overflow in unit tests (#17582) 1 month ago
  Aleksei Nikiforov d82b7a7c1d gguf-py : fix passing non-native endian tensors (editor-gui and new-metadata) (#17553) 1 month ago
  DAN™ 03914c7ef8 common : move all common_chat_parse_* to chat-parser.cpp. (#17481) 1 month ago
  o7si 3ce7a65c2f server: fix: /metrics endpoint returning JSON-escaped Prometheus format (#17386) 1 month ago
  Diego Devesa e072b2052e ggml : add GGML_SCHED_NO_REALLOC option to disable reallocations in ggml_backend_sched (#17276) 1 month ago
  R0CKSTAR c6f7a423c8 [MUSA] enable fp16/fast_fp16/bf16_mma on PH1 (#17551) 1 month ago
  Aman Gupta 2e7ef98f18 ggml-cuda: add stricter checking for fusion (#17568) 1 month ago
  Fredrik Hultin ddf9f94389 server : add Anthropic Messages API support (#17570) 1 month ago
  Piotr Wilkin (ilintar) ff55414c42 model : Qwen3 Next (#16095) 1 month ago
  Johannes Gäßler 73955f7d2a CUDA: no FP16 arithmetic for vector FA kernel (#17558) 1 month ago
  Jeff Bolz 35cf8887e1 vulkan: Implement GGML_OP_TRI (#17503) 1 month ago