cturan/llama.cpp

Author	SHA1 Message	Date
Aman Gupta	6eea666912 llama-graph: avoid expand_forward for fusion (#17633)	1 month ago
Xuan-Son Nguyen	ff90508d68 contributing: update guidelines for AI-generated code (#17625)	1 month ago
Adrien Gallouët	0a4aeb927d cmake : add option to build and link LibreSSL (#17552)	1 month ago
Tarek Dakhran	2ba719519d model: LFM2-VL fixes (#17577)	1 month ago
Xuan-Son Nguyen	7f8ef50cce clip: fix nb calculation for qwen3-vl (#17594)	1 month ago
Xuan-Son Nguyen	3c136b21a3 cli: add migration warning (#17620)	1 month ago
Adrien Gallouët	beb1f0c503 common : throttle download progress output to reduce IO flush (#17427)	1 month ago
Aaron Teo	def5404f26 common: add LLAMA_LOG_FILE env var (#17609)	1 month ago
Gilad S.	fa0465954f ggml: fix: macOS build with `-DGGML_BACKEND_DL=ON` (#17581)	1 month ago
ddh0	5a6241feb0 common: update env var name (#17588)	1 month ago
Aman Gupta	c7af376c29 CUDA: add stream-based concurrency (#16991)	1 month ago
Mahekk Shaikh	00425e2ed1 cuda : add error checking for cudaMemcpyAsync in argsort (#17599)	1 month ago
Acly	385c3da5e6 vulkan : fix FA mask load with bounds check (coopmat2) (#17606)	1 month ago
Xuan-Son Nguyen	ab49f094d2 server: move server-context to its own cpp\|h (#17595)	1 month ago
Haiyue Wang	8c32d9d96d server: explicitly set the function name in lambda (#17538)	1 month ago
Igor Smirnov	0874693b44 common : fix json schema with '\' in literals (#17307)	1 month ago
Neo Zhang	7d2add51d8 sycl : support to malloc memory on device more than 4GB, update the doc and script (#17566)	1 month ago
ixgbe	f698a79c63 ggml: replace hwcap with riscv_hwprobe for RVV detection (#17567)	1 month ago
Ruben Ortlam	47a268ea50 Vulkan: MMVQ Integer Dot K-Quant and MUL_MAT_ID support (#16900)	1 month ago
Jeff Bolz	59d8d4e963 vulkan: improve topk perf for large k, fix overflow in unit tests (#17582)	1 month ago
Aleksei Nikiforov	d82b7a7c1d gguf-py : fix passing non-native endian tensors (editor-gui and new-metadata) (#17553)	1 month ago
DAN™	03914c7ef8 common : move all common_chat_parse_* to chat-parser.cpp. (#17481)	1 month ago
o7si	3ce7a65c2f server: fix: /metrics endpoint returning JSON-escaped Prometheus format (#17386)	1 month ago
Diego Devesa	e072b2052e ggml : add GGML_SCHED_NO_REALLOC option to disable reallocations in ggml_backend_sched (#17276)	1 month ago
R0CKSTAR	c6f7a423c8 [MUSA] enable fp16/fast_fp16/bf16_mma on PH1 (#17551)	1 month ago
Aman Gupta	2e7ef98f18 ggml-cuda: add stricter checking for fusion (#17568)	1 month ago
Fredrik Hultin	ddf9f94389 server : add Anthropic Messages API support (#17570)	1 month ago
Piotr Wilkin (ilintar)	ff55414c42 model : Qwen3 Next (#16095)	1 month ago
Johannes Gäßler	73955f7d2a CUDA: no FP16 arithmetic for vector FA kernel (#17558)	1 month ago
Jeff Bolz	35cf8887e1 vulkan: Implement GGML_OP_TRI (#17503)	1 month ago

Newer Older

Commit History Find

Commit History