cturan/llama.cpp

Author	SHA1 Message	Date
Daniel Bevenius	cd3069dfcb kv-cache : log (debug) all streams in find_slot (#15176)	5 months ago
Sigbjørn Skjæret	50e81bdf5d convert : fix merge conflicts (#15229)	5 months ago
Daniel Bevenius	1ebbaddff2 perplexity : update comments/error msg to use decode [no ci] (#15227)	5 months ago
Julien Denize	a3a7874272 convert : improve Mistral models integration (#14737)	5 months ago
Charles Xu	002cb1bb33 kleidiai: fix unsigned overflow bug (#15150)	5 months ago
David Zhao	79c1160b07 cuda: refactored ssm_scan and use CUB (#13291)	5 months ago
Aman Gupta	34c9d765bf CUDA: add attention sinks for tile and wmma (#15178)	5 months ago
compilade	e54d41befc gguf-py : add Numpy MXFP4 de/quantization support (#15111)	5 months ago
Johannes Gäßler	4850b52aed server-bench: external OAI servers, sqlite (#15179)	5 months ago
AN Long	cd6983d56d ggml : fix field name when new ggml_backend (#14944)	5 months ago
Olivier Chafik	6c7e9a5440 vendor: sync minja (#15161)	5 months ago
Johannes Gäßler	1425f587a8 CUDA: attention sinks for mma FlashAttention (#15157)	5 months ago
lhez	aaa3d07ae7 opencl: support sink in `soft_max` (attn sinks) (#15152)	5 months ago
Xuan-Son Nguyen	50aa938901 convert : support non-mxfp4 HF model (#15153)	5 months ago
Jeff Bolz	c4f53563df vulkan: support fattn sinks (#15126)	5 months ago
Jeff Bolz	a0552c8bee vulkan: Add env var to disable host visible vidmem (#15109)	5 months ago
RunningLeon	99acbc9921 llama : Support intern-s1 (#14875)	5 months ago
uvos	7ad67ba9fe HIP: add cmake option to enable compiler output of kernel resource usage metrics (#15103)	5 months ago
Christian Kastner	9a96389544 ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094)	5 months ago
Johannes Gäßler	1d72c84188 CUDA: GEMM for FP32/FP16/BF16 and ne11 <= 16 (#15131)	5 months ago
Johannes Gäßler	20638e4f16 scripts: fix crash when --tool is not set (#15133)	5 months ago
Daniel Bevenius	36d3f00e14 requirements : fix PyTorch uint64 compatibility (#15134)	5 months ago
Reese Levine	5fd160bbd9 ggml: Add basic SET_ROWS support in WebGPU (#15137)	5 months ago
rmatif	756cfea826 fix profiling crash (#15072)	5 months ago
lhez	e725a1a982 opencl: add `swiglu_oai` and `add_id` (#15121)	5 months ago
Sachin Desai	3db4da56a5 chat : support Granite model reasoning and tool call (#14864)	5 months ago
Juk Armstrong	476aa3fd57 Fixed name `-override-tensors` to `-override-tensor` (#15129)	5 months ago
Diego Devesa	0d8831543c ggml : fix fallback to CPU for ununsupported ops (#15118)	5 months ago
Sigbjørn Skjæret	65c797c4fa chat : fix yandex chat template (#15116)	5 months ago
stevenkuang	25726898e8 chat : fix hunyuan auto-detection (#15114)	5 months ago

Newer Older

Commit History Find

Commit History