cturan/llama.cpp

Author	SHA1 Message	Date
fj-y-saito	df70bedda7 arm64: add i8mm route with SVE ggml_vec_dot_q4_K_q8_K and ggml_vec_dot_q6_K_… (#15277)	2 months ago
Georgi Gerganov	f914544b16 batched-bench : add "separate text gen" mode (#17103)	2 months ago
Xuan-Son Nguyen	4b13a684c5 mtmd: fix patch_size initialized to random value in audio models (#17128)	2 months ago
Georgi Gerganov	9898b57cbe editorconfig : ignore benches/ (#17140)	2 months ago
Acly	1032256ec9 cuda/vulkan : bicubic interpolation (#17022)	2 months ago
Georgi Gerganov	15274c0c50 benches : add eval results (#17139)	2 months ago
Georgi Gerganov	b8595b16e6 mtmd : fix embedding size for image input (#17123)	2 months ago
Ruben Ortlam	392e09a608 vulkan: fix memory allocations (#17122)	2 months ago
compilade	802cef44bf convert : parse safetensors directly (#15667)	2 months ago
compilade	1c07c0c68c convert : handle compressed-tensors quant method (#17069)	2 months ago
Georgi Gerganov	cb1adf8851 server : handle failures to restore host cache (#17078)	2 months ago
Georgi Gerganov	ef1d826997 benches : add folder with benchmarks (#16931)	2 months ago
Eric Curtin	86fde91e62 Switch to using Ubuntu 25.10 vulkan/mesa (#16497)	2 months ago
Ruben Ortlam	7f3e9d339c vulkan: iGPU memory reporting fix (#17110)	2 months ago
Ruben Ortlam	8a3519b708 vulkan: fix mmq out of bounds reads (#17108)	2 months ago
Jeff Bolz	80a6cf6347 vulkan: fuse mul_mat_id + mul (#17095)	2 months ago
Georgi Gerganov	0750a59903 metal : retain src and dst buffers during async ops (#17101)	2 months ago
Xuan-Son Nguyen	aa3b7a90b4 arg: add --cache-list argument to list cached models (#17073)	2 months ago
chansikpark	333f2595a3 webui: fix keyboard shortcuts for new chat & edit chat title (#17007)	2 months ago
Jeff Bolz	53d7d21e61 vulkan: Use spec constants for conv2d s/d/p and kernel W/H (#16978)	2 months ago
Aidan	eeee367de5 server: fix correct time_ms calculation in prompt_progress (#17093)	2 months ago
Aman Gupta	64fe17fbb8 Revert "CUDA: add expert reduce kernel (#16857)" (#17100)	2 months ago
Aman Gupta	c1b187688d CUDA: skip fusion for repeating adds in bias (#17080)	2 months ago
SavicStefan	b8a5cfd11a vulkan: Increase BK to 32; use BK/4 for non-CM mul_mm.comp (#16636)	2 months ago
Aleksei Nikiforov	08416ebe7f ggml: disable vxe for cross-compilation by default (#16966)	2 months ago
Jeff Bolz	b4e335d8dc vulkan: fuse rms_norm + mul + rope (+ view + set_rows) (#16977)	2 months ago
Jeff Bolz	d6fe40fa00 vulkan: Fix test-thread-safety crashes (#17024)	2 months ago
Johannes Gäßler	e14e842e87 CUDA: fix MMQ stream-k fixup ne1 indices (#17089)	2 months ago
Reese Levine	647b960bd8 ggml webgpu: faster matrix multiplication/matrix-vector multiplication (#17031)	2 months ago
bssrdf	299f5d782c CUDA: properly handle nb00=nb02 case for cpy (#17081)	2 months ago

Newer Older

Commit History Find

Commit History