cturan/llama.cpp

Author	SHA1 Message	Date
Reese Levine	15bff84bf5 ggml webgpu: initial flashattention implementation (#18610)	3 weeks ago
Jeff Bolz	2524c26164 vulkan: fix push constant size for quantize_q8_1 (#18687)	3 weeks ago
Jeff Bolz	cb14b06995 vulkan: optimize ssm_scan (#18630)	3 weeks ago
Adrien Gallouët	55abc39355 vendor : update cpp-httplib to 0.30.0 (#18660)	3 weeks ago
Georgi Gerganov	f2f6c88067 scripts : support chaining commands in pr2wt.sh (#18671)	3 weeks ago
도로로도로또	945bf10627 metal : add MoE kernel specialization for ne20=5 (#18667)	3 weeks ago
Johannes Gäßler	64848deb18 llama-fit-params: free memory target per device (#18679)	3 weeks ago
Doctor Shotgun	9a5724dee2 ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH (#18535)	3 weeks ago
Daniel Bevenius	9c142e3a2a model-conversion : add warn about transformers mismatch (#18691)	3 weeks ago
Daniel Bevenius	df7fb92170 model-conversion : remove -st targets for converted model (#18689)	3 weeks ago
Julius Tischbein	2038101bd9 llama : add `use_direct_io` flag for model loading (#18166)	3 weeks ago
shaofeiqi	568371a726 opencl: add FILL op support (#18682)	3 weeks ago
Sigbjørn Skjæret	5b8844ae53 scripts : fix repos cloned with .git extension (#18669)	3 weeks ago
Sigbjørn Skjæret	7e16fef085 convert : more variants of rope_theta config entries (#18668)	3 weeks ago
Oliver Walsh	f5245b5e4e cuda : fix build on cuda 12.8 (#18672)	3 weeks ago
R	ae9f8df778 fix(docker): add missing libglvnd libraries to Vulkan image (#18664)	3 weeks ago
Adrien Gallouët	56d2fed2b3 tools : remove llama-run (#18661)	3 weeks ago
Georgi Gerganov	56426673cb scripts : add pr2wt.sh (#18644)	3 weeks ago
Daniel Bevenius	bb77764c2d convert : clarify sentence-transformers-dense-modules help [no ci] (#18662)	3 weeks ago
Sigbjørn Skjæret	9dfa8ee950 ci : run cann build unconditionally [no ci] (#18659)	3 weeks ago
Jeff Bolz	ca4a8370bc vulkan: reject ops when a tensor is too large to allocate (#18646)	3 weeks ago
virajwad	03023296cf vulkan: Warptile tuning for Intel Xe2/Xe3 (#18178)	3 weeks ago
Eve	8c77a04cc7 vulkan: more mul mat optimizations (#18533)	3 weeks ago
Daniel Bevenius	ffba4f29e6 examples : add debug utility/example (#18464)	3 weeks ago
hipudding	3333951d86 CANN: Fix rename for get_env (#18652)	3 weeks ago
Raul Torres	193ee38a1b CANN: Rename `get_env` to `get_env_as_lowercase` (#18624)	3 weeks ago
Max Krasnyansky	95ea9e0861 Hexagon add support for f16/f32 flash attention, scale, set-rows and improve f16/32 matmul (#18611)	3 weeks ago
Tarek Dakhran	ccbc84a537 mtmd: mtmd_audio_streaming_istft (#18645)	3 weeks ago
Johannes Gäßler	68b4d516c3 llama-params-fit: fix last devices with low VRAM (#18494)	3 weeks ago
Aadeshveer Singh	24af22fc36 ggml : optimize cuda ssm_scan using warp-level reduction (#18505)	3 weeks ago

Newer Older

Commit History Find

Commit History