cturan/llama.cpp

Author	SHA1 Message	Date
Chenguang Li	28b5f190ef CANN: implement LRU cache for ACL graphs (#15814)	4 months ago
Daniel Bevenius	86587da03b llama : check returned fn ptrs from ggml_backend_reg_get_proc_address (#15893)	4 months ago
Daniel Bevenius	ff02caf9ee ci : cache ROCm installation in windows-latest-cmake-hip (#15887)	4 months ago
Ruben Ortlam	ae355f6f71 vulkan: throw the oom error instead of no memory type found (#15905)	4 months ago
Jeff Bolz	4f63cd705c vulkan: Fix OOB accesses in soft_max_back (#15861)	4 months ago
Johannes Gäßler	17bc5a815f HIP: use v_dot2_f32_f16 instruction for FA (#15884)	4 months ago
lksj92hs	ed54e32558 Workaround for subgroup arithmetic failing on MoltenVK with AMD GPUs (issue 15846) (#15886)	4 months ago
Aman Gupta	a972faebed CUDA: Add mul_mat_id support for the mmf kernel (#15767)	4 months ago
Johannes Gäßler	550cf726e1 CUDA: fix GET_ROWS for large tensors (#15882)	4 months ago
Georgi Gerganov	c252ce67c4 contrib : add notes about merging PRs (#15881)	4 months ago
Daniel Bevenius	70cd37dbbe requirements : update transformers/torch for Embedding Gemma (#15828)	4 months ago
Piotr Wilkin (ilintar)	acc1b008cf model-conversion : add extra debugging support for model conversion (#15877)	4 months ago
Aldehir Rojas	7057faf64b json : support `enum` values within `allOf` (#15830)	4 months ago
j-k	fe1c92cd7b media : add llama1 icon (#15878)	4 months ago
Jeff Bolz	e68aa10d8f vulkan: sort graph to allow more parallel execution (#15850)	4 months ago
Aman Gupta	0a16bf52e6 CUDA: generate_cu_files.py - add missing mxfp4 (#15880)	4 months ago
Jesse	88021565f0 chat : Deepseek V3.1 reasoning and tool calling support (OpenAI Style) (#15533)	4 months ago
Xuan-Son Nguyen	56920f5665 server : bring back timings_per_token (#15879)	4 months ago
Georgi Gerganov	b0d52998b9 cuda : fix supports_op condition for get_rows when number of blocks is too large (#15868)	4 months ago
Georgi Gerganov	f28d4f4ac9 metal : refactor + optimize (#15857)	4 months ago
Xuan-Son Nguyen	9fcb29f22f ggml: allow casting between f32 and i32 (#15783)	4 months ago
Sigbjørn Skjæret	5ef22d281d CUDA: non-contiguous src0 not supported for PAD (#15869)	4 months ago
Daniel Bevenius	233d773d02 convert : force setting sliding_window from original config (#15867)	4 months ago
Georgi Gerganov	a885dcff11 batched-bench : fix llama_synchronize usage during prompt processing (#15835)	4 months ago
Georgi Gerganov	663027fd54 context : fix n_outputs during reserve (#15858)	4 months ago
Georgi Gerganov	cf0e3ba150 model : avoid ggml_cont_3d for fused QKV weights (#15662)	4 months ago
Jeff Bolz	d413dca003 tests: large sizes for get_rows (#15687)	4 months ago
Chenguang Li	85ca66a746 CANN: Stream sync between devices for acl_graph (#15809)	4 months ago
Jeff Bolz	3976dfbe00 vulkan: support im2col_3d (#15795)	4 months ago
Aaron Teo	d36e61c580 ggml-cpu: clean up s390x SIMD (#15855)	4 months ago

Newer Older

Commit History Find

Commit History