cturan/llama.cpp

Author	SHA1 Message	Date
lhez	7cba58bbea opencl: add sqr, sqrt, mean and ssm_conv (#17476)	1 month ago
Alberto Cabrera Pérez	5449367b21 Fix chunks being too small with small matrix sizes (#17526)	1 month ago
Han Qingzhe	1d594c295c clip: (minicpmv) fix resampler kq_scale (#17516)	1 month ago
Jeff Bolz	eec1e33a9e vulkan: allow graph_optimize for prompt processing workloads (#17475)	1 month ago
Jeff Bolz	879d673759 vulkan: Implement top-k (#17418)	1 month ago
xctan	6ab4e50d9c ggml-cpu : add RISC-V Zvfh impl for ggml_vec_mad_f16 (#17448)	1 month ago
Adrien Gallouët	2336cc4784 cmake : use EXCLUDE_FROM_ALL to avoid patch-boringssl.cmake (#17520)	1 month ago
Adrien Gallouët	e6923caaec ggml : fix ARM feature verification (#17519)	1 month ago
Jiacheng (Jason) Chen	3e18dba9fd HIP: Patch failed testcase in WMMA-MMQ kernels for RDNA 4 (#17502)	1 month ago
hipudding	eeb5605de2 CANN: Add MROPE and IMROPE support (#17401)	1 month ago
o7si	f3a848a3b1 chore: upgrade cpp-httplib from v0.27.0 to v0.28.0 (#17513)	1 month ago
Jeff Bolz	b3b03a7baf vulkan: Implement GGML_OP_CUMSUM (#17479)	1 month ago
Georgi Gerganov	583cb83416 ggml : add ggml_top_k (#17365)	1 month ago
Aleksei Nikiforov	05872ac885 convert : fix big-endian conversion (#17431)	1 month ago
Diego Devesa	55ab25caf5 codeowners : remove slaren (#17492)	1 month ago
TianHao324	064c90d843 CANN: supports out_prod operator for F32 and F16 (#17406)	1 month ago
Pascal	b1846f1c8e webui: add rehype plugin to restore HTML in Markdown table cells (#17477)	1 month ago
Jeff Bolz	d414db02d3 vulkan: Use fewer rows for scalar FA when HS is not a multiple of 16 (#17455)	1 month ago
Aaron Teo	877566d512 llama: introduce support for model-embedded sampling parameters (#17120)	1 month ago
Jeff Bolz	3d07caa99b vulkan: more FA details in vk_perf_logger (#17443)	1 month ago
Daniel Bevenius	134e6940ca llama : skip output reordering for single token batches (#17466)	1 month ago
Jiacheng (Jason) Chen	0543f928a3 HIP: WMMA-MMQ kernels for RDNA 4 (#17156)	1 month ago
Sigbjørn Skjæret	b61de2b2df convert : allow quantizing lora again (#17453)	1 month ago
Xuan-Son Nguyen	b8372eecd9 server: split server.cpp code into server/common/task/queue (#17362)	1 month ago
Daniel Bevenius	6ab8eacddf examples : add -kvu to batched usage example [no ci] (#17469)	1 month ago
Georgi Gerganov	2d50b9d8cb sync : ggml	1 month ago
Daniel Bevenius	697edfeead ggml : remove dirty flag from version string (ggml/1391)	1 month ago
Alberto Cabrera Pérez	dbb852b549 ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm) (#16739)	1 month ago
ixgbe	5f55c385cb ggml: add RISC-V cpu-feats (#17461)	1 month ago
william pan	4902eebe33 models : Added support for RND1 Diffusion Language Model (#17433)	1 month ago

Newer Older

Commit History Find

Commit History