cturan/llama.cpp

Autor	SHA1 Mensaxe	Data
compilade	66625a59a5 graph : reduce splits for recurrent and hybrid models (#14825)	hai 6 meses
lhez	6e6725459a opencl: add `mul_mat_f32_f32_l4_lm` and `mul_mat_f16_f32_l4_lm` (#14809)	hai 6 meses
Ed Addario	e9192bec56 quantize : fix using combined imatrix GGUFs (multiple datasets) (#14973)	hai 6 meses
Daniel Bevenius	41e78c567e server : add support for `embd_normalize` parameter (#14964)	hai 6 meses
uvos	ad4a700117 HIP: enable mfma mmq on gfx908 and gfx90a for select datatypes and shapes (#14949)	hai 6 meses
Georgi Gerganov	e32a4ec60e sync : ggml	hai 6 meses
Kai Pastor	e228de9449 cmake : Fix BLAS link interface (ggml/1316)	hai 6 meses
Kai Pastor	73a8e5ca03 vulkan : fix 32-bit builds (ggml/1313)	hai 6 meses
Johannes Gäßler	92b8810ec7 CUDA: skip masked KV slices for all FA kernels (#14924)	hai 6 meses
Georgi Gerganov	00131d6eaf tests : update for LLAMA_SET_ROWS=1 (#14961)	hai 6 meses
Georgi Gerganov	1e15bfd42c graph : fix stack-use-after-return (#14960)	hai 6 meses
Douglas Hanley	a118d80233 embeddings: fix extraction of CLS pooling results (#14927)	hai 6 meses
Xinpeng Dou	61550f8231 CANN: update ops docs (#14935)	hai 6 meses
uvos	aa79524c51 HIP: remove the use of __HIP_PLATFORM_AMD__, explicitly support only AMD targets (#14945)	hai 6 meses
uvos	b77d11179d HIP: add GGML_HIP_MMQ_MFMA option to allow disableing the MFMA path. (#14930)	hai 6 meses
uvos	c7aa1364fd HIP: Ignore unsupported unroll transformation in fattn-vec (#14931)	hai 6 meses
kallewoof	1a67fcc306 common : avoid logging partial messages (which can contain broken UTF-8 sequences) (#14937)	hai 6 meses
hipudding	204f2cf168 CANN: Add ggml_set_rows (#14943)	hai 6 meses
Sigbjørn Skjæret	138b288b59 cuda : add softcap fusion (#14907)	hai 6 meses
Johannes Gäßler	bbd0f91779 server-bench: make seed choice configurable (#14929)	hai 6 meses
Aman Gupta	0a5036bee9 CUDA: add roll (#14919)	hai 6 meses
lhez	8ad7b3e65b opencl : add ops docs (#14910)	hai 6 meses
Leonard Mosescu	bda62193b2 test-backend-ops : extend test case filtering (#14865)	hai 6 meses
Radoslav Gerganov	c556418b60 llama-bench : use local GPUs along with RPC servers (#14917)	hai 6 meses
xctan	db16e2831c ggml-cpu : deduplicate scalar implementations (#14897)	hai 6 meses
Akarshan Biswas	cd1fce6d4f SYCL: Add set_rows support for quantized types (#14883)	hai 6 meses
Xuan-Son Nguyen	00fa15fedc mtmd : add support for Voxtral (#14862)	hai 6 meses
Johannes Gäßler	946b1f6859 CUDA: fix pointer incrementation in FA (#14916)	hai 6 meses
Dongliang Wei	6c6e397aff model : add support for SmallThinker series (#14898)	hai 6 meses
Alberto Cabrera Pérez	afc0e89698 sycl: refactor quantization to q8_1 (#14815)	hai 6 meses

Posterior Anterior

Commit History Buscar

Commit History