cturan/llama.cpp

Author	SHA1 Message	Date
Diego Devesa	d6818d06a6 llama : allow other bufts when overriding to CPU, add --no-repack option (#14990)	5 months ago
Ruben Ortlam	e08a98826b Vulkan: Fix minor debug mode issues (#14899)	5 months ago
tc-mb	952a47f455 mtmd : support MiniCPM-V 4.0 (#14983)	5 months ago
Csaba Kecskemeti	36e5fe7bcd MODEL_TENSOR.SSM_DT_NORM has defined twice (#14991)	5 months ago
g2mt	94933c8c2e server : implement universal assisted decoding (#12635)	5 months ago
Dongliang Wei	c1dacaa99b llama : merge build_moe_ffn_from_probs function into build_moe_ffn (#14968)	5 months ago
Lukas Straub	a9f77a8be3 server : add openai-style logit_bias support (#14946)	5 months ago
Aman Gupta	8a4a856277 Add LLaDA 8b Diffusion model (#14771)	5 months ago
hipudding	11490b3672 CANN: Improve loading efficiency after converting weights to NZ format. (#14985)	5 months ago
compilade	66625a59a5 graph : reduce splits for recurrent and hybrid models (#14825)	5 months ago
lhez	6e6725459a opencl: add `mul_mat_f32_f32_l4_lm` and `mul_mat_f16_f32_l4_lm` (#14809)	5 months ago
Ed Addario	e9192bec56 quantize : fix using combined imatrix GGUFs (multiple datasets) (#14973)	5 months ago
Daniel Bevenius	41e78c567e server : add support for `embd_normalize` parameter (#14964)	5 months ago
uvos	ad4a700117 HIP: enable mfma mmq on gfx908 and gfx90a for select datatypes and shapes (#14949)	5 months ago
Georgi Gerganov	e32a4ec60e sync : ggml	5 months ago
Kai Pastor	e228de9449 cmake : Fix BLAS link interface (ggml/1316)	5 months ago
Kai Pastor	73a8e5ca03 vulkan : fix 32-bit builds (ggml/1313)	5 months ago
Johannes Gäßler	92b8810ec7 CUDA: skip masked KV slices for all FA kernels (#14924)	5 months ago
Georgi Gerganov	00131d6eaf tests : update for LLAMA_SET_ROWS=1 (#14961)	5 months ago
Georgi Gerganov	1e15bfd42c graph : fix stack-use-after-return (#14960)	5 months ago
Douglas Hanley	a118d80233 embeddings: fix extraction of CLS pooling results (#14927)	5 months ago
Xinpeng Dou	61550f8231 CANN: update ops docs (#14935)	5 months ago
uvos	aa79524c51 HIP: remove the use of __HIP_PLATFORM_AMD__, explicitly support only AMD targets (#14945)	5 months ago
uvos	b77d11179d HIP: add GGML_HIP_MMQ_MFMA option to allow disableing the MFMA path. (#14930)	5 months ago
uvos	c7aa1364fd HIP: Ignore unsupported unroll transformation in fattn-vec (#14931)	5 months ago
kallewoof	1a67fcc306 common : avoid logging partial messages (which can contain broken UTF-8 sequences) (#14937)	5 months ago
hipudding	204f2cf168 CANN: Add ggml_set_rows (#14943)	5 months ago
Sigbjørn Skjæret	138b288b59 cuda : add softcap fusion (#14907)	5 months ago
Johannes Gäßler	bbd0f91779 server-bench: make seed choice configurable (#14929)	5 months ago
Aman Gupta	0a5036bee9 CUDA: add roll (#14919)	5 months ago

Newer Older

Commit History Find

Commit History