cturan/llama.cpp

Author	SHA1 Message	Date
hipudding	6ba6a3c76f docs : update ops.md for CANN backend (#18654)	1 week ago
Perry Naseck	0802d4cfb3 ggml-blas: hide warnings from included BLAS headers (#18818)	1 week ago
Tarek Dakhran	c945aaaef2 mtmd : Fix ASR for LFM2.5-Audio-1.5B (#18876)	1 week ago
Xuan-Son Nguyen	c15395f73c common : implement new jinja template engine (#18462)	1 week ago
Julius Tischbein	aa1dc3770a Setting mmap and direct_io to false as default in llama-bench.cpp (#18841)	1 week ago
Raul Torres	4ea2eaac01 CANN: Remove unused `ggml_cann_get_device` function (#18625)	1 week ago
Chenguang Li	e20fa27a02 CANN: fix an issue where get_env was not fully renamed (#18796)	1 week ago
hipudding	baa4ba0aec CANN: support gated linear attn (#18653)	1 week ago
shaofeiqi	785a710085 OpenCL: add SOLVE_TRI op support (#18846)	2 weeks ago
Georgi Gerganov	6e7fc8a146 cuda : print less debug logs when disabling cuda graphs (#18868)	2 weeks ago
Georgi Gerganov	be8e3d9515 context : do not reserve scheduler for warmups (#18867)	2 weeks ago
ddh0	13f1e4a9ca llama : add adaptive-p sampler (#17927)	2 weeks ago
Xuan-Son Nguyen	a04c2b06a3 server: improve slots scheduling for n_cmpl (#18789)	2 weeks ago
Georgi Gerganov	39173bcacb context : reserve new scheduler when graph topology changes (#18547)	2 weeks ago
Johannes Gäßler	5c662d21a3 CUDA: fix allignment on register spill for FA (#18815)	2 weeks ago
shalinib-ibm	8cc0ba957b ggml-cpu: optimize ggml_vec_dot_bf16 for Power9 (#18837)	2 weeks ago
Xuan-Son Nguyen	a7e6ddb8bd lora: make sure model keep track of associated adapters (#18490)	2 weeks ago
Sigbjørn Skjæret	2a13180100 model-loader : support bool array sliding window pattern (#18850)	2 weeks ago
Adrien Gallouët	ec997b4f2b tests : download models only when running ctest (#18843)	2 weeks ago
Max Krasnyansky	cff777f226 hexagon: support for OP_CPY, host buffers now optional, hvx-utils refactoring and optimizations (#18822)	2 weeks ago
Oliver Simons	36f0132464 CUDA: Factor out and re-use `block_reduce` function (#18785)	2 weeks ago
Piotr Wilkin (ilintar)	d98b548120 Restore clip's cb() to its rightful glory - extract common debugging elements in llama (#17914)	2 weeks ago
Junwon Hwang	8fb7175576 model : clean up and fix EXAONE-MoE configuration (#18840)	2 weeks ago
Adrien Gallouët	516a4ca9b5 refactor : remove libcurl, use OpenSSL when available (#18828)	2 weeks ago
Jeff Bolz	3e4bb29666 vulkan: Check maxStorageBufferRange in supports_op (#18709)	2 weeks ago
Aman Gupta	47f9612492 llama-model: fix unfortunate typo (#18832)	2 weeks ago
Daniel Bevenius	01cbdfd7eb CUDA : fix typo in clang pragma comment [no ci] (#18830)	2 weeks ago
Ruben Ortlam	635ef78ec5 vulkan: work around Intel fp16 bug in mmq (#18814)	2 weeks ago
Perry Naseck	7d587e5544 ggml-metal: do not copy headers for embedded, use current binary dir for embedded (#18705)	2 weeks ago
Daniel Benjaminsson	d34aa07193 mmap: add Haiku support by skipping RLIMIT_MEMLOCK check (#18819)	2 weeks ago

Newer Older

Commit History Find

Commit History