cturan/llama.cpp

Author	SHA1 Message	Date
Georgi Gerganov	e92d53b29e sampling : optimize samplers by reusing bucket sort (#15665)	4 months ago
Georgi Gerganov	0d161f021a server : enable /slots by default and make it secure (#15630)	4 months ago
Georgi Gerganov	4efd5a8316 metal : fix checks for available FA kernels (#15700)	4 months ago
Diego Devesa	274966226f llama : fix fattn reserve call n_seqs parameter (#15699)	4 months ago
Diego Devesa	9777032dcc llama : separate compute buffer reserve from fattn check (#15696)	4 months ago
Sigbjørn Skjæret	7d3c9f2b21 ci : explicitly set fa off or on (#15692)	4 months ago
Jeff Bolz	bbbf5ecccb vulkan: handle large sizes for get_rows (#15686)	4 months ago
Jeff Bolz	c37052ab4d vulkan: mul_mat_id coopmat2 optimizations (#15546)	4 months ago
Daniel Bevenius	5c16b9c87d vulkan : remove unused portability_enumeration_ext variable (#15679)	4 months ago
Jeff Bolz	b97c9edc59 vulkan: Allow fallback to sysmem memory when vidmem is full (#15649)	4 months ago
Jeff Bolz	94e82c7ead vulkan: clamp matmul and FA results to the max finite value (#15652)	4 months ago
Charles Xu	4d74393bcc ggml: update kleidiai to v1.13.0 (#15663)	4 months ago
Diego Devesa	dd892555b0 Update build.md to remove MSVC arm64 notes (#15684)	4 months ago
Johannes Gäßler	e81b8e4b7f llama: use FA + max. GPU layers by default (#15434)	4 months ago
Johannes Gäßler	38ad381f9f CUDA: use FP32 arithmetic for conv2d (#15683)	4 months ago
Jeff Bolz	696fccf354 vulkan: Skip syncing for prealloc_y when it is reused (#15544)	4 months ago
Chenguang Li	ef476916bb CANN: FIx compiler warnings (#15661)	4 months ago
Sergey Alirzaev	d82f6aa34a server : removed obsolete doc (#15670)	4 months ago
Johannes Gäßler	3d16b29c3b scripts: strip "AMD Instinct" from GPU name (#15668)	4 months ago
ExtReMLapin	792b44f2ed server : add documentation for `parallel_tool_calls` param (#15647)	4 months ago
Aman Gupta	81017865ee CUDA: fix bug in rms_norm fusion (#15660)	4 months ago
Piotr Wilkin (ilintar)	60e5eee31f chat : Seed OSS thinking + tool call support (#15552)	4 months ago
Aman Gupta	009b709d6e CUDA: fuse adds, fuse add with rms norm (#15631)	4 months ago
Gabe Goodhart	e8d99dd0b6 nvidia nemotron nano v2 (nemotronh) (#15507)	4 months ago
Gabe Goodhart	a8bca68f72 fix: Compute the full sum in llama-eval-callback, not just the sum of printed values (#15637)	4 months ago
mnehete32	c97dc09391 CUDA: add conv2d (#15635)	4 months ago
Aaron Teo	6c442f42ff ggml-cpu: fix invalid hsum build in debug s390x (#15634)	4 months ago
compilade	73804145ab ggml : fix SSM_SCAN for n_groups > 1 (#15625)	4 months ago
Georgi Gerganov	c8d0d14e77 kv-cache : fix find_slot to not search for continuous slot (#15638)	4 months ago
Sigbjørn Skjæret	84ab83cc0b model : jina-embeddings-v3 support (#13693)	4 months ago

Newer Older

Commit History Find

Commit History