Sigbjørn Skjæret
|
88fc854b4b
llama : improve sep token handling (#14272)
|
há 7 meses atrás |
Diego Devesa
|
e28c1b93fd
cuda : synchronize graph capture and cublas handle destruction (#14288)
|
há 7 meses atrás |
Georgi Gerganov
|
d27b3ca175
ggml : fix repack work size for mul_mat_id (#14292)
|
há 7 meses atrás |
Charles Xu
|
9230dbe2c7
ggml: Update KleidiAI to v1.9.0 (#14277)
|
há 7 meses atrás |
Georgi Gerganov
|
812939a9e9
model : more uniform output id handling (#14275)
|
há 7 meses atrás |
Georgi Gerganov
|
4c9fdfbe15
ubatch : new splitting logic (#14217)
|
há 7 meses atrás |
Aman Gupta
|
9eaa51e7f0
CUDA: add conv_2d_dw (#14265)
|
há 7 meses atrás |
Diego Devesa
|
8f71d0f3e8
ggml-cpu : remove unnecesary arm feature detection (#14281)
|
há 7 meses atrás |
Alex Trotta
|
381174bbda
gguf-py : make sentencepiece optional (#14200)
|
há 7 meses atrás |
aa956
|
d67341dc18
server : add server parameters for draft model cache type (#13782)
|
há 7 meses atrás |
fanyang
|
456af35eb7
build : suppress gcc15 compile warnings (#14261)
|
há 7 meses atrás |
Anton Mitkov
|
600e3e9b50
sycl: Cleanup codepaths in Get Rows in sycl backend (#14215)
|
há 7 meses atrás |
bashayer hijji
|
fffcce535e
llama-bench : add --no-warmup flag (#14224) (#14270)
|
há 7 meses atrás |
pqnet
|
5fc7856815
convert : fix remote option in Windows (#14100)
|
há 7 meses atrás |
Aaron Teo
|
faed5a5f5d
llamafile : support s390x SIMD instruction set (#14273)
|
há 7 meses atrás |
0cc4m
|
10bb545c5b
Vulkan: Set device max size for host memory to avoid OOM warning and fallback to CPU buffer (#14249)
|
há 7 meses atrás |
Gabe Goodhart
|
edc4a29eff
memory : Hybrid recurrent cache (#13979)
|
há 7 meses atrás |
Georgi Gerganov
|
ed3290ab34
metal : add mean kernel (#14267)
|
há 7 meses atrás |
Aaron Teo
|
8d94713654
docs: add s390x build documentation (#14264)
|
há 7 meses atrás |
Aaron Teo
|
50d2227953
ggml-cpu: reduce asm calls for hsum (#14037)
|
há 7 meses atrás |
Aaron Teo
|
6231c5cd6d
ggml-cpu: fix uncaught underscore terminators (#14023)
|
há 7 meses atrás |
Charles Xu
|
ef035803eb
ggml: Add Apple support for GGML_CPU_ALL_VARIANTS (#14258)
|
há 7 meses atrás |
Xuan-Son Nguyen
|
413977de32
mtmd : refactor llava-uhd preprocessing logic (#14247)
|
há 7 meses atrás |
Xuan-Son Nguyen
|
95402553a5
llama-chat : fix multiple system message for gemma, orion (#14246)
|
há 7 meses atrás |
Sigbjørn Skjæret
|
3865cff4f5
convert : fix null head_dim AutoConfig regression (#14248)
|
há 7 meses atrás |
Georgi Gerganov
|
d03172cc79
sync : ggml
|
há 7 meses atrás |
Daniel Bevenius
|
dd8e59f443
ggml : disable warnings for tests when using MSVC (ggml/1273)
|
há 7 meses atrás |
Daniel Bevenius
|
bbe98d2784
ggml : remove unused ggml_context_container (ggml/1272)
|
há 7 meses atrás |
Daniel Bevenius
|
c2056ed6d4
examples : include examples in msvc disable warn (ggml/1270)
|
há 7 meses atrás |
bandoti
|
c46503014d
cmake: remove shader-gen step-targets from ggml-vulkan (#14226)
|
há 7 meses atrás |