ddpasa
|
21ca987fba
docs: Update link to ggml-org in multimodal.md (#13513)
|
8 ماه پیش |
Sigbjørn Skjæret
|
be1d4a13db
scripts : fix compare-llama-bench.py show parameter (#13514)
|
8 ماه پیش |
Jeff Bolz
|
ab3971f2a0
vulkan: workaround FA compile failures on macos (#13517)
|
8 ماه پیش |
Ed Addario
|
e5c834f718
quantize : improve tensor-type pattern matching (#13033)
|
8 ماه پیش |
Xuan-Son Nguyen
|
71bdbdb587
clip : clip.h become private API (⚠️ breaking change) (#13510)
|
8 ماه پیش |
Georgi Gerganov
|
f0995d28ce
metal : use FA-vec kernel up to batch size 20 (#13496)
|
8 ماه پیش |
Georgi Gerganov
|
c252e0c409
metal : optimize multi-sequence FA vec kernel (#13493)
|
8 ماه پیش |
Dan Johansson
|
4f711afed5
ggml-cpu: Update KleidiAI to v1.6 and fix include directives (#13509)
|
8 ماه پیش |
Georgi Gerganov
|
b89d605a91
batched-bench : fix pp batch contents (#13492)
|
8 ماه پیش |
Xuan-Son Nguyen
|
b4726345ac
mtmd : remove libllava, remove clip-quantize-cli (⚠️ breaking change) (#13460)
|
8 ماه پیش |
Sigbjørn Skjæret
|
bf79371120
scripts : support arbitrary input file formats in compare-llama-bench.py (#13455)
|
8 ماه پیش |
Gabe Goodhart
|
d590cd4c24
model : Granite MoE shared (#13269)
|
8 ماه پیش |
Georgi Gerganov
|
1e2809bc4b
sync : ggml
|
8 ماه پیش |
Diego Devesa
|
cf0a43bb64
llama-bench : add defrag-thold, check for invalid ranges (#13487)
|
8 ماه پیش |
lhez
|
f0d46ef157
opencl: remove unnecessary assert for `add` (#13257)
|
8 ماه پیش |
Xuan-Son Nguyen
|
de4c07f937
clip : cap max image size 1024 for qwen vl model (#13478)
|
8 ماه پیش |
Johannes Gäßler
|
10d2af0eaa
llama/ggml: add LLM training support (#10544)
|
8 ماه پیش |
Georgi Gerganov
|
064cc596ac
context : fix state io for memory-less contexts (#13470)
|
8 ماه پیش |
Anudit Nagar
|
91159ee9df
server : allow content to be null in oaicompat_completion_params_parse (#13477)
|
8 ماه پیش |
Diego Devesa
|
22cdab343b
llama-bench : accept ranges for integer parameters (#13410)
|
8 ماه پیش |
Dan Johansson
|
a71a4075cd
ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel (#13053)
|
8 ماه پیش |
Johannes Gäßler
|
95e18884fc
CUDA: fix misaligned synchronization in FA (#13469)
|
8 ماه پیش |
Xuan-Son Nguyen
|
df8491922f
ggml : add mrope kernel for metal (#13457)
|
8 ماه پیش |
Atharva Dubey
|
14492144c2
enable dpcpp nightly builds with libraries (#13406)
|
8 ماه پیش |
City
|
c104023994
mtmd : Use RMS norm for InternVL 3 38B and 78B mmproj (#13459)
|
8 ماه پیش |
Anthony Umfer
|
9a390c4829
tools : fix uninitialized llama_batch in server (#13436)
|
8 ماه پیش |
Sigbjørn Skjæret
|
09232370fc
scripts : exit compare-llama-bench.py gracefully when there's nothing to compare (#13451)
|
8 ماه پیش |
Johannes Gäßler
|
7474e00b34
CUDA: fix crash with partial offloading of MoE (#13439)
|
8 ماه پیش |
David Huang
|
7f323a589f
Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386)
|
8 ماه پیش |
City
|
3eac209319
mtmd : support InternVL 3 38B and 78B mmproj (#13443)
|
8 ماه پیش |