Tianyue-Zhao
|
bacddc049a
model: Add support for CogVLM model (#15002)
|
пре 2 месеци |
Georgi Gerganov
|
85a7d8677b
memory : remove KV cache size padding (#16812)
|
пре 2 месеци |
Johannes Gäßler
|
7a0e900e36
llama: consistent ctx <-> buf order for KV cache (#16746)
|
пре 2 месеци |
Johannes Gäßler
|
945501f5ea
llama: fix leaked buffers for mmap + split files (#16765)
|
пре 2 месеци |
Sigbjørn Skjæret
|
73a48c9790
convert : enable expert group selection for all models with it (#16691)
|
пре 2 месеци |
Sigbjørn Skjæret
|
7cce4f8158
model : set res->t_embd in SmallThinker models (#16782)
|
пре 2 месеци |
Shunta Saito
|
226f295f4d
model : set res->t_embd in PLaMo2 models (#16766)
|
пре 2 месеци |
Max Krasnyansky
|
63d2fc46e1
Add experimental ggml-hexagon backend for the Hexagon NPU (#16547)
|
пре 2 месеци |
Sigbjørn Skjæret
|
84bf3c6778
model : add BailingMoeV2 support (#16063)
|
пре 2 месеци |
Giuseppe Scrivano
|
0398752dd4
model : add Granite Hybrid types (#16635)
|
пре 3 месеци |
Johannes Gäßler
|
66b0dbcb2d
llama-model: fix insonsistent ctxs <-> bufs order (#16581)
|
пре 3 месеци |
Xuan-Son Nguyen
|
3e3cb19f64
llama-quant: add support for mmproj (#16592)
|
пре 3 месеци |
Georgi Gerganov
|
e38b7c6e9e
graph : support cacheless embeddings with FA and iSWA (#16528)
|
пре 3 месеци |
Georgi Gerganov
|
a3cb04744f
metal : fix mul-mm condition + fix mul-mv permuted kernels (#16494)
|
пре 3 месеци |
Saba Fallah
|
e08db42595
model: EmbeddingGemma Adding Support for SentenceTransformers Dense Modules (#16367)
|
пре 3 месеци |
Tarek Dakhran
|
aeaf8a36f0
llama : support LiquidAI LFM2-MoE hybrid model (#16464)
|
пре 3 месеци |
Gadflyii
|
3df2244df4
llama : add --no-host to disable host buffers (#16310)
|
пре 3 месеци |
ddh0
|
f6dcda3900
server : context checkpointing for hybrid and recurrent models (#16382)
|
пре 3 месеци |
Sigbjørn Skjæret
|
946f71ed9a
llama : fix shapes for bert/mpt q/k norm (#16409)
|
пре 3 месеци |
Piotr Wilkin (ilintar)
|
34fcc5a4ac
model : Apertus model implementation (#15852)
|
пре 3 месеци |
Shunta Saito
|
ded67b9444
llama : parameter conversion and loading fixes for PLaMo2 variants (#16075)
|
пре 3 месеци |
Bartowski
|
e74c92e842
model : support GLM 4.6 (make a few NextN/MTP tensors not required) (#16359)
|
пре 3 месеци |
anavp-nvidia
|
a014310374
cuda : Enable CUDA Graph usage for Nemotron Nano v2 (NemotronH) (#16328)
|
пре 3 месеци |
Vinkal
|
72b24d96c6
model : make minicpm embedding_scale, residual_scale and logit_scale optional with legacy defaults (#16273)
|
пре 3 месеци |
Sigbjørn Skjæret
|
835b2b915c
model : add GroveMoE support (#15510)
|
пре 3 месеци |
Douglas Hanley
|
b5bd037832
llama : add support for qwen3 reranker (#15824)
|
пре 3 месеци |
Johannes Gäßler
|
e789095502
llama: print memory breakdown on exit (#15860)
|
пре 3 месеци |
Tarek Dakhran
|
3a59971967
model : add label for LiquidAI LFM2-2.6B model (#16204)
|
пре 3 месеци |
Xuan-Son Nguyen
|
8f8f2274ee
convert : add Llama4ForCausalLM (#16042)
|
пре 4 месеци |
Shane A
|
85286f3548
model : add OLMo3 support (#16015)
|
пре 4 месеци |