Piotr Wilkin
|
178230ee21
Getting to decode stage...
|
4 kuukautta sitten |
Piotr Wilkin (ilintar)
|
c78f9fce68
Merge branch 'ggml-org:master' into qwen3_next
|
4 kuukautta sitten |
Piotr Wilkin
|
344331c2b6
First draft
|
4 kuukautta sitten |
Xuan-Son Nguyen
|
8f8f2274ee
convert : add Llama4ForCausalLM (#16042)
|
4 kuukautta sitten |
Shane A
|
85286f3548
model : add OLMo3 support (#16015)
|
4 kuukautta sitten |
Aman Gupta
|
6d758839ff
Add LLaDA-7b-MoE diffusion model (#16003)
|
4 kuukautta sitten |
Sigbjørn Skjæret
|
b8e09f08b9
model : add grok-2 support (#15539)
|
4 kuukautta sitten |
Jie Fu (傅杰)
|
4f658855fa
llama : support T5 models with unequal number of encoder-decoder layers (#15909)
|
4 kuukautta sitten |
Georgi Gerganov
|
cf0e3ba150
model : avoid ggml_cont_3d for fused QKV weights (#15662)
|
4 kuukautta sitten |
Georgi Gerganov
|
c610b6c11b
kv-cache : fix SWA checks + disable cacheless iSWA (#15811)
|
4 kuukautta sitten |
Daniel Bevenius
|
fb15d649ed
llama : add support for EmbeddingGemma 300m (#15798)
|
4 kuukautta sitten |
Daniel Bevenius
|
2c8dac72eb
llama : fix incorrect model type for Gemma 270M (#15764)
|
4 kuukautta sitten |
Johannes Gäßler
|
e81b8e4b7f
llama: use FA + max. GPU layers by default (#15434)
|
4 kuukautta sitten |
Gabe Goodhart
|
e8d99dd0b6
nvidia nemotron nano v2 (nemotronh) (#15507)
|
4 kuukautta sitten |
Sigbjørn Skjæret
|
84ab83cc0b
model : jina-embeddings-v3 support (#13693)
|
4 kuukautta sitten |
Georgi Gerganov
|
b730706a49
kv-cache : support layer reuse (#15504)
|
4 kuukautta sitten |
Piotr Wilkin (ilintar)
|
b1afcab804
model : add support for Seed-OSS (#15490)
|
4 kuukautta sitten |
Tarek Dakhran
|
e288693669
readme : model : mtdm : lfm2 improvements (#15476)
|
4 kuukautta sitten |
Georgi Gerganov
|
3f196be84b
graph : remove build_attn_with_sinks overload (#15469)
|
4 kuukautta sitten |
Georgi Gerganov
|
715a6db02c
kv-cache : drop the "unified" prefix (#15467)
|
4 kuukautta sitten |
Georgi Gerganov
|
9ef6b0b835
model : add gpt-oss type strings (#15424)
|
5 kuukautta sitten |
Sigbjørn Skjæret
|
baa9255a45
llama : merge conts and reshapes and remove unnecessary cont (#15380)
|
5 kuukautta sitten |
Daniel Bevenius
|
7a0de96045
llama : add 18-layer model type for Gemma 3-270m (#15319)
|
5 kuukautta sitten |
Georgi Gerganov
|
fd1234cb46
llama : add gpt-oss (#15091)
|
5 kuukautta sitten |
Juk Armstrong
|
c81de6e107
Fix `glm4moe` bug (#15088)
|
5 kuukautta sitten |
Sam
|
ef0144c087
model: support GLM 4.5 family of models (#14939)
|
5 kuukautta sitten |
compilade
|
11a3811164
memory : handle kv_unified for hybrid models (#15050)
|
5 kuukautta sitten |
Douglas Hanley
|
339bd0268c
model : support Qwen3-Embedding (#15023)
|
5 kuukautta sitten |
stevenkuang
|
0f5ccd6fd1
model : add hunyuan dense (#14878)
|
5 kuukautta sitten |
Diego Devesa
|
d6818d06a6
llama : allow other bufts when overriding to CPU, add --no-repack option (#14990)
|
5 kuukautta sitten |