Piotr Wilkin
|
890fa2c1e3
WE HAVE OUTPUT!
|
3 months ago |
Piotr Wilkin
|
e590a75905
Cleanup complete, now for the recurrent memory management...
|
3 months ago |
Piotr Wilkin (ilintar)
|
72c98b0c7d
Merge pull request #1 from ggml-org/xsn/qwen3next_experiment
|
3 months ago |
Xuan Son Nguyen
|
e83ef74733
one less magic number
|
4 months ago |
Xuan Son Nguyen
|
f643b957f4
refactor softplus fn
|
4 months ago |
Xuan Son Nguyen
|
46110e0630
split q_proj/gate
|
4 months ago |
Piotr Wilkin
|
8152df60f3
Getting closer (graph builds for bs=1 but tensor shaping is still wrong for bigger sizes)
|
4 months ago |
Piotr Wilkin
|
e0c5dff2a7
Rewrite to tensor ops
|
4 months ago |
Piotr Wilkin
|
178230ee21
Getting to decode stage...
|
4 months ago |
Piotr Wilkin (ilintar)
|
c78f9fce68
Merge branch 'ggml-org:master' into qwen3_next
|
4 months ago |
Piotr Wilkin
|
344331c2b6
First draft
|
4 months ago |
Xuan-Son Nguyen
|
8f8f2274ee
convert : add Llama4ForCausalLM (#16042)
|
4 months ago |
Shane A
|
85286f3548
model : add OLMo3 support (#16015)
|
4 months ago |
Aman Gupta
|
6d758839ff
Add LLaDA-7b-MoE diffusion model (#16003)
|
4 months ago |
Sigbjørn Skjæret
|
b8e09f08b9
model : add grok-2 support (#15539)
|
4 months ago |
Jie Fu (傅杰)
|
4f658855fa
llama : support T5 models with unequal number of encoder-decoder layers (#15909)
|
4 months ago |
Georgi Gerganov
|
cf0e3ba150
model : avoid ggml_cont_3d for fused QKV weights (#15662)
|
4 months ago |
Georgi Gerganov
|
c610b6c11b
kv-cache : fix SWA checks + disable cacheless iSWA (#15811)
|
4 months ago |
Daniel Bevenius
|
fb15d649ed
llama : add support for EmbeddingGemma 300m (#15798)
|
4 months ago |
Daniel Bevenius
|
2c8dac72eb
llama : fix incorrect model type for Gemma 270M (#15764)
|
4 months ago |
Johannes Gäßler
|
e81b8e4b7f
llama: use FA + max. GPU layers by default (#15434)
|
4 months ago |
Gabe Goodhart
|
e8d99dd0b6
nvidia nemotron nano v2 (nemotronh) (#15507)
|
4 months ago |
Sigbjørn Skjæret
|
84ab83cc0b
model : jina-embeddings-v3 support (#13693)
|
4 months ago |
Georgi Gerganov
|
b730706a49
kv-cache : support layer reuse (#15504)
|
4 months ago |
Piotr Wilkin (ilintar)
|
b1afcab804
model : add support for Seed-OSS (#15490)
|
4 months ago |
Tarek Dakhran
|
e288693669
readme : model : mtdm : lfm2 improvements (#15476)
|
4 months ago |
Georgi Gerganov
|
3f196be84b
graph : remove build_attn_with_sinks overload (#15469)
|
4 months ago |
Georgi Gerganov
|
715a6db02c
kv-cache : drop the "unified" prefix (#15467)
|
4 months ago |
Georgi Gerganov
|
9ef6b0b835
model : add gpt-oss type strings (#15424)
|
5 months ago |
Sigbjørn Skjæret
|
baa9255a45
llama : merge conts and reshapes and remove unnecessary cont (#15380)
|
5 months ago |