Piotr Wilkin
|
c2a82a1773
Move the norm shift to conversion, Gemma 2 style
|
3 mesi fa |
Piotr Wilkin
|
ce87b7d78e
Yup, it's NeoX
|
3 mesi fa |
Piotr Wilkin
|
54712b8664
Oh, forgot to commit
|
3 mesi fa |
Piotr Wilkin
|
0dd6110fdc
v1.0
|
3 mesi fa |
Piotr Wilkin
|
27fa5f335d
Correct convolution state dimension calculations
|
3 mesi fa |
Piotr Wilkin
|
e24c9dfa60
Remove OP_DELTA_NET, fix flake8 and editorchecker because why not
|
3 mesi fa |
Piotr Wilkin
|
6e3abeb6c0
Exclude MTP layers in conversion
|
3 mesi fa |
Xuan Son Nguyen
|
46110e0630
split q_proj/gate
|
3 mesi fa |
Piotr Wilkin (ilintar)
|
c78f9fce68
Merge branch 'ggml-org:master' into qwen3_next
|
4 mesi fa |
Piotr Wilkin
|
344331c2b6
First draft
|
4 mesi fa |
Xuan-Son Nguyen
|
8f8f2274ee
convert : add Llama4ForCausalLM (#16042)
|
4 mesi fa |
Shane A
|
85286f3548
model : add OLMo3 support (#16015)
|
4 mesi fa |
Aman Gupta
|
6d758839ff
Add LLaDA-7b-MoE diffusion model (#16003)
|
4 mesi fa |
Sigbjørn Skjæret
|
b8e09f08b9
model : add grok-2 support (#15539)
|
4 mesi fa |
Jie Fu (傅杰)
|
4f658855fa
llama : support T5 models with unequal number of encoder-decoder layers (#15909)
|
4 mesi fa |
Daniel Bevenius
|
233d773d02
convert : force setting sliding_window from original config (#15867)
|
4 mesi fa |
Daniel Bevenius
|
fb15d649ed
llama : add support for EmbeddingGemma 300m (#15798)
|
4 mesi fa |
Jie Fu (傅杰)
|
4b20d8b7e3
convert : remove redundant code (#15708)
|
4 mesi fa |
Gabe Goodhart
|
e8d99dd0b6
nvidia nemotron nano v2 (nemotronh) (#15507)
|
4 mesi fa |
Sigbjørn Skjæret
|
84ab83cc0b
model : jina-embeddings-v3 support (#13693)
|
4 mesi fa |
Xuan-Son Nguyen
|
79a546220c
mtmd : support Kimi VL model (#15458)
|
4 mesi fa |
Weizhao Ouyang
|
0d5a470223
convert : update Ernie 4.5 dense architecture name (#15555)
|
4 mesi fa |
RunningLeon
|
7da9fed0d6
convert : support interns1-mini (#15412)
|
4 mesi fa |
Piotr Wilkin (ilintar)
|
b1afcab804
model : add support for Seed-OSS (#15490)
|
4 mesi fa |
Julien Denize
|
b2caf67db1
convert : make Mistral community chat templates optional via parameter (#15420)
|
4 mesi fa |
Sigbjørn Skjæret
|
4d196981d4
convert : force patch_embd weights to F16 or F32 to avoid broken GGUFs (#15367)
|
5 mesi fa |
Tarek Dakhran
|
65349f26f2
model : support vision LiquidAI LFM2-VL family (#15347)
|
5 mesi fa |
Sigbjørn Skjæret
|
50e81bdf5d
convert : fix merge conflicts (#15229)
|
5 mesi fa |
Julien Denize
|
a3a7874272
convert : improve Mistral models integration (#14737)
|
5 mesi fa |
Xuan-Son Nguyen
|
50aa938901
convert : support non-mxfp4 HF model (#15153)
|
5 mesi fa |