HelloKS
|
9d52f17ae3
model : add KORMo model (#18032)
|
1 месяц назад |
Johannes Gäßler
|
b1f3a6e5db
llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653)
|
1 месяц назад |
Xuan-Son Nguyen
|
0759b09c90
graph: add f_attn_temp_offset (#18025)
|
1 месяц назад |
Georgi Gerganov
|
609a2d0268
models : fix YaRN regression + consolidate logic (#18006)
|
1 месяц назад |
Georgi Gerganov
|
7bed317f53
models : fix the attn_factor for mistral3 graphs + improve consistency (#17945)
|
1 месяц назад |
Eric Zhang
|
b677721819
model : Qwen3-Next-80B-A3B has 48 layers (#17898)
|
1 месяц назад |
Sigbjørn Skjæret
|
42b12b5608
model : nit, DeepSeek V1 MoE is 16B and GigaChat is 20B (#12652)
|
1 месяц назад |
philip-essential
|
1d2a1ab73d
model : support Rnj-1 (#17811)
|
1 месяц назад |
Xuan-Son Nguyen
|
4d3726278b
model: add llama 4 scaling for mistral-large (deepseek arch) (#17744)
|
1 месяц назад |
Herman Semenoff
|
37adc9c6ba
ggml, llama : use defaulted constructors/destructors (#17649)
|
1 месяц назад |
Piotr Wilkin (ilintar)
|
746f9ee889
Override SSM_A op for Qwen3 Next to reduce splits (#17587)
|
1 месяц назад |
Gilad S.
|
00c361fe53
fix: llama arch implementation (#17665)
|
1 месяц назад |
Xuan-Son Nguyen
|
cd3c118908
model: support Ministral3 (#17644)
|
1 месяц назад |
Piotr Wilkin (ilintar)
|
ff55414c42
model : Qwen3 Next (#16095)
|
1 месяц назад |
Georgi Gerganov
|
6783b11fb0
models : fix LFM2 tensors (#17548)
|
1 месяц назад |
Aaron Teo
|
877566d512
llama: introduce support for model-embedded sampling parameters (#17120)
|
1 месяц назад |
william pan
|
4902eebe33
models : Added support for RND1 Diffusion Language Model (#17433)
|
1 месяц назад |
ubergarm
|
23bc779a6e
model : detect GigaChat3-10-A1.8B as deepseek lite (#17420)
|
1 месяц назад |
Bartowski
|
e1fcf8b09b
model : add AfmoeForCausalLM support (#16477)
|
2 месяцев назад |
Sigbjørn Skjæret
|
9008027aa3
hparams : add n_embd_inp() to support extended embed (#16928)
|
2 месяцев назад |
Li Pengzhan
|
9f052478c2
model : add openPangu-Embedded (#16941)
|
2 месяцев назад |
Georgi Gerganov
|
cd5e3b5754
server : support unified cache across slots (#16736)
|
2 месяцев назад |
Piotr Wilkin (ilintar)
|
bea04522ff
refactor : llama-model.cpp (#16252)
|
2 месяцев назад |
Piotr Wilkin (ilintar)
|
0de0a01576
model : Minimax M2 (#16831)
|
2 месяцев назад |
Giuseppe Scrivano
|
e58d585604
model : add Granite Hybrid nano types (#16896)
|
2 месяцев назад |
JJJYmmm
|
d261223d24
model: add support for qwen3vl series (#16780)
|
2 месяцев назад |
Tianyue-Zhao
|
bacddc049a
model: Add support for CogVLM model (#15002)
|
2 месяцев назад |
Georgi Gerganov
|
85a7d8677b
memory : remove KV cache size padding (#16812)
|
2 месяцев назад |
Johannes Gäßler
|
7a0e900e36
llama: consistent ctx <-> buf order for KV cache (#16746)
|
2 месяцев назад |
Johannes Gäßler
|
945501f5ea
llama: fix leaked buffers for mmap + split files (#16765)
|
2 месяцев назад |