Georgi Gerganov
|
bf9087f59a
metal : fuse add, mul + add tests (#14596)
|
6 месяцев назад |
Georgi Gerganov
|
9fb1042ce6
graph : fix graph reuse reset of params (#14760)
|
6 месяцев назад |
Georgi Gerganov
|
d498af3d5a
graph : avoid huge warm-up graphs for MoE models (#14753)
|
6 месяцев назад |
Georgi Gerganov
|
8f974bc1e9
graph : refactor context to not pass gf explicitly (#14629)
|
6 месяцев назад |
Nexes the Elder
|
09651d09ff
graph : Pass the graph placeholder message in debug mode (#14748)
|
6 месяцев назад |
Georgi Gerganov
|
01612b7409
llama : reuse compute graphs (#14482)
|
6 месяцев назад |
Georgi Gerganov
|
225e7a1438
llama : add high-throughput mode (#14363)
|
6 месяцев назад |
Xuan-Son Nguyen
|
cb9178f885
llama : remove llm_graph_input_one (#14603)
|
6 месяцев назад |
compilade
|
4a5686da22
llama : support Jamba hybrid Transformer-Mamba models (#7531)
|
6 месяцев назад |
Georgi Gerganov
|
7b50f7c025
graph : prepare for 4D mask (#14515)
|
6 месяцев назад |
Georgi Gerganov
|
a70c8a0c4b
kv-cache : use ggml_set_rows (#14285)
|
6 месяцев назад |
compilade
|
5d46babdc2
llama : initial Mamba-2 support (#9126)
|
6 месяцев назад |
Sigbjørn Skjæret
|
a0535ffa0d
ggml : implement REGLU/GEGLU/SWIGLU ops (#14158)
|
6 месяцев назад |
Xuan-Son Nguyen
|
8846aace49
model : gemma3n text-only (#14400)
|
6 месяцев назад |
Georgi Gerganov
|
692e3cdd0a
memory : rename interface to llama_memory_context_i (#14296)
|
7 месяцев назад |
Georgi Gerganov
|
812939a9e9
model : more uniform output id handling (#14275)
|
7 месяцев назад |
Georgi Gerganov
|
4c9fdfbe15
ubatch : new splitting logic (#14217)
|
7 месяцев назад |
Gabe Goodhart
|
edc4a29eff
memory : Hybrid recurrent cache (#13979)
|
7 месяцев назад |
Georgi Gerganov
|
60c666347b
batch : rework llama_batch_allocr (#14153)
|
7 месяцев назад |
Đinh Trọng Huy
|
d714dadb57
pooling : make cls_b and cls_out_b optional (#14165)
|
7 месяцев назад |
compilade
|
dad5c44398
kv-cache : avoid modifying recurrent cells when setting inputs (#13834)
|
7 месяцев назад |
Sigbjørn Skjæret
|
3678b838bb
llama : support GEGLU for jina-bert-v2 (#14090)
|
7 месяцев назад |
Georgi Gerganov
|
201b31dc2e
graph : fix geglu (#14077)
|
7 месяцев назад |
Đinh Trọng Huy
|
91a8ee6a6f
add geglu activation function (#14074)
|
7 месяцев назад |
Xuan-Son Nguyen
|
3ac67535c8
llama-graph : use ggml_repeat_4d (#13998)
|
7 месяцев назад |
Georgi Gerganov
|
0fc16b42e8
kv-cache : split implementation in separate sources (#13920)
|
7 месяцев назад |
Georgi Gerganov
|
12d0188c0d
kv-cache : refactor + add llama_memory_state_i (#13746)
|
7 месяцев назад |
Xuan-Son Nguyen
|
763d06edb7
llama : fix KV shift for qwen2vl (#13870)
|
7 месяцев назад |
Đinh Trọng Huy
|
e0e3aa231d
llama : add support for BertForSequenceClassification reranker (#13858)
|
7 месяцев назад |
0cc4m
|
259469c4b5
Move GLM4 f32 attention fix to the correct function (#13750)
|
7 месяцев назад |