Georgi Gerganov
|
e38b7c6e9e
graph : support cacheless embeddings with FA and iSWA (#16528)
|
3 ay önce |
Saba Fallah
|
e08db42595
model: EmbeddingGemma Adding Support for SentenceTransformers Dense Modules (#16367)
|
3 ay önce |
Douglas Hanley
|
b5bd037832
llama : add support for qwen3 reranker (#15824)
|
3 ay önce |
Daniel Bevenius
|
fb15d649ed
llama : add support for EmbeddingGemma 300m (#15798)
|
4 ay önce |
Johannes Gäßler
|
e81b8e4b7f
llama: use FA + max. GPU layers by default (#15434)
|
4 ay önce |
Georgi Gerganov
|
3f196be84b
graph : remove build_attn_with_sinks overload (#15469)
|
5 ay önce |
Georgi Gerganov
|
715a6db02c
kv-cache : drop the "unified" prefix (#15467)
|
5 ay önce |
Georgi Gerganov
|
fd1234cb46
llama : add gpt-oss (#15091)
|
5 ay önce |
Georgi Gerganov
|
ba42794c9e
graph : fix equal_seq() check (#14986)
|
5 ay önce |
Dongliang Wei
|
c1dacaa99b
llama : merge build_moe_ffn_from_probs function into build_moe_ffn (#14968)
|
5 ay önce |
compilade
|
66625a59a5
graph : reduce splits for recurrent and hybrid models (#14825)
|
5 ay önce |
Georgi Gerganov
|
1e15bfd42c
graph : fix stack-use-after-return (#14960)
|
5 ay önce |
Dongliang Wei
|
6c6e397aff
model : add support for SmallThinker series (#14898)
|
5 ay önce |
Georgi Gerganov
|
8f974bc1e9
graph : refactor context to not pass gf explicitly (#14629)
|
6 ay önce |
Georgi Gerganov
|
01612b7409
llama : reuse compute graphs (#14482)
|
6 ay önce |
Georgi Gerganov
|
225e7a1438
llama : add high-throughput mode (#14363)
|
6 ay önce |
Xuan-Son Nguyen
|
cb9178f885
llama : remove llm_graph_input_one (#14603)
|
6 ay önce |
compilade
|
4a5686da22
llama : support Jamba hybrid Transformer-Mamba models (#7531)
|
6 ay önce |
Georgi Gerganov
|
7b50f7c025
graph : prepare for 4D mask (#14515)
|
6 ay önce |
Georgi Gerganov
|
a70c8a0c4b
kv-cache : use ggml_set_rows (#14285)
|
6 ay önce |
compilade
|
5d46babdc2
llama : initial Mamba-2 support (#9126)
|
6 ay önce |
Sigbjørn Skjæret
|
a0535ffa0d
ggml : implement REGLU/GEGLU/SWIGLU ops (#14158)
|
6 ay önce |
Georgi Gerganov
|
72babea5de
graph : make llm_graph_context destructor virtual (#14410)
|
6 ay önce |
Xuan-Son Nguyen
|
8846aace49
model : gemma3n text-only (#14400)
|
6 ay önce |
Georgi Gerganov
|
692e3cdd0a
memory : rename interface to llama_memory_context_i (#14296)
|
7 ay önce |
Georgi Gerganov
|
4c9fdfbe15
ubatch : new splitting logic (#14217)
|
7 ay önce |
Gabe Goodhart
|
edc4a29eff
memory : Hybrid recurrent cache (#13979)
|
7 ay önce |
Georgi Gerganov
|
60c666347b
batch : rework llama_batch_allocr (#14153)
|
7 ay önce |
compilade
|
dad5c44398
kv-cache : avoid modifying recurrent cells when setting inputs (#13834)
|
7 ay önce |
Đinh Trọng Huy
|
91a8ee6a6f
add geglu activation function (#14074)
|
7 ay önce |