shaofeiqi
|
5516b9c16a
opencl: add TRI op support (#18979)
|
пре 1 недеља |
Aleksei Nikiforov
|
94242a62c0
ggml-zdnn : mark zDNN buffers as non-host (#18967)
|
пре 1 недеља |
Pádraic Slattery
|
6b99a223e3
ci : update GitHub Actions versions [no ci] (#18935)
|
пре 1 недеља |
Mariusz Woloszyn
|
77078e80e5
convert : add Devstral-2 (Ministral3ForCausalLM) arch (#18972)
|
пре 1 недеља |
Piotr Wilkin (ilintar)
|
c301172f66
jinja: support none|string (#18995)
|
пре 1 недеља |
Hendrik Erz
|
3802d3c78f
fix: Use `tabular-nums` for chat message statistics (#18915)
|
пре 1 недеља |
Daniel Bevenius
|
9da3dcd753
llama : clarify nemotron-h.cpp comment about RoPE [no ci] (#18997)
|
пре 1 недеља |
Jeff Bolz
|
bd544c94a3
vulkan: Remove transfer_ctx, do everything in compute_ctx. (#18945)
|
пре 1 недеља |
Adrien Gallouët
|
14be5a39b1
common : improve error message when HTTPS is missing but required (#18987)
|
пре 1 недеља |
손희준
|
fbbf3ad190
server: /v1/responses (partial) (#18486)
|
пре 1 недеља |
Jeff Bolz
|
33f890e579
vulkan: support flash attention GQA/split_k with small batches (#18938)
|
пре 1 недеља |
Masato Nakasaka
|
067b8d7af3
Revert "vulkan: force full subgroups for flash attention to fix intel subgroup crash (#17356)" (#18831)
|
пре 1 недеља |
Jeff Bolz
|
50b7f076a5
vulkan: Use mul_mat_vec_id for small values of n (#18918)
|
пре 1 недеља |
Tarek Dakhran
|
ad8d85bd94
memory : add llama_memory_hybrid_iswa (#18601)
|
пре 1 недеља |
Piotr Wilkin (ilintar)
|
12a4a47e6a
Fix GLM 4.7 Lite MoE gating func (#18980)
|
пре 1 недеља |
Matthieu Coudron
|
37c35f0e1c
gguf: display strerrno when cant load a model (#18884)
|
пре 1 недеља |
Oliver Simons
|
5bd341c9a1
CUDA: Fix builds for older CCCL versions by ifdefing strided_iterator (#18964)
|
пре 1 недеља |
Adrien Gallouët
|
1c7cf94b22
common, server : use the same User-Agent by default (#18957)
|
пре 1 недеља |
Xuan-Son Nguyen
|
2c1f199653
cli : fix reasoning responses in CLI (#18961)
|
пре 1 недеља |
Oliver Simons
|
d1e3556481
CUDA: Replace init_offsets kernel with iterators in cub-based argsort (#18930)
|
пре 1 недеља |
Adrien Gallouët
|
08f3f4a8a3
ggml : cleanup path_str() (#18928)
|
пре 1 недеља |
Georgi Gerganov
|
271191906c
metal : enable FA for MLA heads (#18950)
|
пре 1 недеља |
Daniel Bevenius
|
7dee9ff59a
convert : use n_groups instead of hardcoded values in reshape (#18929)
|
пре 1 недеља |
Xuan-Son Nguyen
|
6df686bee6
server : refactor oai_parser_opt, move it to server_chat_params (#18937)
|
пре 1 недеља |
ddh0
|
1706a6d7c6
convert : support Glm4MoeLite (#18936)
|
пре 1 недеља |
Sigbjørn Skjæret
|
959ecf7f23
jinja : fix undefined keys and attributes and int/float as bool (#18924)
|
пре 1 недеља |
Sigbjørn Skjæret
|
4037093c66
ci : run test-jinja -py on high perf [no ci] (#18916)
|
пре 1 недеља |
Lennart Austenfeld
|
18361c579c
server: fix memory reservations in populate_token_probs (#18787)
|
пре 1 недеља |
Georgi Gerganov
|
365a3e8c31
ggml : add ggml_build_forward_select (#18550)
|
пре 1 недеља |
Daniel Bevenius
|
3d55846a5c
model-conversion : add BUILD_DIR variable to run-converted-model scripts (#18927)
|
пре 1 недеља |