Jeff Bolz
|
33f890e579
vulkan: support flash attention GQA/split_k with small batches (#18938)
|
1 hafta önce |
Masato Nakasaka
|
067b8d7af3
Revert "vulkan: force full subgroups for flash attention to fix intel subgroup crash (#17356)" (#18831)
|
1 hafta önce |
Jeff Bolz
|
50b7f076a5
vulkan: Use mul_mat_vec_id for small values of n (#18918)
|
1 hafta önce |
Tarek Dakhran
|
ad8d85bd94
memory : add llama_memory_hybrid_iswa (#18601)
|
1 hafta önce |
Piotr Wilkin (ilintar)
|
12a4a47e6a
Fix GLM 4.7 Lite MoE gating func (#18980)
|
1 hafta önce |
Matthieu Coudron
|
37c35f0e1c
gguf: display strerrno when cant load a model (#18884)
|
1 hafta önce |
Oliver Simons
|
5bd341c9a1
CUDA: Fix builds for older CCCL versions by ifdefing strided_iterator (#18964)
|
1 hafta önce |
Adrien Gallouët
|
1c7cf94b22
common, server : use the same User-Agent by default (#18957)
|
1 hafta önce |
Xuan-Son Nguyen
|
2c1f199653
cli : fix reasoning responses in CLI (#18961)
|
1 hafta önce |
Oliver Simons
|
d1e3556481
CUDA: Replace init_offsets kernel with iterators in cub-based argsort (#18930)
|
1 hafta önce |
Adrien Gallouët
|
08f3f4a8a3
ggml : cleanup path_str() (#18928)
|
1 hafta önce |
Georgi Gerganov
|
271191906c
metal : enable FA for MLA heads (#18950)
|
1 hafta önce |
Daniel Bevenius
|
7dee9ff59a
convert : use n_groups instead of hardcoded values in reshape (#18929)
|
1 hafta önce |
Xuan-Son Nguyen
|
6df686bee6
server : refactor oai_parser_opt, move it to server_chat_params (#18937)
|
1 hafta önce |
ddh0
|
1706a6d7c6
convert : support Glm4MoeLite (#18936)
|
1 hafta önce |
Sigbjørn Skjæret
|
959ecf7f23
jinja : fix undefined keys and attributes and int/float as bool (#18924)
|
1 hafta önce |
Sigbjørn Skjæret
|
4037093c66
ci : run test-jinja -py on high perf [no ci] (#18916)
|
1 hafta önce |
Lennart Austenfeld
|
18361c579c
server: fix memory reservations in populate_token_probs (#18787)
|
1 hafta önce |
Georgi Gerganov
|
365a3e8c31
ggml : add ggml_build_forward_select (#18550)
|
1 hafta önce |
Daniel Bevenius
|
3d55846a5c
model-conversion : add BUILD_DIR variable to run-converted-model scripts (#18927)
|
1 hafta önce |
Julius Tischbein
|
287a33017b
llama : Extend fallback, fix fileno for dio file, exclude case that mmap uses dio file (#18887)
|
1 hafta önce |
Francisco Herrera
|
293a1565dc
docs: add linux to index (#18907)
|
1 hafta önce |
Xuan-Son Nguyen
|
fe44d35574
tests : add test-jinja -py option for cross-checking (#18906)
|
1 hafta önce |
Sigbjørn Skjæret
|
bbcdac0189
jinja : fix object item order (and properly implement dictsort) (#18904)
|
1 hafta önce |
Sigbjørn Skjæret
|
d03c45c9c5
jinja : attribute support for join, map and sort (#18883)
|
1 hafta önce |
Sigbjørn Skjæret
|
10c98cbdf6
jinja : add missing tojson filter for bool (#18900)
|
1 hafta önce |
Sigbjørn Skjæret
|
420960ab92
jinja : fix lexing of float literals with sign (#18901)
|
1 hafta önce |
Xuan-Son Nguyen
|
f55b033ae6
jinja: correct member access rule (#18905)
|
1 hafta önce |
lhez
|
d1b4757ded
opencl: fix q6_K mv for m=1 (#18893)
|
1 hafta önce |
Sigbjørn Skjæret
|
57c0beaed0
ci : add label for jinja changes (#18903)
|
1 hafta önce |