Chenguang Li
|
65cfe136a0
CANN: Support operator SIN COS ARGMAX (#12709)
|
9 месяцев назад |
Alan Gray
|
3f9da22c2b
Simplify and improve CUDA graphs through use of indirect copy pointers (#9017)
|
9 месяцев назад |
hipudding
|
2a0dc97e56
CANN: Fix failed test cases (#12708)
|
9 месяцев назад |
lhez
|
97a20c012b
opencl: use `max_alloc_size` in backend ctx instead of querying again (#12705)
|
9 месяцев назад |
Jeff Bolz
|
f01bd02376
vulkan: Implement split_k for coopmat2 flash attention. (#12627)
|
9 месяцев назад |
bandoti
|
6f3bd38640
cmake: remove caching from vulkan coopmat checks (#12719)
|
9 месяцев назад |
Jeff Bolz
|
be0a0f8cae
vulkan: Implement grouped query attention in the coopmat2 FA shader (#12559)
|
9 месяцев назад |
0cc4m
|
92e3006bb6
Vulkan: Fix mmq int dot float cache size (#12722)
|
9 месяцев назад |
Georgi Gerganov
|
833e2b7409
model : print tensor size during load (#12711)
|
9 месяцев назад |
Diego Devesa
|
e0e912f49b
llama : add option to override model tensor buffers (#11397)
|
9 месяцев назад |
Georgi Gerganov
|
a10b36c91a
llama : refactor kv cache guard (#12695)
|
9 месяцев назад |
Sigbjørn Skjæret
|
83a88bd6af
vocab : BailingMoE : change possessive quantifiers to greedy (#12677)
|
9 месяцев назад |
Xuan-Son Nguyen
|
42eb248f46
common : remove json.hpp from common.cpp (#12697)
|
9 месяцев назад |
Chenguang Li
|
9bacd6b374
[CANN] get_rows and dup optimization (#12671)
|
9 месяцев назад |
Xuan-Son Nguyen
|
267c1399f1
common : refactor downloading system, handle mmproj with -hf option (#12694)
|
9 месяцев назад |
Junil Kim
|
f423981ac8
opencl : fix memory allocation size (#12649)
|
9 месяцев назад |
jklincn
|
e39e727e9a
llama : use LLM_KV_GENERAL_FILE_TYPE instead of gguf_find_key (#12672)
|
9 месяцев назад |
Sigbjørn Skjæret
|
5936a616e4
convert : BailingMoE : fix qkv split when head_dim is 0 (#12687)
|
9 месяцев назад |
Georgi Gerganov
|
3fd072a540
metal : use F32 prec in FA kernels (#12688)
|
9 месяцев назад |
R0CKSTAR
|
a6f32f0b34
Fix clang warning in gguf_check_reserved_keys (#12686)
|
9 месяцев назад |
Wagner Bruna
|
2bb3597e42
vulkan: fix build when glslc doesn't support coopmat (#12683)
|
9 месяцев назад |
Romain Biessy
|
8293970542
SYCL: Rename oneMKL to oneMath (#12192)
|
9 месяцев назад |
Akarshan Biswas
|
8bbf26083d
SYCL: switch to SYCL namespace (#12674)
|
9 месяцев назад |
Sigbjørn Skjæret
|
35782aeedb
convert : BailingMoE : avoid setting rope_dim to 0 (#12678)
|
9 месяцев назад |
Daniel Bevenius
|
c80a7759da
vocab : add special infill tokens for CodeLlama (#11850)
|
9 месяцев назад |
a3sh
|
250d7953e8
ggml : faster ssm scan (#10558)
|
9 месяцев назад |
Sigbjørn Skjæret
|
403fbacbbc
convert : Qwerky : use lora_rank_tokenshift and lora_rank_decay if present (#12667)
|
9 месяцев назад |
0cc4m
|
a8a1f33567
Vulkan: Add DP4A MMQ and Q8_1 quantization shader (#12135)
|
9 месяцев назад |
Georgi Gerganov
|
1790e73157
cmake : fix whitespace (#0)
|
9 месяцев назад |
Georgi Gerganov
|
0114a32da0
sync : ggml
|
9 месяцев назад |