Xuan-Son Nguyen
|
267c1399f1
common : refactor downloading system, handle mmproj with -hf option (#12694)
|
9 months ago |
Junil Kim
|
f423981ac8
opencl : fix memory allocation size (#12649)
|
9 months ago |
jklincn
|
e39e727e9a
llama : use LLM_KV_GENERAL_FILE_TYPE instead of gguf_find_key (#12672)
|
9 months ago |
Sigbjørn Skjæret
|
5936a616e4
convert : BailingMoE : fix qkv split when head_dim is 0 (#12687)
|
9 months ago |
Georgi Gerganov
|
3fd072a540
metal : use F32 prec in FA kernels (#12688)
|
9 months ago |
R0CKSTAR
|
a6f32f0b34
Fix clang warning in gguf_check_reserved_keys (#12686)
|
9 months ago |
Wagner Bruna
|
2bb3597e42
vulkan: fix build when glslc doesn't support coopmat (#12683)
|
9 months ago |
Romain Biessy
|
8293970542
SYCL: Rename oneMKL to oneMath (#12192)
|
9 months ago |
Akarshan Biswas
|
8bbf26083d
SYCL: switch to SYCL namespace (#12674)
|
9 months ago |
Sigbjørn Skjæret
|
35782aeedb
convert : BailingMoE : avoid setting rope_dim to 0 (#12678)
|
9 months ago |
Daniel Bevenius
|
c80a7759da
vocab : add special infill tokens for CodeLlama (#11850)
|
9 months ago |
a3sh
|
250d7953e8
ggml : faster ssm scan (#10558)
|
9 months ago |
Sigbjørn Skjæret
|
403fbacbbc
convert : Qwerky : use lora_rank_tokenshift and lora_rank_decay if present (#12667)
|
9 months ago |
0cc4m
|
a8a1f33567
Vulkan: Add DP4A MMQ and Q8_1 quantization shader (#12135)
|
9 months ago |
Georgi Gerganov
|
1790e73157
cmake : fix whitespace (#0)
|
9 months ago |
Georgi Gerganov
|
0114a32da0
sync : ggml
|
9 months ago |
Sandro Hanea
|
a7724480fd
cmake: improve Vulkan cooperative matrix support checks (whisper/2966)
|
9 months ago |
Sigbjørn Skjæret
|
1a85949067
llava : proper description fix (#12668)
|
9 months ago |
Akarshan Biswas
|
6c02a032fa
SYCL: Remove misleading ggml_sycl_op_flatten function (#12387)
|
9 months ago |
Sigbjørn Skjæret
|
f52d59d771
llava : fix clip loading GGUFs with missing description (#12660)
|
9 months ago |
marcoStocchi
|
52de2e5949
tts : remove printfs (#12640)
|
9 months ago |
Sigbjørn Skjæret
|
2c3f8b850a
llama : support BailingMoE (Ling) (#12634)
|
9 months ago |
Georgi Gerganov
|
4663bd353c
metal : use constexpr in FA kernels + fix typedef (#12659)
|
9 months ago |
Juyoung Suk
|
b3de7cac73
llama : add Trillion 7B model support (#12556)
|
9 months ago |
Sergei Vorobyov
|
7242dd9675
llama-chat : Add Yandex instruct model template support (#12621)
|
9 months ago |
R0CKSTAR
|
492d7f1ff7
musa: fix all warnings, re-enable `-DLLAMA_FATAL_WARNINGS=ON` in ci and update doc (#12611)
|
9 months ago |
Georgi Gerganov
|
d3f1f0acfb
sync : ggml
|
9 months ago |
Xuan-Son Nguyen
|
360dc22c00
cpu : rm unused variable (ggml/1166)
|
9 months ago |
cmdr2
|
a62d7fa7a9
cpu: de-duplicate some of the operators and refactor (ggml/1144)
|
9 months ago |
Daniel Bevenius
|
e408d4351a
ggml : add logging for native build options/vars (whisper/2935)
|
10 months ago |