Sigbjørn Skjæret
|
403fbacbbc
convert : Qwerky : use lora_rank_tokenshift and lora_rank_decay if present (#12667)
|
9 mesi fa |
0cc4m
|
a8a1f33567
Vulkan: Add DP4A MMQ and Q8_1 quantization shader (#12135)
|
9 mesi fa |
Georgi Gerganov
|
1790e73157
cmake : fix whitespace (#0)
|
9 mesi fa |
Georgi Gerganov
|
0114a32da0
sync : ggml
|
9 mesi fa |
Sandro Hanea
|
a7724480fd
cmake: improve Vulkan cooperative matrix support checks (whisper/2966)
|
9 mesi fa |
Sigbjørn Skjæret
|
1a85949067
llava : proper description fix (#12668)
|
9 mesi fa |
Akarshan Biswas
|
6c02a032fa
SYCL: Remove misleading ggml_sycl_op_flatten function (#12387)
|
9 mesi fa |
Sigbjørn Skjæret
|
f52d59d771
llava : fix clip loading GGUFs with missing description (#12660)
|
9 mesi fa |
marcoStocchi
|
52de2e5949
tts : remove printfs (#12640)
|
9 mesi fa |
Sigbjørn Skjæret
|
2c3f8b850a
llama : support BailingMoE (Ling) (#12634)
|
9 mesi fa |
Georgi Gerganov
|
4663bd353c
metal : use constexpr in FA kernels + fix typedef (#12659)
|
9 mesi fa |
Juyoung Suk
|
b3de7cac73
llama : add Trillion 7B model support (#12556)
|
9 mesi fa |
Sergei Vorobyov
|
7242dd9675
llama-chat : Add Yandex instruct model template support (#12621)
|
9 mesi fa |
R0CKSTAR
|
492d7f1ff7
musa: fix all warnings, re-enable `-DLLAMA_FATAL_WARNINGS=ON` in ci and update doc (#12611)
|
9 mesi fa |
Georgi Gerganov
|
d3f1f0acfb
sync : ggml
|
9 mesi fa |
Xuan-Son Nguyen
|
360dc22c00
cpu : rm unused variable (ggml/1166)
|
9 mesi fa |
cmdr2
|
a62d7fa7a9
cpu: de-duplicate some of the operators and refactor (ggml/1144)
|
9 mesi fa |
Daniel Bevenius
|
e408d4351a
ggml : add logging for native build options/vars (whisper/2935)
|
10 mesi fa |
Daniel Bevenius
|
3891e183c6
examples : command.wasm updates (whisper/2904)
|
10 mesi fa |
Xuan-Son Nguyen
|
af6ae1efb2
llama : fix non-causal mask for gemma 3 (#12615)
|
9 mesi fa |
Djip007
|
0bb2919335
llama : change cpu_buft_list order: ACCEL -> GPU host -> CPU extra -> CPU (#12632)
|
9 mesi fa |
Jay
|
a69f846351
cmake : fix ccache conflict (#12522)
|
9 mesi fa |
hipudding
|
d07a0d7a79
CANN : remove clang-format in ggml-cann (#12607)
|
9 mesi fa |
Sigbjørn Skjæret
|
3714c3ee1a
llama : fix incorrect Qwen2Moe ffn_moe_out graph callback (#12631)
|
9 mesi fa |
Georgi Gerganov
|
b4ae50810e
metal : improve FA + improve MoE (#12612)
|
9 mesi fa |
Icenowy Zheng
|
b86f600723
vulkan: fix coopmat shader generation when cross-compiling (#12272)
|
9 mesi fa |
Johannes Gäßler
|
dd373dd3bf
llama: fix error on bad grammar (#12628)
|
9 mesi fa |
Benson Wong
|
5d01670266
server : include speculative decoding stats when timings_per_token is enabled (#12603)
|
9 mesi fa |
Radoslav Gerganov
|
ef03229ff4
rpc : update README for cache usage (#12620)
|
9 mesi fa |
amritahs-ibm
|
13731766db
llamafile : ppc64le GEMV forwarding for FP32. (#12594)
|
9 mesi fa |