Jeff Bolz
|
7e00e60ef8
vulkan: fix warnings in perf logger querypool code (#13937)
|
7 месяцев назад |
Xuan-Son Nguyen
|
ea1431b0fa
docs : add "Quick start" section for new users (#13862)
|
7 месяцев назад |
lhez
|
71e74a3ac9
opencl: add `backend_synchronize` (#13939)
|
7 месяцев назад |
rmatif
|
bfb1e012a0
OpenCL: Add concat, tsembd, upscale, tanh, pad and repeat (#13840)
|
7 месяцев назад |
Georgi Gerganov
|
3637576288
server : disable speculative decoding for SWA models (#13970)
|
7 месяцев назад |
Georgi Gerganov
|
ea394d7ab1
metal : use F32 accumulators in FA kernels (#13975)
|
7 месяцев назад |
Georgi Gerganov
|
5582c49c39
gemma : more consistent attention scaling for v2 and v3 (#13951)
|
7 месяцев назад |
Olivier Chafik
|
c9bbc77931
`server`: update deepseek reasoning format (pass reasoning_content as diffs) (#13933)
|
7 месяцев назад |
Xuan-Son Nguyen
|
bfd322796c
mtmd : fix memory leak in mtmd_helper_eval_chunk_single (#13961)
|
7 месяцев назад |
shalinib-ibm
|
093e3f1feb
cmake : Handle mixed-case 'Power' strings in POWER CPU detection (#13966)
|
7 месяцев назад |
Atharva Dubey
|
663445b0de
sycl: quantize and reorder the input to q8_1 when reorder is enabled (#13826)
|
7 месяцев назад |
Johannes Gäßler
|
7675c555a1
gguf: fix failure on version == 0 (#13956)
|
7 месяцев назад |
Sigbjørn Skjæret
|
5e1c3aed40
convert : fix nomic-bert-moe mask token (#13757)
|
7 месяцев назад |
Sigbjørn Skjæret
|
c496fe0b1d
convert : fix vocab padding code for bert models (#13954)
|
7 месяцев назад |
Aaron Teo
|
e57bb87ced
ggml: check if non-native endian model is being loaded (#13943)
|
7 месяцев назад |
Georgi Gerganov
|
f3a4b1659c
sync : ggml
|
7 месяцев назад |
Kai Pastor
|
108009f5c7
vulkan : Remove unexpected ; (ggml/1253)
|
7 месяцев назад |
Kai Pastor
|
d337252acf
cmake : Fix broken CMake error messages (ggml/1252)
|
7 месяцев назад |
Radoslav Gerganov
|
af6f91db47
ggml : remove ggml_graph_import and ggml_graph_export declarations (ggml/1247)
|
7 месяцев назад |
Georgi Gerganov
|
a7b8d35f78
sync : whisper.cpp (ggml/1250)
|
7 месяцев назад |
Radoslav Gerganov
|
6eba72b71c
ggml : install dynamic backends (ggml/1240)
|
7 месяцев назад |
Daniel Tang
|
fedf034a98
ggml : Print backtrace on uncaught C++ exceptions (ggml/1232)
|
7 месяцев назад |
ddh0
|
8726392d3d
readme : update bindings (#13950)
|
7 месяцев назад |
Georgi Gerganov
|
c04621711a
parallel : fix n_junk == 0 (#13952)
|
7 месяцев назад |
Georgi Gerganov
|
0fc16b42e8
kv-cache : split implementation in separate sources (#13920)
|
7 месяцев назад |
Max Krasnyansky
|
053b1539c0
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (#12995)
|
7 месяцев назад |
Jiří Podivín
|
b3a89c3d9e
docs : Note about necessity of having libcurl installed for standard build. (#13945)
|
7 месяцев назад |
Olivier Chafik
|
e15898d1c7
server: allow unclosed thinking tags (#13931)
|
7 месяцев назад |
Georgi Gerganov
|
803f8baf4f
llama : deprecate explicit kv_self defrag/update calls (#13921)
|
7 месяцев назад |
Georgi Gerganov
|
3600cc2886
llama : use n_swa + n_ubatch cells for SWA cache (#13833)
|
7 месяцев назад |