Olivier Chafik
|
c9bbc77931
`server`: update deepseek reasoning format (pass reasoning_content as diffs) (#13933)
|
hace 7 meses |
Xuan-Son Nguyen
|
bfd322796c
mtmd : fix memory leak in mtmd_helper_eval_chunk_single (#13961)
|
hace 7 meses |
shalinib-ibm
|
093e3f1feb
cmake : Handle mixed-case 'Power' strings in POWER CPU detection (#13966)
|
hace 7 meses |
Atharva Dubey
|
663445b0de
sycl: quantize and reorder the input to q8_1 when reorder is enabled (#13826)
|
hace 7 meses |
Johannes Gäßler
|
7675c555a1
gguf: fix failure on version == 0 (#13956)
|
hace 7 meses |
Sigbjørn Skjæret
|
5e1c3aed40
convert : fix nomic-bert-moe mask token (#13757)
|
hace 7 meses |
Sigbjørn Skjæret
|
c496fe0b1d
convert : fix vocab padding code for bert models (#13954)
|
hace 7 meses |
Aaron Teo
|
e57bb87ced
ggml: check if non-native endian model is being loaded (#13943)
|
hace 7 meses |
Georgi Gerganov
|
f3a4b1659c
sync : ggml
|
hace 7 meses |
Kai Pastor
|
108009f5c7
vulkan : Remove unexpected ; (ggml/1253)
|
hace 7 meses |
Kai Pastor
|
d337252acf
cmake : Fix broken CMake error messages (ggml/1252)
|
hace 7 meses |
Radoslav Gerganov
|
af6f91db47
ggml : remove ggml_graph_import and ggml_graph_export declarations (ggml/1247)
|
hace 7 meses |
Georgi Gerganov
|
a7b8d35f78
sync : whisper.cpp (ggml/1250)
|
hace 7 meses |
Radoslav Gerganov
|
6eba72b71c
ggml : install dynamic backends (ggml/1240)
|
hace 7 meses |
Daniel Tang
|
fedf034a98
ggml : Print backtrace on uncaught C++ exceptions (ggml/1232)
|
hace 7 meses |
ddh0
|
8726392d3d
readme : update bindings (#13950)
|
hace 7 meses |
Georgi Gerganov
|
c04621711a
parallel : fix n_junk == 0 (#13952)
|
hace 7 meses |
Georgi Gerganov
|
0fc16b42e8
kv-cache : split implementation in separate sources (#13920)
|
hace 7 meses |
Max Krasnyansky
|
053b1539c0
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (#12995)
|
hace 7 meses |
Jiří Podivín
|
b3a89c3d9e
docs : Note about necessity of having libcurl installed for standard build. (#13945)
|
hace 7 meses |
Olivier Chafik
|
e15898d1c7
server: allow unclosed thinking tags (#13931)
|
hace 7 meses |
Georgi Gerganov
|
803f8baf4f
llama : deprecate explicit kv_self defrag/update calls (#13921)
|
hace 7 meses |
Georgi Gerganov
|
3600cc2886
llama : use n_swa + n_ubatch cells for SWA cache (#13833)
|
hace 7 meses |
igardev
|
c7e0a2054b
webui : Replace alert and confirm with custom modals. (#13711)
|
hace 7 meses |
Georgi Gerganov
|
3f55f781f1
llama : auto-batch preparation (#13845)
|
hace 7 meses |
Xuan-Son Nguyen
|
51fa76f172
mtmd : drop `_shared` from `libmtmd` name, merge helpers into libmtmd (⚠️ breaking change) (#13917)
|
hace 7 meses |
Georgi Gerganov
|
12d0188c0d
kv-cache : refactor + add llama_memory_state_i (#13746)
|
hace 7 meses |
Shawn yang
|
eb3949938e
CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dGPU in cuda (#13856) (#13895)
|
hace 7 meses |
Johannes Gäßler
|
e562eece7c
CUDA: fix typo in FlashAttention code (#13926)
|
hace 7 meses |
Diego Devesa
|
b47ab7b8e9
sched : avoid changing cur_copy when a graph is already allocated (#13922)
|
hace 7 meses |