Aaron Teo
|
e57bb87ced
ggml: check if non-native endian model is being loaded (#13943)
|
7 ヶ月 前 |
Georgi Gerganov
|
f3a4b1659c
sync : ggml
|
7 ヶ月 前 |
Kai Pastor
|
108009f5c7
vulkan : Remove unexpected ; (ggml/1253)
|
7 ヶ月 前 |
Kai Pastor
|
d337252acf
cmake : Fix broken CMake error messages (ggml/1252)
|
7 ヶ月 前 |
Radoslav Gerganov
|
af6f91db47
ggml : remove ggml_graph_import and ggml_graph_export declarations (ggml/1247)
|
7 ヶ月 前 |
Georgi Gerganov
|
a7b8d35f78
sync : whisper.cpp (ggml/1250)
|
7 ヶ月 前 |
Radoslav Gerganov
|
6eba72b71c
ggml : install dynamic backends (ggml/1240)
|
7 ヶ月 前 |
Daniel Tang
|
fedf034a98
ggml : Print backtrace on uncaught C++ exceptions (ggml/1232)
|
7 ヶ月 前 |
ddh0
|
8726392d3d
readme : update bindings (#13950)
|
7 ヶ月 前 |
Georgi Gerganov
|
c04621711a
parallel : fix n_junk == 0 (#13952)
|
7 ヶ月 前 |
Georgi Gerganov
|
0fc16b42e8
kv-cache : split implementation in separate sources (#13920)
|
7 ヶ月 前 |
Max Krasnyansky
|
053b1539c0
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (#12995)
|
7 ヶ月 前 |
Jiří Podivín
|
b3a89c3d9e
docs : Note about necessity of having libcurl installed for standard build. (#13945)
|
7 ヶ月 前 |
Olivier Chafik
|
e15898d1c7
server: allow unclosed thinking tags (#13931)
|
7 ヶ月 前 |
Georgi Gerganov
|
803f8baf4f
llama : deprecate explicit kv_self defrag/update calls (#13921)
|
7 ヶ月 前 |
Georgi Gerganov
|
3600cc2886
llama : use n_swa + n_ubatch cells for SWA cache (#13833)
|
7 ヶ月 前 |
igardev
|
c7e0a2054b
webui : Replace alert and confirm with custom modals. (#13711)
|
7 ヶ月 前 |
Georgi Gerganov
|
3f55f781f1
llama : auto-batch preparation (#13845)
|
7 ヶ月 前 |
Xuan-Son Nguyen
|
51fa76f172
mtmd : drop `_shared` from `libmtmd` name, merge helpers into libmtmd (⚠️ breaking change) (#13917)
|
7 ヶ月 前 |
Georgi Gerganov
|
12d0188c0d
kv-cache : refactor + add llama_memory_state_i (#13746)
|
7 ヶ月 前 |
Shawn yang
|
eb3949938e
CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dGPU in cuda (#13856) (#13895)
|
7 ヶ月 前 |
Johannes Gäßler
|
e562eece7c
CUDA: fix typo in FlashAttention code (#13926)
|
7 ヶ月 前 |
Diego Devesa
|
b47ab7b8e9
sched : avoid changing cur_copy when a graph is already allocated (#13922)
|
7 ヶ月 前 |
Georgi Gerganov
|
dd665cc9d4
parallel : increase the variability of the prompt lengths (#13927)
|
7 ヶ月 前 |
Diego Devesa
|
df0c0c7d02
cuda : prevent using split buffers with 3d/4d matrices (#13919)
|
7 ヶ月 前 |
Akarshan Biswas
|
b49a8ff96b
SYCL: Add mrope kernel (#13755)
|
7 ヶ月 前 |
Georgi Gerganov
|
53f925074d
sync : vendor (#13901)
|
7 ヶ月 前 |
Sigbjørn Skjæret
|
db38704f01
convert : fix rwkv bos/eos token (#13844)
|
7 ヶ月 前 |
Xuan-Son Nguyen
|
07e4351ce6
convert : allow partial update to the chkhsh pre-tokenizer list (#13847)
|
7 ヶ月 前 |
Đinh Trọng Huy
|
291f2b6913
llama : add support for DistilBert (#13907)
|
7 ヶ月 前 |