JFLFY2255
|
8d0cfd554a
llama: Support MiniCPM-1B (with & w/o longrope) (#10559)
|
пре 1 година |
Jeff Bolz
|
2759916d86
vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (#10642)
|
пре 1 година |
Nicolò Scipione
|
40c6d79fb5
SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (#10584)
|
пре 1 година |
Wang Ran (汪然)
|
98036d5670
fix typo of README.md (#10605)
|
пре 1 година |
Frankie Robertson
|
cd2f37b304
Avoid using __fp16 on ARM with old nvcc (#10616)
|
пре 1 година |
Benson Wong
|
da6aac91f1
Add docs for creating a static build (#10268) (#10630)
|
пре 1 година |
piDack
|
01e6d9bb71
clip : add sycl support (#10574)
|
пре 1 година |
Jeff Bolz
|
cc98896db8
vulkan: optimize and reenable split_k (#10637)
|
пре 1 година |
Xuan Son Nguyen
|
91c36c269b
server : (web ui) Various improvements, now use vite as bundler (#10599)
|
пре 1 година |
Georgi Gerganov
|
1cd3df46bd
scripts : remove amx sync
|
пре 1 година |
Georgi Gerganov
|
c505471857
sync : ggml
|
пре 1 година |
mahorozte
|
e9e661bd59
CUDA: remove unnecessary warp reduce in FA (ggml/1032)
|
пре 1 година |
PAB
|
efb6ae9630
feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019)
|
пре 1 година |
PAB
|
667d70d170
metal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml/1026)
|
пре 1 година |
Xuan Son Nguyen
|
3b4f2e33e2
llama : add missing LLAMA_API for llama_chat_builtin_templates (#10636)
|
пре 1 година |
Nikolaos Pothitos
|
82bca2257b
readme : add option, update default value, fix formatting (#10271)
|
пре 1 година |
Georgi Gerganov
|
0115df2f65
metal : small-batch mat-mul kernels (#10581)
|
пре 1 година |
Georgi Gerganov
|
515d4e5372
github : minify link [no ci] (revert)
|
пре 1 година |
Georgi Gerganov
|
844e2e1fee
github : minify link [no ci]
|
пре 1 година |
Georgi Gerganov
|
70b98fadbc
server : fix default draft model parameters (#10586)
|
пре 1 година |
Xuan Son Nguyen
|
642330ac7c
llama : add enum for built-in chat templates (#10623)
|
пре 1 година |
Georgi Gerganov
|
8648c52101
make : deprecate (#10514)
|
пре 1 година |
haopeng
|
64ed2091b2
server: Add "tokens per second" information in the backend (#10548)
|
пре 1 година |
Akarshan Biswas
|
991f8aabee
SYCL: Fix and switch to GGML_LOG system instead of fprintf (#10579)
|
пре 1 година |
Georgi Gerganov
|
4cb003dd8d
contrib : refresh (#10593)
|
пре 1 година |
Juk Armstrong
|
917786f43d
Add `mistral-v1`, `mistral-v3`, `mistral-v3-tekken` and `mistral-v7` chat template types (#10572)
|
пре 1 година |
Georgi Gerganov
|
5e1ed95583
grammars : add English-only grammar (#10612)
|
пре 1 година |
Wang Qin
|
5c7a5aa0c3
ci: add error handling for Python venv creation in run.sh (#10608)
|
пре 1 година |
Diego Devesa
|
3420909dff
ggml : automatic selection of best CPU backend (#10606)
|
пре 1 година |
alek3y
|
86dc11c5bc
server : bind to any port when specified (#10590)
|
пре 1 година |