Piotr Wilkin (ilintar)
|
6648989673
Add pwilkin to CODEOWNERS for chat files (#17789)
|
1 month ago |
Johannes Gäßler
|
e95d0bc8fd
CUDA: fix FA VKQ accumulator overflow (#17746)
|
1 month ago |
Jiacheng (Jason) Chen
|
668ed76574
HIP: enable WMMA-MMQ INT kernels for RDNA 3 (#17576)
|
1 month ago |
Sigbjørn Skjæret
|
03d9a77b85
ci : transform release binary root dir in tar to llama-bXXXX (#17773)
|
1 month ago |
Gabe Goodhart
|
3143a755c8
docs : update ops.md (Metal, BLAS) (#17768)
|
1 month ago |
Piotr Wilkin (ilintar)
|
96fe9badfc
Add support for CUMSUM and TRI for CUDA. (#17584)
|
1 month ago |
Gabe Goodhart
|
bde188d60f
metal: TRI, FILL, EXPM1, SOFTPLUS (#16623)
|
1 month ago |
Xuan-Son Nguyen
|
9d0229967a
server: strip content-length header on proxy (#17734)
|
1 month ago |
Xuan-Son Nguyen
|
c4c10bfb86
server: move msg diffs tracking to HTTP thread (#17740)
|
1 month ago |
Daniel Bevenius
|
817d743cc1
examples : add missing code block end marker [no ci] (#17756)
|
1 month ago |
Daniel Bevenius
|
bd4ef13476
common : skip model validation when --help is requested (#17755)
|
1 month ago |
Alberto Cabrera Pérez
|
87a2084c45
ggml-cpu : remove asserts always evaluating to false (#17728)
|
1 month ago |
SmartestWashingMachine
|
3659aa28e9
convert: use existing local chat_template if mistral-format model has one. (#17749)
|
1 month ago |
Adrien Gallouët
|
2a73f81f8a
cmake : simplify build info detection using standard variables (#17423)
|
1 month ago |
Sigbjørn Skjæret
|
7dba049b07
ci : disable ggml-ci-x64-amd-* (#17753)
|
1 month ago |
Adrien Gallouët
|
83c1171529
common: use native MultiByteToWideChar (#17738)
|
1 month ago |
Georgi Gerganov
|
0d1324856f
metal : use params per pipeline instance (#17739)
|
1 month ago |
Georgi Gerganov
|
a67ef0f47f
llama : fix sanity checks during quantization (#17721)
|
1 month ago |
Adrien Gallouët
|
ef75a89fdb
build : move _WIN32_WINNT definition to headers (#17736)
|
1 month ago |
Jeff Bolz
|
d8b5cdc4fe
build: enable parallel builds in msbuild using MTT (#17708)
|
1 month ago |
Herman Semenoff
|
dea9ba27cb
ggml-cpu: remove duplicate conditional check 'iid' (#17650)
|
1 month ago |
Piotr Wilkin (ilintar)
|
c6d1a00aa7
Add a couple of file types to the text section (#17670)
|
1 month ago |
SmartestWashingMachine
|
424c579455
convert : support latest mistral-common (fix conversion with --mistral-format) (#17712)
|
1 month ago |
Aleksander Grygier
|
e9f9483464
Use OpenAI-compatible `/v1/models` endpoint by default (#17689)
|
1 month ago |
Andika Wasisto
|
41c5e02f42
webui: Fix zero pasteLongTextToFileLen to disable conversion being overridden (#17445)
|
1 month ago |
Johannes Gäßler
|
2e1c9cd814
CUDA: generalized (mma) FA, add Volta support (#17505)
|
1 month ago |
Georgi Gerganov
|
190c4838bd
chat : reserve memory in compute_diffs and improve naming (#17729)
|
1 month ago |
Pascal
|
e7c2cf1356
server: add router multi-model tests (#17704) (#17722)
|
1 month ago |
Adrien Gallouët
|
1257491047
server : fix bad fmt, size() is a size_type (#17735)
|
1 month ago |
Adrien Gallouët
|
083e18b11c
cmake: explicitly link against crypt32 on non-MSVC Windows builds (#17727)
|
1 month ago |