Johannes Gäßler
|
a2c199e479
common: clarify instructions for bug reports (#18134)
|
1 mese fa |
Johannes Gäßler
|
b1f3a6e5db
llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653)
|
1 mese fa |
Georgi Gerganov
|
254098a279
common : refactor common_sampler + grammar logic changes (#17937)
|
1 mese fa |
Sigbjørn Skjæret
|
22577583a3
common : change --color to accept on/off/auto, default to auto (#17827)
|
1 mese fa |
Adrien Gallouët
|
83c1171529
common: use native MultiByteToWideChar (#17738)
|
1 mese fa |
Reese Levine
|
7ca5991d2b
ggml webgpu: add support for emscripten builds (#17184)
|
1 mese fa |
Xuan-Son Nguyen
|
13628d8bdb
server: add --media-path for local media files (#17697)
|
1 mese fa |
Xuan-Son Nguyen
|
ec18edfcba
server: introduce API for serving / loading / unloading multiple models (#17470)
|
1 mese fa |
Aaron Teo
|
877566d512
llama: introduce support for model-embedded sampling parameters (#17120)
|
1 mese fa |
Georgi Gerganov
|
196f5083ef
common : more accurate sampling timing (#17382)
|
1 mese fa |
Xuan-Son Nguyen
|
9b17d74ab7
mtmd: add mtmd_log_set (#17268)
|
2 mesi fa |
Xuan-Son Nguyen
|
aa3b7a90b4
arg: add --cache-list argument to list cached models (#17073)
|
2 mesi fa |
Gadflyii
|
3df2244df4
llama : add --no-host to disable host buffers (#16310)
|
3 mesi fa |
Aaron Teo
|
624207e676
devops: add s390x & ppc64le CI (#15925)
|
3 mesi fa |
Douglas Hanley
|
b5bd037832
llama : add support for qwen3 reranker (#15824)
|
3 mesi fa |
Uilian Ries
|
152729f884
common : add missing chrono header for common.cpp (#16211)
|
3 mesi fa |
Johannes Gäßler
|
e81b8e4b7f
llama: use FA + max. GPU layers by default (#15434)
|
4 mesi fa |
Sigbjørn Skjæret
|
84ab83cc0b
model : jina-embeddings-v3 support (#13693)
|
4 mesi fa |
Georgi Gerganov
|
9ebebef62f
llama : remove KV cache defragmentation logic (#15473)
|
4 mesi fa |
Jie Fu (傅杰)
|
2f3dbffb17
common : fix incorrect print of non-ascii characters in the logging (#15466)
|
4 mesi fa |
Jonathan Graehl
|
5cdb27e091
finetune: SGD optimizer, more CLI args (#13873)
|
5 mesi fa |
Diego Devesa
|
d6818d06a6
llama : allow other bufts when overriding to CPU, add --no-repack option (#14990)
|
5 mesi fa |
compilade
|
90083283ec
imatrix : use GGUF to store importance matrices (#9400)
|
6 mesi fa |
Georgi Gerganov
|
225e7a1438
llama : add high-throughput mode (#14363)
|
6 mesi fa |
Georgi Gerganov
|
6ffd4e9c44
server : pre-calculate EOG logit biases (#14721)
|
6 mesi fa |
Ruikai Peng
|
dd6e6d0b6a
vocab : prevent tokenizer overflow (#14301)
|
7 mesi fa |
fanyang
|
456af35eb7
build : suppress gcc15 compile warnings (#14261)
|
7 mesi fa |
Diego Devesa
|
6adc3c3ebc
llama : add thread safety test (#14035)
|
7 mesi fa |
Georgi Gerganov
|
d3e64b9f49
llama : rework embeddings logic (#14208)
|
7 mesi fa |
bandoti
|
2e89f76b7a
common: fix issue with regex_escape routine on windows (#14133)
|
7 mesi fa |