Gabe Goodhart
|
e1fa9569ba
server : add SSL support (#5926)
|
1 год назад |
Pierrick Hymbert
|
fd72d2d2a5
server: tests: add truncated prompt tests, better kv cache size (#5933)
|
1 год назад |
compilade
|
c2101a2e90
llama : support Mamba Selective State Space Models (#5328)
|
1 год назад |
compilade
|
515f7d0d4f
llama : fix quantization of shared token_embd (#5944)
|
1 год назад |
Pierrick Hymbert
|
76e868821a
server: metrics: add llamacpp:prompt_seconds_total and llamacpp:tokens_predicted_seconds_total, reset bucket only on /metrics. Fix values cast to int. Add Process-Start-Time-Unix header. (#5937)
|
1 год назад |
Don Mahurin
|
e457fb3540
llama : assume tied weights if lm_head/output weights is missing (#5824)
|
1 год назад |
Georgi Gerganov
|
af37fd8b30
server : fix EOS token detection with disabled cache (#5938)
|
1 год назад |
UEXTM.com
|
581ed5c4fe
log : fix MSVC compile errors (#5643)
|
1 год назад |
Georgi Gerganov
|
6cdabe6526
llama-bench : add embeddings option (#5924)
|
1 год назад |
Neo Zhang Jianyu
|
89fb735fcf
Revert "[SYCL] fix error when set main gpu to non-zero (#5901)" (#5918)
|
1 год назад |
Minsoo Cheong
|
55a2a900ff
server : add `/v1/completions` endpoint (#5914)
|
1 год назад |
Georgi Gerganov
|
2002bc96bf
server : refactor (#5882)
|
1 год назад |
Neo Zhang Jianyu
|
ceca1aef07
[SYCL] fix error when set main gpu to non-zero (#5901)
|
1 год назад |
Jared Van Bortel
|
e04e04f8fa
ggml : use SYS_get_cpu if SYS_getcpu is not defined (#5906)
|
1 год назад |
bobqianic
|
e25fb4b18f
ggml : use `uint8x16_t` return type for `ggml_vqtbl1q_u8` (#5894)
|
1 год назад |
Georgi Gerganov
|
1e35d619a6
convert : remove AWQ remnants (#5768)
|
1 год назад |
Neo Zhang Jianyu
|
8ced9f7e32
add wait() to make code stable (#5895)
|
1 год назад |
slaren
|
652ca2bded
compare-llama-bench.py : remove mul_mat_q (#5892)
|
1 год назад |
Jared Van Bortel
|
bd836944f8
quants : use MM256_SET_M128I consistently to fix gcc 7 build (#5889)
|
1 год назад |
ExtReMLapin
|
3de31677d3
grammars : blacklists character control set (#5888)
|
1 год назад |
Georgi Gerganov
|
82cb31eb93
Revert "grammars : don't allow to output unescaped new line in string (#5885)"
|
1 год назад |
ExtReMLapin
|
b1a4e994fd
grammars : don't allow to output unescaped new line in string (#5885)
|
1 год назад |
0cc4m
|
61d1c88e15
Vulkan Improvements (#5835)
|
1 год назад |
Neo Zhang Jianyu
|
21b0867433
[SYCL] fix mul_mat fault in CI/unit-test (#5862)
|
1 год назад |
Minsoo Cheong
|
6a87ac3a52
fix editorconfig check break (#5879)
|
1 год назад |
Jeffrey Quesnelle
|
29eee40474
fix speculative decoding build on windows (#5874)
|
1 год назад |
hutli
|
1d41d6f7c2
nix: static build (#5814)
|
1 год назад |
Georgi Gerganov
|
29ae62d2ae
llama : fix embeddings (#5796)
|
1 год назад |
Georgi Gerganov
|
e0843afe1b
flake : fix
|
1 год назад |
Georgi Gerganov
|
a1c6d96ed8
ggml : fix unknown status (#0)
|
1 год назад |