Georgi Gerganov
|
5ef07e25ac
server : handle models with missing EOS token (#8997)
|
1 rok temu |
Mathieu Geli
|
daef3ab233
server : add one level list nesting for embeddings (#8936)
|
1 rok temu |
Xuan Son Nguyen
|
1e6f6554aa
server : add lora hotswap endpoint (WIP) (#8857)
|
1 rok temu |
Liu Jia
|
0a4ce78681
common : Changed tuple to struct (TODO fix) (#8823)
|
1 rok temu |
ardfork
|
978ba3d83d
Server: Don't ignore llama.cpp params (#8754)
|
1 rok temu |
RunningLeon
|
3807c3de04
server : respect `--special` cli arg (#8553)
|
1 rok temu |
Douglas Hanley
|
c3ebcfa148
server : ensure batches are either all embed or all completion (#8420)
|
1 rok temu |
Clint Herron
|
278d0e1846
Initialize default slot sampling parameters from the global context. (#8418)
|
1 rok temu |
Clint Herron
|
a59f8fdc85
Server: Enable setting default sampling parameters via command-line (#8402)
|
1 rok temu |
Bjarke Viksøe
|
cb4d86c4d7
server: Retrieve prompt template in /props (#8337)
|
1 rok temu |
Sigbjørn Skjæret
|
38373cfbab
Add SPM infill support (#8016)
|
1 rok temu |
Xuan Son Nguyen
|
48e6b92cc3
Add chat template support for llama-cli (#8068)
|
1 rok temu |
sasha0552
|
ba58993152
server : fix smart slot selection (#8020)
|
1 rok temu |
Sigbjørn Skjæret
|
91c188d6c2
Only use FIM middle token if it exists (#7648)
|
1 rok temu |
Georgi Gerganov
|
704a35b183
server : restore numeric prompts (#7883)
|
1 rok temu |
Georgi Gerganov
|
d9da0e4986
server : improve "prompt" handling (#7847)
|
1 rok temu |
sasha0552
|
7a16ce7db2
server : smart slot selection using Longest Common Prefix (#7728)
|
1 rok temu |
woodx
|
a5cabd7649
server : do not get prompt in infill mode (#7286)
|
1 rok temu |
Georgi Gerganov
|
f83351f9a6
imatrix : migrate to gpt_params (#7771)
|
1 rok temu |
Georgi Gerganov
|
1442677f92
common : refactor cli arg parsing (#7675)
|
1 rok temu |
Yazan Agha-Schrader
|
2e666832e6
server : new UI (#7633)
|
1 rok temu |
Georgi Gerganov
|
6ff13987ad
common : normalize naming style (#7462)
|
1 rok temu |
Georgi Gerganov
|
e932094d58
server : return error on too large embedding input (#7389)
|
1 rok temu |
Johannes Gäßler
|
41858392e1
server: fix seed being reported back (#7382)
|
1 rok temu |
Radoslav Gerganov
|
ee94172d33
server : add support for the RPC backend (#7305)
|
1 rok temu |
Steve Grubb
|
4f0263633b
server: free sampling contexts on exit (#7264)
|
1 rok temu |
Xuan Son Nguyen
|
72c177c1f6
fix system prompt handling (#7153)
|
1 rok temu |
Steve Grubb
|
988631335a
server : free llama_batch on exit (#7212)
|
1 rok temu |
Johannes Gäßler
|
5ae3426b0b
server: fix reported top tokens for temperature 0 (#7203)
|
1 rok temu |
Johannes Gäßler
|
c12452c7ae
JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143)
|
1 rok temu |