cturan/llama.cpp

Autore	SHA1 Messaggio	Data
Michelle Tan	89d604f2c8 server: Fix `has_next_line` in JSON response (#10818)	1 anno fa
kallewoof	484d2f31ae bug-fix: snprintf prints NULL in place of the last character (#10419)	1 anno fa
Xuan Son Nguyen	3573fa8e7b server : (refactor) no more json in server_task input (#10691)	1 anno fa
Georgi Gerganov	ce4a7b8493 server : various fixes (#10704)	1 anno fa
Xuan Son Nguyen	6c5bc0625f server : (refactoring) do not rely on JSON internally (#10643)	1 anno fa
haopeng	64ed2091b2 server: Add "tokens per second" information in the backend (#10548)	1 anno fa
Georgi Gerganov	d9d54e498d speculative : refactor and add a simpler example (#10362)	1 anno fa
sasha0552	42cadc74bd server : fix slot selection by lru (#10126)	1 anno fa
sasha0552	d865d1478c server : fix smart selection of available slot (#10120)	1 anno fa
Georgi Gerganov	8d8ff71536 llama : remove Tail-Free sampling (#10071)	1 anno fa
Georgi Gerganov	8125e6cbfc server : don't overfill the batch during infill (#10018)	1 anno fa
Xuan Son Nguyen	958367bf53 server : refactor slot input data, move tokenizer to HTTP thread (#10023)	1 anno fa
VoidIsVoid	a89f75e1b7 server : handle "logprobs" field with false value (#9871)	1 anno fa
Georgi Gerganov	c7181bd294 server : reuse cached context chunks (#9866)	1 anno fa
Diego Devesa	7eee341bee common : use common_ prefix for common library functions (#9805)	1 anno fa
Xuan Son Nguyen	458367a906 server : better security control for public deployments (#9776)	1 anno fa
Georgi Gerganov	f4d2b8846a llama : add reranking support (#9510)	1 anno fa
Vinesh Janarthanan	8a308354f6 server : match OAI structured output response (#9527)	1 anno fa
Georgi Gerganov	6262d13e0b common : reimplement logging (#9418)	1 anno fa
Mathijs Henquet	78203641fe server : Add option to return token pieces in /tokenize endpoint (#9108)	1 anno fa
Xuan Son Nguyen	6e7d133a5f server : refactor multitask handling (#9274)	1 anno fa
ardfork	978ba3d83d Server: Don't ignore llama.cpp params (#8754)	1 anno fa
Georgi Gerganov	4e24cffd8c server : handle content array in chat API (#8449)	1 anno fa
Xuan Son Nguyen	48e6b92cc3 Add chat template support for llama-cli (#8068)	1 anno fa
sasha0552	7a16ce7db2 server : smart slot selection using Longest Common Prefix (#7728)	1 anno fa
Georgi Gerganov	1442677f92 common : refactor cli arg parsing (#7675)	1 anno fa
Benjamin Findley	e586ee4259 change default temperature of OAI compat API from 0 to 1 (#7226)	1 anno fa
Johannes Gäßler	c12452c7ae JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143)	1 anno fa
Xuan Son Nguyen	1fd9c1741d clean up json_value & server_log (#7142)	1 anno fa
Pedro Cuenca	b97bc3966e llama : support Llama 3 HF conversion (#6745)	1 anno fa

Più recente Più vecchio

Cronologia Commit Cerca

Cronologia Commit