Georgi Gerganov 080b161995 completion : fix prompt cache for recurrent models (#19045) 4 days ago
..
batched-bench 147a521636 tool/ex/tests: consistently free ctx, then model (#18168) 1 month ago
cli 16639ba217 common : use two decimal places for float arg help messages (#19048) 4 days ago
completion 080b161995 completion : fix prompt cache for recurrent models (#19045) 4 days ago
cvector-generator 254098a279 common : refactor common_sampler + grammar logic changes (#17937) 1 month ago
export-lora 07808ebb07 cmake : Do not install tools on iOS targets (#15903) 4 months ago
fit-params e9fd8dcab4 llama-fit-params: keep explicit --ctx-size 0 (#19070) 4 days ago
gguf-split 6c2131773c cli: new CLI experience (#17824) 1 month ago
imatrix 254098a279 common : refactor common_sampler + grammar logic changes (#17937) 1 month ago
llama-bench aa1dc3770a Setting mmap and direct_io to false as default in llama-bench.cpp (#18841) 1 week ago
mtmd 9eb5bfec1a mtmd : update docs to use llama_model_n_embd_inp (#18999) 6 days ago
perplexity 254098a279 common : refactor common_sampler + grammar logic changes (#17937) 1 month ago
quantize 33ded988ba quantize: prevent input/output file collision (#18451) 4 weeks ago
rpc d2d626938a Install rpc-server when GGML_RPC is ON. (#17149) 2 months ago
server 16639ba217 common : use two decimal places for float arg help messages (#19048) 4 days ago
tokenize 07808ebb07 cmake : Do not install tools on iOS targets (#15903) 4 months ago
tts 516a4ca9b5 refactor : remove libcurl, use OpenSSL when available (#18828) 2 weeks ago
CMakeLists.txt a180ba78c7 cmake: only build cli when server is enabled (#18670) 2 weeks ago