cturan/llama.cpp

Author	SHA1 Message	Date
Georgi Gerganov	225e7a1438 llama : add high-throughput mode (#14363)	6 months ago
Georgi Gerganov	67d1ef23c6 batch : add optional for sequential equal split (#14511)	7 months ago
Georgi Gerganov	c79184d2d1 batch : add n_used count (#14512)	7 months ago
Georgi Gerganov	a70c8a0c4b kv-cache : use ggml_set_rows (#14285)	7 months ago
Georgi Gerganov	745f11fed0 memory : correctly handle failure in apply() (#14438)	7 months ago
Georgi Gerganov	692e3cdd0a memory : rename interface to llama_memory_context_i (#14296)	7 months ago
Georgi Gerganov	4c9fdfbe15 ubatch : new splitting logic (#14217)	7 months ago
Gabe Goodhart	edc4a29eff memory : Hybrid recurrent cache (#13979)	7 months ago
Georgi Gerganov	d3e64b9f49 llama : rework embeddings logic (#14208)	7 months ago
Georgi Gerganov	c3ee46fab4 batch : remove logits_all flag (#14141)	7 months ago
Georgi Gerganov	9596506965 kv-cache : fix split_equal handling in unified implementation (#14130)	7 months ago
Georgi Gerganov	745aa5319b llama : deprecate llama_kv_self_ API (#14030)	8 months ago
Georgi Gerganov	3e63a58ef7 kv-cache : refactor the update/defrag mechanism (#13988)	8 months ago
Georgi Gerganov	0fc16b42e8 kv-cache : split implementation in separate sources (#13920)	8 months ago