cturan/llama.cpp

Author	SHA1 Message	Date
ddh0	f6dcda3900 server : context checkpointing for hybrid and recurrent models (#16382)	3 months ago
Johannes Gäßler	e789095502 llama: print memory breakdown on exit (#15860)	3 months ago
Georgi Gerganov	b730706a49 kv-cache : support layer reuse (#15504)	4 months ago
Georgi Gerganov	d32e03f449 server : add SWA checkpoints (#15293)	5 months ago
l3utterfly	7233358d29 memory : handle saving/loading null layers in recurrent memory (#14675)	6 months ago
Georgi Gerganov	01612b7409 llama : reuse compute graphs (#14482)	6 months ago
compilade	4a5686da22 llama : support Jamba hybrid Transformer-Mamba models (#7531)	6 months ago
compilade	bb4f7a9e4e memory : fix broken batch splits for recurrent cache (#14575)	6 months ago
Georgi Gerganov	67d1ef23c6 batch : add optional for sequential equal split (#14511)	6 months ago
Georgi Gerganov	c79184d2d1 batch : add n_used count (#14512)	6 months ago
Georgi Gerganov	745f11fed0 memory : correctly handle failure in apply() (#14438)	6 months ago
Georgi Gerganov	43678060c1 recurrent : call balloc split_reset() in init_batch() (#14414)	6 months ago
Georgi Gerganov	692e3cdd0a memory : rename interface to llama_memory_context_i (#14296)	7 months ago
Georgi Gerganov	4c9fdfbe15 ubatch : new splitting logic (#14217)	7 months ago
Gabe Goodhart	edc4a29eff memory : Hybrid recurrent cache (#13979)	7 months ago