cturan/llama.cpp

Autor	SHA1 Mensaxe	Data
Diego Devesa	ea02c753eb cuda : clear error after changing peer access (#10153)	hai 1 ano
Georgi Gerganov	05697f670b metal : simplify f16 and f32 dequant kernels (#0)	hai 1 ano
Georgi Gerganov	f8e58135cf metal : move dequantize templates to beginning of MSL source (#0)	hai 1 ano
leo-pony	329ed914c9 CANN: adjust backend registry refactor. (#10158)	hai 1 ano
Georgi Gerganov	ce027adfb3 sync : ggml	hai 1 ano
Yuri Khrustalev	284e5b0275 cmake : make it possible linking ggml as external lib (ggml/1003)	hai 1 ano
Plamen Minev	e2292aaa17 metal : fix minor string leaks (ggml/1004)	hai 1 ano
Diego Devesa	9f40989351 ggml : move CPU backend to a separate file (#10144)	hai 1 ano
Georgi Gerganov	08828a6d7d metal : minor fixup in FA kernel (#10143)	hai 1 ano
Georgi Gerganov	1839f69130 flake.lock: Update (#10146)	hai 1 ano
Christian Köhnenkamp	9830b6923b Add apple arm to presets (#10134)	hai 1 ano
sasha0552	42cadc74bd server : fix slot selection by lru (#10126)	hai 1 ano
Georgi Gerganov	45950415ed server : fix endpoint checks (#10135)	hai 1 ano
Georgi Gerganov	1926d6e39d llama : adjust default context size + print warnings (#10136)	hai 1 ano
Diego Devesa	b634f8a26f simple-chat : only add bos on first prompt (#10129)	hai 1 ano
Xuan Son Nguyen	7554aa4655 convert-lora : make `--base` optional (#10110)	hai 1 ano
Diego Devesa	a6744e43e8 llama : add simple-chat example (#10124)	hai 1 ano
Diego Devesa	e991e3127f llama : use smart pointers for ggml resources (#10117)	hai 1 ano
Shupei Fan	418f5eef26 vulkan : improve ggml_vk_create_buffer error handling (#9898)	hai 1 ano
Georgi Gerganov	ba6f62eb79 readme : update hot topics	hai 1 ano
sasha0552	d865d1478c server : fix smart selection of available slot (#10120)	hai 1 ano
Georgi Gerganov	1804adb0cf ggml : remove ggml_scratch (#10121)	hai 1 ano
Georgi Gerganov	815fe72adc sync : ggml	hai 1 ano
Georgi Gerganov	f221d56220 ggml : alloc ggml_contexts on the heap (whisper/2525)	hai 1 ano
Zhenwei Jin	e597e50794 build: fix build error in Windows env with OneAPI setup (#10107)	hai 1 ano
Diego Devesa	85679d37f3 llama : improve output buffer type selection (#10098)	hai 1 ano
Diego Devesa	1e9f94994e quantize : fix --keep-split (#10114)	hai 1 ano
Diego Devesa	c02e5ab2a6 llama : fix buffer checks for mamba and rwk (#10111)	hai 1 ano
Zhenwei Jin	ab3d71f97f loader: refactor tensor weights storage (#9935)	hai 1 ano
Kevin Gibbons	0a683e8088 server : include scheme when printing URL (#10106)	hai 1 ano

Posterior Anterior

Commit History Buscar

Commit History