cturan/llama.cpp

镜像来自 https://github.com/cturan/llama.cpp

作者	SHA1 備註	提交日期
Georgi Gerganov	9c67c2773d ggml : add Flash Attention (#5021)	1 年之前
compilade	557410b8f0 llama : greatly reduce output buffer memory usage (#6122)	1 年之前
slaren	2bf8d0f7c4 backend : offload large batches to GPU (#6083)	1 年之前
slaren	f30ea47a87 llama : add pipeline parallelism support (#6017)	1 年之前
Michael Podvitskiy	9fa2627347 ggml : introduce ggml_status (ggml/750)	1 年之前
UEXTM.com	5f70671856 Introduce backend GUIDs (ggml/743)	1 年之前
Jared Van Bortel	fbf1ddec69 Nomic Vulkan backend (#4456)	1 年之前