cturan/llama.cpp

mirror de https://github.com/cturan/llama.cpp

Autor	SHA1 Mensagem	Data
Georgi Gerganov	9c67c2773d ggml : add Flash Attention (#5021)	há 1 ano atrás
DAN™	e00b4a8f81 Fix more int overflow during quant (PPL/CUDA). (#6563)	há 1 ano atrás
slaren	ae1f211ce2 cuda : refactor into multiple files (#6269)	há 1 ano atrás