cturan/llama.cpp

espejo de https://github.com/cturan/llama.cpp

Autor	SHA1 Mensaje	Fecha
Johannes Gäßler	1613ef8d8e CUDA: CUDART < 11.7 workaround for __hmax, __hmax2 (#7019)	hace 1 año
Georgi Gerganov	9c67c2773d ggml : add Flash Attention (#5021)	hace 1 año
Carolinabanana	5dc9dd7152 llama : add Command R Plus support (#6491)	hace 1 año
Georgi Gerganov	d48ccf3ad4 sync : ggml (#6351)	hace 1 año
slaren	ae1f211ce2 cuda : refactor into multiple files (#6269)	hace 1 año