cturan/llama.cpp

mirror de https://github.com/cturan/llama.cpp

Autor	SHA1 Mensagem	Data
agray3	bc4bba364f Introduction of CUDA Graphs to LLama.cpp (#6766)	1 ano atrás
DAN™	e00b4a8f81 Fix more int overflow during quant (PPL/CUDA). (#6563)	1 ano atrás
slaren	0d56246f4b ggml : group all experts in a single ggml_mul_mat_id (#6505)	1 ano atrás
Carolinabanana	5dc9dd7152 llama : add Command R Plus support (#6491)	1 ano atrás
Kawrakow	55c1b2a3bb IQ1_M: 1.75 bpw quantization (#6302)	1 ano atrás
slaren	ae1f211ce2 cuda : refactor into multiple files (#6269)	1 ano atrás