cturan/llama.cpp

Autor	SHA1 Mensaje	Fecha
matiaslin	faac0bae26 common : ensure llama_batch size does not exceed max size (#9668)	hace 1 año
nopperl	f99d3f8367 py : add model class for Chameleon conversion (#9683)	hace 1 año
Georgi Gerganov	589b48d41e contrib : add Resources section (#9675)	hace 1 año
Georgi Gerganov	f4d2b8846a llama : add reranking support (#9510)	hace 1 año
slaren	1b2f992cd2 test-backend-ops : use flops for some performance tests (#9657)	hace 1 año
Georgi Gerganov	739842703e llama : add comment about thread-safety [no ci] (#9449)	hace 1 año
Zhenwei Jin	6102037bbb vocab : refactor tokenizer to reduce init overhead (#9449)	hace 1 año
nopperl	9a913110cf llama : add support for Chameleon (#8543)	hace 1 año
Aarni Koskela	43bcdd9703 readme : add tool (#9655)	hace 1 año
Dan Johansson	6a0f779484 ggml : add run-time detection of neon, i8mm and sve (#9331)	hace 1 año
Markus Tavenrath	89f9944981 Enable use to the rebar feature to upload buffers to the device. (#9251)	hace 1 año
Georgi Gerganov	b5de3b74a5 readme : update hot topics	hace 1 año
Borislav Stanimirov	44f59b4301 cmake : add option for common library (#9661)	hace 1 año
Neo Zhang Jianyu	95bc82fbc0 [SYCL] add missed dll file in package (#9577)	hace 1 año
R0CKSTAR	7691654c68 mtgpu: enable VMM (#9597)	hace 1 año
Xuan Son Nguyen	ea9c32be71 ci : fix docker build number and tag name (#9638)	hace 1 año
Charles Xu	1e43630218 ggml : remove assert for AArch64 GEMV and GEMM Q4 kernels (#9217)	hace 1 año
Xuan Son Nguyen	afbbfaa537 server : add more env vars, improve gen-docs (#9635)	hace 1 año
Gabe Goodhart	3d6bf6919f llama : add IBM Granite MoE architecture (#9438)	hace 1 año
Dou Xinpeng	904837e0cb cann: fix crash when llama-bench is running on multiple cann devices (#9627)	hace 1 año
Eric Zhang	70392f1f81 ggml : add AVX512DQ requirement for AVX512 builds (#9622)	hace 1 año
Georgi Gerganov	bb5f819975 sync : ggml	hace 1 año
Georgi Gerganov	c038931615 examples : adapt to ggml.h changes (ggml/0)	hace 1 año
Georgi Gerganov	31ac5834fe llama : keep track of all EOG tokens in the vocab (#9609)	hace 1 año
Georgi Gerganov	cea1486ecf log : add CONT level for continuing previous log entry (#9610)	hace 1 año
StrangeBytesDev	0aa15011e3 server : add newline after chat example (#9616)	hace 1 año
Georgi Gerganov	b0f27361f3 sampling : avoid expensive softmax during greedy sampling (#9605)	hace 1 año
Max Krasnyansky	c087b6f11d threads: fix msvc build without openmp (#9615)	hace 1 año
Ivan	116efee0ee cuda: add q8_0->f32 cpy operation (#9571)	hace 1 año
Xuan Son Nguyen	0b3bf966f4 server : add --no-context-shift option (#9607)	hace 1 año

Posterior Anterior

Historial de Commits Buscar

Historial de Commits