cturan/llama.cpp

Author	SHA1 Message	Date
Jared Van Bortel	32c8486e1f wpm : portable unicode tolower (#6305)	1 year ago
compilade	557410b8f0 llama : greatly reduce output buffer memory usage (#6122)	1 year ago
Kawrakow	55c1b2a3bb IQ1_M: 1.75 bpw quantization (#6302)	1 year ago
Kawrakow	d25b1c31b0 quantize : be able to override metadata by key (#6321)	1 year ago
slaren	280345968d cuda : rename build flag to LLAMA_CUDA (#6299)	1 year ago
Meng, Hengyu	ddf6568510 [SYCL] offload op (#6217)	1 year ago
Jared Van Bortel	94d1b3b411 use _wfopen instead of fopen on Windows (#6248)	1 year ago
Pierrick Hymbert	f482bb2e49 common: llama_load_model_from_url split support (#6192)	1 year ago
Julius Arkenberg	476b0251b2 llama : add grok-1 support (#6204)	1 year ago
Kawrakow	1d0331c12a quantize: options for output and token embedding tensors qtype (#6239)	1 year ago
Pierrick Hymbert	dba1af6129 llama_model_loader: support multiple split/shard GGUFs (#6187)	1 year ago
Nexesenex	e80f06d2a1 llama : correction of the attn.v.weight quantization for IQ3_XS (#6209)	1 year ago
Georgi Gerganov	95d576b48e metal : pad n_ctx by 32 (#6177)	1 year ago
Jared Van Bortel	d199ca79f2 mpt : implement backwards compatiblity with duped output tensor (#6139)	1 year ago
slaren	2bf8d0f7c4 backend : offload large batches to GPU (#6083)	1 year ago
slaren	d84c48505f llama : fix Baichuan2 13B (#6092)	1 year ago
Theia Vogel	877b4d0c62 llama : add support for control vectors (#5970)	1 year ago
Andrew Canis	12247f4c69 llama : add Command-R support (#6033)	1 year ago
Neo Zhang Jianyu	46acb36767 fix set main gpu error (#6073)	1 year ago
Xuan Son Nguyen	aab606a11f llama : add Orion chat template (#6066)	1 year ago
Georgi Gerganov	4755afd1cb llama : fix integer overflow during quantization (#6063)	1 year ago
Michael Podvitskiy	69ff61397d llama : support models without vocabulary (#5798)	1 year ago
Georgi Gerganov	a44bc969e4 llama : fix typo	1 year ago
Michael Podvitskiy	2c4fb69246 llama : optimize defrag moves + fix fragmentation calculation (#6037)	1 year ago
slaren	f30ea47a87 llama : add pipeline parallelism support (#6017)	1 year ago
gliptic	5cdb371731 grammar : fix unnecessarily retained pointer to rules (#6003)	1 year ago
Georgi Gerganov	05b06210c9 llama : more consistent names of count variables (#5994)	1 year ago
Georgi Gerganov	83796e62bc llama : refactor unicode stuff (#5992)	1 year ago
Michael Podvitskiy	3202361c5b ggml, ci : Windows ARM runner and build fixes (#5979)	1 year ago
Georgi Gerganov	ee35600b90 llama : fix F16/F32 downcast + improve names (#5980)	1 year ago

Newer Older

Commit History Find

Commit History