cturan/llama.cpp

Автор	SHA1 Опис	Дата
Georgi Gerganov	ab336a9d5e code : normalize enum names (#5697)	1 рік тому
Kawrakow	4c4cb30736 IQ3_S: a much better alternative to Q3_K (#5676)	1 рік тому
Jared Van Bortel	15499eb942 mpt : do not duplicate token_embd.weight on disk (#5670)	1 рік тому
Georgi Gerganov	96633eeca1 gemma : use more bits for the token_embd.weight tensor (#5650)	1 рік тому
Georgi Gerganov	847eedbdb2 py : add Gemma conversion from HF models (#5647)	1 рік тому
Xuan Son Nguyen	373ee3fbba Add Gemma chat template (#5665)	1 рік тому
Georgi Gerganov	3a03541ced minor : fix trailing whitespace (#5638)	1 рік тому
Xuan Son Nguyen	a46f50747b server : fallback to chatml, add AlphaMonarch chat template (#5628)	1 рік тому
Dat Quoc Nguyen	4ef245a92a mpt : add optional bias tensors (#5638)	1 рік тому
slaren	973053d8b0 llama : fix loading models with shared tok_embd and output (#5651)	1 рік тому
slaren	7fe4678b02 llama : fix session save/load with quantized KV (#5649)	1 рік тому
slaren	ba2135ccae gemma : allow offloading the output tensor (#5646)	1 рік тому
postmasters	580111d42b llama : add `gemma` model (#5631)	1 рік тому
Kawrakow	a14679cc30 IQ4_NL: 4-bit non-linear quants with blocks of 32 (#5590)	1 рік тому
Xuan Son Nguyen	9c405c9f9a Server: use llama_chat_apply_template (#5593)	1 рік тому
Georgi Gerganov	f53119cec4 minor : fix trailing whitespace (#5538)	1 рік тому
Xuan Son Nguyen	11b12de39b llama : add llama_chat_apply_template() (#5538)	1 рік тому
Kawrakow	bd2d4e393b 1.5 bit quantization (#5453)	1 рік тому
Georgi Gerganov	8f1be0d42f ggml : add ALiBi support for ggml_soft_max_ext (#5488)	1 рік тому
Herman Semenov	65085c713e llama : minor fixed return int value (#5529)	1 рік тому
bmwl	f486f6e1e5 ggml : add numa options (#5377)	1 рік тому
Douglas Hanley	4524290e87 Use correct type of pooling for embedding models (#5500)	1 рік тому
Jared Van Bortel	ea9c8e1143 llama : add support for Nomic Embed (#5468)	1 рік тому
Aarni Koskela	c4e6dd59e4 llama : allow raw byte in SPM vocabs; don't crash on nl 404 (#5478)	1 рік тому
Aarni Koskela	037259be68 llama : make load error reporting more granular (#5477)	1 рік тому
Georgi Gerganov	cf45252a7c tests : multi-thread the tokenizer tests (#5474)	1 рік тому
Douglas Hanley	03bf161eb6 llama : support batched embeddings (#5466)	1 рік тому
Georgi Gerganov	49cc1f7d67 bert : add tests + fix quantization (#5475)	1 рік тому
Georgi Gerganov	099afc6274 llama : fix quantization when tensors are missing (#5423)	1 рік тому
Georgi Gerganov	3b169441df sync : ggml (#5452)	1 рік тому

Новіші Старіші

Історія комітів Пошук

Історія комітів