cturan/llama.cpp

Эзэн	SHA1 Мессеж	Огноо
Marcus Dunn	d5ab29757e llama : constified `llama_set_state_data`'s `src` (#5774)	1 жил өмнө
Georgi Gerganov	08c5ee87e4 llama : remove deprecated API (#5770)	1 жил өмнө
compilade	adcb12a9ba llama : fix non-quantization of expert gating tensors (#5754)	1 жил өмнө
Douglas Hanley	177628bfd8 llama : improve BERT tokenization (#5740)	1 жил өмнө
Kawrakow	0becb22ac0 IQ4_XS: a 4.25 bpw quantization (#5747)	1 жил өмнө
Georgi Gerganov	9d533a77d0 llama : fix defrag bugs + add parameter (#5735)	1 жил өмнө
Kawrakow	a33e6a0d2a Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (#5721)	1 жил өмнө
AidanBeltonS	e849078c6e [SYCL] Add support for soft_max ALiBi (#5639)	1 жил өмнө
Georgi Gerganov	269de86ba0 llama : fix Gemma rope type (#5691)	1 жил өмнө
Georgi Gerganov	bf08e00643 llama : refactor k-shift implementation + KV defragmentation (#5691)	1 жил өмнө
Georgi Gerganov	ab336a9d5e code : normalize enum names (#5697)	1 жил өмнө
Kawrakow	4c4cb30736 IQ3_S: a much better alternative to Q3_K (#5676)	1 жил өмнө
Jared Van Bortel	15499eb942 mpt : do not duplicate token_embd.weight on disk (#5670)	1 жил өмнө
Georgi Gerganov	96633eeca1 gemma : use more bits for the token_embd.weight tensor (#5650)	1 жил өмнө
Georgi Gerganov	847eedbdb2 py : add Gemma conversion from HF models (#5647)	1 жил өмнө
Xuan Son Nguyen	373ee3fbba Add Gemma chat template (#5665)	1 жил өмнө
Georgi Gerganov	3a03541ced minor : fix trailing whitespace (#5638)	1 жил өмнө
Xuan Son Nguyen	a46f50747b server : fallback to chatml, add AlphaMonarch chat template (#5628)	1 жил өмнө
Dat Quoc Nguyen	4ef245a92a mpt : add optional bias tensors (#5638)	1 жил өмнө
slaren	973053d8b0 llama : fix loading models with shared tok_embd and output (#5651)	1 жил өмнө
slaren	7fe4678b02 llama : fix session save/load with quantized KV (#5649)	1 жил өмнө
slaren	ba2135ccae gemma : allow offloading the output tensor (#5646)	1 жил өмнө
postmasters	580111d42b llama : add `gemma` model (#5631)	1 жил өмнө
Kawrakow	a14679cc30 IQ4_NL: 4-bit non-linear quants with blocks of 32 (#5590)	1 жил өмнө
Xuan Son Nguyen	9c405c9f9a Server: use llama_chat_apply_template (#5593)	1 жил өмнө
Georgi Gerganov	f53119cec4 minor : fix trailing whitespace (#5538)	1 жил өмнө
Xuan Son Nguyen	11b12de39b llama : add llama_chat_apply_template() (#5538)	1 жил өмнө
Kawrakow	bd2d4e393b 1.5 bit quantization (#5453)	1 жил өмнө
Georgi Gerganov	8f1be0d42f ggml : add ALiBi support for ggml_soft_max_ext (#5488)	1 жил өмнө
Herman Semenov	65085c713e llama : minor fixed return int value (#5529)	1 жил өмнө

Шинэ Хуучин

Коммит түүх Хайх

Коммит түүх