1
0

Коммит түүх

Эзэн SHA1 Мессеж Огноо
  Marcus Dunn d5ab29757e llama : constified `llama_set_state_data`'s `src` (#5774) 1 жил өмнө
  Georgi Gerganov 08c5ee87e4 llama : remove deprecated API (#5770) 1 жил өмнө
  compilade adcb12a9ba llama : fix non-quantization of expert gating tensors (#5754) 1 жил өмнө
  Douglas Hanley 177628bfd8 llama : improve BERT tokenization (#5740) 1 жил өмнө
  Kawrakow 0becb22ac0 IQ4_XS: a 4.25 bpw quantization (#5747) 1 жил өмнө
  Georgi Gerganov 9d533a77d0 llama : fix defrag bugs + add parameter (#5735) 1 жил өмнө
  Kawrakow a33e6a0d2a Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (#5721) 1 жил өмнө
  AidanBeltonS e849078c6e [SYCL] Add support for soft_max ALiBi (#5639) 1 жил өмнө
  Georgi Gerganov 269de86ba0 llama : fix Gemma rope type (#5691) 1 жил өмнө
  Georgi Gerganov bf08e00643 llama : refactor k-shift implementation + KV defragmentation (#5691) 1 жил өмнө
  Georgi Gerganov ab336a9d5e code : normalize enum names (#5697) 1 жил өмнө
  Kawrakow 4c4cb30736 IQ3_S: a much better alternative to Q3_K (#5676) 1 жил өмнө
  Jared Van Bortel 15499eb942 mpt : do not duplicate token_embd.weight on disk (#5670) 1 жил өмнө
  Georgi Gerganov 96633eeca1 gemma : use more bits for the token_embd.weight tensor (#5650) 1 жил өмнө
  Georgi Gerganov 847eedbdb2 py : add Gemma conversion from HF models (#5647) 1 жил өмнө
  Xuan Son Nguyen 373ee3fbba Add Gemma chat template (#5665) 1 жил өмнө
  Georgi Gerganov 3a03541ced minor : fix trailing whitespace (#5638) 1 жил өмнө
  Xuan Son Nguyen a46f50747b server : fallback to chatml, add AlphaMonarch chat template (#5628) 1 жил өмнө
  Dat Quoc Nguyen 4ef245a92a mpt : add optional bias tensors (#5638) 1 жил өмнө
  slaren 973053d8b0 llama : fix loading models with shared tok_embd and output (#5651) 1 жил өмнө
  slaren 7fe4678b02 llama : fix session save/load with quantized KV (#5649) 1 жил өмнө
  slaren ba2135ccae gemma : allow offloading the output tensor (#5646) 1 жил өмнө
  postmasters 580111d42b llama : add `gemma` model (#5631) 1 жил өмнө
  Kawrakow a14679cc30 IQ4_NL: 4-bit non-linear quants with blocks of 32 (#5590) 1 жил өмнө
  Xuan Son Nguyen 9c405c9f9a Server: use llama_chat_apply_template (#5593) 1 жил өмнө
  Georgi Gerganov f53119cec4 minor : fix trailing whitespace (#5538) 1 жил өмнө
  Xuan Son Nguyen 11b12de39b llama : add llama_chat_apply_template() (#5538) 1 жил өмнө
  Kawrakow bd2d4e393b 1.5 bit quantization (#5453) 1 жил өмнө
  Georgi Gerganov 8f1be0d42f ggml : add ALiBi support for ggml_soft_max_ext (#5488) 1 жил өмнө
  Herman Semenov 65085c713e llama : minor fixed return int value (#5529) 1 жил өмнө