slaren
|
49006c67b4
llama : move random seed generation to the samplers (#9398)
|
hai 1 ano |
slaren
|
5fb5e24811
llama : minor sampling refactor (2) (#9386)
|
hai 1 ano |
Georgi Gerganov
|
df270ef745
llama : refactor sampling v2 (#9294)
|
hai 1 ano |
compilade
|
9bc6db28d0
ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151)
|
hai 1 ano |
Molly Sophia
|
8f1d81a0b6
llama : support RWKV v6 models (#8980)
|
hai 1 ano |
Sutou Kouhei
|
0ab30f8d82
llama : fix llama_split_mode enum values in main_gpu document (#9057)
|
hai 1 ano |
Faisal Zaghloul
|
42c76d1358
Threadpool: take 2 (#8672)
|
hai 1 ano |
compilade
|
a1631e53f6
llama : simplify Mamba with advanced batch splits (#8526)
|
hai 1 ano |
Minsoo Cheong
|
c679e0cb5c
llama : add EXAONE model support (#9025)
|
hai 1 ano |
Zhenwei Jin
|
4af8420afb
common : remove duplicate function llama_should_add_bos_token (#8778)
|
hai 1 ano |
Esko Toivonen
|
6bda7ce6c3
llama : add pre-tokenizer regexes for BLOOM and gpt3-finnish (#8850)
|
hai 1 ano |
Daniel Bevenius
|
06943a69f6
ggml : move rope type enum to ggml.h (#8949)
|
hai 1 ano |
fairydreaming
|
7c3f55c100
Add support for encoder-only T5 models (#8900)
|
hai 1 ano |
Nexes the Old
|
31958546c3
typo correction (#8891)
|
hai 1 ano |
compilade
|
4c676c85e5
llama : refactor session file management (#8699)
|
hai 1 ano |
Xuan Son Nguyen
|
b115105f05
add llama_lora_adapter_clear (#8653)
|
hai 1 ano |
Georgi Gerganov
|
938943cdbf
llama : move vocab, grammar and sampling into separate files (#8508)
|
hai 1 ano |
Keke Han
|
081fe431aa
llama : fix codeshell support (#8599)
|
hai 1 ano |
Jason Stillerman
|
d94c6e0ccb
llama : add support for SmolLm pre-tokenizer (#8609)
|
hai 1 ano |
Michael Coppola
|
940362224d
llama : add support for Tekken pre-tokenizer (#8579)
|
hai 1 ano |
Georgi Gerganov
|
d197545530
llama : bump max layers from 256 to 512 (#8530)
|
hai 1 ano |
Georgi Gerganov
|
0efec57787
llama : valign + remove unused ftype (#8502)
|
hai 1 ano |
Xuan Son Nguyen
|
97bdd26eee
Refactor lora adapter support (#8332)
|
hai 1 ano |
Dibakar Gope
|
0f1a39f343
ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (#5780)
|
hai 1 ano |
toyer
|
905942abdb
llama : support glm3 and glm4 (#8031)
|
hai 1 ano |
jaime-m-p
|
213701b51a
Detokenizer fixes (#8039)
|
hai 1 ano |
Douglas Hanley
|
d12f781074
llama : streamline embeddings from "non-embedding" models (#8087)
|
hai 1 ano |
fairydreaming
|
807b0c49ff
Inference support for T5 and FLAN-T5 model families (#5763)
|
hai 1 ano |
Faisal Zaghloul
|
968967376d
Add `JAIS` model(s) (#8118)
|
hai 1 ano |
kustaaya
|
f675b20a3b
Added support for Viking pre-tokenizer (#8135)
|
hai 1 ano |