Ren Xuancheng
|
229ffff872
llama : add BPE pre-tokenization for Qwen2 (#7114)
|
1 سال پیش |
DAN™
|
4cd621c26d
convert : add BPE pre-tokenization for DBRX (#7132)
|
1 سال پیش |
Justine Tunney
|
3855416027
ggml : introduce bfloat16 support (#6412)
|
1 سال پیش |
nopperl
|
b6aa670203
Fix OLMo HF to GGUF conversion (#6910)
|
1 سال پیش |
DAN™
|
889bdd7686
command-r : add BPE pre-tokenization (#7063)
|
1 سال پیش |
Georgi Gerganov
|
92139b90af
tests : add test-tokenizer-0.sh + fix some tokenizers (#7036)
|
1 سال پیش |
Daniel Bevenius
|
433def286e
llama : rename ctx to user_data in progress_callback (#7045)
|
1 سال پیش |
Georgi Gerganov
|
9c67c2773d
ggml : add Flash Attention (#5021)
|
1 سال پیش |
Georgi Gerganov
|
f4ab2a4147
llama : fix BPE pre-tokenization (#6920)
|
1 سال پیش |
Pierrick Hymbert
|
0c4d489e29
quantize: add imatrix and dataset metadata in GGUF (#6658)
|
1 سال پیش |
slaren
|
017e6999b5
add basic tensor data validation function (#6884)
|
1 سال پیش |
jiez
|
1966eb2615
quantize : add '--keep-split' to quantize model into shards (#6688)
|
1 سال پیش |
Douglas Hanley
|
b4e4b8a935
llama : add llama_get_pooling_type function (#6862)
|
1 سال پیش |
Johannes Gäßler
|
28103f4832
Server: fix seed for multiple slots (#6835)
|
1 سال پیش |
Georgi Gerganov
|
40f74e4d73
llama : add option to render special/control tokens (#6807)
|
1 سال پیش |
Pedro Cuenca
|
b97bc3966e
llama : support Llama 3 HF conversion (#6745)
|
1 سال پیش |
Olivier Chafik
|
cbaadc9294
grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses) (#6609)
|
1 سال پیش |
Jared Van Bortel
|
1b67731e18
BERT tokenizer fixes (#6498)
|
1 سال پیش |
Rick G
|
e3c337d87c
llama : support negative ith in llama_get_ API (#6519)
|
1 سال پیش |
Jan Boon
|
beea6e1b16
llama : save and restore kv cache for single seq id (#6341)
|
1 سال پیش |
Clint Herron
|
9b84ae1806
examples : add GBNF validator program (#5948)
|
1 سال پیش |
Jared Van Bortel
|
be55134a53
convert : refactor vocab selection logic (#6355)
|
1 سال پیش |
compilade
|
557410b8f0
llama : greatly reduce output buffer memory usage (#6122)
|
1 سال پیش |
Kawrakow
|
55c1b2a3bb
IQ1_M: 1.75 bpw quantization (#6302)
|
1 سال پیش |
Kawrakow
|
d25b1c31b0
quantize : be able to override metadata by key (#6321)
|
1 سال پیش |
Kawrakow
|
1d0331c12a
quantize: options for output and token embedding tensors qtype (#6239)
|
1 سال پیش |
Pierrick Hymbert
|
dba1af6129
llama_model_loader: support multiple split/shard GGUFs (#6187)
|
1 سال پیش |
Theia Vogel
|
877b4d0c62
llama : add support for control vectors (#5970)
|
1 سال پیش |
Michael Podvitskiy
|
69ff61397d
llama : support models without vocabulary (#5798)
|
1 سال پیش |
slaren
|
f30ea47a87
llama : add pipeline parallelism support (#6017)
|
1 سال پیش |