DAN™
|
889bdd7686
command-r : add BPE pre-tokenization (#7063)
|
1 year ago |
Georgi Gerganov
|
92139b90af
tests : add test-tokenizer-0.sh + fix some tokenizers (#7036)
|
1 year ago |
alwqx
|
6ecf3189e0
chore: fix typo in llama.cpp (#7032)
|
1 year ago |
Georgi Gerganov
|
9c67c2773d
ggml : add Flash Attention (#5021)
|
1 year ago |
Georgi Gerganov
|
f4ab2a4147
llama : fix BPE pre-tokenization (#6920)
|
1 year ago |
Johannes Gäßler
|
c4f708a93f
llama : fix typo LAMMAFILE -> LLAMAFILE (#6974)
|
1 year ago |
Xuan Son Nguyen
|
7bb36ccf91
gguf : enforce that tensor names are unique (#6905)
|
1 year ago |
agray3
|
928e0b7013
Reset schedule earlier to allow overlap with ggml graph computation on device (#6933)
|
1 year ago |
Pierrick Hymbert
|
0c4d489e29
quantize: add imatrix and dataset metadata in GGUF (#6658)
|
1 year ago |
slaren
|
017e6999b5
add basic tensor data validation function (#6884)
|
1 year ago |
Georgi Gerganov
|
dba497e0c1
cmake : restore LLAMA_LLAMAFILE_DEFAULT
|
1 year ago |
slaren
|
d6e1d44f16
llama : synchronize before get/set session data (#6911)
|
1 year ago |
slaren
|
0ead1f1072
llama : check that all the tensor data is in the model file (#6885)
|
1 year ago |
Georgi Gerganov
|
aa750c1ede
tests : minor bash stuff (#6902)
|
1 year ago |
jiez
|
1966eb2615
quantize : add '--keep-split' to quantize model into shards (#6688)
|
1 year ago |
Douglas Hanley
|
b4e4b8a935
llama : add llama_get_pooling_type function (#6862)
|
1 year ago |
Johannes Gäßler
|
28103f4832
Server: fix seed for multiple slots (#6835)
|
1 year ago |
Tristan Druyen
|
abd3314064
llama : add phi 3 chat template (#6857)
|
1 year ago |
liuwei-git
|
c8297c6af5
llama : add phi3 support (#6852)
|
1 year ago |
Georgi Gerganov
|
8960fe86ae
llama : fix typo in <|im_end|> token text (#6745)
|
1 year ago |
Georgi Gerganov
|
40f74e4d73
llama : add option to render special/control tokens (#6807)
|
1 year ago |
Wouter
|
7dbdba5690
llama : add llama-3 chat template (#6751)
|
1 year ago |
Pedro Cuenca
|
b97bc3966e
llama : support Llama 3 HF conversion (#6745)
|
1 year ago |
nopperl
|
9958c81b79
Implement the OLMo architecture (#6741)
|
1 year ago |
slaren
|
0d56246f4b
ggml : group all experts in a single ggml_mul_mat_id (#6505)
|
1 year ago |
Ren Xuancheng
|
e11b2e6e1e
Qwen2 : assume tied weights if lm_head/output weights is missing (#6738)
|
1 year ago |
slaren
|
c71bfd736e
llama : fix compatibility with old 2 expert models (#6735)
|
1 year ago |
Georgi Gerganov
|
532c1737a1
llama : make general.name optional (#6709)
|
1 year ago |
Ashish
|
dbceec87c0
llama : add StableLM2 12B (#6635)
|
1 year ago |
Shijie
|
f4dea7da18
llama : add qwen2moe (#6074)
|
1 year ago |