sjxx
|
b231b37b09
readme : update UI list (#6560)
|
1 year ago |
Jiří Sejkora
|
ba5e134e07
readme: fix typo in amdgpu target name (#6573)
|
1 year ago |
Jared Van Bortel
|
1b67731e18
BERT tokenizer fixes (#6498)
|
1 year ago |
Georgi Gerganov
|
c4a3a4ff47
sync : ggml
|
1 year ago |
Ed Lee
|
400d5d722d
server : detect search query to start webchat (#6554)
|
1 year ago |
Carolinabanana
|
5dc9dd7152
llama : add Command R Plus support (#6491)
|
1 year ago |
Georgi Gerganov
|
e11a8999b5
license : update copyright notice + add AUTHORS (#6405)
|
1 year ago |
Georgi Gerganov
|
cc4a95426d
llama : fix attention layer count sanity check (#6550)
|
1 year ago |
kunnis
|
cecd8d3c98
Comment explaining a decision (#6531)
|
1 year ago |
Georgi Gerganov
|
b73e564b16
quantize : fix precedence of cli args (#6541)
|
1 year ago |
Rick G
|
e3c337d87c
llama : support negative ith in llama_get_ API (#6519)
|
1 year ago |
Jan Boon
|
beea6e1b16
llama : save and restore kv cache for single seq id (#6341)
|
1 year ago |
Abhilash Majumder
|
87fb5b4234
remove row=1 cond (#6532)
|
1 year ago |
Firat
|
d752327c33
Adding KodiBot to UI list (#6535)
|
1 year ago |
Mark Fairbairn
|
855f54402e
Change Windows AMD example to release build to make inference much faster. (#6525)
|
1 year ago |
Georgi Gerganov
|
b909236c0b
flake.lock: Update (#6517)
|
1 year ago |
DAN™
|
e0717e751e
Add GritLM as supported models. (#6513)
|
1 year ago |
Georgi Gerganov
|
c37247796b
sync : ggml
|
1 year ago |
Slava Primenko
|
f77261a7c5
ggml: bypass code incompatible with CUDA < 11.1 (whisper/2020)
|
1 year ago |
Georgi Gerganov
|
43e8995e75
scripts : sync ggml-cuda folder
|
1 year ago |
limitedAtonement
|
9472bce308
Run make to build the project (#6457)
|
1 year ago |
Neo Zhang Jianyu
|
d4f220a5cc
support/fix OPs GGML_TYPE_IQ4_NL, GGML_TYPE_IQ4_XS, GGML_TYPE_IQ3_XXS, GGML_TYPE_IQ3_S, GGML_TYPE_IQ2_XXS, GGML_TYPE_IQ2_XS, GGML_TYPE_IQ2_S, GGML_TYPE_IQ1_S, GGML_TYPE_IQ1_M (#6521)
|
1 year ago |
Georgi Gerganov
|
54ea0698fb
sync : ggml
|
1 year ago |
Daniel Bevenius
|
b66aec675c
backend : fix typo in scheduler documentation (ggml/781)
|
1 year ago |
Clint Herron
|
57dd02c44b
Tests: Added integration tests for GBNF parser (#6472)
|
1 year ago |
Pierrick Hymbert
|
75cd4c7729
ci: bench: support sse and fix prompt processing time / server: add tokens usage in stream OAI response (#6495)
|
1 year ago |
Brian
|
a8bd14d557
gguf.py : add licence and version to gguf writer (#6504)
|
1 year ago |
Hoang Nguyen
|
d0f5deebf8
readme : update UI list (#6503)
|
1 year ago |
Ting Sun
|
87e21bbacd
bench : make n_batch and n_ubatch configurable in Batched bench (#6500)
|
1 year ago |
Ouadie EL FAROUKI
|
1b496a745c
[SYCL] Fixed minor bug when enabling FP16 for non intel targets (#6464)
|
1 year ago |