Jared Van Bortel
|
d199ca79f2
mpt : implement backwards compatiblity with duped output tensor (#6139)
|
преди 1 година |
slaren
|
2bf8d0f7c4
backend : offload large batches to GPU (#6083)
|
преди 1 година |
slaren
|
d84c48505f
llama : fix Baichuan2 13B (#6092)
|
преди 1 година |
Theia Vogel
|
877b4d0c62
llama : add support for control vectors (#5970)
|
преди 1 година |
Andrew Canis
|
12247f4c69
llama : add Command-R support (#6033)
|
преди 1 година |
Neo Zhang Jianyu
|
46acb36767
fix set main gpu error (#6073)
|
преди 1 година |
Xuan Son Nguyen
|
aab606a11f
llama : add Orion chat template (#6066)
|
преди 1 година |
Georgi Gerganov
|
4755afd1cb
llama : fix integer overflow during quantization (#6063)
|
преди 1 година |
Michael Podvitskiy
|
69ff61397d
llama : support models without vocabulary (#5798)
|
преди 1 година |
Georgi Gerganov
|
a44bc969e4
llama : fix typo
|
преди 1 година |
Michael Podvitskiy
|
2c4fb69246
llama : optimize defrag moves + fix fragmentation calculation (#6037)
|
преди 1 година |
slaren
|
f30ea47a87
llama : add pipeline parallelism support (#6017)
|
преди 1 година |
gliptic
|
5cdb371731
grammar : fix unnecessarily retained pointer to rules (#6003)
|
преди 1 година |
Georgi Gerganov
|
05b06210c9
llama : more consistent names of count variables (#5994)
|
преди 1 година |
Georgi Gerganov
|
83796e62bc
llama : refactor unicode stuff (#5992)
|
преди 1 година |
Michael Podvitskiy
|
3202361c5b
ggml, ci : Windows ARM runner and build fixes (#5979)
|
преди 1 година |
Georgi Gerganov
|
ee35600b90
llama : fix F16/F32 downcast + improve names (#5980)
|
преди 1 година |
DAN™
|
bcebd7dbf6
llama : add support for GritLM (#5959)
|
преди 1 година |
slaren
|
d894f352bf
perplexity : support using multiple sequences to allow larger batch sizes (#5946)
|
преди 1 година |
Georgi Gerganov
|
5b09797321
ggml : remove old quantization functions (#5942)
|
преди 1 година |
compilade
|
c2101a2e90
llama : support Mamba Selective State Space Models (#5328)
|
преди 1 година |
compilade
|
515f7d0d4f
llama : fix quantization of shared token_embd (#5944)
|
преди 1 година |
Don Mahurin
|
e457fb3540
llama : assume tied weights if lm_head/output weights is missing (#5824)
|
преди 1 година |
Neo Zhang Jianyu
|
89fb735fcf
Revert "[SYCL] fix error when set main gpu to non-zero (#5901)" (#5918)
|
преди 1 година |
Georgi Gerganov
|
2002bc96bf
server : refactor (#5882)
|
преди 1 година |
Neo Zhang Jianyu
|
ceca1aef07
[SYCL] fix error when set main gpu to non-zero (#5901)
|
преди 1 година |
0cc4m
|
61d1c88e15
Vulkan Improvements (#5835)
|
преди 1 година |
Georgi Gerganov
|
29ae62d2ae
llama : fix embeddings (#5796)
|
преди 1 година |
Xuan Son Nguyen
|
4ffcdce2ff
add alias for chat template (#5858)
|
преди 1 година |
Douglas Hanley
|
475df1d6cf
llama : allow for user specified embedding pooling type (#5849)
|
преди 1 година |