Cebtenzzre
|
20c7e1e804
gguf : fix a few general keys (#3341)
|
2 лет назад |
Rickard Hallerbäck
|
dc6897404e
metal : reusing llama.cpp logging (#3152)
|
2 лет назад |
Johannes Gäßler
|
8185710a80
CUDA: use only 1 thread if fully offloaded (#2915)
|
2 лет назад |
Cebtenzzre
|
a5661d7e71
llama : allow gguf RoPE keys to be overridden with defaults (#3240)
|
2 лет назад |
slaren
|
8b428c9bc8
llama.cpp : show model size and BPW on load (#3223)
|
2 лет назад |
goerch
|
b08e75baea
Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 (#3170)
|
2 лет назад |
Cebtenzzre
|
3aefaab9e5
check C++ code with -Wmissing-declarations (#3184)
|
2 лет назад |
Meng Zhang
|
4fe09dfe66
llama : add support for StarCoder model architectures (#3187)
|
2 лет назад |
Georgi Gerganov
|
a51b687657
metal : relax conditions on fast matrix multiplication kernel (#3168)
|
2 лет назад |
Cebtenzzre
|
98311c4277
llama : make quantize example up to 2.7x faster (#3115)
|
2 лет назад |
jameswu2014
|
4c8643dd6e
feature : support Baichuan serial models (#3009)
|
2 лет назад |
goerch
|
71ca2fad7d
whisper : tokenizer fix + re-enable tokenizer test for LLaMa (#3096)
|
2 лет назад |
Cebtenzzre
|
e64f5b5578
examples : make n_ctx warning work again (#3066)
|
2 лет назад |
Przemysław Pawełczyk
|
cb6c44c5e0
build : do not use _GNU_SOURCE gratuitously (#2035)
|
2 лет назад |
Kunshang Ji
|
7f412dab9c
enable CPU HBM (#2603)
|
2 лет назад |
Cebtenzzre
|
00d62adb79
fix some warnings from gcc and clang-tidy (#3038)
|
2 лет назад |
Przemysław Pawełczyk
|
fec2fb19e4
ggml : posixify madvise and pagesize (#3037)
|
2 лет назад |
Georgi Gerganov
|
35938ee3b0
llama : update logic for number of threads when using BLAS
|
2 лет назад |
Georgi Gerganov
|
921772104b
speculative : add grammar support (#2991)
|
2 лет назад |
Georgi Gerganov
|
e36ecdccc8
build : on Mac OS enable Metal by default (#2901)
|
2 лет назад |
opparco
|
3730134776
llama : fix bpe tokenize from byte (#2889)
|
2 лет назад |
momonga
|
c42f0ec6b3
examples : fix gpt-neox (#2943)
|
2 лет назад |
Kerfuffle
|
5d6f19f16b
Allow quantize to only copy tensors, some other improvements (#2931)
|
2 лет назад |
m3ndax
|
ee8654bcd0
minor : add const qualifiers (#2853)
|
2 лет назад |
Cebtenzzre
|
ef15649972
build : fix most gcc and clang warnings (#2861)
|
2 лет назад |
DannyDaemonic
|
e8422de39e
@vxiiduu's fix for PrefetchVirtualMemory (#2930)
|
2 лет назад |
Johannes Gäßler
|
8afe228000
CUDA: mul_mat_q=true llama_context_params default (#2912)
|
2 лет назад |
Kawrakow
|
e37e69dcc3
10X faster BPE tokenizer (#2876)
|
2 лет назад |
xaedes
|
44c117f41e
train : mem usage and other improvements (#2439)
|
2 лет назад |
Johannes Gäßler
|
6b73ef1201
YAML result logging + preset script (#2657)
|
2 лет назад |