Kunshang Ji
|
7f412dab9c
enable CPU HBM (#2603)
|
2 years ago |
Cebtenzzre
|
00d62adb79
fix some warnings from gcc and clang-tidy (#3038)
|
2 years ago |
Przemysław Pawełczyk
|
fec2fb19e4
ggml : posixify madvise and pagesize (#3037)
|
2 years ago |
Georgi Gerganov
|
35938ee3b0
llama : update logic for number of threads when using BLAS
|
2 years ago |
Georgi Gerganov
|
921772104b
speculative : add grammar support (#2991)
|
2 years ago |
Georgi Gerganov
|
e36ecdccc8
build : on Mac OS enable Metal by default (#2901)
|
2 years ago |
opparco
|
3730134776
llama : fix bpe tokenize from byte (#2889)
|
2 years ago |
momonga
|
c42f0ec6b3
examples : fix gpt-neox (#2943)
|
2 years ago |
Kerfuffle
|
5d6f19f16b
Allow quantize to only copy tensors, some other improvements (#2931)
|
2 years ago |
m3ndax
|
ee8654bcd0
minor : add const qualifiers (#2853)
|
2 years ago |
Cebtenzzre
|
ef15649972
build : fix most gcc and clang warnings (#2861)
|
2 years ago |
DannyDaemonic
|
e8422de39e
@vxiiduu's fix for PrefetchVirtualMemory (#2930)
|
2 years ago |
Johannes Gäßler
|
8afe228000
CUDA: mul_mat_q=true llama_context_params default (#2912)
|
2 years ago |
Kawrakow
|
e37e69dcc3
10X faster BPE tokenizer (#2876)
|
2 years ago |
xaedes
|
44c117f41e
train : mem usage and other improvements (#2439)
|
2 years ago |
Johannes Gäßler
|
6b73ef1201
YAML result logging + preset script (#2657)
|
2 years ago |
grahameth
|
be475f60af
llama.cpp : fix wrong vsnprintf call in MS compiler (#2856)
|
2 years ago |
Georgi Gerganov
|
c10704d01e
llama : fix MPI threads (close #2827)
|
2 years ago |
Kawrakow
|
463173a6c0
llama : speedup tokenization (#2831)
|
2 years ago |
Georgi Gerganov
|
eaa13a48ff
falcon : fix CUDA inference by making K and Q contiguous (#2830)
|
2 years ago |
Kawrakow
|
a6d1189fdd
k_quants tuning for Falcon-7b (#2816)
|
2 years ago |
Georgi Gerganov
|
d0cee0d36d
gguf : add 64-bit support (GGUF v2) (#2821)
|
2 years ago |
Georgi Gerganov
|
edd4c14817
llama : more tokenizer fixes (#2810)
|
2 years ago |
Przemysław Pawełczyk
|
1591e2e590
ggml : detect SSSE3 (#2825)
|
2 years ago |
Tim Miller
|
c7d92e6dfe
llama : use Unicode Escape Sequence to replace encoded characters (#2814)
|
2 years ago |
Cebtenzzre
|
741ca7dd1c
llama : move #includes out of _GNU_SOURCE conditional (#2817)
|
2 years ago |
Cebtenzzre
|
50526f37eb
llama : use std::abs in llama_sample_tail_free (#2800)
|
2 years ago |
Georgi Gerganov
|
04f4b1eb10
k-quants : remove unnecessary tensor shape restrictions (#2811)
|
2 years ago |
Kawrakow
|
7592375403
Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (#2807)
|
2 years ago |
klosax
|
2ba83c8685
Fix spm whitespaces (#2806)
|
2 years ago |