Kawrakow
|
72ff5282bf
metal : add Q2_K implementation (#1762)
|
пре 2 година |
Georgi Gerganov
|
0bf7cf1b29
Revert "ggml : load data into int8x16x4_t using vld4q_s8 on arm64 (#1738)"
|
пре 2 година |
le.chang
|
8432d4d9f7
ggml : load data into int8x16x4_t using vld4q_s8 on arm64 (#1738)
|
пре 2 година |
Kawrakow
|
0f291e1f65
metal : Q6_K implementation (#1752)
|
пре 2 година |
qingfengfenga
|
8fc8179919
Add llama.cpp docker support for non-latin languages (#1673)
|
пре 2 година |
Steven Roussey
|
b50b570ed9
ggml : fix fprintf warnings (#1720)
|
пре 2 година |
Georgi Gerganov
|
53aba3f393
clang-tidy : restore dot file from accidental deletion
|
пре 2 година |
Kawrakow
|
4161bdc04d
metal : add Q4_K implementation (#1733)
|
пре 2 година |
johnson442
|
0035858273
k-quants : add missing compile definition to CMakeLists (#1748)
|
пре 2 година |
Georgi Gerganov
|
5c64a0952e
k-quants : allow to optionally disable at compile time (#1734)
|
пре 2 година |
jacobi petrucciani
|
5b57a5b726
flake : update to support metal on m1/m2 (#1724)
|
пре 2 година |
Georgi Gerganov
|
4dc62c545d
readme : add June roadmap
|
пре 2 година |
Willy Tarreau
|
35a84916fb
main: add the possibility to open the prompt cache read-only (#1640)
|
пре 2 година |
Georgi Gerganov
|
2d7bf110ed
llama : fix vram_scratch var
|
пре 2 година |
Georgi Gerganov
|
2a4e41a086
llama : fix compile warnings
|
пре 2 година |
Johannes Gäßler
|
17366df842
Multi GPU support, CUDA refactor, CUDA scratch buffer (#1703)
|
пре 2 година |
Georgi Gerganov
|
44f906e853
metal : add f16 support
|
пре 2 година |
LostRuins
|
d5b111f53d
Clblast fixes + enhancements to save VRAM and offload more layers (#1675)
|
пре 2 година |
Georgi Gerganov
|
2d43387daf
ggml : fix builds, add ggml-quants-k.o (close #1712, close #1710)
|
пре 2 година |
Georgi Gerganov
|
7ad7750c5c
gitignore : add .clang-tidy
|
пре 2 година |
Georgi Gerganov
|
7a74dee6b4
llama : temporary disable Q6_K output quantization (#1711)
|
пре 2 година |
Spencer Sutton
|
590250f7a9
metal : add checks for buffer size (#1706)
|
пре 2 година |
Yuval Peled
|
f4c55d3bd7
docs : add performance troubleshoot + example benchmark documentation (#1674)
|
пре 2 година |
Foul-Tarnished
|
f1465624c2
readme : fix typo (#1700)
|
пре 2 година |
mgroeber9110
|
c2df36d60d
llama : consistently catch and throw only exceptions deriving from std::exception (#1599)
|
пре 2 година |
kiltyj
|
9d0693bce3
metal : use shared buffers between CPU and GPU (#1696)
|
пре 2 година |
grahameth
|
efe0507632
ggml : fix internal overflow in ggml_time_us on Windows (#1702)
|
пре 2 година |
Georgi Gerganov
|
e7fe66e670
ci : disable auto tidy (#1705)
|
пре 2 година |
Kawrakow
|
99009e72f8
ggml : add SOTA 2,3,4,5,6 bit k-quantizations (#1684)
|
пре 2 година |
Henri Vasserman
|
5220a991a5
Increase 3B scratch buffers. (#1698)
|
пре 2 година |