Georgi Gerganov
|
3b44d30d9b
ggml : add ARM_NEON ggml_vec_dot_q4_1()
|
2 years ago |
anzz1
|
83df5639eb
Fix GCC warning about binary literal (#595)
|
2 years ago |
anzz1
|
5a5f8b1501
Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375)
|
2 years ago |
slaren
|
2a98bc18ea
ggml : add AVX2 implementation of quantize_row_q4_1 (#515)
|
2 years ago |
Stephan Walter
|
99c5b27654
ggml : refactor quantized processing functions (#509)
|
2 years ago |
Stephan Walter
|
436e561931
all : be more strict about converting float to double (#458)
|
2 years ago |
Stephan Walter
|
c1f885067c
ggml : introduce structs for the q4 data blocks (#356)
|
2 years ago |
slaren
|
a6bdc47cba
Fix usage of F16C intrinsics in AVX code (#563)
|
2 years ago |
Stephan Walter
|
939ad2d3a5
Fix undefined variables in debug build, remove unused variables (#531)
|
2 years ago |
slaren
|
459e93cce0
Add AVX2 implementation of dequantize_row_q4_1 (#505)
|
2 years ago |
Georgi Gerganov
|
a316a425d0
Overhaul the examples structure
|
2 years ago |
Georgi Gerganov
|
ecbe466a36
Retire the ggml_mul_mat() branch for transposed src0 (#500)
|
2 years ago |
slaren
|
09aecbf628
Add AVX2 implementation of dequantize_row_q4_0 (#467)
|
2 years ago |
Georgi Gerganov
|
6b6dbc8910
Remove obsolete assert and fix compiler warning
|
2 years ago |
Georgi Gerganov
|
2a2e63ce05
Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS
|
2 years ago |
Georgi Gerganov
|
8520fc310e
Disable BLAS altogether - the bug is not just for qunatized mat mul
|
2 years ago |
Georgi Gerganov
|
b3f460e941
Disable BLAS branch in mul_mat - seems there is a bug
|
2 years ago |
Georgi Gerganov
|
7a9b6c3a8b
Reduce memory usage and allocate enough memory for largest context (#473)
|
2 years ago |
Cameron Kaiser
|
481044d50c
additional optimizations for POWER9 (#454)
|
2 years ago |
comex
|
563cdc391d
Support calling mlock() on loaded model data on Linux and macOS (#453)
|
2 years ago |
Stephan Walter
|
69c92298a9
Deduplicate q4 quantization functions (#383)
|
2 years ago |
Valentyn Bezshapkin
|
97940520e8
fix: add POSIX functionality for Linux compilation (#51)
|
2 years ago |
Georgi Gerganov
|
f5a77a629b
Introduce C-style API (#370)
|
2 years ago |
Kevin Lo
|
715d292ee0
Add OpenBSD support (#314)
|
2 years ago |
Casey Primozic
|
2e664f1ff4
Add initial AVX512 support for dot product on Linux (#320)
|
2 years ago |
Georgi Gerganov
|
22213a17b5
Change RMSNorm eps to 1e-6 (#173)
|
2 years ago |
Stephan Walter
|
367946c668
Don't tell users to use a bad number of threads (#243)
|
2 years ago |
Matvey Soloviev
|
904d2a8d6a
Q4_1 quantization (#193)
|
2 years ago |
Nebula
|
9b4a15b17d
Fix RMS norm in GGML (#191)
|
2 years ago |
hoangmit
|
6eac39ba95
Add RMS norm and use it (#187)
|
2 years ago |