SebastianApel
|
437e77855a
10+% performance improvement of ggml_vec_dot_q4_0 on AVX2 (#654)
|
2 years ago |
Marian Cepok
|
c0bb1d3ce2
ggml : change ne to int64_t (#626)
|
2 years ago |
Stephan Walter
|
3525899277
Enable -std= for cmake builds, fix warnings (#598)
|
2 years ago |
slaren
|
1d08882afa
Optimize AVX2 ggml_vec_dot_q4_0 (#642)
|
2 years ago |
perserk
|
02c5b27e91
Add AVX acceleration (#617)
|
2 years ago |
Justine Tunney
|
6f23ba5ee2
Ensure --mlock works properly with mmap() support
|
2 years ago |
Slaren
|
c03ae8dca1
Add mmap support for model files
|
2 years ago |
Casey Primozic
|
a4755cf288
Remove unused variable (#607)
|
2 years ago |
Georgi Gerganov
|
77efdf5a50
ggml : fix NEON signs (close #620, #622)
|
2 years ago |
slaren
|
ed3c680bcd
Fix GGML_F32Cx8_STORE in AVX without F16C path (#619)
|
2 years ago |
Georgi Gerganov
|
b51c717d5c
ggml : init time on first ggml_init() call
|
2 years ago |
Georgi Gerganov
|
cea1c85948
ggml : add ARM_NEON dequantize_row_q4_1()
|
2 years ago |
Georgi Gerganov
|
f202ada131
ggml : add ARM_NEON quantize_row_q4_1()
|
2 years ago |
Georgi Gerganov
|
3b44d30d9b
ggml : add ARM_NEON ggml_vec_dot_q4_1()
|
2 years ago |
anzz1
|
83df5639eb
Fix GCC warning about binary literal (#595)
|
2 years ago |
anzz1
|
5a5f8b1501
Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375)
|
2 years ago |
slaren
|
2a98bc18ea
ggml : add AVX2 implementation of quantize_row_q4_1 (#515)
|
2 years ago |
Stephan Walter
|
99c5b27654
ggml : refactor quantized processing functions (#509)
|
2 years ago |
Stephan Walter
|
436e561931
all : be more strict about converting float to double (#458)
|
2 years ago |
Stephan Walter
|
c1f885067c
ggml : introduce structs for the q4 data blocks (#356)
|
2 years ago |
slaren
|
a6bdc47cba
Fix usage of F16C intrinsics in AVX code (#563)
|
2 years ago |
Stephan Walter
|
939ad2d3a5
Fix undefined variables in debug build, remove unused variables (#531)
|
2 years ago |
slaren
|
459e93cce0
Add AVX2 implementation of dequantize_row_q4_1 (#505)
|
2 years ago |
Georgi Gerganov
|
a316a425d0
Overhaul the examples structure
|
2 years ago |
Georgi Gerganov
|
ecbe466a36
Retire the ggml_mul_mat() branch for transposed src0 (#500)
|
2 years ago |
slaren
|
09aecbf628
Add AVX2 implementation of dequantize_row_q4_0 (#467)
|
2 years ago |
Georgi Gerganov
|
6b6dbc8910
Remove obsolete assert and fix compiler warning
|
2 years ago |
Georgi Gerganov
|
2a2e63ce05
Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS
|
2 years ago |
Georgi Gerganov
|
8520fc310e
Disable BLAS altogether - the bug is not just for qunatized mat mul
|
2 years ago |
Georgi Gerganov
|
b3f460e941
Disable BLAS branch in mul_mat - seems there is a bug
|
2 years ago |