Georgi Gerganov
|
a316a425d0
Overhaul the examples structure
|
2 år sedan |
Georgi Gerganov
|
ecbe466a36
Retire the ggml_mul_mat() branch for transposed src0 (#500)
|
2 år sedan |
slaren
|
09aecbf628
Add AVX2 implementation of dequantize_row_q4_0 (#467)
|
2 år sedan |
Georgi Gerganov
|
6b6dbc8910
Remove obsolete assert and fix compiler warning
|
2 år sedan |
Georgi Gerganov
|
2a2e63ce05
Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS
|
2 år sedan |
Georgi Gerganov
|
8520fc310e
Disable BLAS altogether - the bug is not just for qunatized mat mul
|
2 år sedan |
Georgi Gerganov
|
b3f460e941
Disable BLAS branch in mul_mat - seems there is a bug
|
2 år sedan |
Georgi Gerganov
|
7a9b6c3a8b
Reduce memory usage and allocate enough memory for largest context (#473)
|
2 år sedan |
Cameron Kaiser
|
481044d50c
additional optimizations for POWER9 (#454)
|
2 år sedan |
comex
|
563cdc391d
Support calling mlock() on loaded model data on Linux and macOS (#453)
|
2 år sedan |
Stephan Walter
|
69c92298a9
Deduplicate q4 quantization functions (#383)
|
2 år sedan |
Valentyn Bezshapkin
|
97940520e8
fix: add POSIX functionality for Linux compilation (#51)
|
2 år sedan |
Georgi Gerganov
|
f5a77a629b
Introduce C-style API (#370)
|
2 år sedan |
Kevin Lo
|
715d292ee0
Add OpenBSD support (#314)
|
2 år sedan |
Casey Primozic
|
2e664f1ff4
Add initial AVX512 support for dot product on Linux (#320)
|
2 år sedan |
Georgi Gerganov
|
22213a17b5
Change RMSNorm eps to 1e-6 (#173)
|
2 år sedan |
Stephan Walter
|
367946c668
Don't tell users to use a bad number of threads (#243)
|
2 år sedan |
Matvey Soloviev
|
904d2a8d6a
Q4_1 quantization (#193)
|
2 år sedan |
Nebula
|
9b4a15b17d
Fix RMS norm in GGML (#191)
|
2 år sedan |
hoangmit
|
6eac39ba95
Add RMS norm and use it (#187)
|
2 år sedan |
hoangmit
|
113e685d18
inline -> static inline for "bytesFromNibbles" (#161)
|
2 år sedan |
Ronsor
|
47857e564c
Don't use vdotq_s32 if it's not available (#139)
|
2 år sedan |
Thomas Klausner
|
41be0a3b3d
Add NetBSD support. (#90)
|
2 år sedan |
Georgi Gerganov
|
84d9015c4a
Use vdotq_s32 to improve performance (#67)
|
2 år sedan |
Georgi Gerganov
|
c80e2a8f2a
Revert "10% performance boost on ARM"
|
2 år sedan |
Georgi Gerganov
|
54a0e66ea0
Check for vdotq_s32 availability
|
2 år sedan |
Georgi Gerganov
|
543c57e991
Ammend to previous commit - forgot to update non-QRDMX branch
|
2 år sedan |
Georgi Gerganov
|
113a9e83eb
10% performance boost on ARM
|
2 år sedan |
Sebastián A
|
eb062bb012
Windows fixes (#31)
|
2 år sedan |
Georgi Gerganov
|
f1eaff4721
Add AVX2 support for x86 architectures thanks to @Const-me !
|
2 år sedan |