Georgi Gerganov
|
0c27e6f62e
ggml : fix loongson compile warnings (#7537)
|
1 year ago |
junchao-loongson
|
d5c05821f3
ggml : fix loongarch build (O2 issue) (#7636)
|
1 year ago |
Masaya, Kato
|
faa0e6979a
ggml: aarch64: SVE kernels for q8_0_q8_0, q4_0_q8_0 vector dot (#7433)
|
1 year ago |
Georgi Gerganov
|
1debe72737
ggml : silence UB sanitizer error during iq2_xxs quantization (#0)
|
1 year ago |
Georgi Gerganov
|
e84b71c2c6
ggml : drop support for QK_K=64 (#7473)
|
1 year ago |
junchao-loongson
|
65c58207ec
ggml : add loongarch lsx and lasx support (#6454)
|
1 year ago |
slaren
|
e4e6f67be6
ggml : fix another case of quants nans (#7387)
|
1 year ago |
slaren
|
05834841dc
ggml : fix quants nans when all the group weights are very close to zero (#7313)
|
1 year ago |
Herman Semenov
|
359cbe3f46
ggml-quants, llama : removed excess checks (#7274)
|
1 year ago |
Max Krasnyansky
|
13ad16af12
Add support for properly optimized Windows ARM64 builds with LLVM and MSVC (#7191)
|
1 year ago |
Georgi Gerganov
|
c3c88f296a
ggml : try fix ppc64 (whisper/0)
|
1 year ago |
Hong Bo PENG
|
0d26d8ccd8
ggml : optimize for ppc64le using VSX intrinsics (ggml/784)
|
1 year ago |
Borislav Stanimirov
|
ef0d5e3ec9
build: fix and ignore msvc warnings (ggml/805)
|
1 year ago |
Justine Tunney
|
3855416027
ggml : introduce bfloat16 support (#6412)
|
1 year ago |
slaren
|
017e6999b5
add basic tensor data validation function (#6884)
|
1 year ago |
Georgi Gerganov
|
54770413c4
ggml : fix MIN / MAX macros (#6904)
|
1 year ago |
Georgi Gerganov
|
c0d1b3e03e
ggml : move 32-bit arm compat in ggml-impl.h (#6865)
|
1 year ago |
Justine Tunney
|
8cc91dc63c
ggml : add llamafile sgemm (#6414)
|
1 year ago |
Carolinabanana
|
5dc9dd7152
llama : add Command R Plus support (#6491)
|
1 year ago |
Kawrakow
|
cbc8343619
Make IQ1_M work for QK_K = 64 (#6327)
|
1 year ago |
Kawrakow
|
55c1b2a3bb
IQ1_M: 1.75 bpw quantization (#6302)
|
1 year ago |
Justine Tunney
|
7733f0c760
ggml : support AVX512VNNI (#6280)
|
1 year ago |
Kawrakow
|
cfd3be76e3
ggml : same IQ4_NL quantization for CPU/CUDA/Metal (#6196)
|
1 year ago |
Georgi Gerganov
|
8030da7afe
ggml : reuse quantum structs across backends (#5943)
|
1 year ago |
Georgi Gerganov
|
184215e783
ggml : fix UB in IQ2_S and IQ3_S (#6012)
|
1 year ago |
Kawrakow
|
44ca159faf
1.5 bit: we can do even better (#5999)
|
1 year ago |
Michael Podvitskiy
|
3202361c5b
ggml, ci : Windows ARM runner and build fixes (#5979)
|
1 year ago |
Kawrakow
|
be858f6205
Better 1.5 bit quantization (#5971)
|
1 year ago |
Georgi Gerganov
|
df4dc3e7cb
ggml : try fix 32-bit arm compat (whisper/1938)
|
1 year ago |
Georgi Gerganov
|
8380ecfb21
ggml : fix unnecessary f32 -> f16 -> f32 casts (mmla) (#5951)
|
1 year ago |