Commit History

Author SHA1 Message Date
  Kawrakow bd2d4e393b 1.5 bit quantization (#5453) 1 year ago
  Kawrakow 895407f31b ggml-quants : fix compiler warnings (shadow variable) (#5472) 1 year ago
  Georgi Gerganov 0f2411f154 ggml : fix compile warnings (unused vars) (#4966) 1 year ago
  snadampal a07d0fee1f ggml : add mmla kernels for quantized GEMM (#4966) 1 year ago
  Michael Podvitskiy b2f87cb64d ggml : fix `error C2078: too many initializers` for MSVC ARM64 (#5404) 2 years ago
  Kawrakow f57fadc009 Slight quantization improvement for Q4_K and Q5_K (#5361) 2 years ago
  Kawrakow 6fdfa2ecc6 iq2_xxs: tune quantization (#5320) 2 years ago
  Kawrakow 8e14e3ddb3 Faster AVX2 dot product for IQ2_XS (#5187) 2 years ago
  Kawrakow f4d7e54974 SOTA 3-bit quants (#5196) 2 years ago
  Georgi Gerganov 38566680cd ggml : add IQ2 to test-backend-ops + refactoring (#4990) 2 years ago
  Kawrakow 334a835a1c ggml : importance matrix support for legacy quants (#4969) 2 years ago
  Kawrakow 467a882fd2 Add ability to use importance matrix for all k-quants (#4930) 2 years ago
  Kawrakow 147b17ac94 2-bit quantizations (#4897) 2 years ago
  Georgi Gerganov f238461236 ggml : fix 32-bit ARM compat for IQ2_XS (whisper/1758) 2 years ago
  Kawrakow 49662cbed3 ggml : SOTA 2-bit quants (add IQ2_XS) (#4856) 2 years ago
  Georgi Gerganov 18c2e1752c ggml : fix vld1q_s8_x4 32-bit compat (#4828) 2 years ago
  Kawrakow dd5ae06405 SOTA 2-bit quants (#4773) 2 years ago
  Georgi Gerganov e39106c055 ggml : add ggml_vdotq_s32 alias (#4715) 2 years ago
  Georgi Gerganov 951010fa53 ggml : fix dot product for ARM (#4630) 2 years ago
  FantasyGmm a55876955b cuda : fix jetson compile error (#4560) 2 years ago
  Richard Kiss 9494d7c477 english : use `typos` to fix comments and logs (#4354) 2 years ago
  Roger Meier 8e9361089d build : support ppc64le build for make and CMake (#3963) 2 years ago
  Michael Potter 6bb4908a17 Fix MacOS Sonoma model quantization (#4052) 2 years ago
  Georgi Gerganov 3d68f364f1 ggml : sync (im2col, GPU conv, 32-bit arm compat) (#4060) 2 years ago
  Georgi Gerganov 9a3b4f6c86 ggml : fix UNUSED macro (#3762) 2 years ago
  Andrew Godfrey 73bdcb395e finetune : add -ngl parameter (#3762) 2 years ago
  Georgi Gerganov 207b51900e ggml : move FP16 <-> FP32 code to ggml-impl.h (#3861) 2 years ago
  Georgi Gerganov d69d777c02 ggml : quantization refactoring (#3833) 2 years ago