Commit History

Autor SHA1 Mensaxe Data
  Pavol Rusnak 6f79699286 build: add armv{6,7,8} support to cmake (#1251) %!s(int64=2) %!d(string=hai) anos
  Stephan Walter f0d70f147d Various fixes to mat_mul benchmark (#1253) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov 214b6a3570 ggml : adjust mul_mat_f16 work memory (#1226) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov 305eb5afd5 build : fix reference to old llama_util.h %!s(int64=2) %!d(string=hai) anos
  slaren 7fc50c051a cuBLAS: use host pinned memory and dequantize while copying (#1207) %!s(int64=2) %!d(string=hai) anos
  0cc4m 7296c961d9 ggml : add CLBlast support (#1164) %!s(int64=2) %!d(string=hai) anos
  Johannes Gäßler 92a6e13a31 Add Manjaro CUDA include and lib dirs to Makefile (#1212) %!s(int64=2) %!d(string=hai) anos
  slaren e4cf982e0d Fix cuda compilation (#1128) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov e4422e299c ggml : better PERF prints + support "LLAMA_PERF=1 make" %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov 872c365a91 ggml : fix AVX build + update to new Q8_0 format %!s(int64=2) %!d(string=hai) anos
  slaren 50cb666b8a Improve cuBLAS performance by using a memory pool (#1094) %!s(int64=2) %!d(string=hai) anos
  slaren 2005469ea1 Add Q4_3 support to cuBLAS (#1086) %!s(int64=2) %!d(string=hai) anos
  源文雨 5addcb120c fix: LLAMA_CUBLAS=1 undefined reference 'shm_open' (#1080) %!s(int64=2) %!d(string=hai) anos
  slaren 02d6988121 Improve cuBLAS performance by dequantizing on the GPU (#1065) %!s(int64=2) %!d(string=hai) anos
  Stephan Walter f3d4edf504 ggml : Q4 cleanup - remove 4-bit dot product code (#1061) %!s(int64=2) %!d(string=hai) anos
  slaren 8944a13296 Add NVIDIA cuBLAS support (#1044) %!s(int64=2) %!d(string=hai) anos
  Kawrakow 5ecff35151 Adding a simple program to measure speed of dot products (#1041) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov e95b6554b4 ggml : add Q8_0 quantization for intermediate results (#951) %!s(int64=2) %!d(string=hai) anos
  Stephan Walter 93265e988a make : fix dependencies, use auto variables (#983) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov 9190e8eac8 llama : merge llama_internal.h into llama.h %!s(int64=2) %!d(string=hai) anos
  CRD716 8cda5c981d fix whitespace (#944) %!s(int64=2) %!d(string=hai) anos
  SebastianApel 95ea26f6e9 benchmark : add tool for timing q4_0 matrix multiplication (#653) %!s(int64=2) %!d(string=hai) anos
  comex f963b63afa Rewrite loading code to try to satisfy everyone: %!s(int64=2) %!d(string=hai) anos
  unbounded 62cfc54f77 Add quantize-stats command for testing quantization (#728) %!s(int64=2) %!d(string=hai) anos
  bhubbb 698f7b5d63 make : add libllama.so target for llama-cpp-python (#797) %!s(int64=2) %!d(string=hai) anos
  Ivan Stepanov 0c44427df1 make : missing host optimizations in CXXFLAGS (#763) %!s(int64=2) %!d(string=hai) anos
  Fabian c4f89d8d73 make : use -march=native -mtune=native on x86 (#609) %!s(int64=2) %!d(string=hai) anos
  david raistrick 1f0414feec make : fix darwin f16c flags check (#615) %!s(int64=2) %!d(string=hai) anos
  Stephan Walter 436e561931 all : be more strict about converting float to double (#458) %!s(int64=2) %!d(string=hai) anos
  RJ Adriaansen 4b8efff0e3 Add embedding example to Makefile (#540) %!s(int64=2) %!d(string=hai) anos