Commit History

Author SHA1 Message Date
  Roland 2d770505a8 llama : remove mtest (#3177) 2 years ago
  Cebtenzzre 98311c4277 llama : make quantize example up to 2.7x faster (#3115) 2 years ago
  jneem feea179e9f flake : allow $out/include to already exist (#3175) 2 years ago
  Andrei 769266a543 cmake : compile ggml-rocm with -fpic when building shared library (#3158) 2 years ago
  Asbjørn Olling cf8238e7f4 flake : include llama.h in nix output (#3159) 2 years ago
  Cebtenzzre 4b8560e72a make : fix clang++ detection, move some definitions to CPPFLAGS (#3155) 2 years ago
  Alon 83a53b753a CI: add FreeBSD & simplify CUDA windows (#3053) 2 years ago
  akawrykow 5c872dbca2 falcon : use stated vocab size (#2914) 2 years ago
  bandoti 990a5e226a cmake : add relocatable Llama package (#2960) 2 years ago
  dylan 980ab41afb docker : add gpu image CI builds (#3103) 2 years ago
  Kerfuffle e394084166 gguf-py : support identity operation in TensorNameMap (#3095) 2 years ago
  jameswu2014 4c8643dd6e feature : support Baichuan serial models (#3009) 2 years ago
  Leng Yue 35f73049af speculative : add heuristic algorithm (#3006) 2 years ago
  goerch 71ca2fad7d whisper : tokenizer fix + re-enable tokenizer test for LLaMa (#3096) 2 years ago
  Tristan Ross 1b6c650d16 cmake : add a compiler flag check for FP16 format (#3086) 2 years ago
  Johannes Gäßler 0a5eebb45d CUDA: mul_mat_q RDNA2 tunings (#2910) 2 years ago
  FK 84e723653c speculative: add --n-gpu-layers-draft option (#3063) 2 years ago
  Eric Sommerlade b52b29ab9d arm64 support for windows (#3007) 2 years ago
  Johannes Gäßler 4f7cd6ba9c CUDA: fix LoRAs (#3130) 2 years ago
  Johannes Gäßler 89e89599fd CUDA: fix mul_mat_q not used for output tensor (#3127) 2 years ago
  Johannes Gäßler d54a4027a6 CUDA: lower GPU latency + fix Windows performance (#3110) 2 years ago
  Jhen-Jie Hong 1b0d09259e cmake : support build for iOS/tvOS (#3116) 2 years ago
  Johannes Gäßler 8a4ca9af56 CUDA: add device number to error messages (#3112) 2 years ago
  Kawrakow f31b6f4e2d metal : PP speedup (#3084) 2 years ago
  Erik Scholz 6eeb4d9083 convert: remove most of the n_mult usage in convert.py (#3098) 2 years ago
  kchro3 21ac3a1503 metal : support for Swift (#3078) 2 years ago
  Jhen-Jie Hong 4fd5477955 metal : support build for iOS/tvOS (#3089) 2 years ago
  takov751 ec2a24fedf flake : add train-text-from-scratch to flake.nix (#3042) 2 years ago
  Ikko Eltociear Ashimine 7d99aca759 readme : fix typo (#3043) 2 years ago
  Kawrakow ba7ffbb251 metal : Q3_K speedup (#2995) 2 years ago