Commit History

Author SHA1 Message Date
  Johannes Gäßler e11bd856d5 CPU/CUDA: Gemma 2 FlashAttention support (#8542) 1 year ago
  compilade a1631e53f6 llama : simplify Mamba with advanced batch splits (#8526) 1 year ago
  Daniel Bevenius 06943a69f6 ggml : move rope type enum to ggml.h (#8949) 1 year ago
  Molly Sophia 2d5dd7bb3f ggml : add epsilon as a parameter for group_norm (#8818) 1 year ago
  Daniel Bevenius 655858ace0 ggml : move c parameter comment to ggml_rope_ext (ggml/901) 1 year ago
  Sigbjørn Skjæret b72c20b85c Fix conversion of unnormalized BF16->BF16 weights (#7843) 1 year ago
  slaren 2b1f616b20 ggml : reduce hash table reset cost (#8698) 1 year ago
  Georgi Gerganov eddcb5238b ggml : add and use ggml_cpu_has_llamafile() (#8664) 1 year ago
  hipudding 1bdd8ae19f [CANN] Add Ascend NPU backend (#6035) 1 year ago
  Georgi Gerganov 370b1f7e7a ggml : minor naming changes (#8433) 1 year ago
  Dibakar Gope 0f1a39f343 ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (#5780) 1 year ago
  Georgi Gerganov f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago