Commit History

Author SHA1 Message Date
  Johannes Gäßler 76d66ee0be CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (#7921) 1 year ago
  slaren 08a0c02060 ggml : mul_mat_id use the same tensor for all the experts (#6387) 1 year ago
  slaren ae1f211ce2 cuda : refactor into multiple files (#6269) 1 year ago