Commit History

Author SHA1 Message Date
  Kunshang Ji 7f412dab9c enable CPU HBM (#2603) 2 years ago
  Cebtenzzre 00d62adb79 fix some warnings from gcc and clang-tidy (#3038) 2 years ago
  Przemysław Pawełczyk fec2fb19e4 ggml : posixify madvise and pagesize (#3037) 2 years ago
  Georgi Gerganov 35938ee3b0 llama : update logic for number of threads when using BLAS 2 years ago
  Georgi Gerganov 921772104b speculative : add grammar support (#2991) 2 years ago
  Georgi Gerganov e36ecdccc8 build : on Mac OS enable Metal by default (#2901) 2 years ago
  opparco 3730134776 llama : fix bpe tokenize from byte (#2889) 2 years ago
  momonga c42f0ec6b3 examples : fix gpt-neox (#2943) 2 years ago
  Kerfuffle 5d6f19f16b Allow quantize to only copy tensors, some other improvements (#2931) 2 years ago
  m3ndax ee8654bcd0 minor : add const qualifiers (#2853) 2 years ago
  Cebtenzzre ef15649972 build : fix most gcc and clang warnings (#2861) 2 years ago
  DannyDaemonic e8422de39e @vxiiduu's fix for PrefetchVirtualMemory (#2930) 2 years ago
  Johannes Gäßler 8afe228000 CUDA: mul_mat_q=true llama_context_params default (#2912) 2 years ago
  Kawrakow e37e69dcc3 10X faster BPE tokenizer (#2876) 2 years ago
  xaedes 44c117f41e train : mem usage and other improvements (#2439) 2 years ago
  Johannes Gäßler 6b73ef1201 YAML result logging + preset script (#2657) 2 years ago
  grahameth be475f60af llama.cpp : fix wrong vsnprintf call in MS compiler (#2856) 2 years ago
  Georgi Gerganov c10704d01e llama : fix MPI threads (close #2827) 2 years ago
  Kawrakow 463173a6c0 llama : speedup tokenization (#2831) 2 years ago
  Georgi Gerganov eaa13a48ff falcon : fix CUDA inference by making K and Q contiguous (#2830) 2 years ago
  Kawrakow a6d1189fdd k_quants tuning for Falcon-7b (#2816) 2 years ago
  Georgi Gerganov d0cee0d36d gguf : add 64-bit support (GGUF v2) (#2821) 2 years ago
  Georgi Gerganov edd4c14817 llama : more tokenizer fixes (#2810) 2 years ago
  Przemysław Pawełczyk 1591e2e590 ggml : detect SSSE3 (#2825) 2 years ago
  Tim Miller c7d92e6dfe llama : use Unicode Escape Sequence to replace encoded characters (#2814) 2 years ago
  Cebtenzzre 741ca7dd1c llama : move #includes out of _GNU_SOURCE conditional (#2817) 2 years ago
  Cebtenzzre 50526f37eb llama : use std::abs in llama_sample_tail_free (#2800) 2 years ago
  Georgi Gerganov 04f4b1eb10 k-quants : remove unnecessary tensor shape restrictions (#2811) 2 years ago
  Kawrakow 7592375403 Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (#2807) 2 years ago
  klosax 2ba83c8685 Fix spm whitespaces (#2806) 2 years ago