Commit History

Author SHA1 Message Date
  Georgi Gerganov 6f0dbf6ab0 infill : assert prefix/suffix tokens + remove old space logic (#8351) 1 year ago
  Kevin Wang ffd00797d8 common : avoid unnecessary logits fetch (#8358) 1 year ago
  toyer 04ce3a8b19 readme : add supported glm models (#8360) 1 year ago
  compilade 3fd62a6b1c py : type-check all Python scripts with Pyright (#8341) 1 year ago
  Denis Spasyuk a8db2a9ce6 Update llama-cli documentation (#8315) 1 year ago
  Alex Tuddenham 4090ea5501 ci : add checks for cmake,make and ctest in ci/run.sh (#8200) 1 year ago
  Andy Tai f1948f1e10 readme : update bindings list (#8222) 1 year ago
  Brian f7cab35ef9 gguf-hash: model wide and per tensor hashing using xxhash and sha1 (#8048) 1 year ago
  toyer 905942abdb llama : support glm3 and glm4 (#8031) 1 year ago
  Georgi Gerganov b5040086d4 llama : fix n_rot default (#8348) 1 year ago
  compilade d39130a398 py : use cpu-only torch in requirements.txt (#8335) 1 year ago
  standby24x7 b81ba1f96b finetune: Rename command name in README.md (#8343) 1 year ago
  standby24x7 210eb9ed0a finetune: Rename an old command name in finetune.sh (#8344) 1 year ago
  Bjarke Viksøe cb4d86c4d7 server: Retrieve prompt template in /props (#8337) 1 year ago
  Derrick T. Woolworth 86e7299ef5 added support for Authorization Bearer tokens when downloading model (#8307) 1 year ago
  Xuan Son Nguyen 60d83a0149 update main readme (#8333) 1 year ago
  Daniel Bevenius 87e25a1d1b llama : add early return for empty range (#8327) 1 year ago
  jaime-m-p 213701b51a Detokenizer fixes (#8039) 1 year ago
  Xuan Son Nguyen be20e7f49d Reorganize documentation pages (#8325) 1 year ago
  Georgi Gerganov 7ed03b8974 llama : fix compile warning (#8304) 1 year ago
  Natsu 1d894a790e cmake : add GGML_BUILD and GGML_SHARED macro definitions (#8281) 1 year ago
  Ouadie EL FAROUKI 1f3e1b66e2 Enabled more data types for oneMKL gemm_batch (#8236) 1 year ago
  Georgi Gerganov 148ec970b6 convert : remove AWQ remnants (#8320) 1 year ago
  Georgi Gerganov 2cccbaa008 llama : minor indentation during tensor loading (#8304) 1 year ago
  Johannes Gäßler 8e558309dc CUDA: MMQ support for iq4_nl, iq4_xs (#8278) 1 year ago
  Daniele 0a423800ff CUDA: revert part of the RDNA1 optimizations (#8309) 1 year ago
  Douglas Hanley d12f781074 llama : streamline embeddings from "non-embedding" models (#8087) 1 year ago
  Johannes Gäßler bcefa03bc0 CUDA: fix MMQ stream-k rounding if ne00 % 128 != 0 (#8311) 1 year ago
  Pieter Ouwerkerk 5a7447c569 readme : fix minor typos [no ci] (#8314) 1 year ago
  Daniel Bevenius 61ecafa390 passkey : add short intro to README.md [no-ci] (#8317) 1 year ago