Historique des commits

Auteur SHA1 Message Date
  Georgi Gerganov 0fd6c1f015 embedding : print cosine similarity (#899) il y a 1 an
  slaren f30ea47a87 llama : add pipeline parallelism support (#6017) il y a 1 an
  SeungWon Jeong fb215c3832 server : normalize embeddings (#5956) il y a 1 an
  Minsoo Cheong 6d341ab6c5 speculative : implement stochastic speculative sampling (#5625) il y a 1 an
  DAN™ 82f3e668ad common : use LLAMA_DEFAULT_SEED (#5855) il y a 1 an
  Douglas Hanley 475df1d6cf llama : allow for user specified embedding pooling type (#5849) il y a 1 an
  Pierrick Hymbert 3ab8b3a92e llama : cleanup unused mmq flags (#5772) il y a 1 an
  Georgi Gerganov 9d533a77d0 llama : fix defrag bugs + add parameter (#5735) il y a 1 an
  Georgi Gerganov ab336a9d5e code : normalize enum names (#5697) il y a 1 an
  Alexey Parfenov 6dcc02d244 server : add "samplers" param to control the samplers order (#5494) il y a 1 an
  bmwl f486f6e1e5 ggml : add numa options (#5377) il y a 1 an
  Alexey Parfenov a803333a4e common : use enums for sampler types (#5418) il y a 1 an
  Jared Van Bortel 1ec3332ade YaRN : store rope scaling type as int32_t in memory (#5285) il y a 2 ans
  Georgi Gerganov 5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240) il y a 2 ans
  Kawrakow 6f9939d119 KL-divergence (#5076) il y a 2 ans
  Kawrakow 7dcbe39d36 Add ability to evauate multiple choice tasks (#5047) il y a 2 ans
  Kawrakow 682986a08e Add Winogrande evaluation (#5015) il y a 2 ans
  stduhpf e0324285a5 speculative : threading options (#4959) il y a 2 ans
  Yann Follet 722d33f34e main : add parameter --no-display-prompt (#4541) il y a 2 ans
  slaren e7e4df031b llama : ggml-backend integration (#4766) il y a 2 ans
  Georgi Gerganov 7edefbd79c main : better name for variable n_print (#4874) il y a 2 ans
  Georgi Gerganov 3ca63b4538 main : disable token count by default (#4874) il y a 2 ans
  pudepiedj 43f76bf1c3 main : print total token count and tokens consumed so far (#4874) il y a 2 ans
  Georgi Gerganov 52531fdff8 main : add self-extend support (#4815) il y a 2 ans
  LeonEricsson 7082d24cec lookup : add prompt lookup decoding example (#4484) il y a 2 ans
  Georgi Gerganov bcc0eb4591 llama : per-layer KV cache + quantum K cache (#4309) il y a 2 ans
  Kerfuffle 5aa365d88f llama : allow overriding GGUF metadata when loading model (#4092) il y a 2 ans
  MaggotHATE 52c8bc3cf3 sampling : custom samplers order (#4285) il y a 2 ans
  Georgi Gerganov 6b0a7420d0 llama : KV cache view API + better KV cache management (#4170) il y a 2 ans
  Seb C 881800d1f0 main : Add ChatML functionality to main example (#4046) il y a 2 ans