Historique des commits

Auteur SHA1 Message Date
  Georgi Gerganov 6cdabe6526 llama-bench : add embeddings option (#5924) il y a 1 an
  Neo Zhang Jianyu 715641391d Support multiple GPUs (split mode) on SYCL backend (#5806) il y a 1 an
  Pierrick Hymbert 3ab8b3a92e llama : cleanup unused mmq flags (#5772) il y a 1 an
  Georgi Gerganov ab336a9d5e code : normalize enum names (#5697) il y a 1 an
  bmwl f486f6e1e5 ggml : add numa options (#5377) il y a 1 an
  Michael Klimenko 52bb63c708 refactor : switch to emplace_back to avoid extra object (#5291) il y a 1 an
  Neo Zhang Jianyu 128dcbd3c9 add --no-mmap in llama-bench (#5257) il y a 2 ans
  Georgi Gerganov 5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240) il y a 2 ans
  Jared Van Bortel e8dc55d006 kompute : llama-bench support and ggml_cpu_has_kompute() (#5226) il y a 2 ans
  0cc4m 2307523d32 ggml : add Vulkan backend (#2059) il y a 2 ans
  slaren e7e4df031b llama : ggml-backend integration (#4766) il y a 2 ans
  slaren 226460cc0d llama-bench : add no-kv-offload parameter (#4812) il y a 2 ans
  Georgi Gerganov bcc0eb4591 llama : per-layer KV cache + quantum K cache (#4309) il y a 2 ans
  cebtenzzre b12fa0d1c1 build : link against build info instead of compiling against it (#3879) il y a 2 ans
  Kerfuffle 6e08281e58 Extend llama_kv_cache_seq_rm to allow matching any sequence (#3843) il y a 2 ans
  Marcus Dunn 5be6c803fa llama : remove token functions with `context` args in favor of `model` (#3720) il y a 2 ans
  Cebtenzzre bc39553c90 build : enable more non-default compiler warnings (#3200) il y a 2 ans
  slaren 16bc66d947 llama.cpp : split llama_context_params into model and context params (#3301) il y a 2 ans
  Georgi Gerganov ec893798b7 llama : custom attention mask + parallel decoding + no context swaps (#3228) il y a 2 ans
  Rickard Hallerbäck dc6897404e metal : reusing llama.cpp logging (#3152) il y a 2 ans
  Georgi Gerganov 8c00b7a6ff sync : ggml (Metal F32 support + reduce ggml-alloc size) (#3192) il y a 2 ans
  slaren 15b67a66c2 llama-bench : use two tokens in the warmup run for prompt evals (#3059) il y a 2 ans
  Cebtenzzre de2fe892af examples : replace fprintf to stdout with printf (#3017) il y a 2 ans
  Cebtenzzre 3103568144 llama-bench : make cpp file non-executable (#2999) il y a 2 ans
  slaren 43033b7bb4 llama-bench : set locale to utf8 (#2832) il y a 2 ans
  slaren 154725c543 llama-bench : add model sizes (#2771) il y a 2 ans
  Henri Vasserman 6bbc598a63 ROCm Port (#1087) il y a 2 ans
  slaren 8e4364f2af llama-bench : minor fixes (#2695) il y a 2 ans
  Georgi Gerganov 6381d4e110 gguf : new file format with flexible meta data (beta) (#2398) il y a 2 ans
  slaren 097e121e2f llama : add benchmark example (#2626) il y a 2 ans