Historique des commits

Auteur SHA1 Message Date
  hipudding 1bdd8ae19f [CANN] Add Ascend NPU backend (#6035) il y a 1 an
  Radoslav Gerganov e65bbf606c llama-bench : fix RPC indication (#7936) il y a 1 an
  slaren f578b86b21 move BLAS to a separate backend (#6210) il y a 1 an
  Johannes Gäßler 148995e5e5 llama-bench: more compact markdown tables (#7879) il y a 1 an
  Georgi Gerganov 1442677f92 common : refactor cli arg parsing (#7675) il y a 1 an
  Georgi Gerganov 554c247caf ggml : remove OpenCL (#7735) il y a 1 an
  slaren adc9ff3841 llama-bench : allow using a different printer for stderr with -oe (#7722) il y a 1 an
  Radoslav Gerganov 210d99173d llama-bench : add support for the RPC backend (#7435) il y a 1 an
  Georgi Gerganov 6ff13987ad common : normalize naming style (#7462) il y a 1 an
  slaren b18532a4ef phi3 : duplicate rope factors in each layer (#7447) il y a 1 an
  slaren e849648888 llama-bench : add pp+tg test type (#7199) il y a 1 an
  kunnis 628b299106 Adding support for the --numa argument for llama-bench. (#7080) il y a 1 an
  Georgi Gerganov 9c67c2773d ggml : add Flash Attention (#5021) il y a 1 an
  Justine Tunney 8cc91dc63c ggml : add llamafile sgemm (#6414) il y a 1 an
  slaren 280345968d cuda : rename build flag to LLAMA_CUDA (#6299) il y a 1 an
  Kawrakow 76aa30a263 Add ability to use Q5_0, Q5_1, and IQ4_NL for quantized K cache (#6183) il y a 1 an
  slaren 2bf8d0f7c4 backend : offload large batches to GPU (#6083) il y a 1 an
  slaren b0bc9f4a9d llama-bench : use random tokens to improve accuracy with mixtral (#6069) il y a 1 an
  Steve Grubb 6e0438da3c gguf : fix resource leaks (#6061) il y a 1 an
  slaren f30ea47a87 llama : add pipeline parallelism support (#6017) il y a 1 an
  Georgi Gerganov 6cdabe6526 llama-bench : add embeddings option (#5924) il y a 1 an
  Neo Zhang Jianyu 715641391d Support multiple GPUs (split mode) on SYCL backend (#5806) il y a 1 an
  Pierrick Hymbert 3ab8b3a92e llama : cleanup unused mmq flags (#5772) il y a 1 an
  Georgi Gerganov ab336a9d5e code : normalize enum names (#5697) il y a 1 an
  bmwl f486f6e1e5 ggml : add numa options (#5377) il y a 1 an
  Michael Klimenko 52bb63c708 refactor : switch to emplace_back to avoid extra object (#5291) il y a 1 an
  Neo Zhang Jianyu 128dcbd3c9 add --no-mmap in llama-bench (#5257) il y a 1 an
  Georgi Gerganov 5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240) il y a 1 an
  Jared Van Bortel e8dc55d006 kompute : llama-bench support and ggml_cpu_has_kompute() (#5226) il y a 1 an
  0cc4m 2307523d32 ggml : add Vulkan backend (#2059) il y a 2 ans