Historique des commits

Auteur SHA1 Message Date
  Xuan Son Nguyen cda0e4b648 llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745) il y a 1 an
  Ouadie EL FAROUKI 87421a23e8 [SYCL] Add SYCL Backend registry, device and Event Interfaces (#9705) il y a 1 an
  Diego Devesa 0e9f760eb1 rpc : add backend registry / device interfaces (#9812) il y a 1 an
  Michael Podvitskiy 7be099fa81 llama-bench: correct argument parsing error message (#9524) il y a 1 an
  Georgi Gerganov 0abc6a2c25 llama : llama_perf + option to disable timings during decode (#9355) il y a 1 an
  Georgi Gerganov df270ef745 llama : refactor sampling v2 (#9294) il y a 1 an
  Aarni Koskela 134bc38ecf llama-bench : log benchmark progress (#9287) il y a 1 an
  slaren bdf314f38a llama-bench : fix NUL terminators in CPU name (#9313) il y a 1 an
  Radoslav Gerganov 82e3b03c11 rpc : make RPC servers come first in the device list (#9296) il y a 1 an
  Aarni Koskela 8962422b1c llama-bench : add JSONL (NDJSON) output mode (#9288) il y a 1 an
  Faisal Zaghloul 42c76d1358 Threadpool: take 2 (#8672) il y a 1 an
  Zhenwei Jin 506122d854 llama-bench : add support for getting cpu info on Windows (#8824) il y a 1 an
  slaren 2b1f616b20 ggml : reduce hash table reset cost (#8698) il y a 1 an
  hipudding 1bdd8ae19f [CANN] Add Ascend NPU backend (#6035) il y a 1 an
  Radoslav Gerganov e65bbf606c llama-bench : fix RPC indication (#7936) il y a 1 an
  slaren f578b86b21 move BLAS to a separate backend (#6210) il y a 1 an
  Johannes Gäßler 148995e5e5 llama-bench: more compact markdown tables (#7879) il y a 1 an
  Georgi Gerganov 1442677f92 common : refactor cli arg parsing (#7675) il y a 1 an
  Georgi Gerganov 554c247caf ggml : remove OpenCL (#7735) il y a 1 an
  slaren adc9ff3841 llama-bench : allow using a different printer for stderr with -oe (#7722) il y a 1 an
  Radoslav Gerganov 210d99173d llama-bench : add support for the RPC backend (#7435) il y a 1 an
  Georgi Gerganov 6ff13987ad common : normalize naming style (#7462) il y a 1 an
  slaren b18532a4ef phi3 : duplicate rope factors in each layer (#7447) il y a 1 an
  slaren e849648888 llama-bench : add pp+tg test type (#7199) il y a 1 an
  kunnis 628b299106 Adding support for the --numa argument for llama-bench. (#7080) il y a 1 an
  Georgi Gerganov 9c67c2773d ggml : add Flash Attention (#5021) il y a 1 an
  Justine Tunney 8cc91dc63c ggml : add llamafile sgemm (#6414) il y a 1 an
  slaren 280345968d cuda : rename build flag to LLAMA_CUDA (#6299) il y a 1 an
  Kawrakow 76aa30a263 Add ability to use Q5_0, Q5_1, and IQ4_NL for quantized K cache (#6183) il y a 1 an
  slaren 2bf8d0f7c4 backend : offload large batches to GPU (#6083) il y a 1 an