Histórico de commits

Autor SHA1 Mensagem Data
  Pierrick Hymbert f482bb2e49 common: llama_load_model_from_url split support (#6192) 1 ano atrás
  Pierrick Hymbert 1997577d5e server: docs: `--threads` and `--threads`, `--ubatch-size`, `--log-disable` (#6254) 1 ano atrás
  Julius Arkenberg 476b0251b2 llama : add grok-1 support (#6204) 1 ano atrás
  Pierrick Hymbert 21cad01b6e split: add gguf-split in the make build target (#6262) 1 ano atrás
  Pierrick Hymbert 1b26aebe4d server: flush stdout after logging in both text and json layout (#6253) 1 ano atrás
  Johannes Gäßler 50ccaf5eac lookup: complement data from context with general text statistics (#5479) 1 ano atrás
  Georgi Gerganov 56a00f0a2f common : default --hf-file to --model (#6234) 1 ano atrás
  fraxy-v 92397d87a4 convert-llama2c-to-ggml : enable conversion of GQA models (#6237) 1 ano atrás
  Kawrakow 1d0331c12a quantize: options for output and token embedding tensors qtype (#6239) 1 ano atrás
  Pierrick Hymbert dba1af6129 llama_model_loader: support multiple split/shard GGUFs (#6187) 1 ano atrás
  Minsoo Cheong ee804f6223 ci: apply concurrency limit for github workflows (#6243) 1 ano atrás
  Georgi Gerganov 80bd33bc2c common : add HF arg helpers (#6234) 1 ano atrás
  Nexesenex e80f06d2a1 llama : correction of the attn.v.weight quantization for IQ3_XS (#6209) 1 ano atrás
  Olivier Chafik f77a8ffd3b tests : conditional python & node json schema tests (#6207) 1 ano atrás
  Olivier Chafik 72114edf06 json-schema-to-grammar : fix order of props + non-str const/enum (#6232) 1 ano atrás
  slaren 2f0e81e053 cuda : add LLAMA_CUDA_NO_PEER_COPY to workaround broken ROCm p2p copy (#6208) 1 ano atrás
  Xiaoyi Chen 29ab270e65 readme : add RecurseChat to the list of UIs (#6219) 1 ano atrás
  Jan Boon 6b8bb3a31d server : fix n_keep always showing as 0 in response (#6211) 1 ano atrás
  Georgi Gerganov 68e210b354 server : enable continuous batching by default (#6231) 1 ano atrás
  Georgi Gerganov b3e94f26ba metal : proper assert for mat-mat memory alignment (#6225) 1 ano atrás
  Vaibhav Srivastav b2075fd6a5 ci : add CURL flag for the mac builds (#6214) 1 ano atrás
  Georgi Gerganov 95d576b48e metal : pad n_ctx by 32 (#6177) 1 ano atrás
  Neo Zhang Jianyu 59c17f02de add blog link (#6222) 1 ano atrás
  DAN™ fa046eafbc Fix params underscore convert to dash. (#6203) 1 ano atrás
  Jan Boon be07a03217 server : update readme doc from `slot_id` to `id_slot` (#6213) 1 ano atrás
  slaren d0a71233fb cuda : disable host register by default (#6206) 1 ano atrás
  semidark f372c49ccd Corrected typo to wrong file (#6199) 1 ano atrás
  Georgi Gerganov 924ce1dce7 tests : disable system() calls (#6198) 1 ano atrás
  slaren 03a8f8fafe cuda : fix LLAMA_CUDA_F16 build (#6197) 1 ano atrás
  Kawrakow cfd3be76e3 ggml : same IQ4_NL quantization for CPU/CUDA/Metal (#6196) 1 ano atrás