História revízii

Autor SHA1 Správa Dátum
  Georgi Gerganov 540938f890 llama : llama_model_desc print number of experts 2 rokov pred
  Marcus Dunn 0040d42eeb llama : replace all API facing `int`'s with `int32_t` (#4577) 2 rokov pred
  postmasters 83e633c27e llama : differentiate the KV dims in the attention (#4657) 2 rokov pred
  automaticcat 24a447e20a ggml : add ggml_cpu_has_avx_vnni() (#4589) 2 rokov pred
  manikbhandari ea5497df5d gpt2 : Add gpt2 architecture integration (#4555) 2 rokov pred
  Nam D. Tran f6793491b5 llama : add AWQ for llama, llama2, mpt, and mistral models (#4593) 2 rokov pred
  slaren dc68f0054c cuda : fix vmm pool with multi GPU (#4620) 2 rokov pred
  Shintarou Okada 753be377b6 llama : add PLaMo model (#3557) 2 rokov pred
  slaren 5bf3953d7e cuda : improve cuda pool efficiency using virtual memory (#4606) 2 rokov pred
  slaren 708e179e85 fallback to CPU buffer if host buffer alloc fails (#4610) 2 rokov pred
  slaren 48b7ff193e llama : fix platforms without mmap (#4578) 2 rokov pred
  crasm c7e9701f86 llama : add ability to cancel model loading (#4462) 2 rokov pred
  Georgi Gerganov afefa319f1 ggml : change ggml_scale to take a float instead of tensor (#4573) 2 rokov pred
  slaren d232aca5a7 llama : initial ggml-backend integration (#4520) 2 rokov pred
  Marcus Dunn 31f27758fa llama : allow getting n_batch from llama_context in c api (#4540) 2 rokov pred
  Johannes Gäßler d3223afdad llama : disable per-tensor info prints on model load (#4562) 2 rokov pred
  Ebey Abraham b9e74f9bca llama : add phi-2 + fix NeoX rope + ggml_mul_mat_set_prec (#4490) 2 rokov pred
  hankcs 3c04bf6da8 llama : fix try_override for bool_value which always return true (#4519) 2 rokov pred
  Jared Van Bortel 2994f0c5a2 decode : fix logits_valid for legacy API (#4516) 2 rokov pred
  Georgi Gerganov 800a489e4a llama.swiftui : add bench functionality (#4483) 2 rokov pred
  slaren c6c4fc081c lora : add support for non-llama models (#3333) 2 rokov pred
  Jared Van Bortel 8a5be3bd58 llama : sanity checks for access to logits (#4274) 2 rokov pred
  slaren cafcd4f895 ggml : remove n_dims from ggml_tensor (#4469) 2 rokov pred
  LostRuins 20a68a7030 ggml : add ggml_row_size() (fixes llama out of space) (#4461) 2 rokov pred
  slaren 799a1cb13b llama : add Mixtral support (#4406) 2 rokov pred
  Richard Kiss 9494d7c477 english : use `typos` to fix comments and logs (#4354) 2 rokov pred
  Xiang (Kevin) Li e18f7345a3 grammar : revert the replacement of llama_token_to_piece with id_to_token (#4396) 2 rokov pred
  Georgi Gerganov bcc0eb4591 llama : per-layer KV cache + quantum K cache (#4309) 2 rokov pred
  Marcus Dunn 5f6e0c0dff grammar : pre-computed pieces + reserve mem + less string copies (#4330) 2 rokov pred
  Kerfuffle 5aa365d88f llama : allow overriding GGUF metadata when loading model (#4092) 2 rokov pred