Revīziju vēsture

Autors SHA1 Ziņojums Datums
  Aidan eeee367de5 server: fix correct time_ms calculation in prompt_progress (#17093) 2 mēneši atpakaļ
  Aman Gupta 64fe17fbb8 Revert "CUDA: add expert reduce kernel (#16857)" (#17100) 2 mēneši atpakaļ
  Aman Gupta c1b187688d CUDA: skip fusion for repeating adds in bias (#17080) 2 mēneši atpakaļ
  SavicStefan b8a5cfd11a vulkan: Increase BK to 32; use BK/4 for non-CM mul_mm.comp (#16636) 2 mēneši atpakaļ
  Aleksei Nikiforov 08416ebe7f ggml: disable vxe for cross-compilation by default (#16966) 2 mēneši atpakaļ
  Jeff Bolz b4e335d8dc vulkan: fuse rms_norm + mul + rope (+ view + set_rows) (#16977) 2 mēneši atpakaļ
  Jeff Bolz d6fe40fa00 vulkan: Fix test-thread-safety crashes (#17024) 2 mēneši atpakaļ
  Johannes Gäßler e14e842e87 CUDA: fix MMQ stream-k fixup ne1 indices (#17089) 2 mēneši atpakaļ
  Reese Levine 647b960bd8 ggml webgpu: faster matrix multiplication/matrix-vector multiplication (#17031) 2 mēneši atpakaļ
  bssrdf 299f5d782c CUDA: properly handle nb00=nb02 case for cpy (#17081) 2 mēneši atpakaļ
  Acly ac76d36201 vulkan : refactor buffer handling in vk_op_f32 (#16840) 2 mēneši atpakaļ
  Johannes Gäßler 6515610506 CUDA: fix should_use_mmvf for ne11 == 1 (#17085) 2 mēneši atpakaļ
  Georgi Gerganov 7956bb4d7f bench : cache the llama_context state at computed depth (#16944) 2 mēneši atpakaļ
  Sigbjørn Skjæret 9008027aa3 hparams : add n_embd_inp() to support extended embed (#16928) 2 mēneši atpakaļ
  Georgi Gerganov 16bcc1259d kv-cache : pad the cache size to 256 for performance (#17046) 2 mēneši atpakaļ
  Adrien Gallouët 9eb9a1331d Revert "ggml-cpu: detect correct cpu flags for arm64 (#16229) (#16239)" (#17084) 2 mēneši atpakaļ
  iron 7c23f3f0d4 ggml-cpu: detect correct cpu flags for arm64 (#16229) (#16239) 2 mēneši atpakaļ
  Georgi Gerganov 8c0d6bb455 server : print the samplers chain for each request (#17070) 2 mēneši atpakaļ
  Xuan-Son Nguyen 5c9a18e674 common: move download functions to download.(cpp|h) (#17059) 2 mēneši atpakaļ
  xctan 7f09a680af ggml-cpu : optimize RVV q2_k and q3_k kernels (#16887) 2 mēneši atpakaļ
  Johannes Gäßler aa374175c3 CUDA: fix crash on uneven context without FA (#16988) 2 mēneši atpakaļ
  Georgi Gerganov 5b180c3d60 metal : initial Metal4 tensor API support (#16634) 2 mēneši atpakaļ
  Georgi Gerganov b7f9010d24 server : disable checkpoints with mtmd (#17045) 2 mēneši atpakaļ
  Xuan-Son Nguyen 4882f0ff78 clip: implement minicpm-v sinusoidal embd using GGML (#17036) 2 mēneši atpakaļ
  YehuditE 9d7c518d64 sycl: add CONCAT operator support (#16047) 2 mēneši atpakaļ
  Johannes Gäßler 22c8c3c6ad docs: explain CUDA 11 compilation [no ci] (#16824) 2 mēneši atpakaļ
  l3utterfly 6db3d1ffe6 ggml-hexagon: graceful fallback for older socs where rpcmem_alloc2 and FASTRPC_GET_URI is unsupported (#16987) 2 mēneši atpakaļ
  bssrdf 230d1169e5 improve CUDA cpy memory bandwidth when copying transposed tensor (#16841) 2 mēneši atpakaļ
  Jeff Bolz a44d77126c vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion (#16919) 2 mēneši atpakaļ
  Gabe Goodhart 5886f4f545 examples(gguf): GGUF example outputs (#17025) 2 mēneši atpakaļ