Commit Verlauf

Autor SHA1 Nachricht Datum
  Ruben Ortlam fec7911f8f vulkan: disable large mmv subgroups on older Nvidia GPUs (#15717) vor 4 Monaten
  s-goto-11 078ce23ea7 ggml: SVE support for exponential functions (#15145) vor 4 Monaten
  Prashant Vithule a0c2b207c5 ggml: aarch64: Implement SVE F16 kernels for vector functions (#15115) vor 4 Monaten
  Jie Fu (傅杰) 4b20d8b7e3 convert : remove redundant code (#15708) vor 4 Monaten
  Ruben Ortlam 02c1813517 Vulkan: Add Integer Dot Product mul_mat_vec shader for legacy quants (#14903) vor 4 Monaten
  Daniel Bevenius 77dee9de97 ggml : WebGPU add TRANSPOSE and RESHAPE to supported ops (#15695) vor 4 Monaten
  Jie Fu (傅杰) 4795c91c32 docs : add Hunyuan to models section (#15707) vor 4 Monaten
  Akarshan Biswas b66df9d9c9 CUDA: fix build error from ambiguous __half conversions in conv2d (#15690) vor 4 Monaten
  hipudding b9382c3877 CANN: Optimize MUL_MAT_ID (#15658) vor 4 Monaten
  hipudding 3dc7397a27 CANN: fix RoPE cache issue on multi-device (#15629) vor 4 Monaten
  Georgi Gerganov e92d53b29e sampling : optimize samplers by reusing bucket sort (#15665) vor 4 Monaten
  Georgi Gerganov 0d161f021a server : enable /slots by default and make it secure (#15630) vor 4 Monaten
  Georgi Gerganov 4efd5a8316 metal : fix checks for available FA kernels (#15700) vor 4 Monaten
  Diego Devesa 274966226f llama : fix fattn reserve call n_seqs parameter (#15699) vor 4 Monaten
  Diego Devesa 9777032dcc llama : separate compute buffer reserve from fattn check (#15696) vor 4 Monaten
  Sigbjørn Skjæret 7d3c9f2b21 ci : explicitly set fa off or on (#15692) vor 4 Monaten
  Jeff Bolz bbbf5ecccb vulkan: handle large sizes for get_rows (#15686) vor 4 Monaten
  Jeff Bolz c37052ab4d vulkan: mul_mat_id coopmat2 optimizations (#15546) vor 4 Monaten
  Daniel Bevenius 5c16b9c87d vulkan : remove unused portability_enumeration_ext variable (#15679) vor 4 Monaten
  Jeff Bolz b97c9edc59 vulkan: Allow fallback to sysmem memory when vidmem is full (#15649) vor 4 Monaten
  Jeff Bolz 94e82c7ead vulkan: clamp matmul and FA results to the max finite value (#15652) vor 4 Monaten
  Charles Xu 4d74393bcc ggml: update kleidiai to v1.13.0 (#15663) vor 4 Monaten
  Diego Devesa dd892555b0 Update build.md to remove MSVC arm64 notes (#15684) vor 4 Monaten
  Johannes Gäßler e81b8e4b7f llama: use FA + max. GPU layers by default (#15434) vor 4 Monaten
  Johannes Gäßler 38ad381f9f CUDA: use FP32 arithmetic for conv2d (#15683) vor 4 Monaten
  Jeff Bolz 696fccf354 vulkan: Skip syncing for prealloc_y when it is reused (#15544) vor 4 Monaten
  Chenguang Li ef476916bb CANN: FIx compiler warnings (#15661) vor 4 Monaten
  Sergey Alirzaev d82f6aa34a server : removed obsolete doc (#15670) vor 4 Monaten
  Johannes Gäßler 3d16b29c3b scripts: strip "AMD Instinct" from GPU name (#15668) vor 4 Monaten
  ExtReMLapin 792b44f2ed server : add documentation for `parallel_tool_calls` param (#15647) vor 4 Monaten