Commit History

Author SHA1 Message Date
  Daniel Bevenius 8c3fdf44ec model-conversion : add missing curl script [no ci] (#15761) 4 months ago
  hipudding f6da8cb86a CANN: Mask unsupported TRANSPOSE_1D operator (#15733) 4 months ago
  Chenguang Li 8a2234ea0c CANN: Fix type float_t to float (#15736) 4 months ago
  SnA1lGo 3de008208b fix: resolve unsigned int initialization warning for n_dims/size in gguf.cpp (#15754) 4 months ago
  Oliver Simons 69db8a52e6 chore: Update `.clang-format` to use `BinPackArguments=true` (#15744) 4 months ago
  Johannes Gäßler c466abe158 llama: -fa 1/0/-1 aliases for -fa on/off/auto (#15746) 4 months ago
  Ruben Ortlam 0a2a3841e8 vulkan: fix shaders gen when no integer dot is available (#15740) 4 months ago
  hipudding 9961d244f2 CANN: Resolve soft_max precision issue (#15730) 4 months ago
  Jeff Bolz 25f1045f07 vulkan: Fix macro parameter order for f32 matmul shaders (#15716) 4 months ago
  rmatif 97669e4073 opencl: add attn sinks support for FA kernels (#15706) 4 months ago
  Chenguang Li 2f853687b3 CANN: Support eager execution mode under ACL graph compilation (#15712) 4 months ago
  hipudding ef2af57ddf CANN: Support ext_factor in rope (#15710) 4 months ago
  Johannes Gäßler 5d804a4938 ggml-backend: raise GGML_MAX_SPLIT_INPUTS (#15722) 4 months ago
  Gilad S. d4d8dbe383 vulkan: use memory budget extension to read memory usage (#15545) 4 months ago
  Jeff Bolz 35a42edac8 vulkan: add missing clamps in new mul_mat_id paths (#15702) 4 months ago
  Ruben Ortlam fec7911f8f vulkan: disable large mmv subgroups on older Nvidia GPUs (#15717) 4 months ago
  s-goto-11 078ce23ea7 ggml: SVE support for exponential functions (#15145) 4 months ago
  Prashant Vithule a0c2b207c5 ggml: aarch64: Implement SVE F16 kernels for vector functions (#15115) 4 months ago
  Jie Fu (傅杰) 4b20d8b7e3 convert : remove redundant code (#15708) 4 months ago
  Ruben Ortlam 02c1813517 Vulkan: Add Integer Dot Product mul_mat_vec shader for legacy quants (#14903) 4 months ago
  Daniel Bevenius 77dee9de97 ggml : WebGPU add TRANSPOSE and RESHAPE to supported ops (#15695) 4 months ago
  Jie Fu (傅杰) 4795c91c32 docs : add Hunyuan to models section (#15707) 4 months ago
  Akarshan Biswas b66df9d9c9 CUDA: fix build error from ambiguous __half conversions in conv2d (#15690) 4 months ago
  hipudding b9382c3877 CANN: Optimize MUL_MAT_ID (#15658) 4 months ago
  hipudding 3dc7397a27 CANN: fix RoPE cache issue on multi-device (#15629) 4 months ago
  Georgi Gerganov e92d53b29e sampling : optimize samplers by reusing bucket sort (#15665) 4 months ago
  Georgi Gerganov 0d161f021a server : enable /slots by default and make it secure (#15630) 4 months ago
  Georgi Gerganov 4efd5a8316 metal : fix checks for available FA kernels (#15700) 4 months ago
  Diego Devesa 274966226f llama : fix fattn reserve call n_seqs parameter (#15699) 4 months ago
  Diego Devesa 9777032dcc llama : separate compute buffer reserve from fattn check (#15696) 4 months ago