Histórico de Commits

Autor SHA1 Mensagem Data
  Jie Fu (傅杰) 4795c91c32 docs : add Hunyuan to models section (#15707) há 4 meses atrás
  Akarshan Biswas b66df9d9c9 CUDA: fix build error from ambiguous __half conversions in conv2d (#15690) há 4 meses atrás
  hipudding b9382c3877 CANN: Optimize MUL_MAT_ID (#15658) há 4 meses atrás
  hipudding 3dc7397a27 CANN: fix RoPE cache issue on multi-device (#15629) há 4 meses atrás
  Georgi Gerganov e92d53b29e sampling : optimize samplers by reusing bucket sort (#15665) há 4 meses atrás
  Georgi Gerganov 0d161f021a server : enable /slots by default and make it secure (#15630) há 4 meses atrás
  Georgi Gerganov 4efd5a8316 metal : fix checks for available FA kernels (#15700) há 4 meses atrás
  Diego Devesa 274966226f llama : fix fattn reserve call n_seqs parameter (#15699) há 4 meses atrás
  Diego Devesa 9777032dcc llama : separate compute buffer reserve from fattn check (#15696) há 4 meses atrás
  Sigbjørn Skjæret 7d3c9f2b21 ci : explicitly set fa off or on (#15692) há 4 meses atrás
  Jeff Bolz bbbf5ecccb vulkan: handle large sizes for get_rows (#15686) há 4 meses atrás
  Jeff Bolz c37052ab4d vulkan: mul_mat_id coopmat2 optimizations (#15546) há 4 meses atrás
  Daniel Bevenius 5c16b9c87d vulkan : remove unused portability_enumeration_ext variable (#15679) há 4 meses atrás
  Jeff Bolz b97c9edc59 vulkan: Allow fallback to sysmem memory when vidmem is full (#15649) há 4 meses atrás
  Jeff Bolz 94e82c7ead vulkan: clamp matmul and FA results to the max finite value (#15652) há 4 meses atrás
  Charles Xu 4d74393bcc ggml: update kleidiai to v1.13.0 (#15663) há 4 meses atrás
  Diego Devesa dd892555b0 Update build.md to remove MSVC arm64 notes (#15684) há 4 meses atrás
  Johannes Gäßler e81b8e4b7f llama: use FA + max. GPU layers by default (#15434) há 4 meses atrás
  Johannes Gäßler 38ad381f9f CUDA: use FP32 arithmetic for conv2d (#15683) há 4 meses atrás
  Jeff Bolz 696fccf354 vulkan: Skip syncing for prealloc_y when it is reused (#15544) há 4 meses atrás
  Chenguang Li ef476916bb CANN: FIx compiler warnings (#15661) há 4 meses atrás
  Sergey Alirzaev d82f6aa34a server : removed obsolete doc (#15670) há 4 meses atrás
  Johannes Gäßler 3d16b29c3b scripts: strip "AMD Instinct" from GPU name (#15668) há 4 meses atrás
  ExtReMLapin 792b44f2ed server : add documentation for `parallel_tool_calls` param (#15647) há 4 meses atrás
  Aman Gupta 81017865ee CUDA: fix bug in rms_norm fusion (#15660) há 4 meses atrás
  Piotr Wilkin (ilintar) 60e5eee31f chat : Seed OSS thinking + tool call support (#15552) há 4 meses atrás
  Aman Gupta 009b709d6e CUDA: fuse adds, fuse add with rms norm (#15631) há 4 meses atrás
  Gabe Goodhart e8d99dd0b6 nvidia nemotron nano v2 (nemotronh) (#15507) há 4 meses atrás
  Gabe Goodhart a8bca68f72 fix: Compute the full sum in llama-eval-callback, not just the sum of printed values (#15637) há 4 meses atrás
  mnehete32 c97dc09391 CUDA: add conv2d (#15635) há 5 meses atrás