Commit History

Autor SHA1 Mensaxe Data
  Georgi Gerganov 85a7d8677b memory : remove KV cache size padding (#16812) hai 2 meses
  Georgi Gerganov a8ca18b4b8 llama-bench : clarify benchmarked parts of the computation (#16823) hai 2 meses
  l3utterfly 8284efc35c initialise buffer.device in ggml_hexagon_session (#16816) hai 2 meses
  Sam Malayek 1c1409e131 embedding: add raw option for --embd-output-format (#16541) hai 2 meses
  Johannes Gäßler 7a0e900e36 llama: consistent ctx <-> buf order for KV cache (#16746) hai 2 meses
  Aldehir Rojas 280d97be96 grammar : support array references in json schema (#16792) hai 2 meses
  Chenguang Li 3479efd112 CANN: Improve device ID handling and aclnnArange checks (#16752) hai 2 meses
  Aman Gupta 463bbf20bf CUDA: add unused vars to mmvf and mmvq (#16807) hai 2 meses
  tamarPal ad8d36beff sycl: add SSM_CONV operation support (#16800) hai 2 meses
  Yuri Khrustalev c053e18a66 chat: Add LFM2 tool handling (#16763) hai 2 meses
  Xuan-Son Nguyen e1ab084803 mtmd : fix idefics3 preprocessing (#16806) hai 2 meses
  Diego Devesa 5a4ff43e7d llama : disable pipeline parallelism if compute buffer allocation fails (#16748) hai 2 meses
  Acly 10640e31aa ggml : fix interpolate with align-corners and ne=1 (#16700) hai 2 meses
  Johannes Gäßler 80d28f104c HIP: fix AMDGPU_TARGETS, update documentation (#16803) hai 2 meses
  Xuan-Son Nguyen c55d53acec model : add LightOnOCR-1B model (#16764) hai 2 meses
  Johannes Gäßler 945501f5ea llama: fix leaked buffers for mmap + split files (#16765) hai 2 meses
  Aman Gupta 75cbdd3fce test-backend-ops: print failed tests at the end (#16785) hai 2 meses
  tamarPal 2b9bd9bf4e sycl: add ROLL operation support (#16665) hai 2 meses
  shani-f 59fc1ec8e8 sycl: add REPEAT_BACK operation support (#16734) hai 2 meses
  Aman Gupta 75d33b9302 CUDA: support for weight clamp in top-k norm (#16702) hai 2 meses
  Acly 3470a5c891 ggml-alloc : make gallocr prefer chunks that allow memory reuse (#16788) hai 2 meses
  Sigbjørn Skjæret bd562fe4f7 cuda : use fast copy when src and dst are of different type and contiguous (#16789) hai 2 meses
  leejet bbac6a26b2 ggml: fix cuda kernel launch configuration for k_compute_batched_ptrs to support large batch (#16744) hai 2 meses
  Sigbjørn Skjæret 73a48c9790 convert : enable expert group selection for all models with it (#16691) hai 2 meses
  Sigbjørn Skjæret f696428ce8 graph : add clamping to ffn_moe_weights_sum to avoid div-by-zero (#16655) hai 2 meses
  Sigbjørn Skjæret 7cce4f8158 model : set res->t_embd in SmallThinker models (#16782) hai 2 meses
  amirai21 8d8862829c docs : add Jamba to Text-only models list (#16778) hai 2 meses
  Aman Gupta f77c13b91f CUDA: General GEMV fusion (#16715) hai 2 meses
  Gilad S. 3cfa9c3f12 vulkan: deduplicate Microsoft Direct3D12 devices (#16689) hai 2 meses
  Galunid 5d195f17bc convert : handle mmproj filename/path properly (#16760) hai 2 meses