Commit History

Author SHA1 Message Date
  Reese Levine d304f459d8 GGML WebGPU: Support for ADD, MUL, RMS_NORM, GET_ROWS operators (#16018) 4 months ago
  Georgi Gerganov 0320ac5264 metal : refactor + optimize v2 (#15995) 4 months ago
  Aleksander Grygier a7a98e0fff SvelteKit-based WebUI (#14839) 4 months ago
  Xuan-Son Nguyen 8f8f2274ee convert : add Llama4ForCausalLM (#16042) 4 months ago
  Johannes Gäßler c959b676be CUDA: fix FA occupancy, optimize tile kernel (#15982) 4 months ago
  David Ribeiro Alves cd08fc3ecc common : Fix corrupted memory error on json grammar initialization (#16038) 4 months ago
  Eve cb5bb6cc05 vulkan: automatically remove unsupported devices (#15976) 4 months ago
  Daniel Bevenius a91d035b90 ci : revert back to macos-13 for macOS-latest-cmake-x64 (#16040) 4 months ago
  Jie Fu (傅杰) 745cbcf2fe llama-quant : fix the verification of attention layers for encoder-decoder models (#16023) 4 months ago
  Jie Fu (傅杰) 1cbd80f8cf examples : support encoder-decoder models in the simple example (#16002) 4 months ago
  Shane A 85286f3548 model : add OLMo3 support (#16015) 4 months ago
  Chenguang Li d5fabe3682 CANN: Optimize ggml_cann_set_device (#15935) 4 months ago
  jacekpoplawski 8ff206097c llama-bench: add --n-cpu-moe support (#15952) 4 months ago
  Daniel Bevenius 77475530b8 ci : use macos-latest for arm64 webgpu build (#16029) 4 months ago
  Daniel Bevenius 3913f8730e ggml : fix padding in timestep embedding kernels (#15932) 4 months ago
  Daniel Bevenius 76888d202e ci : upload xcframework artifact from ios-xcode-build job (#16010) 4 months ago
  Bowen Han f1fbffb5c0 fix: apply clang-format to CUDA macros (#16017) 4 months ago
  Daniel Bevenius 51abc96bdc ci : update macos-latest* jobs to use macos-latest (#15938) 4 months ago
  Yuri Khrustalev 07808ebb07 cmake : Do not install tools on iOS targets (#15903) 4 months ago
  Aman Gupta 6d758839ff Add LLaDA-7b-MoE diffusion model (#16003) 4 months ago
  Jake Karnes 3d4053f77f CUDA: fix im2col_3d to respect non-contiguous inputs (views) (#15956) 4 months ago
  Diego Devesa dc381aa9a6 docker : enable rocWMMA in ROCm images, add gfx1151 (#15997) 4 months ago
  Diego Devesa 10d197409b releases : switch to rocWMMA develop branch, add gfx1151 (#15992) 4 months ago
  yael-works b907255f4b SYCL: Add COUNT_EQUAL operator support (#15991) 4 months ago
  Nikolay Popov 28c39da7c6 llama-run: Fix model download on Windows (#15988) 4 months ago
  Aman Gupta 106220562a CUDA: some micro-optimizations in mmf.cuh for mul_mat_id (#15926) 4 months ago
  ddh0 a68f31edd7 fix KLD percentile output (#15999) 4 months ago
  Sigbjørn Skjæret b8e09f08b9 model : add grok-2 support (#15539) 4 months ago
  Sigbjørn Skjæret 6c019cb04e server : only attempt to enable thinking if using jinja (#15967) 4 months ago
  Georgi Gerganov 9dcd200d57 metal : remove memory pools (#15966) 4 months ago