Commit History

Author SHA1 Message Date
  Georgi Gerganov bf9087f59a metal : fuse add, mul + add tests (#14596) 6 months ago
  Jeff Bolz bd9c981d72 vulkan: Add fusion support for RMS_NORM+MUL (#14366) 6 months ago
  Diego Devesa b47ab7b8e9 sched : avoid changing cur_copy when a graph is already allocated (#13922) 7 months ago
  Diego Devesa 952f3953c1 ggml : allow CUDA graphs when using pipeline parallelism (#13814) 7 months ago
  Johannes Gäßler 10d2af0eaa llama/ggml: add LLM training support (#10544) 8 months ago
  David Huang 7f323a589f Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386) 8 months ago
  Johannes Gäßler 9070365020 CUDA: fix logic for clearing padding with -ngl 0 (#13320) 8 months ago
  mgroeber9110 5bbe6a9fe9 ggml : portability fixes for VS 2017 (#12150) 10 months ago
  William Tambellini 70680c48e5 ggml : upgrade init_tensor API to return a ggml_status (#11854) 10 months ago
  Diego Devesa 017cc5f446 ggml-backend : only offload from host buffers (fix) (#11124) 1 year ago
  Diego Devesa a3d50bc022 ggml-backend : only offload from host buffers (#11120) 1 year ago
  Daniel Bevenius db68c93b57 ggml : improve inputs log sched_print_assignments (ggml/1053) 1 year ago
  Diego Devesa 7cc2d2c889 ggml : move AMX to the CPU backend (#10570) 1 year ago
  slaren 59b9172822 ggml/sched : do not skip views in pre-assignments 1 year ago
  Johannes Gäßler 02e4eaf22f ggml-opt: fix data corruption (ggml/1022) 1 year ago
  Diego Devesa be5caccef9 llama : only use default buffer types for the KV cache (#10358) 1 year ago
  Diego Devesa eda7e1d4f5 ggml : fix possible buffer use after free in sched reserve (#9930) 1 year ago
  Johannes Gäßler 8a43e940ab ggml: new optimization interface (ggml/988) 1 year ago
  Diego Devesa ae8de6d50a ggml : build backends as libraries (#10256) 1 year ago
  Diego Devesa 9f40989351 ggml : move CPU backend to a separate file (#10144) 1 year ago
  Diego Devesa c02e5ab2a6 llama : fix buffer checks for mamba and rwk (#10111) 1 year ago
  Sergio López 61408e7fad kompute: add backend registry / device interfaces (#10045) 1 year ago
  Diego Devesa c5b0f4b5d9 llama : refactor model loader with backend registry (#10026) 1 year ago
  leo-pony 6b8447352d [CANN] Adapt to dynamically loadable backends mechanism (#9970) 1 year ago
  Ouadie EL FAROUKI 87421a23e8 [SYCL] Add SYCL Backend registry, device and Event Interfaces (#9705) 1 year ago
  Ma Mingfei 60ce97c9d8 add amx kernel for gemm (#8998) 1 year ago
  Diego Devesa f010b77a37 vulkan : add backend registry / device interfaces (#9721) 1 year ago
  Gilad S. 2194200278 fix: allocating CPU buffer with size `0` (#9917) 1 year ago
  Gilad S. 73afe681aa fix: use `vm_allocate` to allocate CPU backend buffer on macOS (#9875) 1 year ago
  Diego Devesa 96776405a1 ggml : move more prints to the ggml log system (#9839) 1 year ago