Ruben Ortlam 7f459c98e7 vulkan: use fewer FA rows for small cache runs (#18280) 1 month ago
..
ggml-blas 4d3d455d3c sync : whisper.cpp (ggml/1359) 4 months ago
ggml-cann cf2ffc02bc CANN: Uses yarn_ramp cache in ROPE (#17725) 1 month ago
ggml-cpu d34d5ca1e9 llamafile: add rvv support for sgemm kernels (#18199) 1 month ago
ggml-cuda b365c3ff01 vulkan/cuda: fix topk_moe with exp_probs_b (#18071) 1 month ago
ggml-hexagon ed75977717 ggml-hexagon: create generalized functions for cpu side op (#17500) 1 month ago
ggml-hip 80d28f104c HIP: fix AMDGPU_TARGETS, update documentation (#16803) 3 months ago
ggml-metal 165caaf5fb metal: use shared buffers on eGPU (#17866) 1 month ago
ggml-musa 11f0af5504 CUDA: faster tile FA, add oob checks, more HSs (#16492) 3 months ago
ggml-opencl eb492bf43f opencl: unpack q4_0 for adreno in get_tensor (#18278) 1 month ago
ggml-rpc 12ee1763a6 rpc : add check for rpc buffer type (#18242) 1 month ago
ggml-sycl 4aced7a631 [SYCL] Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (#17826) 1 month ago
ggml-vulkan 7f459c98e7 vulkan: use fewer FA rows for small cache runs (#18280) 1 month ago
ggml-webgpu fd57b24c0f ggml webgpu: unary op suppport, code refactoring, ops support (#17764) 1 month ago
ggml-zdnn 264f1b5187 zdnn: refactor codebase + add docs (#16178) 4 months ago
ggml-zendnn 017761daf5 ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690) 1 month ago
CMakeLists.txt 5c0d18881e llama.android : Rewrite Android binding (w/o cpu_features dep) (#17413) 1 month ago
ggml-alloc.c b1f3a6e5db llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653) 1 month ago
ggml-backend-impl.h 898acba681 rpc : add support for multiple devices (#16276) 3 months ago
ggml-backend-reg.cpp 017761daf5 ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690) 1 month ago
ggml-backend.cpp b1f3a6e5db llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653) 1 month ago
ggml-common.h fd1234cb46 llama : add gpt-oss (#15091) 5 months ago
ggml-impl.h 389ac78b26 ggml : add ops SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM (#17063) 2 months ago
ggml-opt.cpp 5cdb27e091 finetune: SGD optimizer, more CLI args (#13873) 5 months ago
ggml-quants.c f6b4af3d04 ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (#15928) 4 months ago
ggml-quants.h fd1234cb46 llama : add gpt-oss (#15091) 5 months ago
ggml-threading.cpp ae8de6d50a ggml : build backends as libraries (#10256) 1 year ago
ggml-threading.h cb13ef85a4 remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 1 year ago
ggml.c b1f3a6e5db llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653) 1 month ago
ggml.cpp fedf034a98 ggml : Print backtrace on uncaught C++ exceptions (ggml/1232) 8 months ago
gguf.cpp 37adc9c6ba ggml, llama : use defaulted constructors/destructors (#17649) 1 month ago