Charles Xu
|
8c583242ad
kleidiai: add optimized per-channel kernels for Q8_0 (#16993)
|
2 months ago |
Mike Abbott
|
4a5b8aff40
cmake : add version to all shared object files (#17091)
|
2 months ago |
Nicolas B. Pierron
|
d2d626938a
Install rpc-server when GGML_RPC is ON. (#17149)
|
2 months ago |
levkropp
|
2fc392ce35
convert : register UMT5Model architecture for T5 conversion (#17160)
|
2 months ago |
lhez
|
ece0f5c177
opencl: add fastdiv and use it in set_rows, ported from cuda (#17090)
|
2 months ago |
Sigbjørn Skjæret
|
7bef684118
models : move build_inp_out_ids outside loop (#17151)
|
2 months ago |
Max Krasnyansky
|
395e286bc9
cpu: skip NOPs to avoid barriers (#17133)
|
2 months ago |
Georgi Gerganov
|
13730c183b
metal : cap threadgroups size of set_rows (#17146)
|
2 months ago |
Adrien Gallouët
|
967eb4b2bf
ggml-cpu : inspect -march and -mcpu to found the CPU (#16333)
|
2 months ago |
Ruben Ortlam
|
f117be185e
vulkan: check glslc executable string (#17144)
|
2 months ago |
Ruben Ortlam
|
85234a4b3a
vulkan: fix validation issue introduced by #16868 (#17145)
|
2 months ago |
Gabe Goodhart
|
0c74f32632
memory: Hybrid context shift (#17009)
|
2 months ago |
Georgi Gerganov
|
c27efd2bd1
metal : enable tensor API for A19 (#17087)
|
2 months ago |
fj-y-saito
|
df70bedda7
arm64: add i8mm route with SVE ggml_vec_dot_q4_K_q8_K and ggml_vec_dot_q6_K_… (#15277)
|
2 months ago |
Georgi Gerganov
|
f914544b16
batched-bench : add "separate text gen" mode (#17103)
|
2 months ago |
Xuan-Son Nguyen
|
4b13a684c5
mtmd: fix patch_size initialized to random value in audio models (#17128)
|
2 months ago |
Georgi Gerganov
|
9898b57cbe
editorconfig : ignore benches/ (#17140)
|
2 months ago |
Acly
|
1032256ec9
cuda/vulkan : bicubic interpolation (#17022)
|
2 months ago |
Georgi Gerganov
|
15274c0c50
benches : add eval results (#17139)
|
2 months ago |
Georgi Gerganov
|
b8595b16e6
mtmd : fix embedding size for image input (#17123)
|
2 months ago |
Ruben Ortlam
|
392e09a608
vulkan: fix memory allocations (#17122)
|
2 months ago |
compilade
|
802cef44bf
convert : parse safetensors directly (#15667)
|
2 months ago |
compilade
|
1c07c0c68c
convert : handle compressed-tensors quant method (#17069)
|
2 months ago |
Georgi Gerganov
|
cb1adf8851
server : handle failures to restore host cache (#17078)
|
2 months ago |
Georgi Gerganov
|
ef1d826997
benches : add folder with benchmarks (#16931)
|
2 months ago |
Eric Curtin
|
86fde91e62
Switch to using Ubuntu 25.10 vulkan/mesa (#16497)
|
2 months ago |
Ruben Ortlam
|
7f3e9d339c
vulkan: iGPU memory reporting fix (#17110)
|
2 months ago |
Ruben Ortlam
|
8a3519b708
vulkan: fix mmq out of bounds reads (#17108)
|
2 months ago |
Jeff Bolz
|
80a6cf6347
vulkan: fuse mul_mat_id + mul (#17095)
|
2 months ago |
Georgi Gerganov
|
0750a59903
metal : retain src and dst buffers during async ops (#17101)
|
2 months ago |