lhez
|
7cba58bbea
opencl: add sqr, sqrt, mean and ssm_conv (#17476)
|
1 month ago |
Alberto Cabrera Pérez
|
5449367b21
Fix chunks being too small with small matrix sizes (#17526)
|
1 month ago |
Han Qingzhe
|
1d594c295c
clip: (minicpmv) fix resampler kq_scale (#17516)
|
1 month ago |
Jeff Bolz
|
eec1e33a9e
vulkan: allow graph_optimize for prompt processing workloads (#17475)
|
1 month ago |
Jeff Bolz
|
879d673759
vulkan: Implement top-k (#17418)
|
1 month ago |
xctan
|
6ab4e50d9c
ggml-cpu : add RISC-V Zvfh impl for ggml_vec_mad_f16 (#17448)
|
1 month ago |
Adrien Gallouët
|
2336cc4784
cmake : use EXCLUDE_FROM_ALL to avoid patch-boringssl.cmake (#17520)
|
1 month ago |
Adrien Gallouët
|
e6923caaec
ggml : fix ARM feature verification (#17519)
|
1 month ago |
Jiacheng (Jason) Chen
|
3e18dba9fd
HIP: Patch failed testcase in WMMA-MMQ kernels for RDNA 4 (#17502)
|
1 month ago |
hipudding
|
eeb5605de2
CANN: Add MROPE and IMROPE support (#17401)
|
1 month ago |
o7si
|
f3a848a3b1
chore: upgrade cpp-httplib from v0.27.0 to v0.28.0 (#17513)
|
1 month ago |
Jeff Bolz
|
b3b03a7baf
vulkan: Implement GGML_OP_CUMSUM (#17479)
|
1 month ago |
Georgi Gerganov
|
583cb83416
ggml : add ggml_top_k (#17365)
|
1 month ago |
Aleksei Nikiforov
|
05872ac885
convert : fix big-endian conversion (#17431)
|
1 month ago |
Diego Devesa
|
55ab25caf5
codeowners : remove slaren (#17492)
|
1 month ago |
TianHao324
|
064c90d843
CANN: supports out_prod operator for F32 and F16 (#17406)
|
1 month ago |
Pascal
|
b1846f1c8e
webui: add rehype plugin to restore HTML in Markdown table cells (#17477)
|
1 month ago |
Jeff Bolz
|
d414db02d3
vulkan: Use fewer rows for scalar FA when HS is not a multiple of 16 (#17455)
|
1 month ago |
Aaron Teo
|
877566d512
llama: introduce support for model-embedded sampling parameters (#17120)
|
1 month ago |
Jeff Bolz
|
3d07caa99b
vulkan: more FA details in vk_perf_logger (#17443)
|
1 month ago |
Daniel Bevenius
|
134e6940ca
llama : skip output reordering for single token batches (#17466)
|
1 month ago |
Jiacheng (Jason) Chen
|
0543f928a3
HIP: WMMA-MMQ kernels for RDNA 4 (#17156)
|
1 month ago |
Sigbjørn Skjæret
|
b61de2b2df
convert : allow quantizing lora again (#17453)
|
1 month ago |
Xuan-Son Nguyen
|
b8372eecd9
server: split server.cpp code into server/common/task/queue (#17362)
|
1 month ago |
Daniel Bevenius
|
6ab8eacddf
examples : add -kvu to batched usage example [no ci] (#17469)
|
1 month ago |
Georgi Gerganov
|
2d50b9d8cb
sync : ggml
|
1 month ago |
Daniel Bevenius
|
697edfeead
ggml : remove dirty flag from version string (ggml/1391)
|
1 month ago |
Alberto Cabrera Pérez
|
dbb852b549
ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm) (#16739)
|
1 month ago |
ixgbe
|
5f55c385cb
ggml: add RISC-V cpu-feats (#17461)
|
1 month ago |
william pan
|
4902eebe33
models : Added support for RND1 Diffusion Language Model (#17433)
|
1 month ago |