Alberto Cabrera Pérez
|
cd8370b408
ggml-cpu: aarm64: q4_K repack gemm and gemv implementations (dotprod only) (#17494)
|
2 bulan lalu |
Eric Curtin
|
d21a76ac38
devops: Add build-essential to Ubuntu 26.04 image (#17531)
|
2 bulan lalu |
Aleksei Nikiforov
|
4fcd87cf7c
gguf-py : skip endian-conversion of MXFP4 data (#17523)
|
2 bulan lalu |
Acly
|
b78db3bd50
vulkan : move contiguous checks to device_supports_op (#17490)
|
2 bulan lalu |
Jeff Bolz
|
142df17c9c
vulkan: use a fixed 1KB buffer for the add_rms_fusion opt (#17514)
|
2 bulan lalu |
Xuan-Son Nguyen
|
e509411cf1
server: enable jinja by default, update docs (#17524)
|
2 bulan lalu |
lhez
|
7cba58bbea
opencl: add sqr, sqrt, mean and ssm_conv (#17476)
|
2 bulan lalu |
Alberto Cabrera Pérez
|
5449367b21
Fix chunks being too small with small matrix sizes (#17526)
|
2 bulan lalu |
Han Qingzhe
|
1d594c295c
clip: (minicpmv) fix resampler kq_scale (#17516)
|
2 bulan lalu |
Jeff Bolz
|
eec1e33a9e
vulkan: allow graph_optimize for prompt processing workloads (#17475)
|
2 bulan lalu |
Jeff Bolz
|
879d673759
vulkan: Implement top-k (#17418)
|
2 bulan lalu |
xctan
|
6ab4e50d9c
ggml-cpu : add RISC-V Zvfh impl for ggml_vec_mad_f16 (#17448)
|
2 bulan lalu |
Adrien Gallouët
|
2336cc4784
cmake : use EXCLUDE_FROM_ALL to avoid patch-boringssl.cmake (#17520)
|
2 bulan lalu |
Adrien Gallouët
|
e6923caaec
ggml : fix ARM feature verification (#17519)
|
2 bulan lalu |
Jiacheng (Jason) Chen
|
3e18dba9fd
HIP: Patch failed testcase in WMMA-MMQ kernels for RDNA 4 (#17502)
|
2 bulan lalu |
hipudding
|
eeb5605de2
CANN: Add MROPE and IMROPE support (#17401)
|
2 bulan lalu |
o7si
|
f3a848a3b1
chore: upgrade cpp-httplib from v0.27.0 to v0.28.0 (#17513)
|
2 bulan lalu |
Jeff Bolz
|
b3b03a7baf
vulkan: Implement GGML_OP_CUMSUM (#17479)
|
2 bulan lalu |
Georgi Gerganov
|
583cb83416
ggml : add ggml_top_k (#17365)
|
2 bulan lalu |
Aleksei Nikiforov
|
05872ac885
convert : fix big-endian conversion (#17431)
|
2 bulan lalu |
Diego Devesa
|
55ab25caf5
codeowners : remove slaren (#17492)
|
2 bulan lalu |
TianHao324
|
064c90d843
CANN: supports out_prod operator for F32 and F16 (#17406)
|
2 bulan lalu |
Pascal
|
b1846f1c8e
webui: add rehype plugin to restore HTML in Markdown table cells (#17477)
|
2 bulan lalu |
Jeff Bolz
|
d414db02d3
vulkan: Use fewer rows for scalar FA when HS is not a multiple of 16 (#17455)
|
2 bulan lalu |
Aaron Teo
|
877566d512
llama: introduce support for model-embedded sampling parameters (#17120)
|
2 bulan lalu |
Jeff Bolz
|
3d07caa99b
vulkan: more FA details in vk_perf_logger (#17443)
|
2 bulan lalu |
Daniel Bevenius
|
134e6940ca
llama : skip output reordering for single token batches (#17466)
|
2 bulan lalu |
Jiacheng (Jason) Chen
|
0543f928a3
HIP: WMMA-MMQ kernels for RDNA 4 (#17156)
|
2 bulan lalu |
Sigbjørn Skjæret
|
b61de2b2df
convert : allow quantizing lora again (#17453)
|
2 bulan lalu |
Xuan-Son Nguyen
|
b8372eecd9
server: split server.cpp code into server/common/task/queue (#17362)
|
2 bulan lalu |