Jiacheng (Jason) Chen
|
3e18dba9fd
HIP: Patch failed testcase in WMMA-MMQ kernels for RDNA 4 (#17502)
|
1 month ago |
hipudding
|
eeb5605de2
CANN: Add MROPE and IMROPE support (#17401)
|
1 month ago |
o7si
|
f3a848a3b1
chore: upgrade cpp-httplib from v0.27.0 to v0.28.0 (#17513)
|
1 month ago |
Jeff Bolz
|
b3b03a7baf
vulkan: Implement GGML_OP_CUMSUM (#17479)
|
1 month ago |
Georgi Gerganov
|
583cb83416
ggml : add ggml_top_k (#17365)
|
1 month ago |
Aleksei Nikiforov
|
05872ac885
convert : fix big-endian conversion (#17431)
|
1 month ago |
Diego Devesa
|
55ab25caf5
codeowners : remove slaren (#17492)
|
1 month ago |
TianHao324
|
064c90d843
CANN: supports out_prod operator for F32 and F16 (#17406)
|
1 month ago |
Pascal
|
b1846f1c8e
webui: add rehype plugin to restore HTML in Markdown table cells (#17477)
|
1 month ago |
Jeff Bolz
|
d414db02d3
vulkan: Use fewer rows for scalar FA when HS is not a multiple of 16 (#17455)
|
1 month ago |
Aaron Teo
|
877566d512
llama: introduce support for model-embedded sampling parameters (#17120)
|
1 month ago |
Jeff Bolz
|
3d07caa99b
vulkan: more FA details in vk_perf_logger (#17443)
|
1 month ago |
Daniel Bevenius
|
134e6940ca
llama : skip output reordering for single token batches (#17466)
|
1 month ago |
Jiacheng (Jason) Chen
|
0543f928a3
HIP: WMMA-MMQ kernels for RDNA 4 (#17156)
|
1 month ago |
Sigbjørn Skjæret
|
b61de2b2df
convert : allow quantizing lora again (#17453)
|
1 month ago |
Xuan-Son Nguyen
|
b8372eecd9
server: split server.cpp code into server/common/task/queue (#17362)
|
1 month ago |
Daniel Bevenius
|
6ab8eacddf
examples : add -kvu to batched usage example [no ci] (#17469)
|
1 month ago |
Georgi Gerganov
|
2d50b9d8cb
sync : ggml
|
1 month ago |
Daniel Bevenius
|
697edfeead
ggml : remove dirty flag from version string (ggml/1391)
|
1 month ago |
Alberto Cabrera Pérez
|
dbb852b549
ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm) (#16739)
|
1 month ago |
ixgbe
|
5f55c385cb
ggml: add RISC-V cpu-feats (#17461)
|
1 month ago |
william pan
|
4902eebe33
models : Added support for RND1 Diffusion Language Model (#17433)
|
1 month ago |
Max Krasnyansky
|
923ae3c619
hexagon: add support for ROPE_NEOX (#17458)
|
1 month ago |
Raul Torres
|
01ad35e6d6
CANN: Define `cann_graph_update_required` before macro (#17434)
|
1 month ago |
M. Mediouni
|
fcb013847c
ggml-hexagon: Initial Hexagon v68/v69 support (#17394)
|
1 month ago |
nullname
|
d5bc1ad110
ggml-hexagon: add `hex_supported_buffer` for better buffer supported check (#17212)
|
2 months ago |
Pascal
|
0c7220db56
webui: minor settings reorganization and add disable autoscroll option (#17452)
|
2 months ago |
Sigbjørn Skjæret
|
96ac5a2329
cuda : support non-contiguous i32 to i32 copy (#17326)
|
2 months ago |
Eric Curtin
|
bc809e9c53
vulkan: Update docker image to Ubuntu 26.04 to enable glslc features (#17439)
|
2 months ago |
Jeff Bolz
|
54d83bbe85
vulkan: remove a couple unnecessary switches (#17419)
|
2 months ago |