Daniel Bevenius
|
fb15d649ed
llama : add support for EmbeddingGemma 300m (#15798)
|
4 kuukautta sitten |
Gabe Goodhart
|
856ed0947f
metal : Add template specialization for mul_mm_id w/ ne20 == 10 (#15799)
|
4 kuukautta sitten |
Daniel Bevenius
|
d1e2adba65
llama : set n_outputs to 1 to avoid 0 outputs mean-pooling (#15791)
|
4 kuukautta sitten |
Chenguang Li
|
c1c354e44c
CANN: Refactor ND to NZ workspace to be per-device (#15763)
|
4 kuukautta sitten |
Xuan-Son Nguyen
|
a68d914426
server: add exceed_context_size_error type (#15780)
|
4 kuukautta sitten |
Eric Curtin
|
badb80cadb
Document the new max GPU layers default in help (#15771)
|
4 kuukautta sitten |
leejet
|
0a1b3982cd
ggml: add ops for WAN video model (cuda && cpu) (#15669)
|
4 kuukautta sitten |
hipudding
|
5421f63ab0
CANN: Fix precision issue on 310I DUO multi-devices (#15784)
|
4 kuukautta sitten |
rmatif
|
820bc98531
opencl: add hs=40 to FA (#15758)
|
4 kuukautta sitten |
Chenguang Li
|
239b60e898
CANN: fix acl_rstd allocation size in ggml_cann_rms_norm (#15760)
|
4 kuukautta sitten |
Ruben Ortlam
|
dff7551bfd
vulkan: fix mmv subgroup16 selection (#15775)
|
4 kuukautta sitten |
Jeff Bolz
|
0fce7a1248
vulkan: don't use std::string in load_shaders, to improve compile time (#15724)
|
4 kuukautta sitten |
Daniel Bevenius
|
8227695d7a
vulkan : update ggml_vk_instance_validation_ext_available (#15666)
|
4 kuukautta sitten |
Shin-myoung-serp
|
0014fb4add
ggml vulkan: add hardsigmoid and hardswish operations (#15762)
|
4 kuukautta sitten |
Oliver Simons
|
661ae31c9c
CUDA: Optimize `rms_norm_f32` kernel and its fused variants, giving 1-6% perf E2E (#15715)
|
4 kuukautta sitten |
Daniel Bevenius
|
407c23786d
model-conversion : fix pyright errors (#15770)
|
4 kuukautta sitten |
Georgi Gerganov
|
cdedb70a99
sampling : optimize dist sampler (#15704)
|
4 kuukautta sitten |
Daniel Bevenius
|
2c8dac72eb
llama : fix incorrect model type for Gemma 270M (#15764)
|
4 kuukautta sitten |
Daniel Bevenius
|
40a751ea9a
model-conversion : remove hardcoded /bin/bash shebangs [no ci] (#15765)
|
4 kuukautta sitten |
hipudding
|
5eae934883
CANN: Add RoPE contiguous check for 310I DUP device (#15735)
|
4 kuukautta sitten |
xctan
|
05c0380f2a
ggml-cpu : optimize RVV kernels (#15720)
|
4 kuukautta sitten |
Daniel Bevenius
|
8c3fdf44ec
model-conversion : add missing curl script [no ci] (#15761)
|
4 kuukautta sitten |
hipudding
|
f6da8cb86a
CANN: Mask unsupported TRANSPOSE_1D operator (#15733)
|
4 kuukautta sitten |
Chenguang Li
|
8a2234ea0c
CANN: Fix type float_t to float (#15736)
|
4 kuukautta sitten |
SnA1lGo
|
3de008208b
fix: resolve unsigned int initialization warning for n_dims/size in gguf.cpp (#15754)
|
4 kuukautta sitten |
Oliver Simons
|
69db8a52e6
chore: Update `.clang-format` to use `BinPackArguments=true` (#15744)
|
4 kuukautta sitten |
Johannes Gäßler
|
c466abe158
llama: -fa 1/0/-1 aliases for -fa on/off/auto (#15746)
|
4 kuukautta sitten |
Ruben Ortlam
|
0a2a3841e8
vulkan: fix shaders gen when no integer dot is available (#15740)
|
4 kuukautta sitten |
hipudding
|
9961d244f2
CANN: Resolve soft_max precision issue (#15730)
|
4 kuukautta sitten |
Jeff Bolz
|
25f1045f07
vulkan: Fix macro parameter order for f32 matmul shaders (#15716)
|
4 kuukautta sitten |