Xuan-Son Nguyen
|
cb9178f885
llama : remove llm_graph_input_one (#14603)
|
6 miesięcy temu |
compilade
|
4a5686da22
llama : support Jamba hybrid Transformer-Mamba models (#7531)
|
6 miesięcy temu |
Xuan-Son Nguyen
|
98bab638fb
ggml : add ggml_scale_bias (#14417)
|
6 miesięcy temu |
Miaoqian Lin
|
26a48ad699
ggml : prevent integer overflow in gguf tensor size calculation (#14595)
|
6 miesięcy temu |
Dowon
|
ffd59e7d18
model : add skt/A.X-4.0 model vocabulary (#14589)
|
6 miesięcy temu |
Sigbjørn Skjæret
|
105554595f
llama : remove unintended whitespace (#14592)
|
6 miesięcy temu |
ibrahim khadraoui
|
04655063c4
model : add support for Falcon-H1 family (#14534)
|
6 miesięcy temu |
Xuan-Son Nguyen
|
20b7bf8a32
convert : fix smollm3 jinja template (#14586)
|
6 miesięcy temu |
Jeff Bolz
|
6efcd65945
vulkan: optimize flash attention split_k_reduce (#14554)
|
6 miesięcy temu |
stevenkuang
|
699f4392a3
model : fix hunyuan moe chat template (#14584)
|
6 miesięcy temu |
Xuan-Son Nguyen
|
08382869a2
model : add SmolLM3 (#14581)
|
6 miesięcy temu |
compilade
|
bb4f7a9e4e
memory : fix broken batch splits for recurrent cache (#14575)
|
6 miesięcy temu |
Jeff Bolz
|
b8eeb8741d
vulkan : fix rope with partial rotation and non-cont src (#14582)
|
6 miesięcy temu |
Alawode Oluwandabira
|
17a1f0d2d4
server: Add ability to mount server at prefix (#14544)
|
6 miesięcy temu |
Xuan-Son Nguyen
|
8f22dc0a53
model : add hunyuan moe (#14425)
|
6 miesięcy temu |
Jeff Bolz
|
53903ae6fa
vulkan: increase timeout for CI (#14574)
|
6 miesięcy temu |
Georgi Gerganov
|
4d0dcd4a06
cuda : fix rope with partial rotation and non-cont src (#14580)
|
6 miesięcy temu |
Aman Gupta
|
75c91de6e9
CUDA: add bilinear interpolation for upscale (#14563)
|
6 miesięcy temu |
R0CKSTAR
|
68155c66f0
musa: fix build warnings (unused variable) (#14561)
|
6 miesięcy temu |
Sigbjørn Skjæret
|
e1a7059053
llama : fix incorrect minicpm3 v_states shape (#14571)
|
6 miesięcy temu |
Sigbjørn Skjæret
|
12f55c302b
llama : remove ggml_cont where possible (#14568)
|
6 miesięcy temu |
Aman Gupta
|
b9c3eefde1
CUDA: add bf16 and i32 to getrows (#14529)
|
6 miesięcy temu |
Eve
|
6491d6e4f1
vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (#14485)
|
6 miesięcy temu |
Jeff Bolz
|
e592be1575
vulkan: fix rms_norm+mul fusion (#14545)
|
6 miesięcy temu |
Jeff Bolz
|
a0374a67e2
vulkan: Handle updated FA dim2/3 definition (#14518)
|
6 miesięcy temu |
Sigbjørn Skjæret
|
ddef99522d
server : fix assistant prefilling when content is an array (#14360)
|
6 miesięcy temu |
Sigbjørn Skjæret
|
6681688146
opencl: add GELU_ERF (#14476)
|
6 miesięcy temu |
Georgi Gerganov
|
bac8bed248
eval-callback : check for empty input (#14539)
|
6 miesięcy temu |
R0CKSTAR
|
b81510a7b7
test-backend-ops: add support for specifying output format (#14368)
|
6 miesięcy temu |
Georgi Gerganov
|
ef797db357
metal : disable fast math in all quantize kernels (#14528)
|
6 miesięcy temu |