Sigbjørn Skjæret
|
2bf3fbf0b5
ci : check that pre-tokenizer hashes are up-to-date (#15032)
|
5 kuukautta sitten |
Douglas Hanley
|
711d5e6fe6
convert : fix Qwen3-Embedding pre-tokenizer hash (#15030)
|
5 kuukautta sitten |
Jhen-Jie Hong
|
f738989dcb
chat : fix multiple tool_calls on hermes-2-pro (#14962)
|
5 kuukautta sitten |
Jeff Bolz
|
4cb208c93c
vulkan: coopmat2 mul_mat optimizations (#14934)
|
5 kuukautta sitten |
R0CKSTAR
|
3025b621d1
llama-bench: rename DB table name from test to llama_bench (#15003)
|
5 kuukautta sitten |
Jeff Bolz
|
ec0b18802c
vulkan: Support ne[3]>1 in noncontig matrix-vector multiply (#15015)
|
5 kuukautta sitten |
Douglas Hanley
|
339bd0268c
model : support Qwen3-Embedding (#15023)
|
5 kuukautta sitten |
Johannes Gäßler
|
f906275537
server: enable token array inputs for OAI API (#15001)
|
5 kuukautta sitten |
Jeff Bolz
|
a9f7541ec2
vulkan: optimizations for direct convolution (#14933)
|
5 kuukautta sitten |
Johannes Gäßler
|
9c35706b98
CUDA: fix MMQ nwarps for AMD with warp_size==32 (#15014)
|
5 kuukautta sitten |
l-austenfeld
|
c76b420e4c
vendor : update vendored copy of google/minja (#15011)
|
5 kuukautta sitten |
stevenkuang
|
0f5ccd6fd1
model : add hunyuan dense (#14878)
|
5 kuukautta sitten |
lhez
|
1c872f71fb
opencl: add f16 for `add`, `sub`, `mul`, `div` (#14984)
|
5 kuukautta sitten |
Srihari-mcw
|
baad94885d
ggml : Q2k interleaving implementation - x86/x64 SIMD (#14373)
|
5 kuukautta sitten |
Georgi Gerganov
|
ba42794c9e
graph : fix equal_seq() check (#14986)
|
5 kuukautta sitten |
diannao
|
2860d479b4
docker : add cann build pipline (#14591)
|
5 kuukautta sitten |
R0CKSTAR
|
484b2091ce
compare-commits.sh: support both llama-bench and test-backend-ops (#14392)
|
5 kuukautta sitten |
Ed Addario
|
daf2dd7880
quantize : skip tensor override when in fallback mode (#14995)
|
5 kuukautta sitten |
Diego Devesa
|
a06ed5feae
llama : add simple option to enable CPU for MoE weights (--cpu-moe) (#14992)
|
5 kuukautta sitten |
Aman Gupta
|
784524053d
Fix params bug in diffusion example (#14993)
|
5 kuukautta sitten |
Diego Devesa
|
d6818d06a6
llama : allow other bufts when overriding to CPU, add --no-repack option (#14990)
|
5 kuukautta sitten |
Ruben Ortlam
|
e08a98826b
Vulkan: Fix minor debug mode issues (#14899)
|
5 kuukautta sitten |
tc-mb
|
952a47f455
mtmd : support MiniCPM-V 4.0 (#14983)
|
5 kuukautta sitten |
Csaba Kecskemeti
|
36e5fe7bcd
MODEL_TENSOR.SSM_DT_NORM has defined twice (#14991)
|
5 kuukautta sitten |
g2mt
|
94933c8c2e
server : implement universal assisted decoding (#12635)
|
5 kuukautta sitten |
Dongliang Wei
|
c1dacaa99b
llama : merge build_moe_ffn_from_probs function into build_moe_ffn (#14968)
|
5 kuukautta sitten |
Lukas Straub
|
a9f77a8be3
server : add openai-style logit_bias support (#14946)
|
5 kuukautta sitten |
Aman Gupta
|
8a4a856277
Add LLaDA 8b Diffusion model (#14771)
|
5 kuukautta sitten |
hipudding
|
11490b3672
CANN: Improve loading efficiency after converting weights to NZ format. (#14985)
|
5 kuukautta sitten |
compilade
|
66625a59a5
graph : reduce splits for recurrent and hybrid models (#14825)
|
5 kuukautta sitten |