lhez
|
1c872f71fb
opencl: add f16 for `add`, `sub`, `mul`, `div` (#14984)
|
пре 5 месеци |
Srihari-mcw
|
baad94885d
ggml : Q2k interleaving implementation - x86/x64 SIMD (#14373)
|
пре 5 месеци |
Georgi Gerganov
|
ba42794c9e
graph : fix equal_seq() check (#14986)
|
пре 5 месеци |
diannao
|
2860d479b4
docker : add cann build pipline (#14591)
|
пре 5 месеци |
R0CKSTAR
|
484b2091ce
compare-commits.sh: support both llama-bench and test-backend-ops (#14392)
|
пре 5 месеци |
Ed Addario
|
daf2dd7880
quantize : skip tensor override when in fallback mode (#14995)
|
пре 5 месеци |
Diego Devesa
|
a06ed5feae
llama : add simple option to enable CPU for MoE weights (--cpu-moe) (#14992)
|
пре 5 месеци |
Aman Gupta
|
784524053d
Fix params bug in diffusion example (#14993)
|
пре 5 месеци |
Diego Devesa
|
d6818d06a6
llama : allow other bufts when overriding to CPU, add --no-repack option (#14990)
|
пре 5 месеци |
Ruben Ortlam
|
e08a98826b
Vulkan: Fix minor debug mode issues (#14899)
|
пре 5 месеци |
tc-mb
|
952a47f455
mtmd : support MiniCPM-V 4.0 (#14983)
|
пре 5 месеци |
Csaba Kecskemeti
|
36e5fe7bcd
MODEL_TENSOR.SSM_DT_NORM has defined twice (#14991)
|
пре 5 месеци |
g2mt
|
94933c8c2e
server : implement universal assisted decoding (#12635)
|
пре 5 месеци |
Dongliang Wei
|
c1dacaa99b
llama : merge build_moe_ffn_from_probs function into build_moe_ffn (#14968)
|
пре 5 месеци |
Lukas Straub
|
a9f77a8be3
server : add openai-style logit_bias support (#14946)
|
пре 5 месеци |
Aman Gupta
|
8a4a856277
Add LLaDA 8b Diffusion model (#14771)
|
пре 5 месеци |
hipudding
|
11490b3672
CANN: Improve loading efficiency after converting weights to NZ format. (#14985)
|
пре 5 месеци |
compilade
|
66625a59a5
graph : reduce splits for recurrent and hybrid models (#14825)
|
пре 5 месеци |
lhez
|
6e6725459a
opencl: add `mul_mat_f32_f32_l4_lm` and `mul_mat_f16_f32_l4_lm` (#14809)
|
пре 5 месеци |
Ed Addario
|
e9192bec56
quantize : fix using combined imatrix GGUFs (multiple datasets) (#14973)
|
пре 5 месеци |
Daniel Bevenius
|
41e78c567e
server : add support for `embd_normalize` parameter (#14964)
|
пре 5 месеци |
uvos
|
ad4a700117
HIP: enable mfma mmq on gfx908 and gfx90a for select datatypes and shapes (#14949)
|
пре 5 месеци |
Georgi Gerganov
|
e32a4ec60e
sync : ggml
|
пре 5 месеци |
Kai Pastor
|
e228de9449
cmake : Fix BLAS link interface (ggml/1316)
|
пре 5 месеци |
Kai Pastor
|
73a8e5ca03
vulkan : fix 32-bit builds (ggml/1313)
|
пре 5 месеци |
Johannes Gäßler
|
92b8810ec7
CUDA: skip masked KV slices for all FA kernels (#14924)
|
пре 5 месеци |
Georgi Gerganov
|
00131d6eaf
tests : update for LLAMA_SET_ROWS=1 (#14961)
|
пре 5 месеци |
Georgi Gerganov
|
1e15bfd42c
graph : fix stack-use-after-return (#14960)
|
пре 5 месеци |
Douglas Hanley
|
a118d80233
embeddings: fix extraction of CLS pooling results (#14927)
|
пре 5 месеци |
Xinpeng Dou
|
61550f8231
CANN: update ops docs (#14935)
|
пре 5 месеци |