R0CKSTAR
|
3cd3a39532
ci: [MUSA] add CI and update doc (#12562)
|
9 miesięcy temu |
Georgi Gerganov
|
2d77d88e70
context : fix worst-case reserve outputs (#12545)
|
9 miesięcy temu |
Akarshan Biswas
|
c95fa362b3
ci: [SYCL] ggml-ci Use main GPU and enable sysman (#12547)
|
9 miesięcy temu |
lhez
|
2b65ae3029
opencl: simplify kernel embedding logic in cmakefile (#12503)
|
9 miesięcy temu |
Akarshan Biswas
|
48d7021c61
CI: fix SYCL build (#12546)
|
9 miesięcy temu |
Tei Home
|
3361e2deba
docs: update: improve the Fedoa CUDA guide (#12536)
|
9 miesięcy temu |
compilade
|
00d53800e0
llama-vocab : add SuperBPE pre-tokenizer (#12532)
|
9 miesięcy temu |
R0CKSTAR
|
7ea75035b6
CUDA: Fix clang warnings (#12540)
|
9 miesięcy temu |
Prajwal B Mehendarkar
|
c54f6b7988
mmap : skip resource limit checks on AIX (#12541)
|
9 miesięcy temu |
Jeff Bolz
|
9b169a4d4e
vulkan: fix mul_mat_vec failure in backend tests (#12529)
|
9 miesięcy temu |
Marius Gerdes
|
77f9c6bbe5
server : Add verbose output to OAI compatible chat endpoint. (#12246)
|
10 miesięcy temu |
Lars Sonchocky-Helldorf
|
18b663d8e4
install : add macports (#12518)
|
10 miesięcy temu |
Xuan-Son Nguyen
|
fbdfefe74e
llama : gemma3 : use output tensor if it exists in model weight (#12506)
|
10 miesięcy temu |
Georgi Gerganov
|
ba932dfb50
ggml : fix quantized cpy op (#12310)
|
10 miesięcy temu |
R0CKSTAR
|
fac63a3d78
musa: refine compute capability (#12493)
|
10 miesięcy temu |
Jeff Bolz
|
eddfb43850
vulkan: Optimize mul_mat_vec p021 and nc shaders (#12505)
|
10 miesięcy temu |
stduhpf
|
4375415b4a
Vulkan: RTE rounding for cpy to quant (#12480)
|
10 miesięcy temu |
Eve
|
30c42ef5cb
vulkan: workaround for AMD Windows driver 16 bit unpack8 bug (#12472)
|
10 miesięcy temu |
Georgi Gerganov
|
af04481e6b
model : do not repack if a GPU device is present (#12498)
|
10 miesięcy temu |
Sigbjørn Skjæret
|
960e726077
chore : cleanup llama_model_loader::TENSOR_ usage (#12492)
|
10 miesięcy temu |
marcoStocchi
|
ea1518e839
llama-tts : avoid crashes related to bad model file paths (#12482)
|
10 miesięcy temu |
蕭澧邦
|
1aa87ee53d
[SYCL] Fix build on Windows when ccache enabled (#9954) (#9976)
|
10 miesięcy temu |
Svetlozar Georgiev
|
9ffcc9e374
sycl: cleanup oneDNN related code (#12097)
|
10 miesięcy temu |
Woof Dog
|
e04643063b
webui : Prevent rerendering on textarea input (#12299)
|
10 miesięcy temu |
Sigbjørn Skjæret
|
dbb3a4739e
llama : make Qwen2MoE QKV bias optional (#12477)
|
10 miesięcy temu |
Srihari-mcw
|
3d82dbcbce
ggml : block interleaving support for Q4_K quantization for x86 AVX2 architecture (#12332)
|
10 miesięcy temu |
Bartowski
|
732b5fbf5e
convert : avoid calls to tokenizer.added_tokens_decoder (#12473)
|
10 miesięcy temu |
fairydreaming
|
568013d0cd
context : clear sets containing encoder output sequence ids before storing new values (#12470)
|
10 miesięcy temu |
Gaurav Garg
|
517b5ddbf0
CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case (#12183)
|
10 miesięcy temu |
Jeff Bolz
|
a9b59288e2
vulkan: optimize iq1 coopmat2 dequant functions (#12427)
|
10 miesięcy temu |