Ervin Áron Tasnádi
|
a979ca22db
ggml: adds CONV_2D op and direct GEMM Vulkan implementation (#14316)
|
6 months ago |
compilade
|
90083283ec
imatrix : use GGUF to store importance matrices (#9400)
|
6 months ago |
Peter0x44
|
d4b91ea7b2
vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) (#14707)
|
6 months ago |
0cc4m
|
83f5872404
Vulkan: Fix fprintf format-security warning (#14770)
|
6 months ago |
rspOverflow
|
f0d4d176df
Documentation: Update build.md's Vulkan section (#14736)
|
6 months ago |
Georgi Gerganov
|
b17230917c
sync : ggml
|
6 months ago |
Georgi Gerganov
|
bf9087f59a
metal : fuse add, mul + add tests (#14596)
|
6 months ago |
Georgi Gerganov
|
9fb1042ce6
graph : fix graph reuse reset of params (#14760)
|
6 months ago |
Georgi Gerganov
|
2adf8d83ac
parallel : add option for different RNG seeds (#14757)
|
6 months ago |
Oliver Simons
|
021cc28bef
cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (#14741)
|
6 months ago |
Georgi Gerganov
|
d498af3d5a
graph : avoid huge warm-up graphs for MoE models (#14753)
|
6 months ago |
Georgi Gerganov
|
eacdeb5bfc
model : fix build after merge conflict (#14754)
|
6 months ago |
lgai-exaone
|
e0cb5c5cb8
model : add EXAONE 4.0 support (#14630)
|
6 months ago |
Aman Gupta
|
f9a31eea06
CUDA: set_rows + cpy.cu refactor (#14712)
|
6 months ago |
Georgi Gerganov
|
8f974bc1e9
graph : refactor context to not pass gf explicitly (#14629)
|
6 months ago |
Nexes the Elder
|
09651d09ff
graph : Pass the graph placeholder message in debug mode (#14748)
|
6 months ago |
Neo Zhang Jianyu
|
349ea79fce
use max work group size for device to replace the magic number (#14732)
|
6 months ago |
Piotr Wilkin (ilintar)
|
670e1360cd
convert : fix Ernie4.5 MoE without shared experts (#14746)
|
6 months ago |
Wroclaw
|
760b4484e3
nix : use optionalAttrs for env mkDerivation attrset argument (#14726)
|
6 months ago |
Piotr Wilkin (ilintar)
|
cb887f1bc1
model: add Ernie 4.5 MoE support (#14658)
|
6 months ago |
Georgi Gerganov
|
d6fb3f6b49
kv-cache : fix k-shift for multiple streams (#14742)
|
6 months ago |
Georgi Gerganov
|
01612b7409
llama : reuse compute graphs (#14482)
|
6 months ago |
Tarek Dakhran
|
086cf81e88
llama : fix parallel processing for lfm2 (#14705)
|
6 months ago |
Georgi Gerganov
|
d9b691081c
kv-cache : opt mask set input (#14600)
|
6 months ago |
Georgi Gerganov
|
ad57d3edd2
batch : fix uninitialized has_cpl flag (#14733)
|
6 months ago |
Sigbjørn Skjæret
|
1ba45d4982
ci : disable failing vulkan crossbuilds (#14723)
|
6 months ago |
Sigbjørn Skjæret
|
19e5943d9e
convert : make hf token optional (#14717)
|
6 months ago |
Diner Burger
|
496957e1cb
llama : fix parameter order for hybrid memory initialization (#14725)
|
6 months ago |
Reese Levine
|
21c021745d
ggml: Add initial WebGPU backend (#14521)
|
6 months ago |
tempstudio
|
b0f0ecc3dc
model : support output bias for qwen2 (#14711)
|
6 months ago |