Alessandro98-git
|
96e33a814e
model : fix div-by-zero for Nemotron V2 (#18309)
|
1 maand geleden |
Ryan Mangeno
|
dfc959b886
model : Granite Embedding support (#15641)
|
1 maand geleden |
compilade
|
8f48807380
gguf-py : do not align the data start offset (#18291)
|
1 maand geleden |
Shouyu
|
bf6bc3c155
ggml-hexagon: gelu optimization (#18151)
|
1 maand geleden |
Xuan-Son Nguyen
|
179fd82a72
gen-docs: automatically update markdown file (#18294)
|
1 maand geleden |
Taimur Ahmad
|
d34d5ca1e9
llamafile: add rvv support for sgemm kernels (#18199)
|
1 maand geleden |
lhez
|
eb492bf43f
opencl: unpack q4_0 for adreno in get_tensor (#18278)
|
1 maand geleden |
Jeff Bolz
|
e3b35ddf1c
vulkan: Extend rope fusions to allow mrope (#18264)
|
1 maand geleden |
Xuan-Son Nguyen
|
6ce863c803
server: prevent data race from HTTP threads (#18263)
|
1 maand geleden |
Xuan-Son Nguyen
|
3997c78e33
server: fix data race in to_json_anthropic (#18283)
|
1 maand geleden |
Mattt
|
ee74642982
release: update release workflow to store XCFramework as Zip file (#18284)
|
1 maand geleden |
Aaron Teo
|
a28310488c
convert: rework ftype heuristics (#18214)
|
1 maand geleden |
Xuan-Son Nguyen
|
86af848153
server: (docs) remove mention about extra_args (#18262)
|
1 maand geleden |
Johannes Gäßler
|
147a521636
tool/ex/tests: consistently free ctx, then model (#18168)
|
1 maand geleden |
Jeff Bolz
|
e1f15b454f
vulkan: Implement set_tensor_async and the event interfaces (#18047)
|
1 maand geleden |
Johannes Gäßler
|
0e1ccf15c7
llama: fix RPC for -fit on (#18233)
|
1 maand geleden |
Xuan-Son Nguyen
|
5e25ddebff
move copilot instructions to AGENTS.md (#18259)
|
1 maand geleden |
Jeff Bolz
|
fd05c51cec
vulkan: fix im2col overflowing maxworkgroupcount (#18180)
|
1 maand geleden |
Jeff Bolz
|
b365c3ff01
vulkan/cuda: fix topk_moe with exp_probs_b (#18071)
|
1 maand geleden |
Jeff Bolz
|
cb64222b0c
vulkan: support GGML_UNARY_OP_XIELU (#18062)
|
1 maand geleden |
Jeff Bolz
|
6eb7081860
vulkan: in graph_optimize, try to group ADD operations (#18060)
|
1 maand geleden |
lovedheart
|
4117ae5557
Vulkan: some improvement on mul_mat_iq2_xs (#18031)
|
1 maand geleden |
Daniel Bevenius
|
65e96a2464
docs : fix links in parsing.md (#18245)
|
1 maand geleden |
Aldehir Rojas
|
9496bbb808
common : reorganize includes to prioritize vendored deps (#18222)
|
1 maand geleden |
Xuan-Son Nguyen
|
ddcb75dd8a
server: add auto-sleep after N seconds of idle (#18228)
|
1 maand geleden |
Jeff Bolz
|
52ab19df63
tests: Avoid floating point precision false positives in SUM (#17471)
|
1 maand geleden |
Jeff Bolz
|
5182dd64cd
test-backend-ops: improve msvc build time (#18209)
|
1 maand geleden |
Aadeshveer Singh
|
10b4f82d44
Added comments explaining thread block size selection logic based on row count and column size, derived from historical commit context (#18212)
|
1 maand geleden |
Oleksandr Kuvshynov
|
408616adbd
server : [easy] fix per round speculative decode logging (#18211)
|
1 maand geleden |
Xuan-Son Nguyen
|
9e39a1e6a9
server: support load model on startup, support preset-only options (#18206)
|
1 maand geleden |