Law Po Ying
|
d9e03db1e7
sycl: add missing BF16 conversion support for Intel oneAPI (#17780)
|
1 month ago |
Jeff Bolz
|
db97837385
vulkan: perf_logger improvements (#17672)
|
1 month ago |
Vishal Singh
|
017761daf5
ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690)
|
1 month ago |
Xuan-Son Nguyen
|
c42712b056
server: support multiple generations from one prompt (OAI "n" option) (#17775)
|
1 month ago |
Phylliida Dev
|
09c7c50e64
ggml : add circular tiling support to pad, for Vulkan, CUDA, and CPU (used for making seamless textures) (#16985)
|
1 month ago |
Johannes Gäßler
|
f334b79494
HIP: fix RDNA3 FP16/BF16 matrix multiplication (#17817)
|
1 month ago |
Aleksander Grygier
|
a28e3c7567
webui: Stop generation from chat sidebar (#17806)
|
1 month ago |
Aleksander Grygier
|
e31b5c55c3
webui: Fix context available value in Multi-model Router mode (#17804)
|
1 month ago |
Aleksander Grygier
|
21f24f27a9
webui: Per-conversation system message with UI displaying, edition & branching (#17275)
|
1 month ago |
Sky
|
7b43f55753
ggml : improve error handling for search path existence checks (#17653)
|
1 month ago |
Daniel Bevenius
|
444f00b0ec
llama : remove quantization sanity check (#17788)
|
1 month ago |
Jeff Bolz
|
2960eb2975
vulkan: Use one row per workgroup for f32 mmv (#17711)
|
1 month ago |
Xuan-Son Nguyen
|
dbc15a7967
convert: support Mistral 3 Large MoE (#17730)
|
1 month ago |
Jeff Bolz
|
c6c5e85979
vulkan: support solve_tri with larger N/K values (#17781)
|
1 month ago |
Georgi Gerganov
|
8e5f4987b1
contrib : stale PRs (#17803)
|
1 month ago |
Georgi Gerganov
|
8ce774a102
metal : fix build(#17799)
|
1 month ago |
Masato Nakasaka
|
67788f6846
vulkan: Replace deprecated VK_EXT_validation_features (#17637)
|
1 month ago |
Masato Nakasaka
|
d8c0a7b085
vulkan: Fix mismatch in TOPK_MOE unit test (#17541)
|
1 month ago |
Jeff Bolz
|
933414c0b6
vulkan: add more num_blocks instantiations in rms_norm (#17701)
|
1 month ago |
Jeff Bolz
|
a0f3897d53
vulkan: fix top_k bug when there are ties in the input (#17659)
|
1 month ago |
Acly
|
e15cd06a94
vulkan : support conv-2d with large output size (#17685)
|
1 month ago |
Reese Levine
|
fd57b24c0f
ggml webgpu: unary op suppport, code refactoring, ops support (#17764)
|
1 month ago |
Jeff Bolz
|
6ab0d64960
vulkan: enable mmvq for q2_k on NVIDIA (#17675)
|
1 month ago |
Jeff Bolz
|
93bb92664e
vulkan: set all memory allocations to high priority (#17624)
|
1 month ago |
Georgi Gerganov
|
8160b38a5f
rpc : fix alloc size logic (#17116)
|
1 month ago |
Georgi Gerganov
|
c41bde6fbd
metal : add residency sets keep-alive heartbeat (#17766)
|
1 month ago |
Johannes Gäßler
|
6016d0bd41
HIP : fix RDNA4 build (#17792)
|
1 month ago |
Pascal
|
1be97831e4
fix: prevent segfault in tokenizer on highly repetitive input (#17786)
|
1 month ago |
Adrien Gallouët
|
a6cfc212ed
ci : fix winget workflow (#17790)
|
1 month ago |
shalinib-ibm
|
3a0d10533a
Q4/Q8 Tiled Gemm Optimization. (#16999)
|
1 month ago |