Jeff Bolz
|
da95bf2a85
vulkan: support noncontig i32 copy (#17328)
|
2 meses atrás |
Xuan-Son Nguyen
|
0de8878c96
server: split HTTP into its own interface (#17216)
|
2 meses atrás |
Ruben Ortlam
|
38e2c1b412
vulkan: add log RTE support to fix Nvidia CI (#17320)
|
2 meses atrás |
Adrien Gallouët
|
cb44fc84e8
cmake : fix ARM feature verification (#17170)
|
2 meses atrás |
Adrien Gallouët
|
cb623de3fc
ggml : add missing AVX512 feature checks (#17270)
|
2 meses atrás |
Georgi Gerganov
|
7aaeedc098
metal : support I32 -> I32 copy (#17317)
|
2 meses atrás |
Georgi Gerganov
|
3347e6d904
metal : faster argsort (#17315)
|
2 meses atrás |
Georgi Gerganov
|
1a139644a8
metal : add cumsum (#17305)
|
2 meses atrás |
hipudding
|
2376b7758c
CANN: Use smart pointers to manage ACL objects (#17238)
|
2 meses atrás |
Pavels Zaicenkovs
|
dbed61294a
vulkan: add LOG operation support for F32 and F16 (#17183)
|
2 meses atrás |
Ruben Ortlam
|
80deff3648
vulkan: fix MMQ quantize_y condition (#17301)
|
2 meses atrás |
Eve
|
8b1c339bd2
ci : revert #16249 (#17303)
|
2 meses atrás |
Georgi Gerganov
|
416e7c7f47
metal : remove obosolete asserts (#17295)
|
2 meses atrás |
Georgi Gerganov
|
5b2093becc
server : handle context overflow during decode (#17267)
|
2 meses atrás |
lhez
|
52e5d421f1
opencl: fix rms_norm_mul (#17250)
|
2 meses atrás |
shaofeiqi
|
4db5641210
opencl: add kernel to handle mat mul in attention to improve encoding speed (#17181)
|
2 meses atrás |
shani-f
|
72bd7321a7
sycl : unify unary kernels with a generic implementation and enable wide operator support (#17213)
|
2 meses atrás |
Aleksander Grygier
|
22e1ce2f81
webui: Fix clickability around chat processing statistics UI (#17278)
|
2 meses atrás |
Pascal
|
1411d9275a
webui: add OAI-Compat Harmony tool-call streaming visualization and persistence in chat UI (#16618)
|
2 meses atrás |
Sigbjørn Skjæret
|
662192e1dc
convert : remove unnecessary chat template patching (#17289)
|
2 meses atrás |
Jeff Bolz
|
24dc769f1b
vulkan: Fuse mul_mat_id+add_id+mul and mul_mat+add+add. (#17287)
|
2 meses atrás |
Ruben Ortlam
|
4dca015b7e
vulkan: Replace 16-bit unpack8 calls to work around legacy Windows AMD driver bug (#17285)
|
2 meses atrás |
Sigbjørn Skjæret
|
9a8860cf5d
convert : use all parts in safetensors index (#17286)
|
2 meses atrás |
Sigbjørn Skjæret
|
9d3ef4809f
convert : set expert gating func in base class (#17279)
|
2 meses atrás |
Ankur Verma
|
c7b7db0445
mtmd-cli: Avoid logging to stdout for model loading messages in mtmd-cli (#17277)
|
2 meses atrás |
Giuseppe Scrivano
|
1568d13c2c
vulkan: implement ABS and NEG (#17245)
|
2 meses atrás |
Jeff Bolz
|
439342ea0b
vulkan: Use ggml_vk_tensor_subbuffer in mul_mat_vec(id) paths (#17244)
|
2 meses atrás |
Jeff Bolz
|
234ae7d7bd
vulkan: skip all-negative-inf blocks in FA (#17186)
|
2 meses atrás |
Jeff Bolz
|
38eaf32af1
vulkan: change graph_compute to be async and enable get_tensor_async (#17158)
|
2 meses atrás |
Xuan-Son Nguyen
|
9b17d74ab7
mtmd: add mtmd_log_set (#17268)
|
2 meses atrás |