Jeff Bolz
|
aa0c461efe
vulkan: fix failing dequant shaders (#15862)
|
пре 4 месеци |
Jeff Bolz
|
b9c9c9f789
vulkan: initialize vulkan-hpp to allow using extension function pointers (#15705)
|
пре 4 месеци |
Diego Devesa
|
50f4281a6f
llama : allow using iGPUs with --device (#15951)
|
пре 4 месеци |
Georgi Gerganov
|
55758b00ca
metal : refactor kernel loading (#15964)
|
пре 4 месеци |
Georgi Gerganov
|
f161463a54
metal : allow ops to run concurrently (#15929)
|
пре 4 месеци |
Georgi Gerganov
|
84d7b2fca1
metal : fix memory leaks (#15962)
|
пре 4 месеци |
Aaron Teo
|
40be51152d
ggml-zdnn: fix #15414, activate FP16 and BF16 acceleration and incorrect zTensor free (#15839)
|
пре 4 месеци |
Eric Curtin
|
4bf5549269
Add docker protocol support for llama-server model loading (#15790)
|
пре 4 месеци |
Haiyue Wang
|
f4e664f838
context : remove redundant explicit casting to the same type (#15948)
|
пре 4 месеци |
Georgi Gerganov
|
f088b6a84f
server : adjust prompt similarity thold + add logs (#15913)
|
пре 4 месеци |
Ruben Ortlam
|
304ac5693d
Vulkan iGPU device selection overhaul and PCI ID API support (#15947)
|
пре 4 месеци |
Mathieu Baudier
|
6c88ad8fa7
vulkan: Make device memory check more portable (#15939)
|
пре 4 месеци |
Neo Zhang Jianyu
|
704d90c987
Revert "sycl: add usage of enqueue_functions extension (#14244)" (#15910)
|
пре 4 месеци |
Diego Devesa
|
360d6533db
ggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device type (#15797)
|
пре 4 месеци |
Johannes Gäßler
|
0e6ff0046f
CUDA: larger SRAM reads for tile FA, AMD FP16 dot (#15927)
|
пре 4 месеци |
ddh0
|
df082f5630
nitpick : correct MB to MiB (#15934)
|
пре 4 месеци |
Daniel Bevenius
|
24a6734daf
ggml-cpu : add check for ARM MATMUL_INT8/i8mm support (#15922)
|
пре 4 месеци |
Charles Xu
|
2b3efea9a4
kleidiai: fix GGML_ASSERT(*cur_backend_id != -1) failed (#15614)
|
пре 4 месеци |
hipudding
|
c0389dba43
CANN: Disable acl_graph for prefill stage (#15933)
|
пре 4 месеци |
Oliver Simons
|
00681dfc16
CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3% E2E performance (#15872)
|
пре 4 месеци |
Jie Fu (傅杰)
|
4f658855fa
llama : support T5 models with unequal number of encoder-decoder layers (#15909)
|
пре 4 месеци |
Sigbjørn Skjæret
|
6ab397e12b
graph : support non-contiguous Q in build_attn_mha (#15908)
|
пре 4 месеци |
Daniel Bevenius
|
9de447d94e
ggml-cpu : fix padding in ggml_timestep_embedding (#15917)
|
пре 4 месеци |
Georgi Gerganov
|
0f0a3c2851
metal : make the backend async (#15906)
|
пре 4 месеци |
Daniel Bevenius
|
33daece86b
ci : add caching for ROCm installation in release workflow (#15924)
|
пре 4 месеци |
Daniel Bevenius
|
e7b6d83b52
tests : filter out no-ops from coverage report (#15900)
|
пре 4 месеци |
j-k
|
2cfef4d117
media : add transparent icon svg and png [no ci] (#15891)
|
пре 4 месеци |
Jesse
|
09e72a037c
gitignore : Ignore vim swap files in tests (#15901)
|
пре 4 месеци |
Chenguang Li
|
10d8b2b6b0
CANN: Add ROPE sin/cos cache for reuse (#15912)
|
пре 4 месеци |
Chenguang Li
|
28b5f190ef
CANN: implement LRU cache for ACL graphs (#15814)
|
пре 4 месеци |