Johannes Gäßler
|
202084d31d
tests: add gradient tests for all backends (ggml/932)
|
1 rok pred |
slaren
|
4db04784f9
cuda : fix defrag with quantized KV (#9319)
|
1 rok pred |
Faisal Zaghloul
|
42c76d1358
Threadpool: take 2 (#8672)
|
1 rok pred |
Nico Bosshard
|
e3f6fd56b1
ggml : dynamic ggml_sched_max_splits based on graph_size (#9047)
|
1 rok pred |
slaren
|
be55695eff
ggml-backend : fix async copy from CPU (#8897)
|
1 rok pred |
slaren
|
2b1f616b20
ggml : reduce hash table reset cost (#8698)
|
1 rok pred |
Johannes Gäßler
|
a15ef8f8a0
CUDA: fix partial offloading for ne0 % 256 != 0 (#8572)
|
1 rok pred |
hipudding
|
1bdd8ae19f
[CANN] Add Ascend NPU backend (#6035)
|
1 rok pred |
Chen Xi
|
b549a1bbef
[SYCL] fix the mul_mat_id ut issues (#8427)
|
1 rok pred |
Georgi Gerganov
|
f3f65429c4
llama : reorganize source code + improve CMake (#8006)
|
1 rok pred |