Georgi Gerganov
|
b9cc76d87e
ggml : fix ggml_backend_cpu_supports_op() for CPY (#0)
|
1 rok temu |
slaren
|
280345968d
cuda : rename build flag to LLAMA_CUDA (#6299)
|
1 rok temu |
slaren
|
5e1b7f94a0
backend : set max split inputs to GGML_MAX_SRC (#6137)
|
1 rok temu |
slaren
|
2bf8d0f7c4
backend : offload large batches to GPU (#6083)
|
1 rok temu |
slaren
|
f30ea47a87
llama : add pipeline parallelism support (#6017)
|
1 rok temu |
Michael Podvitskiy
|
9fa2627347
ggml : introduce ggml_status (ggml/750)
|
1 rok temu |
UEXTM.com
|
5f70671856
Introduce backend GUIDs (ggml/743)
|
1 rok temu |
Kawrakow
|
bd2d4e393b
1.5 bit quantization (#5453)
|
1 rok temu |
Georgi Gerganov
|
8f1be0d42f
ggml : add ALiBi support for ggml_soft_max_ext (#5488)
|
1 rok temu |
Ananta Bastola
|
6e4e973b26
ci : add an option to fail on compile warning (#3952)
|
1 rok temu |
AT
|
f5ca054855
Early return for zero size calls to get_tensor. (#5482)
|
1 rok temu |
Georgi Gerganov
|
3b169441df
sync : ggml (#5452)
|
1 rok temu |
Michael Podvitskiy
|
4633d93af0
ggml : add abort_callback for cpu backend (ggml/725)
|
1 rok temu |
Jared Van Bortel
|
fbf1ddec69
Nomic Vulkan backend (#4456)
|
2 lat temu |
0cc4m
|
2307523d32
ggml : add Vulkan backend (#2059)
|
2 lat temu |
Abhilash Majumder
|
0f648573dd
ggml : add unified SYCL backend for Intel GPUs (#2690)
|
2 lat temu |
slaren
|
62fead3ea0
cuda : fix tensor size calculation for non-split buffer (#5145)
|
2 lat temu |
slaren
|
6df465a91d
llama : run all KQV ops on the CPU with no KV offload (#5049)
|
2 lat temu |
Georgi Gerganov
|
38566680cd
ggml : add IQ2 to test-backend-ops + refactoring (#4990)
|
2 lat temu |
Georgi Gerganov
|
44a1a4a41a
backend : add eval callback (#4935)
|
2 lat temu |
Justine Tunney
|
a0b3ac8c48
ggml : introduce GGML_CALL function annotation (#4850)
|
2 lat temu |
slaren
|
fa5c1fb44a
backend_sched : fix assignments
|
2 lat temu |
slaren
|
e7e4df031b
llama : ggml-backend integration (#4766)
|
2 lat temu |
Finn Voorhees
|
1bf681f90e
ggml : add error handling to graph_compute (whisper/1714)
|
2 lat temu |
bssrdf
|
afc8c19291
ggml : fix some mul mat cases + add tests for src1 F16 (ggml/669)
|
2 lat temu |
slaren
|
5bf3953d7e
cuda : improve cuda pool efficiency using virtual memory (#4606)
|
2 lat temu |
slaren
|
d232aca5a7
llama : initial ggml-backend integration (#4520)
|
2 lat temu |
Georgi Gerganov
|
fe680e3d10
sync : ggml (new ops, tests, backend, etc.) (#4359)
|
2 lat temu |
Georgi Gerganov
|
4760e7cc0b
sync : ggml (backend v2) (#3912)
|
2 lat temu |
Georgi Gerganov
|
db3abcc114
sync : ggml (ggml-backend) (#3548)
|
2 lat temu |