Howard Su
|
32c5411631
Revert "Support using mmap when applying LoRA (#2095)" (#2206)
|
2 gadi atpakaļ |
Howard Su
|
2347463201
Support using mmap when applying LoRA (#2095)
|
2 gadi atpakaļ |
zrm
|
b853d45601
ggml : add NUMA support (#1556)
|
2 gadi atpakaļ |
kiltyj
|
9d0693bce3
metal : use shared buffers between CPU and GPU (#1696)
|
2 gadi atpakaļ |
Johannes Gäßler
|
affc76edfd
cuda : loading models directly into VRAM, norm calculation on GPU, broadcasting for ggml_mul (#1483)
|
2 gadi atpakaļ |
Maxime
|
503db28849
llama : fix name shadowing and C4146 (#1526)
|
2 gadi atpakaļ |
Ivan Stepanov
|
34d9f22f44
Wrap exceptions in std::exception to verbose output on exception. (#1316)
|
2 gadi atpakaļ |
xloem
|
ea3a0ad6b6
llama : update stubs for systems without mmap and mlock (#1266)
|
2 gadi atpakaļ |
slaren
|
b925f1f1b0
cuBLAS: fall back to pageable memory if pinned alloc fails (#1233)
|
2 gadi atpakaļ |
Georgi Gerganov
|
84ca9c2ecf
examples : fix save-load-state + rename llama-util.h
|
2 gadi atpakaļ |