vxiiduu
|
f31b539714
Enhance Windows 7 and below compatibility. (#2592)
|
2 年之前 |
Johannes Gäßler
|
3d9a551816
Fixed mmap prefetch for GPU offloading (#2529)
|
2 年之前 |
l3utterfly
|
415e99fec2
Stream save llama context data to file instead of allocating entire buffer upfront (#2488)
|
2 年之前 |
Howard Su
|
32c5411631
Revert "Support using mmap when applying LoRA (#2095)" (#2206)
|
2 年之前 |
Howard Su
|
2347463201
Support using mmap when applying LoRA (#2095)
|
2 年之前 |
zrm
|
b853d45601
ggml : add NUMA support (#1556)
|
2 年之前 |
kiltyj
|
9d0693bce3
metal : use shared buffers between CPU and GPU (#1696)
|
2 年之前 |
Johannes Gäßler
|
affc76edfd
cuda : loading models directly into VRAM, norm calculation on GPU, broadcasting for ggml_mul (#1483)
|
2 年之前 |
Maxime
|
503db28849
llama : fix name shadowing and C4146 (#1526)
|
2 年之前 |
Ivan Stepanov
|
34d9f22f44
Wrap exceptions in std::exception to verbose output on exception. (#1316)
|
2 年之前 |
xloem
|
ea3a0ad6b6
llama : update stubs for systems without mmap and mlock (#1266)
|
2 年之前 |
slaren
|
b925f1f1b0
cuBLAS: fall back to pageable memory if pinned alloc fails (#1233)
|
2 年之前 |
Georgi Gerganov
|
84ca9c2ecf
examples : fix save-load-state + rename llama-util.h
|
2 年之前 |