Sigbjørn Skjæret
|
86a3f0fad8
ggml : allow fill node alloc inplace (#17870)
|
1 month ago |
Diego Devesa
|
e072b2052e
ggml : add GGML_SCHED_NO_REALLOC option to disable reallocations in ggml_backend_sched (#17276)
|
2 months ago |
Acly
|
3470a5c891
ggml-alloc : make gallocr prefer chunks that allow memory reuse (#16788)
|
3 months ago |
Diego Devesa
|
b617cfd289
ggml-alloc : fix leak when reusing a tensor with a larger size (#16679)
|
3 months ago |
Acly
|
638d330246
ggml : fix graph reallocation with multiple chunks (#16396)
|
3 months ago |
Acly
|
f2a789e334
ggml : split graph allocations according to backend max buffer size (#15815)
|
4 months ago |
Georgi Gerganov
|
fd1234cb46
llama : add gpt-oss (#15091)
|
5 months ago |
Georgi Gerganov
|
bf9087f59a
metal : fuse add, mul + add tests (#14596)
|
6 months ago |
Jesse Gross
|
f057808ffa
ggml: Don't assert fail when tensor data changes (#13222)
|
9 months ago |
William Tambellini
|
70680c48e5
ggml : upgrade init_tensor API to return a ggml_status (#11854)
|
11 months ago |
Jeff Bolz
|
1b598b3058
vulkan: use smaller combined allocations to avoid fragmentation (#11551)
|
11 months ago |
Johannes Gäßler
|
9c8dcefe17
CUDA: backwards pass for misc. ops, add tests (#11257)
|
1 year ago |
Daniel Bevenius
|
130d0c90bd
ggml : remove return from ggml_gallocr_allocate_node (ggml/1048)
|
1 year ago |
Johannes Gäßler
|
8a43e940ab
ggml: new optimization interface (ggml/988)
|
1 year ago |
Daniel Bevenius
|
cd60b88bf7
ggml-alloc : remove buffer_id from leaf_alloc (ggml/987)
|
1 year ago |
Diego Devesa
|
96776405a1
ggml : move more prints to the ggml log system (#9839)
|
1 year ago |
slaren
|
d09770cae7
ggml-alloc : fix list of allocated tensors with GGML_ALLOCATOR_DEBUG (#9573)
|
1 year ago |
slaren
|
2b1f616b20
ggml : reduce hash table reset cost (#8698)
|
1 year ago |
Johannes Gäßler
|
a15ef8f8a0
CUDA: fix partial offloading for ne0 % 256 != 0 (#8572)
|
1 year ago |
Georgi Gerganov
|
f3f65429c4
llama : reorganize source code + improve CMake (#8006)
|
1 year ago |