Jack Mousseau
|
5f14ee0b0c
metal : add debug capture backend function (ggml/694)
|
2 years ago |
Paul Tsochantaris
|
158f8c9e21
metal : localized logic in `ggml_metal_graph_compute` (#4924)
|
2 years ago |
Justine Tunney
|
a0b3ac8c48
ggml : introduce GGML_CALL function annotation (#4850)
|
2 years ago |
Georgi Gerganov
|
4be5ef556d
metal : remove old API (#4919)
|
2 years ago |
Finn Voorhees
|
1bf681f90e
ggml : add error handling to graph_compute (whisper/1714)
|
2 years ago |
slaren
|
d232aca5a7
llama : initial ggml-backend integration (#4520)
|
2 years ago |
Georgi Gerganov
|
fe680e3d10
sync : ggml (new ops, tests, backend, etc.) (#4359)
|
2 years ago |
Georgi Gerganov
|
3d68f364f1
ggml : sync (im2col, GPU conv, 32-bit arm compat) (#4060)
|
2 years ago |
Georgi Gerganov
|
db3abcc114
sync : ggml (ggml-backend) (#3548)
|
2 years ago |
Rickard Hallerbäck
|
dc6897404e
metal : reusing llama.cpp logging (#3152)
|
2 years ago |
Georgi Gerganov
|
f55538c3cc
metal : fix memory leak (#2762)
|
2 years ago |
Georgi Gerganov
|
6381d4e110
gguf : new file format with flexible meta data (beta) (#2398)
|
2 years ago |
Shouzheng Liu
|
fc8ef549e5
metal : enable ggml-alloc (#2627)
|
2 years ago |
Shouzheng Liu
|
1aa18ef994
metal : concurrently dispatch commands (#2358)
|
2 years ago |
Qingyou Meng
|
1d656d6360
ggml : change ggml_graph_compute() API to not require context (#1999)
|
2 years ago |
Georgi Gerganov
|
ce2c7d72e2
metal : handle buffers larger than device's maxBufferLength (#1826)
|
2 years ago |
Georgi Gerganov
|
4bfcc855ab
metal : parallel command buffer encoding (#1860)
|
2 years ago |
Georgi Gerganov
|
ecb217db4f
llama : Metal inference (#1642)
|
2 years ago |