slaren
|
fbe7dfa53c
ggml : add max buffer sizes to opencl and metal backends (#5181)
|
1 rok temu |
Paul Tsochantaris
|
d2f650cb5b
metal : free metal objects (#5161)
|
1 rok temu |
0cc4m
|
2307523d32
ggml : add Vulkan backend (#2059)
|
1 rok temu |
Paul Tsochantaris
|
6dd3c28c9c
metal : remove unused `n_buffers` and `buffers` (#5129)
|
2 lat temu |
Georgi Gerganov
|
ddc5a5033f
metal : show compile log messages
|
2 lat temu |
Georgi Gerganov
|
26d607608d
metal : disable support for MUL_MAT F32 x F16
|
2 lat temu |
Paul Tsochantaris
|
1e605f4102
metal : fix memory leak, dangling pointer and unused autorel (#5007)
|
2 lat temu |
Georgi Gerganov
|
c918fe8dca
metal : create autorelease pool during library build (#4970)
|
2 lat temu |
Paul Tsochantaris
|
7563293665
metal : remove unnecessary nil check (#4986)
|
2 lat temu |
Paul Tsochantaris
|
158f8c9e21
metal : localized logic in `ggml_metal_graph_compute` (#4924)
|
2 lat temu |
Alex Azarov
|
3a48d558a6
metal : replace loop of dispatch_async with dispatch_apply (#4934)
|
2 lat temu |
Alex Azarov
|
7c8d3abd1a
metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (#4936)
|
2 lat temu |
Justine Tunney
|
a0b3ac8c48
ggml : introduce GGML_CALL function annotation (#4850)
|
2 lat temu |
Alex Azarov
|
5f5fe1bd60
metal : correctly set SIMD support flags on iOS (#4923)
|
2 lat temu |
Georgi Gerganov
|
4be5ef556d
metal : remove old API (#4919)
|
2 lat temu |
Georgi Gerganov
|
2d57de5255
metal : disable log for loaded kernels (#4794)
|
2 lat temu |
Georgi Gerganov
|
b38b5e93ae
metal : refactor kernel loading code (#4794)
|
2 lat temu |
slaren
|
e7e4df031b
llama : ggml-backend integration (#4766)
|
2 lat temu |
Kawrakow
|
49662cbed3
ggml : SOTA 2-bit quants (add IQ2_XS) (#4856)
|
2 lat temu |
Paul Tsochantaris
|
2a7c94db5f
metal : put encoder debug group behind a define (#4873)
|
2 lat temu |
Georgi Gerganov
|
3267c2abc7
metal : fix deprecation warning (ggml/690)
|
2 lat temu |
Jack Mousseau
|
5362e43962
metal : wrap each operation in debug group (ggml/690)
|
2 lat temu |
Kawrakow
|
dd5ae06405
SOTA 2-bit quants (#4773)
|
2 lat temu |
Georgi Gerganov
|
91d38876df
metal : switch back to default.metallib (ggml/681)
|
2 lat temu |
Finn Voorhees
|
1bf681f90e
ggml : add error handling to graph_compute (whisper/1714)
|
2 lat temu |
Georgi Gerganov
|
289313716f
metal : add kernel_get_rows_i32
|
2 lat temu |
Georgi Gerganov
|
f3f62f0d83
metal : optimize ggml_mul_mat_id (faster Mixtral PP) (#4725)
|
2 lat temu |
Georgi Gerganov
|
58ba655af0
metal : enable shader debugging (cmake option) (#4705)
|
2 lat temu |
Georgi Gerganov
|
afefa319f1
ggml : change ggml_scale to take a float instead of tensor (#4573)
|
2 lat temu |
slaren
|
d232aca5a7
llama : initial ggml-backend integration (#4520)
|
2 lat temu |