Kawrakow
|
49662cbed3
ggml : SOTA 2-bit quants (add IQ2_XS) (#4856)
|
2 жил өмнө |
Paul Tsochantaris
|
2a7c94db5f
metal : put encoder debug group behind a define (#4873)
|
2 жил өмнө |
Georgi Gerganov
|
3267c2abc7
metal : fix deprecation warning (ggml/690)
|
2 жил өмнө |
Jack Mousseau
|
5362e43962
metal : wrap each operation in debug group (ggml/690)
|
2 жил өмнө |
Kawrakow
|
dd5ae06405
SOTA 2-bit quants (#4773)
|
2 жил өмнө |
Georgi Gerganov
|
91d38876df
metal : switch back to default.metallib (ggml/681)
|
2 жил өмнө |
Finn Voorhees
|
1bf681f90e
ggml : add error handling to graph_compute (whisper/1714)
|
2 жил өмнө |
Georgi Gerganov
|
289313716f
metal : add kernel_get_rows_i32
|
2 жил өмнө |
Georgi Gerganov
|
f3f62f0d83
metal : optimize ggml_mul_mat_id (faster Mixtral PP) (#4725)
|
2 жил өмнө |
Georgi Gerganov
|
58ba655af0
metal : enable shader debugging (cmake option) (#4705)
|
2 жил өмнө |
Georgi Gerganov
|
afefa319f1
ggml : change ggml_scale to take a float instead of tensor (#4573)
|
2 жил өмнө |
slaren
|
d232aca5a7
llama : initial ggml-backend integration (#4520)
|
2 жил өмнө |
Georgi Gerganov
|
4d98d9a656
sync : ggml (SD ops, tests, kernels) (#4444)
|
2 жил өмнө |
slaren
|
799a1cb13b
llama : add Mixtral support (#4406)
|
2 жил өмнө |
Georgi Gerganov
|
fe680e3d10
sync : ggml (new ops, tests, backend, etc.) (#4359)
|
2 жил өмнө |
Georgi Gerganov
|
bcc0eb4591
llama : per-layer KV cache + quantum K cache (#4309)
|
2 жил өмнө |
Georgi Gerganov
|
d7b800b8bc
llama : pad KV cache size (#4280)
|
2 жил өмнө |
Georgi Gerganov
|
ef47ec18da
ggml : add ggml_soft_max_ext (#4256)
|
2 жил өмнө |
Xiao-Yong Jin
|
22da05536f
metal : fix yarn (#4220)
|
2 жил өмнө |
Georgi Gerganov
|
4f447a4833
llama : fix data units (#4101)
|
2 жил өмнө |
Georgi Gerganov
|
3d68f364f1
ggml : sync (im2col, GPU conv, 32-bit arm compat) (#4060)
|
2 жил өмнө |
Georgi Gerganov
|
4760e7cc0b
sync : ggml (backend v2) (#3912)
|
2 жил өмнө |
Peter Sugihara
|
d9b33fe95b
metal : round up to 16 to fix MTLDebugComputeCommandEncoder assertion (#3938)
|
2 жил өмнө |
Xiao-Yong Jin
|
5ba3746171
ggml-metal: fix yarn rope (#3937)
|
2 жил өмнө |
Georgi Gerganov
|
183b3fac6c
metal : fix build errors and kernel sig after #2268 (#3898)
|
2 жил өмнө |
cebtenzzre
|
898aeca90a
llama : implement YaRN RoPE scaling (#2268)
|
2 жил өмнө |
Georgi Gerganov
|
e16b9fa4ba
metal : multi-simd softmax (#3710)
|
2 жил өмнө |
Georgi Gerganov
|
71e3718abd
llama : refactor graph build code (#3837)
|
2 жил өмнө |
Aarni Koskela
|
82a6646e02
metal : try cwd for ggml-metal.metal if bundle lookup fails (#3793)
|
2 жил өмнө |
Georgi Gerganov
|
469c9addef
metal : handle ggml_scale for n%4 != 0 (close #3754)
|
2 жил өмнө |