Daniel Illescas Romero
|
c75ca5d96f
llama.swiftui : use correct pointer for llama_token_eos (#4797)
|
2 years ago |
Georgi Gerganov
|
96e80dabc6
examples : improve base-translate.sh script (#4783)
|
2 years ago |
a-n-n-a-l-e-e
|
eec22a1c63
cmake : check for openblas64 (#4134)
|
2 years ago |
Ikko Eltociear Ashimine
|
be36bb946a
flake.nix : fix typo (#4700)
|
2 years ago |
Georgi Gerganov
|
91d38876df
metal : switch back to default.metallib (ggml/681)
|
2 years ago |
Georgi Gerganov
|
d061bf9405
ggml : fix q2_k bpw in comments (ggml/680)
|
2 years ago |
Finn Voorhees
|
1bf681f90e
ggml : add error handling to graph_compute (whisper/1714)
|
2 years ago |
Georgi Gerganov
|
c1d7cb28d3
ggml : do not sched_yield when calling BLAS (#4761)
|
2 years ago |
Georgi Gerganov
|
3681f22443
examples : add few-shot translation example (#4783)
|
2 years ago |
Daniel Bevenius
|
b3a7c20b5c
finetune : remove unused includes (#4756)
|
2 years ago |
Georgi Gerganov
|
012cf349ae
server : send token probs for "stream == false" (#4714)
|
2 years ago |
Johannes Gäßler
|
a91928014f
Print backend name on test-backend-ops failure (#4751)
|
2 years ago |
singularity
|
3c0b585561
llama.swiftui : support loading custom model from file picker (#4767)
|
2 years ago |
Michael Coppola
|
e5804313a1
server : fix options in README.md (#4765)
|
2 years ago |
Georgi Gerganov
|
dc891b7f7a
ggml : include stdlib.h before intrin.h (#4736)
|
2 years ago |
singularity
|
46cea79e1f
llama.swiftui : fix build of ggml.metallib (#4754)
|
2 years ago |
Daniel Bevenius
|
cb1e2818e0
train : fix typo in overlapping-samples help msg (#4758)
|
2 years ago |
Ashraful Islam
|
ece9a45e8f
swift : update Package.swift to use ggml as dependency (#4691)
|
2 years ago |
Georgi Gerganov
|
7bed7eba35
cuda : simplify expression
|
2 years ago |
Georgi Gerganov
|
d55356d3ba
cuda : mark I16 and I32 ops as unsupported
|
2 years ago |
Georgi Gerganov
|
75e3fd8581
sync : ggml
|
2 years ago |
Georgi Gerganov
|
289313716f
metal : add kernel_get_rows_i32
|
2 years ago |
Georgi Gerganov
|
ab62fc3e55
scripts : fix sync order + metal sed
|
2 years ago |
Guillaume Wenzek
|
5f66ebca9c
ggml : extend ggml_get_rows, ggml_repeat, ggml_concat (ggml/639)
|
2 years ago |
Justin Parker
|
f2eb19bd8b
server : throw an error when `slot unavailable` (#4741)
|
2 years ago |
Georgi Gerganov
|
f3f62f0d83
metal : optimize ggml_mul_mat_id (faster Mixtral PP) (#4725)
|
2 years ago |
Phil H
|
0ef3ca2ac6
server : add token counts to html footer (#4738)
|
2 years ago |
Georgi Gerganov
|
540938f890
llama : llama_model_desc print number of experts
|
2 years ago |
Marcus Dunn
|
0040d42eeb
llama : replace all API facing `int`'s with `int32_t` (#4577)
|
2 years ago |
postmasters
|
83e633c27e
llama : differentiate the KV dims in the attention (#4657)
|
2 years ago |