Damian Stewart
|
381efbf480
llava : expose as a shared library for downstream projects (#3613)
|
2 سال پیش |
cebtenzzre
|
b12fa0d1c1
build : link against build info instead of compiling against it (#3879)
|
2 سال پیش |
cebtenzzre
|
2046eb4345
make : remove unnecessary dependency on build-info.h (#3842)
|
2 سال پیش |
Georgi Gerganov
|
d69d777c02
ggml : quantization refactoring (#3833)
|
2 سال پیش |
Georgi Gerganov
|
2f9ec7e271
cuda : improve text-generation and batched decoding performance (#3776)
|
2 سال پیش |
Georgi Gerganov
|
e3932593d4
Revert "make : add optional CUDA_NATIVE_ARCH (#2482)"
|
2 سال پیش |
Alex
|
96981f37b1
make : add optional CUDA_NATIVE_ARCH (#2482)
|
2 سال پیش |
Georgi Gerganov
|
438c2ca830
server : parallel decoding and multimodal (#3677)
|
2 سال پیش |
Georgi Gerganov
|
d1031cf49c
sampling : refactor init to use llama_sampling_params (#3696)
|
2 سال پیش |
Georgi Gerganov
|
0e89203b51
speculative : add tree-based sampling example (#3624)
|
2 سال پیش |
M. Yusuf Sarıgöz
|
370359e5ba
examples: support LLaVA v1.5 (multimodal model) (#3436)
|
2 سال پیش |
Kerfuffle
|
70c29da118
common : fix mirostat state when using multiple sequences (#3543)
|
2 سال پیش |
Georgi Gerganov
|
8c70a5ff25
batched : add bench tool (#3545)
|
2 سال پیش |
Zane Shannon
|
24ba3d829e
examples : add batched.swift + improve CI for swift (#3562)
|
2 سال پیش |
Georgi Gerganov
|
db3abcc114
sync : ggml (ggml-backend) (#3548)
|
2 سال پیش |
goerch
|
ff5a3f0c09
Work on the BPE tokenizer (#3252)
|
2 سال پیش |
vvhg1
|
c97f01c362
infill : add new example + extend server API (#3296)
|
2 سال پیش |
Cebtenzzre
|
bc39553c90
build : enable more non-default compiler warnings (#3200)
|
2 سال پیش |
xaedes
|
0e76a8992c
train : finetune LORA (#2632)
|
2 سال پیش |
Georgi Gerganov
|
ec893798b7
llama : custom attention mask + parallel decoding + no context swaps (#3228)
|
2 سال پیش |
Jag Chadha
|
527e57cfd8
build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma (#3342)
|
2 سال پیش |
Cebtenzzre
|
8781013ef6
make : restore build-info.h dependency for several targets (#3205)
|
2 سال پیش |
Johannes Gäßler
|
111163e246
CUDA: enable peer access between devices (#2470)
|
2 سال پیش |
Vlad
|
5dbc2b3213
Enable build with CUDA 11.0 (make) (#3132)
|
2 سال پیش |
Cebtenzzre
|
e6616cf0db
examples : add compiler version and target to build info (#2998)
|
2 سال پیش |
Cebtenzzre
|
3aefaab9e5
check C++ code with -Wmissing-declarations (#3184)
|
2 سال پیش |
Cebtenzzre
|
4b8560e72a
make : fix clang++ detection, move some definitions to CPPFLAGS (#3155)
|
2 سال پیش |
goerch
|
71ca2fad7d
whisper : tokenizer fix + re-enable tokenizer test for LLaMa (#3096)
|
2 سال پیش |
Johannes Gäßler
|
0a5eebb45d
CUDA: mul_mat_q RDNA2 tunings (#2910)
|
2 سال پیش |
Przemysław Pawełczyk
|
cb6c44c5e0
build : do not use _GNU_SOURCE gratuitously (#2035)
|
2 سال پیش |