Georgi Gerganov
|
940efa95fe
llava : fix tokenization to not add bos between image embeddings and user prompt (#3645)
|
2 سال پیش |
cebtenzzre
|
11bff29045
MPT : support GQA for replit-code-v1.5 (#3627)
|
2 سال پیش |
M. Yusuf Sarıgöz
|
11dc1091f6
Honor -ngl option for Cuda offloading in llava (#3621)
|
2 سال پیش |
Daniel Bevenius
|
2a4bcbacea
llama : remove n_threads from llama_decode_internal (#3614)
|
2 سال پیش |
slaren
|
424b6381c4
ggml : add context enumeration functions (#3605)
|
2 سال پیش |
shibe2
|
1e0e873c37
CLBlast: Fix matrix-vector multiplication (#3544)
|
2 سال پیش |
M. Yusuf Sarıgöz
|
370359e5ba
examples: support LLaVA v1.5 (multimodal model) (#3436)
|
2 سال پیش |
uint256_t
|
9e24cc6e2e
docs : fix typo GOMP_CPU_AFFINITY (#3597)
|
2 سال پیش |
Georgi Gerganov
|
d28e572c02
cmake : fix add_compile_options on macOS
|
2 سال پیش |
Ian Scrivener
|
f3040beaab
typo : it is `--n-gpu-layers` not `--gpu-layers` (#3592)
|
2 سال پیش |
Georgi Gerganov
|
1a8c8795d6
ci : check if there is enough VRAM (#3596)
|
2 سال پیش |
Aarni Koskela
|
b016596d90
server : add completion mode (no chat) (#3582)
|
2 سال پیش |
Georgi Gerganov
|
6b3ae4da92
prompts : add mnemonics.txt
|
2 سال پیش |
Georgi Gerganov
|
57dd55e2c7
server : fix kv cache management (#3588)
|
2 سال پیش |
Georgi Gerganov
|
b8fe4b5cc9
main : fix session loading bug (#3400)
|
2 سال پیش |
Michael Coppola
|
a8bdd65525
server : add parameter -tb N, --threads-batch N (#3584)
|
2 سال پیش |
Kerfuffle
|
70c29da118
common : fix mirostat state when using multiple sequences (#3543)
|
2 سال پیش |
Georgi Gerganov
|
8c70a5ff25
batched : add bench tool (#3545)
|
2 سال پیش |
Zane Shannon
|
24ba3d829e
examples : add batched.swift + improve CI for swift (#3562)
|
2 سال پیش |
Galunid
|
9f6ede19f3
Add MPT model to supported models in README.md (#3574)
|
2 سال پیش |
goerch
|
233fc1c69f
Minor improvements in GPT2 tokenizer (#3567)
|
2 سال پیش |
Xingchen Song(宋星辰)
|
c5b49360d0
readme : add bloom (#3570)
|
2 سال پیش |
Xingchen Song(宋星辰)
|
02d2875def
llm : add bloom models (#3553)
|
2 سال پیش |
Jhen-Jie Hong
|
0aa6595ae0
swift : improvements and fixes (#3564)
|
2 سال پیش |
Jan Ploski
|
f5f9121de1
llm : add MPT support (#3417)
|
2 سال پیش |
vvhg1
|
11ea5c7d96
infill. : fix tokenization (#3508)
|
2 سال پیش |
slaren
|
95bd60a0a6
ggml-alloc : fix assert in debug builds (#3555)
|
2 سال پیش |
Georgi Gerganov
|
fcca0a7004
refact : fix convert script + zero out KV cache to avoid nans (#3523)
|
2 سال پیش |
Georgi Gerganov
|
dcc09d2596
metal : do not use mul_mm kernels when ne00 < 64 (#3542)
|
2 سال پیش |
Georgi Gerganov
|
db3abcc114
sync : ggml (ggml-backend) (#3548)
|
2 سال پیش |