Georgi Gerganov
|
cc44877486
log : disable pid in log filenames
|
2 年之前 |
cebtenzzre
|
ad93962657
server : add parameter -tb N, --threads-batch N (#3584) (#3768)
|
2 年之前 |
Georgi Gerganov
|
1717521cdb
server : do not block system prompt update (#3767)
|
2 年之前 |
Georgi Gerganov
|
b2f7e04bd3
sync : ggml (conv ops + cuda MSVC fixes) (#3765)
|
2 年之前 |
John Smith
|
abd21fc99f
cmake : add missed dependencies (#3763)
|
2 年之前 |
Georgi Gerganov
|
2b4ea35e56
cuda : add batched cuBLAS GEMM for faster attention (#3749)
|
2 年之前 |
Galunid
|
daab3d7f45
Add more tokenizer tests (#3742)
|
2 年之前 |
Georgi Gerganov
|
469c9addef
metal : handle ggml_scale for n%4 != 0 (close #3754)
|
2 年之前 |
Georgi Gerganov
|
e3932593d4
Revert "make : add optional CUDA_NATIVE_ARCH (#2482)"
|
2 年之前 |
M. Yusuf Sarıgöz
|
9d02956443
issues : separate bug and enhancement template + no default title (#3748)
|
2 年之前 |
Galunid
|
69a6735087
Update special token handling in conversion scripts for gpt2 derived tokenizers (#3746)
|
2 年之前 |
Marcus Dunn
|
5be6c803fa
llama : remove token functions with `context` args in favor of `model` (#3720)
|
2 年之前 |
Galunid
|
6336701c93
Fix baichuan convert script not detecing model (#3739)
|
2 年之前 |
Alex
|
96981f37b1
make : add optional CUDA_NATIVE_ARCH (#2482)
|
2 年之前 |
Georgi Gerganov
|
438c2ca830
server : parallel decoding and multimodal (#3677)
|
2 年之前 |
goerch
|
9e70cc0322
Add test for MPT tokenization (#3728)
|
2 年之前 |
Ian Scrivener
|
5a42a5f8e8
readme : remove unsupported node.js library (#3703)
|
2 年之前 |
Kerfuffle
|
a5e7dbd614
llama : validate special token ids are in range when loading GGUF model (#3635)
|
2 年之前 |
vvhg1
|
d3956aea53
main : escape prompt for cfg_negative_prompt and consecutive inputs in main with interactive (#3623)
|
2 年之前 |
Georgi Gerganov
|
22c69a2794
batched : add len CLI argument
|
2 年之前 |
shibe2
|
465219b914
CLBlast: Add outer loops over src0 for broadcasting in mulmat
|
2 年之前 |
Georgi Gerganov
|
d1031cf49c
sampling : refactor init to use llama_sampling_params (#3696)
|
2 年之前 |
Qin Yue Chen
|
8cf19d60dc
gguf : support big endian platform (#3552)
|
2 年之前 |
Georgi Gerganov
|
a0edf73bda
server : fix uninitialized sampling context (close #3685)
|
2 年之前 |
Herman Semenov
|
f439e506e8
ggml : fix rope + llama minor optimizations (#3560)
|
2 年之前 |
cebtenzzre
|
e78f3ef24a
convert : restore compat with old Falcon models (#3680)
|
2 年之前 |
M. Yusuf Sarıgöz
|
f3b25e4043
multimodal : add BakLLaVA conversion support (#3682)
|
2 年之前 |
M. Yusuf Sarıgöz
|
60abea9798
llava : avoid segfault in case of non-existent mmproj file (#3674)
|
2 年之前 |
Georgi Gerganov
|
004797f6ac
readme : update hot topics
|
2 年之前 |
Georgi Gerganov
|
4e82b2ea3f
speculative : bug fixes
|
2 年之前 |