Galunid
|
69a6735087
Update special token handling in conversion scripts for gpt2 derived tokenizers (#3746)
|
2 жил өмнө |
Marcus Dunn
|
5be6c803fa
llama : remove token functions with `context` args in favor of `model` (#3720)
|
2 жил өмнө |
Galunid
|
6336701c93
Fix baichuan convert script not detecing model (#3739)
|
2 жил өмнө |
Alex
|
96981f37b1
make : add optional CUDA_NATIVE_ARCH (#2482)
|
2 жил өмнө |
Georgi Gerganov
|
438c2ca830
server : parallel decoding and multimodal (#3677)
|
2 жил өмнө |
goerch
|
9e70cc0322
Add test for MPT tokenization (#3728)
|
2 жил өмнө |
Ian Scrivener
|
5a42a5f8e8
readme : remove unsupported node.js library (#3703)
|
2 жил өмнө |
Kerfuffle
|
a5e7dbd614
llama : validate special token ids are in range when loading GGUF model (#3635)
|
2 жил өмнө |
vvhg1
|
d3956aea53
main : escape prompt for cfg_negative_prompt and consecutive inputs in main with interactive (#3623)
|
2 жил өмнө |
Georgi Gerganov
|
22c69a2794
batched : add len CLI argument
|
2 жил өмнө |
shibe2
|
465219b914
CLBlast: Add outer loops over src0 for broadcasting in mulmat
|
2 жил өмнө |
Georgi Gerganov
|
d1031cf49c
sampling : refactor init to use llama_sampling_params (#3696)
|
2 жил өмнө |
Qin Yue Chen
|
8cf19d60dc
gguf : support big endian platform (#3552)
|
2 жил өмнө |
Georgi Gerganov
|
a0edf73bda
server : fix uninitialized sampling context (close #3685)
|
2 жил өмнө |
Herman Semenov
|
f439e506e8
ggml : fix rope + llama minor optimizations (#3560)
|
2 жил өмнө |
cebtenzzre
|
e78f3ef24a
convert : restore compat with old Falcon models (#3680)
|
2 жил өмнө |
M. Yusuf Sarıgöz
|
f3b25e4043
multimodal : add BakLLaVA conversion support (#3682)
|
2 жил өмнө |
M. Yusuf Sarıgöz
|
60abea9798
llava : avoid segfault in case of non-existent mmproj file (#3674)
|
2 жил өмнө |
Georgi Gerganov
|
004797f6ac
readme : update hot topics
|
2 жил өмнө |
Georgi Gerganov
|
4e82b2ea3f
speculative : bug fixes
|
2 жил өмнө |
Georgi Gerganov
|
0e89203b51
speculative : add tree-based sampling example (#3624)
|
2 жил өмнө |
Jhen-Jie Hong
|
c67fe68e41
metal : implement q5_0 and q5_1 kernels (#3648)
|
2 жил өмнө |
shibe2
|
1117d06607
opencl : fix element-wise multiplication (#3656)
|
2 жил өмнө |
slaren
|
cb33f43a2a
fix embeddings when using CUDA (#3657)
|
2 жил өмнө |
Georgi Gerganov
|
e1675d133c
llama : avoid fprintf in favor of LLAMA_LOG (#3538)
|
2 жил өмнө |
BarfingLemurs
|
8402566a7c
readme : update hot-topics & models, detail windows release in usage (#3615)
|
2 жил өмнө |
shibe2
|
40e5ce054f
CLBlast: Fix temporary buffer size for f16 conversion (wsize)
|
2 жил өмнө |
slaren
|
a5e8c1d8c7
train-text-from-scratch : fix assert failure in ggml-alloc (#3618)
|
2 жил өмнө |
Georgi Gerganov
|
e74c705e15
editorconfig : remove trailing spaces
|
2 жил өмнө |
coezbek
|
3ad1e3f1a1
server : documentation of JSON return value of /completion endpoint (#3632)
|
2 жил өмнө |