Andrew Godfrey
|
b83e149ec6
cuda : get_row_rounding F32 (#4095)
|
2 жил өмнө |
Georgi Gerganov
|
4f447a4833
llama : fix data units (#4101)
|
2 жил өмнө |
Kerfuffle
|
91f6499393
Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040)
|
2 жил өмнө |
texmex76
|
8da46278e1
gguf : fix potential infinite loops while parsing (#4100)
|
2 жил өмнө |
Jared Van Bortel
|
a6fc554e26
llama : restore prefix space in llama tokenizer (#4081)
|
2 жил өмнө |
slaren
|
1cf2850d52
ggml-cuda : increase max graph size (#4084)
|
2 жил өмнө |
Michael Potter
|
6bb4908a17
Fix MacOS Sonoma model quantization (#4052)
|
2 жил өмнө |
Galunid
|
36eed0c42c
stablelm : StableLM support (#3586)
|
2 жил өмнө |
afrideva
|
b46d12f86d
convert.py: also look for plain model.safetensors (#4043)
|
2 жил өмнө |
M. Yusuf Sarıgöz
|
bd90eca237
llava : fix regression for square images in #3613 (#4056)
|
2 жил өмнө |
Georgi Gerganov
|
3d68f364f1
ggml : sync (im2col, GPU conv, 32-bit arm compat) (#4060)
|
2 жил өмнө |
Georgi Gerganov
|
c049b37d7b
readme : update hot topics
|
2 жил өмнө |
Georgi Gerganov
|
4760e7cc0b
sync : ggml (backend v2) (#3912)
|
2 жил өмнө |
Kerfuffle
|
bb50a792ec
Add ReLU and SQR CUDA ops to (partially) fix Persimmon offloading (#4041)
|
2 жил өмнө |
Kerfuffle
|
21fd874c8d
gguf-py: gguf_writer: Use bytearray to build metadata (#4051)
|
2 жил өмнө |
Richard Kiss
|
532dd74e38
Fix some documentation typos/grammar mistakes (#4032)
|
2 жил өмнө |
M. Yusuf Sarıgöz
|
e86fc56f75
Fix gguf-convert-endian script (#4037)
|
2 жил өмнө |
Alexey Parfenov
|
d96ca7ded7
server : fix crash when prompt exceeds context size (#3996)
|
2 жил өмнө |
Kerfuffle
|
34b0a08207
gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981)
|
2 жил өмнө |
Jhen-Jie Hong
|
4a4fd3eefa
server : allow continue edit on completion mode (#3950)
|
2 жил өмнө |
Galunid
|
df9d1293de
Unbreak persimmon after #3837 (#4010)
|
2 жил өмнө |
Galunid
|
a75fa576ab
scripts: Generalize convert scripts (#3838)
|
2 жил өмнө |
Mihai
|
57ad015dc3
server : add min_p param (#3877)
|
2 жил өмнө |
slaren
|
875fb42871
ggml-alloc : fix backend assignments of views (#3982)
|
2 жил өмнө |
Jared Van Bortel
|
0a7c980b6f
gguf : track writer state, free unneeded tensors, cleanup (#3871)
|
2 жил өмнө |
Georgi Gerganov
|
413503d4b9
make : do not add linker flags when compiling static llava lib (#3977)
|
2 жил өмнө |
xaedes
|
e9c1cecb9d
ggml : fix backward rope after YaRN (#3974)
|
2 жил өмнө |
Matthew Tejo
|
54b4df8886
Use params when loading models in llava-cli (#3976)
|
2 жил өмнө |
Meng Zhang
|
46876d2a2c
cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (#3946)
|
2 жил өмнө |
Damian Stewart
|
381efbf480
llava : expose as a shared library for downstream projects (#3613)
|
2 жил өмнө |