Georgi Gerganov
|
bc0baab2ea
server : allow to override -ngl in tests (#6170)
|
1 vuosi sitten |
Georgi Gerganov
|
d795988d9e
Revert "llava : add a MobileVLM_V2-1.7B backup (#6152)"
|
1 vuosi sitten |
Ziang Wu
|
f8c4e745e1
llava : add a MobileVLM_V2-1.7B backup (#6152)
|
1 vuosi sitten |
Karthick
|
47cc7a7bf9
Server: Handle n_keep parameter in the request (#6174)
|
1 vuosi sitten |
Jared Van Bortel
|
bd60d82d0c
server tests : more pythonic process management; fix bare `except:` (#6146)
|
1 vuosi sitten |
Neo Zhang Jianyu
|
6c0b287748
update readme sycl for new update (#6151)
|
1 vuosi sitten |
Abhilash Majumder
|
d26e8b669d
increase igpu cluster limit (#6159)
|
1 vuosi sitten |
DAN™
|
d8b009a945
Remove undeed header file. (#6158)
|
1 vuosi sitten |
Pierrick Hymbert
|
d0d5de42e5
gguf-split: split and merge gguf per batch of tensors (#6135)
|
1 vuosi sitten |
Georgi Gerganov
|
b80cf3b2d1
common : disable repeat penalties by default (#6127)
|
1 vuosi sitten |
slaren
|
970a48060a
ci : exempt some labels from being tagged as stale (#6140)
|
1 vuosi sitten |
DAN™
|
4c28b82529
common : print usage on '-h' and '--help' (#6145)
|
1 vuosi sitten |
github-actions[bot]
|
2d15886bb0
flake.lock: Update
|
1 vuosi sitten |
Jared Van Bortel
|
d199ca79f2
mpt : implement backwards compatiblity with duped output tensor (#6139)
|
1 vuosi sitten |
Felix
|
104f5e0fc1
clip : fix memory leak (#6138)
|
1 vuosi sitten |
slaren
|
5e1b7f94a0
backend : set max split inputs to GGML_MAX_SRC (#6137)
|
1 vuosi sitten |
Georgi Gerganov
|
ac9ee6a4ad
ci : disable stale issue messages (#6126)
|
1 vuosi sitten |
Georgi Gerganov
|
4f6d1337ca
ci : temporary disable sanitizer builds (#6128)
|
1 vuosi sitten |
slaren
|
2bf8d0f7c4
backend : offload large batches to GPU (#6083)
|
1 vuosi sitten |
DAN™
|
496bc79bc2
common : tidy-up argument parsing (#6105)
|
1 vuosi sitten |
Thérence
|
9b03719ad7
convert : add support for CamembertModel architecture (#6119)
|
1 vuosi sitten |
Romain D
|
3a6efdd03c
convert : use f32 outtype for bf16 tensors (#6106)
|
1 vuosi sitten |
Pierrick Hymbert
|
d01b3c4c32
common: llama_load_model_from_url using --model-url (#6098)
|
1 vuosi sitten |
Georgi Gerganov
|
cd776c37c9
ci : close all stale issues at once (#6115)
|
1 vuosi sitten |
GainLee
|
dc0f612548
ggml:fix finding transfer queue family index error (#6094)
|
1 vuosi sitten |
AmirAli Mirian
|
c47cf414ef
ggml : add AVX512F SIMD (#6088)
|
1 vuosi sitten |
Daniel Bevenius
|
b5f4ae09c3
gritlm : add initial README.md (#6086)
|
1 vuosi sitten |
Xuan Son Nguyen
|
dfbfdd60f9
readme : add wllama as a wasm binding (#6100)
|
1 vuosi sitten |
DAN™
|
15961ec04d
common : refactor nested if causing error C1061 on MSVC (#6101)
|
1 vuosi sitten |
Pierrick Hymbert
|
a56d09a440
ci : close inactive issue with workflow (#6053)
|
1 vuosi sitten |