cturan/llama.cpp

Author	SHA1 Message	Date
slaren	1c51f98adc cuda : print the returned error when CUDA initialization fails (#6185)	1 year ago
Ziang Wu	f9c7ba3447 llava : update MobileVLM-README.md (#6180)	1 year ago
Ziang Wu	272935b281 llava : add MobileVLM_V2 backup (#6175)	1 year ago
slaren	ccf58aa3ec cuda : refactor to remove global resources (#6170)	1 year ago
Xuan Son Nguyen	91f8ad167d Server: version bump for httplib and json (#6169)	1 year ago
Georgi Gerganov	6b7e76d28c gitignore : ignore curl-related files	1 year ago
Georgi Gerganov	bc0baab2ea server : allow to override -ngl in tests (#6170)	1 year ago
Georgi Gerganov	d795988d9e Revert "llava : add a MobileVLM_V2-1.7B backup (#6152)"	1 year ago
Ziang Wu	f8c4e745e1 llava : add a MobileVLM_V2-1.7B backup (#6152)	1 year ago
Karthick	47cc7a7bf9 Server: Handle n_keep parameter in the request (#6174)	1 year ago
Jared Van Bortel	bd60d82d0c server tests : more pythonic process management; fix bare `except:` (#6146)	1 year ago
Neo Zhang Jianyu	6c0b287748 update readme sycl for new update (#6151)	1 year ago
Abhilash Majumder	d26e8b669d increase igpu cluster limit (#6159)	1 year ago
DAN™	d8b009a945 Remove undeed header file. (#6158)	1 year ago
Pierrick Hymbert	d0d5de42e5 gguf-split: split and merge gguf per batch of tensors (#6135)	1 year ago
Georgi Gerganov	b80cf3b2d1 common : disable repeat penalties by default (#6127)	1 year ago
slaren	970a48060a ci : exempt some labels from being tagged as stale (#6140)	1 year ago
DAN™	4c28b82529 common : print usage on '-h' and '--help' (#6145)	1 year ago
github-actions[bot]	2d15886bb0 flake.lock: Update	1 year ago
Jared Van Bortel	d199ca79f2 mpt : implement backwards compatiblity with duped output tensor (#6139)	1 year ago
Felix	104f5e0fc1 clip : fix memory leak (#6138)	1 year ago
slaren	5e1b7f94a0 backend : set max split inputs to GGML_MAX_SRC (#6137)	1 year ago
Georgi Gerganov	ac9ee6a4ad ci : disable stale issue messages (#6126)	1 year ago
Georgi Gerganov	4f6d1337ca ci : temporary disable sanitizer builds (#6128)	1 year ago
slaren	2bf8d0f7c4 backend : offload large batches to GPU (#6083)	1 year ago
DAN™	496bc79bc2 common : tidy-up argument parsing (#6105)	1 year ago
Thérence	9b03719ad7 convert : add support for CamembertModel architecture (#6119)	1 year ago
Romain D	3a6efdd03c convert : use f32 outtype for bf16 tensors (#6106)	1 year ago
Pierrick Hymbert	d01b3c4c32 common: llama_load_model_from_url using --model-url (#6098)	1 year ago
Georgi Gerganov	cd776c37c9 ci : close all stale issues at once (#6115)	1 year ago

Newer Older

Commit History Find

Commit History