Commit History

Author SHA1 Message Date
  slaren 1c51f98adc cuda : print the returned error when CUDA initialization fails (#6185) 1 year ago
  Ziang Wu f9c7ba3447 llava : update MobileVLM-README.md (#6180) 1 year ago
  Ziang Wu 272935b281 llava : add MobileVLM_V2 backup (#6175) 1 year ago
  slaren ccf58aa3ec cuda : refactor to remove global resources (#6170) 1 year ago
  Xuan Son Nguyen 91f8ad167d Server: version bump for httplib and json (#6169) 1 year ago
  Georgi Gerganov 6b7e76d28c gitignore : ignore curl-related files 1 year ago
  Georgi Gerganov bc0baab2ea server : allow to override -ngl in tests (#6170) 1 year ago
  Georgi Gerganov d795988d9e Revert "llava : add a MobileVLM_V2-1.7B backup (#6152)" 1 year ago
  Ziang Wu f8c4e745e1 llava : add a MobileVLM_V2-1.7B backup (#6152) 1 year ago
  Karthick 47cc7a7bf9 Server: Handle n_keep parameter in the request (#6174) 1 year ago
  Jared Van Bortel bd60d82d0c server tests : more pythonic process management; fix bare `except:` (#6146) 1 year ago
  Neo Zhang Jianyu 6c0b287748 update readme sycl for new update (#6151) 1 year ago
  Abhilash Majumder d26e8b669d increase igpu cluster limit (#6159) 1 year ago
  DAN™ d8b009a945 Remove undeed header file. (#6158) 1 year ago
  Pierrick Hymbert d0d5de42e5 gguf-split: split and merge gguf per batch of tensors (#6135) 1 year ago
  Georgi Gerganov b80cf3b2d1 common : disable repeat penalties by default (#6127) 1 year ago
  slaren 970a48060a ci : exempt some labels from being tagged as stale (#6140) 1 year ago
  DAN™ 4c28b82529 common : print usage on '-h' and '--help' (#6145) 1 year ago
  github-actions[bot] 2d15886bb0 flake.lock: Update 1 year ago
  Jared Van Bortel d199ca79f2 mpt : implement backwards compatiblity with duped output tensor (#6139) 1 year ago
  Felix 104f5e0fc1 clip : fix memory leak (#6138) 1 year ago
  slaren 5e1b7f94a0 backend : set max split inputs to GGML_MAX_SRC (#6137) 1 year ago
  Georgi Gerganov ac9ee6a4ad ci : disable stale issue messages (#6126) 1 year ago
  Georgi Gerganov 4f6d1337ca ci : temporary disable sanitizer builds (#6128) 1 year ago
  slaren 2bf8d0f7c4 backend : offload large batches to GPU (#6083) 1 year ago
  DAN™ 496bc79bc2 common : tidy-up argument parsing (#6105) 1 year ago
  Thérence 9b03719ad7 convert : add support for CamembertModel architecture (#6119) 1 year ago
  Romain D 3a6efdd03c convert : use f32 outtype for bf16 tensors (#6106) 1 year ago
  Pierrick Hymbert d01b3c4c32 common: llama_load_model_from_url using --model-url (#6098) 1 year ago
  Georgi Gerganov cd776c37c9 ci : close all stale issues at once (#6115) 1 year ago