Commit History

Author SHA1 Message Date
  Xuan-Son Nguyen 4d1316c440 arg: fix ASAN error on sampler_type_names empty (#18167) 1 month ago
  Sigbjørn Skjæret ec7b9329ae gguf-py : use copy-on-write mode for localtensor (#18162) 1 month ago
  yulo 54189c0d39 remove i_major_dual (#18157) 1 month ago
  Aleksander Grygier 9ce64aed7d webui: Fix selecting generated output issues during active streaming (#18091) 1 month ago
  Kim S. 900316da4e webui: fix chat screen shadow width (#18010) 1 month ago
  Johannes Gäßler 57c1e05643 llama: offload output layer to GPU first (#18148) 1 month ago
  Sigbjørn Skjæret 9cff4cc554 convert : sort and use file parts from model index if present (#18043) 1 month ago
  Julius Tischbein 4d4f4cacd1 llama : Async DirectIO model loading on Linux (#18012) 1 month ago
  Shouyu 0a0bba05e8 ggml-hexagon: swiglu_oai operation (#18114) 1 month ago
  Sigbjørn Skjæret 5166aaf868 convert : force patch_merger tensors to f16/f32 (#18124) 1 month ago
  Pascal 6ce3d85796 server: (webui) add --webui-config (#18028) 1 month ago
  Xuan-Son Nguyen e85e9d7637 server: (router) disable SSL on child process (#18141) 1 month ago
  Johannes Gäßler 8dcc3662a2 llama-fit-params: fix memory print (#18136) 1 month ago
  Kim S. d37fc93505 webui: fix chat header width when sidebar is closed (#17981) 1 month ago
  Shouyu 4470a0764a ggml-hexagon: gelu operation (#17921) 1 month ago
  Georgi Gerganov 4301e27319 common : restore grammar-based rejection sampling (#18137) 1 month ago
  Johannes Gäßler a2c199e479 common: clarify instructions for bug reports (#18134) 1 month ago
  HonestQiao 15dd67d869 model: fix GLM-ASR-Nano-2512 load error (#18130) (#18142) 1 month ago
  Xuan-Son Nguyen bde461de8c server: (router) allow child process to report status via stdout (#18110) 1 month ago
  Piotr Wilkin (ilintar) 8faa87db02 Extend run-org-model.py, add (a) batching (b) loading prompt from file (c) multimodal capacity (#18034) 1 month ago
  Johannes Gäßler 6f1f6a961a Github: ask for -v logs for params_fit [no ci] (#18128) 1 month ago
  Alberto Cabrera Pérez 669696e00d ggml-cpu: ARM64: repack version of q8_0 (dotprod and i8mm) (#18096) 1 month ago
  Tarek Dakhran 982060fadc model: fix LFM2_MOE missing tensors (#18132) 1 month ago
  Sigbjørn Skjæret 6853bee680 ci : clean up webui jobs (#18116) 1 month ago
  Pascal 487674fbb3 common: fix --override-kv to support comma-separated values (#18056) 1 month ago
  yulo acec774ef6 HIP: Refactor mma for RDNA and CDNA (#17990) 1 month ago
  Naco Siren 5c0d18881e llama.android : Rewrite Android binding (w/o cpu_features dep) (#17413) 1 month ago
  TrevorS 4b2a4778f8 arg: allow -kvu flag for llama-perplexity (#18117) 1 month ago
  Aadeshveer Singh 58062860af ggml : use WARP_SIZE/2 for argmax reduction offset (#18092) 1 month ago
  Yuri Khrustalev 2973a65ecb gguf-py : allow converting multi-tensor models from read-only locations (#18100) 1 month ago