Commit History

Author SHA1 Message Date
  igardev c7e0a2054b webui : Replace alert and confirm with custom modals. (#13711) 8 months ago
  Georgi Gerganov 3f55f781f1 llama : auto-batch preparation (#13845) 8 months ago
  Xuan-Son Nguyen 51fa76f172 mtmd : drop `_shared` from `libmtmd` name, merge helpers into libmtmd (⚠️ breaking change) (#13917) 8 months ago
  Georgi Gerganov 12d0188c0d kv-cache : refactor + add llama_memory_state_i (#13746) 8 months ago
  Shawn yang eb3949938e CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dGPU in cuda (#13856) (#13895) 8 months ago
  Johannes Gäßler e562eece7c CUDA: fix typo in FlashAttention code (#13926) 8 months ago
  Diego Devesa b47ab7b8e9 sched : avoid changing cur_copy when a graph is already allocated (#13922) 8 months ago
  Georgi Gerganov dd665cc9d4 parallel : increase the variability of the prompt lengths (#13927) 8 months ago
  Diego Devesa df0c0c7d02 cuda : prevent using split buffers with 3d/4d matrices (#13919) 8 months ago
  Akarshan Biswas b49a8ff96b SYCL: Add mrope kernel (#13755) 8 months ago
  Georgi Gerganov 53f925074d sync : vendor (#13901) 8 months ago
  Sigbjørn Skjæret db38704f01 convert : fix rwkv bos/eos token (#13844) 8 months ago
  Xuan-Son Nguyen 07e4351ce6 convert : allow partial update to the chkhsh pre-tokenizer list (#13847) 8 months ago
  Đinh Trọng Huy 291f2b6913 llama : add support for DistilBert (#13907) 8 months ago
  zhangkaihuo 2c90da4c7e llama : use llm_build_granite for minicpm (#13911) 8 months ago
  Christian Kastner ec9e0301fe cmake: Guard GGML_CPU_ALL_VARIANTS by architecture (#13890) 8 months ago
  Sigbjørn Skjæret e83ba3e460 llama : add support for jina-reranker-v2 (#13900) 8 months ago
  Sigbjørn Skjæret 2b131621e6 gguf-py : add support for sub_type (in arrays) in GGUFWriter add_key_value method (#13561) 8 months ago
  Yibo Cai 54a2c7a8cd arm64: optimize q4_k_q8_k kernel with i8mm (#13886) 8 months ago
  Christian Kastner 21fcc21ad5 cmake: Factor out CPU architecture detection (#13883) 8 months ago
  Vineel Abhinav dd8ba93416 ggml: aarch64: Implement SVE F32 kernels for Mamba Sequential Scan Algorithm (#13882) 8 months ago
  Georgi Gerganov 66c92061f5 tests : remove json.hpp from a test (#13880) 8 months ago
  Sigbjørn Skjæret 5ca82fc1d7 convert : workaround for AutoConfig dummy labels (#13881) 8 months ago
  Sigbjørn Skjæret 6385b843a8 llama : add RobertaForSequenceClassification reranker support (#13875) 8 months ago
  Vineel Abhinav 1b8fb8152d ggml: aarch64: Implement SVE F32 kernels for vector functions (#13843) 8 months ago
  Beinsezii 53ae30640e gguf-py : fix SafetensorRemote return on undefined size (< 0) (#13841) 8 months ago
  Xuan-Son Nguyen 763d06edb7 llama : fix KV shift for qwen2vl (#13870) 8 months ago
  Xuan-Son Nguyen 10961339b2 mtmd : move helpers to dedicated library (⚠️ breaking change) (#13866) 8 months ago
  bandoti d98f2a35fc ci: disable LLAMA_CURL for Linux cross-builds (#13871) 8 months ago
  Đinh Trọng Huy e0e3aa231d llama : add support for BertForSequenceClassification reranker (#13858) 8 months ago