Historia zmian

Autor SHA1 Wiadomość Data
  Kawrakow 89503dcb5f iq3_xxs: quards for the no-imatrix situation (#5334) 1 rok temu
  Jared Van Bortel 1ec3332ade YaRN : store rope scaling type as int32_t in memory (#5285) 1 rok temu
  Ian Bull e1e721094d llama : fix memory leak in llama_batch_free (#5252) 1 rok temu
  Guoteng ce32060198 llama : support InternLM2 (#5184) 1 rok temu
  Georgi Gerganov d3bac7d584 llama : reorder build_orion() at correct place (#5118) 1 rok temu
  Georgi Gerganov 5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240) 1 rok temu
  Yiming Cui d62520eb2c Fix typos of IQ2_XXS and IQ3_XXS in llama.cpp (#5231) 1 rok temu
  Jared Van Bortel e8dc55d006 kompute : llama-bench support and ggml_cpu_has_kompute() (#5226) 1 rok temu
  Kawrakow f4d7e54974 SOTA 3-bit quants (#5196) 1 rok temu
  Jared Van Bortel 6daa69ee81 kompute : fix fallback to CPU (#5201) 2 lat temu
  Jared Van Bortel fbf1ddec69 Nomic Vulkan backend (#4456) 2 lat temu
  divinity76 2aed77eb06 fix typo "RLIMIT_MLOCK" (#5175) 2 lat temu
  0cc4m 2307523d32 ggml : add Vulkan backend (#2059) 2 lat temu
  Abhilash Majumder 0f648573dd ggml : add unified SYCL backend for Intel GPUs (#2690) 2 lat temu
  Johannes Gäßler 9241c3a2ac Apply min_p to unsorted tokens (#5115) 2 lat temu
  Johannes Gäßler b2b2bf988c Tests for min_p, sampling queue (#5147) 2 lat temu
  sharpHL f2e69d28c0 llama : add support for Orion-14B (#5118) 2 lat temu
  Kawrakow 1182cf4d4f Another bucket sort (#5109) 2 lat temu
  l3utterfly 5eaf9964fc llama : dynamic temperature sampling (#4972) 2 lat temu
  Kawrakow faa3526a1e Fix Q3_K_XS for MoE models (#5113) 2 lat temu
  slaren 1387ea2117 llama : pre-allocate input tensors in a separate buffer (#5100) 2 lat temu
  Georgi Gerganov 89758723c7 minor : clean-up some warnings and style (#5094) 2 lat temu
  slaren 011e8ec577 llama : fix not enough space in buffer with Qwen (#5086) 2 lat temu
  compilade d6bd4d46dd llama : support StableLM 2 1.6B (#5052) 2 lat temu
  Kawrakow 66d575c45c llama : add Q3_K_XS (#5060) 2 lat temu
  Shijie 3466c6ebcf llama : add more qwen2 models (#5071) 2 lat temu
  slaren 6df465a91d llama : run all KQV ops on the CPU with no KV offload (#5049) 2 lat temu
  Shijie 9b75cb2b3c llama : support upcoming Qwen2 (#5037) 2 lat temu
  chiranko 2b3b999cac llama : add CodeShell support (#5016) 2 lat temu
  John 57e2a7a52a llama : fix falcon arch for tied output embeddings (#4978) 2 lat temu