Историја ревизија

Аутор SHA1 Порука Датум
  Srihari-mcw 3d82dbcbce ggml : block interleaving support for Q4_K quantization for x86 AVX2 architecture (#12332) пре 10 месеци
  Bartowski 732b5fbf5e convert : avoid calls to tokenizer.added_tokens_decoder (#12473) пре 10 месеци
  fairydreaming 568013d0cd context : clear sets containing encoder output sequence ids before storing new values (#12470) пре 10 месеци
  Gaurav Garg 517b5ddbf0 CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case (#12183) пре 10 месеци
  Jeff Bolz a9b59288e2 vulkan: optimize iq1 coopmat2 dequant functions (#12427) пре 10 месеци
  Guus Waals 0fd8487b14 Fix visionOS build and add CI (#12415) пре 10 месеци
  Sigbjørn Skjæret 108e53c2f1 llama : add support for GPT2, Bloom and CodeShell tied word embeddings (#12456) пре 10 месеци
  Sigbjørn Skjæret a686171ea7 convert : Support chat_template.json (#12460) пре 10 месеци
  Jeff Bolz c446b2edd2 vulkan: Submit once enough matmul work has been recorded (#12406) пре 10 месеци
  lhez d84635b1b0 opencl: improve profiling (#12442) пре 10 месеци
  Georgi Gerganov 75422e8bc4 graph : normalize Q, K, V shapes + sync cross attention (#12449) пре 10 месеци
  R0CKSTAR bb115d2bf7 musa: override warp_size of musa device to 32 (#12445) пре 10 месеци
  Xuan-Son Nguyen 29fff308c7 llama : support converting Mistral Small text-only (#12450) пре 10 месеци
  Georgi Gerganov c6af2161b2 speculative : fix seg fault in certain cases (#12454) пре 10 месеци
  Xuan-Son Nguyen 99aa304fb9 llama : add support for EXAONE tied word embeddings (#12451) пре 10 месеци
  Georgi Gerganov 8551c44d84 context : always use non-causal attention for encoder graphs (#12447) пре 10 месеци
  Łukasz Ślusarczyk 35cae5ba05 SYCL: using graphs is configurable by environment variable and compile option (#12371) пре 10 месеци
  Georgi Gerganov 810e0af3f5 server : fix warmup draft cache type (#12446) пре 10 месеци
  Prajwal B Mehendarkar eba92d64c3 cmake : fix PowerPC build (#12241) пре 10 месеци
  fj-y-saito d9a14523bb ggml : add SVE support for q6_K_q8_K (#12361) пре 10 месеци
  0cc4m fd123cfead Vulkan: Default to 1GB allocations instead of 4GB to avoid fragmentation and driver issues (#12434) пре 10 месеци
  Łukasz Ślusarczyk a53f7f7b88 fixed compilation warnings in ggml-sycl (#12424) пре 10 месеци
  Molly Sophia 7dfad387e3 llama: Add support for RWKV v7 architecture (#12412) пре 10 месеци
  Sigbjørn Skjæret 60c902926c docs : bring llama-cli conversation/template docs up-to-date (#12426) пре 10 месеци
  Gaurav Garg b1b132efcb cuda : enable CUDA Graph on CUDA Toolkit < 12.x (#12394) пре 10 месеци
  Guus Waals 01e8f2138b ggml-vulkan: remove unused find_program(glslc) (#12416) пре 10 месеци
  Jeff Bolz 484a8ab513 vulkan: Add N/2 and N/4 optimized paths in coopmat2 shader (#12312) пре 10 месеци
  Daniele cf2270e4d3 vulkan: subgroup size tuning (#12087) пре 10 месеци
  Jeff Bolz f07690c930 vulkan: use fp32 in coopmat2 q4_k dequant function (#12309) пре 10 месеци
  Jeff Bolz 891c63956d vulkan: Pad N dimension of B matrix for coopmat2 perf, to avoid bounds checking (#12273) пре 10 месеци