Commit History

Author SHA1 Message Date
  Michael Coppola 31e7903221 server : add `dynatemp_range` and `dynatemp_exponent` (#5352) 1 year ago
  Niall Coates 4ffc7a17d4 server : various fixes for the prompt field in /completion (#5300) 1 year ago
  Georgi Gerganov 906cff55c2 py : handle byte tokens in `get_token_type` (#5341) 1 year ago
  Johannes Gäßler 098f6d737b make: Use ccache for faster compilation (#5318) 1 year ago
  Johannes Gäßler 78b00dda6c README: updated introduction (#5343) 1 year ago
  Kawrakow c6b395535a ggml : make use of ggml-quants.h possible in C++ code (#5338) 1 year ago
  Dr. Tom Murphy VII Ph.D abb61944a5 ggml : avoid duplicating function calls using MIN/MAX macros (#5325) 1 year ago
  Kawrakow 89503dcb5f iq3_xxs: quards for the no-imatrix situation (#5334) 1 year ago
  Guoteng 7e1ae372f3 py : fix internlm2-hf convert to gguf (#5305) 1 year ago
  Kawrakow 6fdfa2ecc6 iq2_xxs: tune quantization (#5320) 1 year ago
  Alexey Parfenov a2d60c9158 server : allow to get default generation settings for completion (#5307) 1 year ago
  l3utterfly e6f8177532 common : add dynamic temperature parameters to main example cli (#5295) 1 year ago
  Georgi Gerganov 30679d438d scripts : fix typos, cleanup (#5303) 1 year ago
  Нияз Гарифзянов 4be04c8965 scripts : add non-interactive server-llm.sh (#5303) 1 year ago
  chiranko 5d55b0cd82 readme : add CodeShell models to the supported models list (#5330) 1 year ago
  AidanBeltonS 4833ac209d [SYCL] Fix cpy with dims of 3 (#5289) 1 year ago
  github-actions[bot] 9392ebd49e flake.lock: Update 1 year ago
  Kawrakow 5ed26e1fc9 Adding some imatrix tools (#5302) 1 year ago
  Welby Seely 277fad30c6 cmake : use set() for LLAMA_WIN_VER (#5298) 1 year ago
  Johannes Gäßler 3c0d25c475 make: add nvcc info print (#5310) 1 year ago
  Johannes Gäßler 3cc5ed353c make: fix nvcc optimization flags for host code (#5309) 1 year ago
  Martin Schwaighofer 60ecf099ed add Vulkan support to Nix flake 2 years ago
  0cc4m e920ed393d Vulkan Intel Fixes, Optimizations and Debugging Flags (#5301) 1 year ago
  Michael Klimenko 52bb63c708 refactor : switch to emplace_back to avoid extra object (#5291) 1 year ago
  Jared Van Bortel 1ec3332ade YaRN : store rope scaling type as int32_t in memory (#5285) 1 year ago
  BADR 6a66c5071a readme : add tenere in the ui tools list (#5284) 1 year ago
  AidanBeltonS a305dba8ff Fix im2col with 32fp (#5286) 1 year ago
  kalomaze 191221178f perplexity : fix KL divergence calculations on Windows (#5273) 1 year ago
  Georgi Gerganov e437b37fd0 scripts : parse wtype in server-llm.sh (#5167) 1 year ago
  Mirror Azure 2d40085c26 py : add check for '.attn.masked_bias' layers to GPT2model (#5281) 1 year ago