Commit History

Author SHA1 Message Date
  Clint Herron ad675e1c67 Added support for . (any character) token in grammar engine. (#6467) 1 year ago
  Mattheus Chediak a143c04375 README minor fixes (#7798) [no ci] 1 year ago
  Olivier Chafik 55b2d0849d grammars: x{min,max} repetition operator (#6640) 1 year ago
  Joan Fontanals f5d7b268ec llama : add jina v2 base code (#7596) 1 year ago
  slaren 2d08b7fbb4 docker : build only main and server in their images (#7782) 1 year ago
  slaren d67caea0d6 docker : add openmp lib (#7780) 1 year ago
  Galunid 7672adeec7 Fix encoding in python scripts (#7733) 1 year ago
  Johannes Gäßler 7d1a378b8f CUDA: refactor mmq, dmmv, mmvq (#7716) 1 year ago
  Georgi Gerganov 2b3389677a ggml : refactor rope norm/neox (#7634) 1 year ago
  arch-btw 9973e81c5c readme : remove -ins (#7759) 1 year ago
  jaime-m-p c90dbe026b Fix per token atrributes bits (#7749) 1 year ago
  agray3 b90dc566c1 Allow number of nodes in CUDA graph to change (#7738) 1 year ago
  Georgi Gerganov 1442677f92 common : refactor cli arg parsing (#7675) 1 year ago
  Georgi Gerganov 554c247caf ggml : remove OpenCL (#7735) 1 year ago
  Georgi Gerganov 0cd6bd3483 llama : remove beam search (#7736) 1 year ago
  Georgi Gerganov 5ca0944a15 readme : remove obsolete Zig instructions (#7471) 1 year ago
  slaren adc9ff3841 llama-bench : allow using a different printer for stderr with -oe (#7722) 1 year ago
  Daniele 987d743d6b Improve hipBLAS support in CMake (#7696) 1 year ago
  zhouwg b226c1227b refine .gitignore (#7688) 1 year ago
  jaime-m-p 3b38d48609 Per token attributes (#7685) 1 year ago
  Georgi Gerganov 6d1616944d ggml : prevent builds with -ffinite-math-only (#7726) 1 year ago
  Radoslav Gerganov bde7cd3cd9 llama : offload to RPC in addition to other backends (#7640) 1 year ago
  Masaya, Kato a5735e4426 ggml : use OpenMP as a thread pool (#7606) 1 year ago
  Johannes Gäßler 0b832d53ba make: fix debug options not being applied to NVCC (#7714) 1 year ago
  0cc4m 3d7ebf6312 Vulkan Mixture of Experts (MoE) support (#7628) 1 year ago
  Andy Tai a10cda58d3 cmake : add pkg-config spec file for llama.cpp (#7702) 1 year ago
  zhangkaihuo 6f28a333c1 llama : MiniCPM support tied embeddings (#7664) 1 year ago
  Georgi Gerganov 549279d804 llama : avoid double token-to-piece cache (#7654) 1 year ago
  woachk 9e405b6e2e kompute : implement op_getrows_f32 (#6403) 1 year ago
  Dave Airlie 3413ae2193 fix bug introduced in using calloc (#7701) 1 year ago