Commit History

Autor SHA1 Mensaxe Data
  Kawrakow 62959e740e Strided perplexity (#2714) %!s(int64=2) %!d(string=hai) anos
  IgnacioFDM 7f7ddd5002 Fix ggml to gguf conversion on Windows (#2733) %!s(int64=2) %!d(string=hai) anos
  Xiao-Yong Jin b8ad1b66b2 server : allow json array in prompt or content for direct token input (#2306) %!s(int64=2) %!d(string=hai) anos
  Evan Jones f5fe98d11b docs : add grammar docs (#2701) %!s(int64=2) %!d(string=hai) anos
  Kerfuffle 777f42ba18 Improve handling of special tokens in GGML to GGUF converter (#2725) %!s(int64=2) %!d(string=hai) anos
  goerch 46ef5b5fcf llama : fix whitespace escaping in tokenizer (#2724) %!s(int64=2) %!d(string=hai) anos
  Johannes Gäßler c63bb1d16a CUDA: use mul_mat_q kernels by default (#2683) %!s(int64=2) %!d(string=hai) anos
  Alex Petenchea 3b6cfe7c92 convert.py : clarifying error message (#2718) %!s(int64=2) %!d(string=hai) anos
  Jiahao Li 800c9635b4 Fix CUDA softmax by subtracting max value before exp (#2665) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov deb7dfca4b gguf : add ftype meta info to the model (#2710) %!s(int64=2) %!d(string=hai) anos
  Kawrakow bac66994cf Quantization imrovements for k_quants (#2707) %!s(int64=2) %!d(string=hai) anos
  slaren 519c981f8b embedding : evaluate prompt in batches (#2713) %!s(int64=2) %!d(string=hai) anos
  slaren 1123f7fbdf ggml-cuda : use graph allocator (#2684) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov ef3f333d37 ggml : sync latest (SAM + SD operators, CUDA alibi) (#2709) %!s(int64=2) %!d(string=hai) anos
  slaren 8e4364f2af llama-bench : minor fixes (#2695) %!s(int64=2) %!d(string=hai) anos
  Kylin 1e3bc523d8 ggml : support CUDA's half type for aarch64(#1455) (#2670) %!s(int64=2) %!d(string=hai) anos
  Shouzheng Liu 14b1d7e6f7 metal : add missing barriers for mul-mat (#2699) %!s(int64=2) %!d(string=hai) anos
  Jhen-Jie Hong 226255b44e server : fallback to default if client param is null (#2688) %!s(int64=2) %!d(string=hai) anos
  Kerfuffle 930523c8e1 Fix convert-llama-ggmlv3-to-gguf.py vocab conversion (#2698) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov c8dba409e6 py : remove obsolete script %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov 6381d4e110 gguf : new file format with flexible meta data (beta) (#2398) %!s(int64=2) %!d(string=hai) anos
  Shouzheng Liu dadbed99e6 metal : fix synchronization in new matrix multiplication kernel (#2686) %!s(int64=2) %!d(string=hai) anos
  Kawrakow cb1c0727bd HellaSwag: split token evaluation into batches if needed (#2681) %!s(int64=2) %!d(string=hai) anos
  slaren 9e232f0234 ggml : move all type info to ggml_type_traits (#2663) %!s(int64=2) %!d(string=hai) anos
  Kawrakow 5e9ff54a67 More efficient Hellaswag implementation (#2677) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov 1f0bccb279 server : better default prompt (#2646) %!s(int64=2) %!d(string=hai) anos
  Jhen-Jie Hong f63564adfa server : update xxd usage for older versions compatibility (#2649) %!s(int64=2) %!d(string=hai) anos
  Adrian 2d8b76a110 Add link to clojure bindings to Readme. (#2659) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov 7af633aec3 readme : incoming BREAKING CHANGE %!s(int64=2) %!d(string=hai) anos
  slaren 097e121e2f llama : add benchmark example (#2626) %!s(int64=2) %!d(string=hai) anos