Commit History

Author SHA1 Message Date
  Joan Fontanals f5d7b268ec llama : add jina v2 base code (#7596) 1 year ago
  Galunid 7672adeec7 Fix encoding in python scripts (#7733) 1 year ago
  Bartowski c429b33beb llama : add Smaug 70B support (#7402) 1 year ago
  Georgi Gerganov c3f8d58356 tests : test-tokenizer-0.sh print more info (#7402) 1 year ago
  Anas Ahouzi 6aade19ee7 Add StableLM2 pre-tokenizer (#7349) 1 year ago
  Aarni Koskela d273c1402b py : convert-hf-to-gguf-update improvements (#7340) 1 year ago
  Joan Fontanals 9aa672490c llama : rename jina tokenizers to v2 (#7249) 1 year ago
  CrispStrobe 3292733f95 convert : skip unaccessible HF repos (#7210) 1 year ago
  Joan Fontanals b83cc3f5b3 llama : add Jina Embeddings architecture (#6826) 1 year ago
  Georgi Gerganov 8c660242d7 convert : print "ignore_merges" field 1 year ago
  jaime-m-p 43248e5594 llama3 custom regex split (#6965) 1 year ago
  Galunid f31ec120bc Add warning if token is invalid (#7173) 1 year ago
  Ren Xuancheng 229ffff872 llama : add BPE pre-tokenization for Qwen2 (#7114) 1 year ago
  DAN™ 4cd621c26d convert : add BPE pre-tokenization for DBRX (#7132) 1 year ago
  Georgi Gerganov 7e0b6a7b3b py : also print the normalizers 1 year ago
  nopperl b6aa670203 Fix OLMo HF to GGUF conversion (#6910) 1 year ago
  DAN™ 889bdd7686 command-r : add BPE pre-tokenization (#7063) 1 year ago
  Brian 6fbd432211 py : logging and flake8 suppression refactoring (#7081) 1 year ago
  Georgi Gerganov 92139b90af tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) 1 year ago
  Brian a2ac89d6ef convert.py : add python logging instead of print() (#6511) 1 year ago
  Georgi Gerganov 952d03dbea convert : use utf8 encoding (#7000) 1 year ago
  Georgi Gerganov f4ab2a4147 llama : fix BPE pre-tokenization (#6920) 1 year ago