Commit History

Author SHA1 Message Date
  Herman Semenov 4cb0727698 llava : removed excess free(NULL) operation (#5531) 1 year ago
  Herman Semenov 65085c713e llama : minor fixed return int value (#5529) 1 year ago
  Alexey Parfenov 6dcc02d244 server : add "samplers" param to control the samplers order (#5494) 1 year ago
  Rőczey Barnabás 5f5808ca7b server : fix system prompt cli (#5516) 1 year ago
  bmwl f486f6e1e5 ggml : add numa options (#5377) 1 year ago
  Daniel Bevenius 60ed04cf82 llava : fix clip-model-is-vision flag in README.md (#5509) 1 year ago
  Georgi Gerganov 594845aab1 ci : fix BERT model download and convert 1 year ago
  Douglas Hanley 4524290e87 Use correct type of pooling for embedding models (#5500) 1 year ago
  Georgi Gerganov c06e45d729 clip : fix wrong loop condition 1 year ago
  slaren 9060a1e9df cuda : print message when initialization fails (#5512) 1 year ago
  Georgi Gerganov 9350a1cf21 scripts : add hf.sh helper script (#5501) 1 year ago
  Michaël de Vries 73122473ff fix(gguf-py): special tokens are no longer skipped when add_<token>_token is set to false (#5487) 1 year ago
  Elbios 0d4177126b llava : fix memory management bug (#5491) 1 year ago
  John 7930a8a6e8 llaba : hotfix for llava-1.6 image number (#5495) 1 year ago
  Neuman Vong 704359e299 vulkan: Find optimal memory type but with fallback (#5381) 1 year ago
  Rune 594fca3fef readme : fix typo (#5490) 1 year ago
  John ccbb277f46 llava : update README.md (#5489) 1 year ago
  Michael Podvitskiy 8084d55440 cmake : ARM intrinsics detection for MSVC (#5401) 1 year ago
  John aa23412989 llava : support v1.6 (#5267) 1 year ago
  AT f5ca054855 Early return for zero size calls to get_tensor. (#5482) 1 year ago
  John 6c00a06692 gguf : add python reader example (#5216) 1 year ago
  Jared Van Bortel ea9c8e1143 llama : add support for Nomic Embed (#5468) 1 year ago
  Aarni Koskela c4e6dd59e4 llama : allow raw byte in SPM vocabs; don't crash on nl 404 (#5478) 1 year ago
  Aarni Koskela 037259be68 llama : make load error reporting more granular (#5477) 1 year ago
  Daniel Bevenius 263978904c finetune : rename feed-forward tensors (w1/w2/w3) (#4839) 1 year ago
  Georgi Gerganov cf45252a7c tests : multi-thread the tokenizer tests (#5474) 1 year ago
  Douglas Hanley 03bf161eb6 llama : support batched embeddings (#5466) 1 year ago
  Johannes Gäßler ad014bba97 make: add error message for bad CUDA version (#5444) 1 year ago
  Georgi Gerganov 49cc1f7d67 bert : add tests + fix quantization (#5475) 1 year ago
  Georgi Gerganov 99b8b43d7b tests : disable moe test (#5473) 1 year ago