Jiří Podivín
|
566daa5a5b
*.py: Stylistic adjustments for python (#8233)
|
1 år sedan |
Georgi Gerganov
|
6f11a83e4e
llama : allow overrides for tokenizer flags (#8614)
|
1 år sedan |
Georgi Gerganov
|
e093dd2382
tests : re-enable tokenizer tests (#8611)
|
1 år sedan |
Douglas Hanley
|
50e05353e8
llama : add Mistral Nemo inference support (#8604)
|
1 år sedan |
Jan Boon
|
628154492a
server : update doc to clarify n_keep when there is bos token (#8619)
|
1 år sedan |
Mark Zhuang
|
04bab6b7da
ggml: fix compile error for RISC-V (#8623)
|
1 år sedan |
devojony
|
b7c11d36e6
examples: fix android example cannot be generated continuously (#8621)
|
1 år sedan |
Georgi Gerganov
|
45f2c19cc5
flake.lock: Update (#8610)
|
1 år sedan |
M-A
|
22f281aa16
examples : Rewrite pydantic_models_to_grammar_examples.py (#8493)
|
1 år sedan |
compilade
|
328884f421
gguf-py : fix some metadata name extraction edge cases (#8591)
|
1 år sedan |
compilade
|
c69c63039c
convert_hf : fix Gemma v1 conversion (#8597)
|
1 år sedan |
Johannes Gäßler
|
69c487f4ed
CUDA: MMQ code deduplication + iquant support (#8495)
|
1 år sedan |
Georgi Gerganov
|
07283b1a90
gguf : handle null name during init (#8587)
|
1 år sedan |
Michael Coppola
|
940362224d
llama : add support for Tekken pre-tokenizer (#8579)
|
1 år sedan |
Huifeng Ou
|
69b9945b44
llama.swiftui: fix end of generation bug (#8268)
|
1 år sedan |
Brian
|
c3776cacab
gguf_dump.py: fix markddown kv array print (#8588)
|
1 år sedan |
slaren
|
87e397d00b
ggml : fix quant dot product with odd number of blocks (#8549)
|
1 år sedan |
Brian
|
57b1d4f9eb
convert-*.py: remove add_name from ChatGLMModel class (#8590)
|
1 år sedan |
Georgi Gerganov
|
d197545530
llama : bump max layers from 256 to 512 (#8530)
|
1 år sedan |
Georgi Gerganov
|
be0cfb4175
readme : fix server badge
|
1 år sedan |
Clint Herron
|
b57eb9ca4f
ggml : add friendlier error message to fopen errors (#8575)
|
1 år sedan |
Frank Mai
|
f299aa98ec
fix: typo of chatglm4 chat tmpl (#8586)
|
1 år sedan |
Brian
|
3d0e4367d9
convert-*.py: add general.name kv override (#8571)
|
1 år sedan |
Johannes Gäßler
|
a15ef8f8a0
CUDA: fix partial offloading for ne0 % 256 != 0 (#8572)
|
1 år sedan |
65a
|
705b7ecf60
cmake : install all ggml public headers (#8480)
|
1 år sedan |
Eric Zhang
|
0d2c7321e9
server: use relative routes for static files in new UI (#8552)
|
1 år sedan |
Brian
|
672a6f1018
convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor (#7499)
|
1 år sedan |
RunningLeon
|
3807c3de04
server : respect `--special` cli arg (#8553)
|
1 år sedan |
Johannes Gäßler
|
e02b597be3
lookup: fibonacci hashing, fix crashes (#8548)
|
1 år sedan |
Al Mochkin
|
b3283448ce
build : Fix docker build warnings (#8535) (#8537)
|
1 år sedan |