Georgi Gerganov
|
938943cdbf
llama : move vocab, grammar and sampling into separate files (#8508)
|
vor 1 Jahr |
0cc4m
|
751fcfc6c3
Vulkan IQ4_NL Support (#8613)
|
vor 1 Jahr |
Jeroen Mostert
|
46e47417aa
Allow all RDNA2 archs to use sdot4 intrinsic (#8629)
|
vor 1 Jahr |
Georgi Gerganov
|
e7e6487ba0
contrib : clarify PR squashing + module names (#8630)
|
vor 1 Jahr |
luoyu-intel
|
063d99ad11
[SYCL] fix scratch size of softmax (#8642)
|
vor 1 Jahr |
Keke Han
|
081fe431aa
llama : fix codeshell support (#8599)
|
vor 1 Jahr |
Jason Stillerman
|
d94c6e0ccb
llama : add support for SmolLm pre-tokenizer (#8609)
|
vor 1 Jahr |
Jiří Podivín
|
566daa5a5b
*.py: Stylistic adjustments for python (#8233)
|
vor 1 Jahr |
Georgi Gerganov
|
6f11a83e4e
llama : allow overrides for tokenizer flags (#8614)
|
vor 1 Jahr |
Georgi Gerganov
|
e093dd2382
tests : re-enable tokenizer tests (#8611)
|
vor 1 Jahr |
Douglas Hanley
|
50e05353e8
llama : add Mistral Nemo inference support (#8604)
|
vor 1 Jahr |
Jan Boon
|
628154492a
server : update doc to clarify n_keep when there is bos token (#8619)
|
vor 1 Jahr |
Mark Zhuang
|
04bab6b7da
ggml: fix compile error for RISC-V (#8623)
|
vor 1 Jahr |
devojony
|
b7c11d36e6
examples: fix android example cannot be generated continuously (#8621)
|
vor 1 Jahr |
Georgi Gerganov
|
45f2c19cc5
flake.lock: Update (#8610)
|
vor 1 Jahr |
M-A
|
22f281aa16
examples : Rewrite pydantic_models_to_grammar_examples.py (#8493)
|
vor 1 Jahr |
compilade
|
328884f421
gguf-py : fix some metadata name extraction edge cases (#8591)
|
vor 1 Jahr |
compilade
|
c69c63039c
convert_hf : fix Gemma v1 conversion (#8597)
|
vor 1 Jahr |
Johannes Gäßler
|
69c487f4ed
CUDA: MMQ code deduplication + iquant support (#8495)
|
vor 1 Jahr |
Georgi Gerganov
|
07283b1a90
gguf : handle null name during init (#8587)
|
vor 1 Jahr |
Michael Coppola
|
940362224d
llama : add support for Tekken pre-tokenizer (#8579)
|
vor 1 Jahr |
Huifeng Ou
|
69b9945b44
llama.swiftui: fix end of generation bug (#8268)
|
vor 1 Jahr |
Brian
|
c3776cacab
gguf_dump.py: fix markddown kv array print (#8588)
|
vor 1 Jahr |
slaren
|
87e397d00b
ggml : fix quant dot product with odd number of blocks (#8549)
|
vor 1 Jahr |
Brian
|
57b1d4f9eb
convert-*.py: remove add_name from ChatGLMModel class (#8590)
|
vor 1 Jahr |
Georgi Gerganov
|
d197545530
llama : bump max layers from 256 to 512 (#8530)
|
vor 1 Jahr |
Georgi Gerganov
|
be0cfb4175
readme : fix server badge
|
vor 1 Jahr |
Clint Herron
|
b57eb9ca4f
ggml : add friendlier error message to fopen errors (#8575)
|
vor 1 Jahr |
Frank Mai
|
f299aa98ec
fix: typo of chatglm4 chat tmpl (#8586)
|
vor 1 Jahr |
Brian
|
3d0e4367d9
convert-*.py: add general.name kv override (#8571)
|
vor 1 Jahr |