Haoxiang Fei
|
f99e1e456e
llama : lookup word in vocab before doing BPE merges (#7193)
|
1 year ago |
Johannes Gäßler
|
5ae3426b0b
server: fix reported top tokens for temperature 0 (#7203)
|
1 year ago |
Joan Fontanals
|
b83cc3f5b3
llama : add Jina Embeddings architecture (#6826)
|
1 year ago |
Georgi Gerganov
|
9cb317f77e
ggml : full ALiBi support (#7192)
|
1 year ago |
slaren
|
e849648888
llama-bench : add pp+tg test type (#7199)
|
1 year ago |
Georgi Gerganov
|
18e437665c
metal : fix flash attention kernel requirements (#7169)
|
1 year ago |
Georgi Gerganov
|
8c660242d7
convert : print "ignore_merges" field
|
1 year ago |
slaren
|
25c6e82e7a
llama : use n_vocab to differentiate between mistral 7B and llama3 8B (#7200)
|
1 year ago |
Justine Tunney
|
4e3880978f
Fix memory bug in grammar parser (#7194)
|
1 year ago |
HanishKVC
|
f89fe2732c
Main+: optionally allow special tokens from user in interactive mode (#7097)
|
1 year ago |
Andrei
|
d11afd6652
llava : fix moondream support (#7163)
|
1 year ago |
Ouadie EL FAROUKI
|
8c570c9496
Minor arithmetic improvement to mmvq wrapper kernel (#7172)
|
1 year ago |
slaren
|
eaf4bd8b39
eval-callback : fix conversion to float (#7184)
|
1 year ago |
0cc4m
|
befddd0f15
Vulkan Bugfixes and Improvements (#7084)
|
1 year ago |
Georgi Gerganov
|
d46dbc76f8
readme : add scheduled server workflow status badge
|
1 year ago |
l3utterfly
|
0961d86604
readme : add app (#6371)
|
1 year ago |
jaime-m-p
|
43248e5594
llama3 custom regex split (#6965)
|
1 year ago |
Johannes Gäßler
|
a743d76a01
CUDA: generalize FP16 fattn vec kernel (#7061)
|
1 year ago |
Galunid
|
f31ec120bc
Add warning if token is invalid (#7173)
|
1 year ago |
Daniel Bevenius
|
fd9f92b154
llama : update llama_timings.n_p_eval setting (#7160)
|
1 year ago |
Sigbjørn Skjæret
|
22842164bc
gguf-py : add special token modification capability (#7166)
|
1 year ago |
Albert Jin
|
4734524882
opencl : alignment size converted from bits to bytes (#7090)
|
1 year ago |
Ahmet Zeer
|
07cd41d096
TypoFix (#7162)
|
1 year ago |
Jared Van Bortel
|
4426e2987b
cmake : fix typo (#7151)
|
1 year ago |
compilade
|
f98eb31c51
convert-hf : save memory with lazy evaluation (#7075)
|
1 year ago |
agray3
|
bc4bba364f
Introduction of CUDA Graphs to LLama.cpp (#6766)
|
1 year ago |
Johannes Gäßler
|
c12452c7ae
JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143)
|
1 year ago |
Georgi Gerganov
|
9da243b36a
Revert "llava : add support for moondream vision language model (#6899)"
|
1 year ago |
JohnnyB
|
bd1871fa2b
server : add themes + favicon (#6848)
|
1 year ago |
Gilad S
|
26458af1d6
metal : use `vm_allocate` instead of `posix_memalign` on macOS (#7078)
|
1 year ago |