slaren
|
541600201e
llama : disable pipeline parallelism with nkvo (#7265)
|
1 year ago |
Elton Kola
|
efc8f767c8
move ndk code to a new library (#6951)
|
1 year ago |
Haggai Nuchi
|
e0f556186b
Add left recursion check: quit early instead of going into an infinite loop (#7083)
|
1 year ago |
Ryuei
|
27f65d6267
docs: Fix typo and update description for --embeddings flag (#7026)
|
1 year ago |
compilade
|
ee52225067
convert-hf : support direct Q8_0 conversion (#7234)
|
1 year ago |
Georgi Gerganov
|
614d3b914e
llama : less KV padding when FA is off (#7257)
|
1 year ago |
k.h.lai
|
30e70334f7
llava-cli: fix base64 prompt (#7248)
|
1 year ago |
Johannes Gäßler
|
1c570d8bee
perplexity: add BF16 vs. FP16 results (#7150)
|
1 year ago |
Neo Zhang
|
948f4ec7c5
[SYCL] rm wait() (#7233)
|
1 year ago |
Joan Fontanals
|
9aa672490c
llama : rename jina tokenizers to v2 (#7249)
|
1 year ago |
Brian
|
b1f8af1886
convert.py: Outfile default name change and additional metadata support (#4858)
|
1 year ago |
Benjamin Findley
|
e586ee4259
change default temperature of OAI compat API from 0 to 1 (#7226)
|
1 year ago |
Neo Zhang
|
cbf75894d2
[SYCL] Add oneapi runtime dll files to win release package (#7241)
|
1 year ago |
Neo Zhang
|
0d5cef78ae
[SYCL] update CI with oneapi 2024.1 (#7235)
|
1 year ago |
Johannes Gäßler
|
dc685be466
CUDA: add FP32 FlashAttention vector kernel (#7188)
|
1 year ago |
Georgi Gerganov
|
6f1b63606f
cmake : fix version cmp (#7227)
|
1 year ago |
slaren
|
b228aba91a
remove convert-lora-to-ggml.py (#7204)
|
1 year ago |
Georgi Gerganov
|
7bd4ffb780
metal : fix warnings (skipme) (#0)
|
1 year ago |
Georgi Gerganov
|
1622ac023f
sync : ggml
|
1 year ago |
Georgi Gerganov
|
6aeff24f8b
metal : fix indent (ggml/0)
|
1 year ago |
Georgi Gerganov
|
325756d28d
ggml : resolve merge (ggml/0)
|
1 year ago |
Josh Ramer
|
fed0108491
Scripting & documenting debugging one test without anything else in the loop. (#7096)
|
1 year ago |
Xuan Son Nguyen
|
72c177c1f6
fix system prompt handling (#7153)
|
1 year ago |
compilade
|
5a419926b0
convert-hf : support bfloat16 conversion (#7158)
|
1 year ago |
Georgi Gerganov
|
fae9d234b6
sync : ggml
|
1 year ago |
Justina Cho
|
f5ef34e428
feat: implemented sigmoid function (ggml/806)
|
1 year ago |
Borislav Stanimirov
|
ef0d5e3ec9
build: fix and ignore msvc warnings (ggml/805)
|
1 year ago |
CrispStrobe
|
3292733f95
convert : skip unaccessible HF repos (#7210)
|
1 year ago |
Steve Grubb
|
988631335a
server : free llama_batch on exit (#7212)
|
1 year ago |
Haoxiang Fei
|
f99e1e456e
llama : lookup word in vocab before doing BPE merges (#7193)
|
1 year ago |