Kerfuffle
|
730d9c681e
convert.py : advanced option (#2753)
|
2 years ago |
Tim Miller
|
c7d92e6dfe
llama : use Unicode Escape Sequence to replace encoded characters (#2814)
|
2 years ago |
Tungsten842
|
61d1a2895e
flake.nix : add rocm support and cleanup (#2808)
|
2 years ago |
Cebtenzzre
|
741ca7dd1c
llama : move #includes out of _GNU_SOURCE conditional (#2817)
|
2 years ago |
Dr. Tom Murphy VII Ph.D
|
72f895c923
main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (#1528)
|
2 years ago |
Cebtenzzre
|
50526f37eb
llama : use std::abs in llama_sample_tail_free (#2800)
|
2 years ago |
Georgi Gerganov
|
04f4b1eb10
k-quants : remove unnecessary tensor shape restrictions (#2811)
|
2 years ago |
Kawrakow
|
7592375403
Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (#2807)
|
2 years ago |
Kawrakow
|
771551a793
Fix HellaSwag (#2805)
|
2 years ago |
Volodymyr Vitvitskyi
|
f305bad11e
flake : build llama.cpp on Intel with nix (#2795)
|
2 years ago |
Nigel Bosch
|
a2ca4e9de9
Handle null rope scaling value (#2793)
|
2 years ago |
klosax
|
2ba83c8685
Fix spm whitespaces (#2806)
|
2 years ago |
lon
|
bae5c5f679
examples : skip unnecessary external lib in server README.md how-to (#2804)
|
2 years ago |
Marcus Dunn
|
232caf3c15
llama : fix struct decl (#2790)
|
2 years ago |
Kawrakow
|
d046dcee08
Faster perplexity computation (#2786)
|
2 years ago |
Matt Pulver
|
c82742ac9c
llama : add llama_beam_search() (#2267)
|
2 years ago |
Nigel Bosch
|
28b2c996ca
convert.py : Get rope scale from HuggingFace models (#2772)
|
2 years ago |
slaren
|
154725c543
llama-bench : add model sizes (#2771)
|
2 years ago |
slaren
|
12e2e33a97
convert.py : export rope freq_base when converting CodeLlama from an HF model (#2773)
|
2 years ago |
Jhen-Jie Hong
|
29674ab4e8
server : display token probabilities in the UI (#2489)
|
2 years ago |
Georgi Gerganov
|
5439a0ab57
ci : pip install gguf in editable mode (#2782)
|
2 years ago |
M. Yusuf Sarıgöz
|
8194cd8772
gguf : export objects to user code (#2780)
|
2 years ago |
Henri Vasserman
|
6bbc598a63
ROCm Port (#1087)
|
2 years ago |
Georgi Gerganov
|
3f460a2b72
cuda : add RoPE kernel for mode == 2 (NeoX) (#2760)
|
2 years ago |
M. Yusuf Sarıgöz
|
87e3733f24
gguf : make gguf pip-installable
|
2 years ago |
Shouzheng Liu
|
b91ad7f461
ggml-alloc : enlarge size of parse_seq (#2776)
|
2 years ago |
Marcus Dunn
|
2e5f70a25f
Added `enum` to `llama_token_get_type` return type (#2774)
|
2 years ago |
slaren
|
d0f77b1353
convert.py : try to determine n_ctx automatically for CodeLlama (#2770)
|
2 years ago |
slaren
|
0d3094f0c7
gguf : add rope_freq_base parameter for CodeLlama (#2769)
|
2 years ago |
Georgi Gerganov
|
01f2224682
falcon : write file type
|
2 years ago |