compilade
|
557410b8f0
llama : greatly reduce output buffer memory usage (#6122)
|
1 rok pred |
slaren
|
f30ea47a87
llama : add pipeline parallelism support (#6017)
|
1 rok pred |
Georgi Gerganov
|
05b06210c9
llama : more consistent names of count variables (#5994)
|
1 rok pred |
slaren
|
d894f352bf
perplexity : support using multiple sequences to allow larger batch sizes (#5946)
|
1 rok pred |
compilade
|
c2101a2e90
llama : support Mamba Selective State Space Models (#5328)
|
1 rok pred |
Georgi Gerganov
|
b1de96824b
ci : fix wikitext url + compile warnings (#5569)
|
1 rok pred |
Herman Semenov
|
5d3de51f97
ggml, common, examples, tests : fixed type arguments in printf (#5528)
|
1 rok pred |
bmwl
|
f486f6e1e5
ggml : add numa options (#5377)
|
1 rok pred |
Michael Klimenko
|
52bb63c708
refactor : switch to emplace_back to avoid extra object (#5291)
|
1 rok pred |
kalomaze
|
191221178f
perplexity : fix KL divergence calculations on Windows (#5273)
|
1 rok pred |
Kawrakow
|
44879ee885
Additional KL-divergence statistics (#5081)
|
2 rokov pred |
Georgi Gerganov
|
89758723c7
minor : clean-up some warnings and style (#5094)
|
2 rokov pred |
Kawrakow
|
6f9939d119
KL-divergence (#5076)
|
2 rokov pred |
Kawrakow
|
7dcbe39d36
Add ability to evauate multiple choice tasks (#5047)
|
2 rokov pred |
Jared Van Bortel
|
97c1549808
perplexity : fix MSVC build after #5020 (#5043)
|
2 rokov pred |
Kawrakow
|
7051aacfac
winogrande: evaluate log-probs in parallel (#5036)
|
2 rokov pred |
Kawrakow
|
993fba8180
perplexity: avoid unnecessary alloocations and logit copies (#5035)
|
2 rokov pred |
Georgi Gerganov
|
8b20858e5e
perplexity : faster Winogrande via batching (#5024)
|
2 rokov pred |
Georgi Gerganov
|
d391ae9b49
perplexity : fix winogrande N tasks option
|
2 rokov pred |
Kawrakow
|
3e945cc1e9
HellaSwag: speed up by parallelizing log-prob evaluation (#5020)
|
2 rokov pred |
Georgi Gerganov
|
ad19812cda
perplexity : faster HellaSwag via batching (#5017)
|
2 rokov pred |
Kawrakow
|
682986a08e
Add Winogrande evaluation (#5015)
|
2 rokov pred |
Georgi Gerganov
|
959ef0c0df
perplexity : fix kv cache handling for hellaswag (#4981)
|
2 rokov pred |
Kerfuffle
|
91f6499393
Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040)
|
2 rokov pred |
cebtenzzre
|
b12fa0d1c1
build : link against build info instead of compiling against it (#3879)
|
2 rokov pred |
Kerfuffle
|
6e08281e58
Extend llama_kv_cache_seq_rm to allow matching any sequence (#3843)
|
2 rokov pred |
Marcus Dunn
|
5be6c803fa
llama : remove token functions with `context` args in favor of `model` (#3720)
|
2 rokov pred |
slaren
|
16bc66d947
llama.cpp : split llama_context_params into model and context params (#3301)
|
2 rokov pred |
Georgi Gerganov
|
ec893798b7
llama : custom attention mask + parallel decoding + no context swaps (#3228)
|
2 rokov pred |
Cebtenzzre
|
8781013ef6
make : restore build-info.h dependency for several targets (#3205)
|
2 rokov pred |