Jared Van Bortel
|
b43ebde3b0
convert : partially revert PR #4818 (#5041)
|
2 years ago |
Jared Van Bortel
|
97c1549808
perplexity : fix MSVC build after #5020 (#5043)
|
2 years ago |
slaren
|
6df465a91d
llama : run all KQV ops on the CPU with no KV offload (#5049)
|
2 years ago |
Herman Semenov
|
77bc1bbd05
cmake : add support for ccache (#5002)
|
2 years ago |
adel boussaken
|
48e2b13372
Add a dart/flutter binding to README.md (#4882)
|
2 years ago |
Kylin
|
cca894f16a
cuda : fix compile error in jetson platform (#4975)
|
2 years ago |
Uzo Nweke
|
381ee19572
finetune : fix ggml_allocr lifetimes (tmp workaround) (#5033)
|
2 years ago |
Georgi Gerganov
|
a5cacb22b2
imatrix : add README.md
|
2 years ago |
Shijie
|
9b75cb2b3c
llama : support upcoming Qwen2 (#5037)
|
2 years ago |
Georgi Gerganov
|
de9a147df1
py : fix flake8 lint
|
2 years ago |
Kawrakow
|
7051aacfac
winogrande: evaluate log-probs in parallel (#5036)
|
2 years ago |
chiranko
|
2b3b999cac
llama : add CodeShell support (#5016)
|
2 years ago |
Kawrakow
|
993fba8180
perplexity: avoid unnecessary alloocations and logit copies (#5035)
|
2 years ago |
Georgi Gerganov
|
8b20858e5e
perplexity : faster Winogrande via batching (#5024)
|
2 years ago |
John
|
57e2a7a52a
llama : fix falcon arch for tied output embeddings (#4978)
|
2 years ago |
Georgi Gerganov
|
9b6ea4263a
cmake : add ggml public headers (#5011)
|
2 years ago |
Xuan Son Nguyen
|
821f0a271e
server : defer tasks when "slot unavailable" (#5018)
|
2 years ago |
slaren
|
96d7f56d29
llama : fix mlock with no-mmap with Metal (#5025)
|
2 years ago |
Georgi Gerganov
|
2d5419d08a
imatrix : fix assert for src0 non-cont check
|
2 years ago |
Georgi Gerganov
|
d391ae9b49
perplexity : fix winogrande N tasks option
|
2 years ago |
Georgi Gerganov
|
e9240cdfa0
scripts : add get-winogrande.sh
|
2 years ago |
David Sommers
|
b46757735d
convert.py : fix llama/llama2 conversion due to vocab_size=-1 (#5019)
|
2 years ago |
Kawrakow
|
3e945cc1e9
HellaSwag: speed up by parallelizing log-prob evaluation (#5020)
|
2 years ago |
Georgi Gerganov
|
ad19812cda
perplexity : faster HellaSwag via batching (#5017)
|
2 years ago |
Kawrakow
|
682986a08e
Add Winogrande evaluation (#5015)
|
2 years ago |
Georgi Gerganov
|
dcad445d0c
scritps : add helper script to get hellaswag data in txt format
|
2 years ago |
Paul Tsochantaris
|
1e605f4102
metal : fix memory leak, dangling pointer and unused autorel (#5007)
|
2 years ago |
Georgi Gerganov
|
6b6916b215
sync : ggml
|
2 years ago |
Georgi Gerganov
|
38566680cd
ggml : add IQ2 to test-backend-ops + refactoring (#4990)
|
2 years ago |
Georgi Gerganov
|
ba69bbc84c
imatrix : offload to GPU support (#4957)
|
2 years ago |