Georgi Gerganov
|
e9240cdfa0
scripts : add get-winogrande.sh
|
2 lat temu |
David Sommers
|
b46757735d
convert.py : fix llama/llama2 conversion due to vocab_size=-1 (#5019)
|
2 lat temu |
Kawrakow
|
3e945cc1e9
HellaSwag: speed up by parallelizing log-prob evaluation (#5020)
|
2 lat temu |
Georgi Gerganov
|
ad19812cda
perplexity : faster HellaSwag via batching (#5017)
|
2 lat temu |
Kawrakow
|
682986a08e
Add Winogrande evaluation (#5015)
|
2 lat temu |
Georgi Gerganov
|
dcad445d0c
scritps : add helper script to get hellaswag data in txt format
|
2 lat temu |
Paul Tsochantaris
|
1e605f4102
metal : fix memory leak, dangling pointer and unused autorel (#5007)
|
2 lat temu |
Georgi Gerganov
|
6b6916b215
sync : ggml
|
2 lat temu |
Georgi Gerganov
|
38566680cd
ggml : add IQ2 to test-backend-ops + refactoring (#4990)
|
2 lat temu |
Georgi Gerganov
|
ba69bbc84c
imatrix : offload to GPU support (#4957)
|
2 lat temu |
Georgi Gerganov
|
44a1a4a41a
backend : add eval callback (#4935)
|
2 lat temu |
Georgi Gerganov
|
c918fe8dca
metal : create autorelease pool during library build (#4970)
|
2 lat temu |
Georgi Gerganov
|
0f83e727af
py : fix whitespace
|
2 lat temu |
Georgi Gerganov
|
4f4bf35f46
py : fix missing added_tokens_dict for SPM and BPE vocabs (#4971)
|
2 lat temu |
Kawrakow
|
2b3a665d39
llama : use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 (#4996)
|
2 lat temu |
Paul Tsochantaris
|
7563293665
metal : remove unnecessary nil check (#4986)
|
2 lat temu |
David Renshaw
|
f46c0c1b0e
llama : fix copy/paste error in llama_sampling_params comment (#4994)
|
2 lat temu |
Georgi Gerganov
|
5c99960901
py : remove unnecessary hasattr (#4903)
|
2 lat temu |
Philip Taron
|
bee938da74
nix: remove nixConfig from flake.nix (#4984)
|
2 lat temu |
Daniel Bevenius
|
cec8a48470
finetune : add training data file to log message (#4979)
|
2 lat temu |
Kawrakow
|
334a835a1c
ggml : importance matrix support for legacy quants (#4969)
|
2 lat temu |
Maximilian Winter
|
4feb4b33ee
examples : add complete parallel function calling example (#4974)
|
2 lat temu |
Georgi Gerganov
|
959ef0c0df
perplexity : fix kv cache handling for hellaswag (#4981)
|
2 lat temu |
Georgi Gerganov
|
c37b3474e6
flake.lock: update flake-parts, flake-parts/nixpkgs-lib, and nixpkgs (#4920)
|
2 lat temu |
Paul Tsochantaris
|
158f8c9e21
metal : localized logic in `ggml_metal_graph_compute` (#4924)
|
2 lat temu |
Neuman Vong
|
862f5e41ab
android : introduce starter project example (#4926)
|
2 lat temu |
Alex Azarov
|
3a48d558a6
metal : replace loop of dispatch_async with dispatch_apply (#4934)
|
2 lat temu |
Alex Azarov
|
7c8d3abd1a
metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (#4936)
|
2 lat temu |
Maximilian Winter
|
122ed4840c
examples : fix and improv docs for the grammar generator (#4909)
|
2 lat temu |
Justine Tunney
|
a0b3ac8c48
ggml : introduce GGML_CALL function annotation (#4850)
|
2 lat temu |