cturan/llama.cpp

Autor	SHA1 Mensaje	Fecha
Georgi Gerganov	38566680cd ggml : add IQ2 to test-backend-ops + refactoring (#4990)	hace 2 años
Georgi Gerganov	ba69bbc84c imatrix : offload to GPU support (#4957)	hace 2 años
Georgi Gerganov	44a1a4a41a backend : add eval callback (#4935)	hace 2 años
Georgi Gerganov	c918fe8dca metal : create autorelease pool during library build (#4970)	hace 2 años
Georgi Gerganov	0f83e727af py : fix whitespace	hace 2 años
Georgi Gerganov	4f4bf35f46 py : fix missing added_tokens_dict for SPM and BPE vocabs (#4971)	hace 2 años
Kawrakow	2b3a665d39 llama : use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 (#4996)	hace 2 años
Paul Tsochantaris	7563293665 metal : remove unnecessary nil check (#4986)	hace 2 años
David Renshaw	f46c0c1b0e llama : fix copy/paste error in llama_sampling_params comment (#4994)	hace 2 años
Georgi Gerganov	5c99960901 py : remove unnecessary hasattr (#4903)	hace 2 años
Philip Taron	bee938da74 nix: remove nixConfig from flake.nix (#4984)	hace 2 años
Daniel Bevenius	cec8a48470 finetune : add training data file to log message (#4979)	hace 2 años
Kawrakow	334a835a1c ggml : importance matrix support for legacy quants (#4969)	hace 2 años
Maximilian Winter	4feb4b33ee examples : add complete parallel function calling example (#4974)	hace 2 años
Georgi Gerganov	959ef0c0df perplexity : fix kv cache handling for hellaswag (#4981)	hace 2 años
Georgi Gerganov	c37b3474e6 flake.lock: update flake-parts, flake-parts/nixpkgs-lib, and nixpkgs (#4920)	hace 2 años
Paul Tsochantaris	158f8c9e21 metal : localized logic in `ggml_metal_graph_compute` (#4924)	hace 2 años
Neuman Vong	862f5e41ab android : introduce starter project example (#4926)	hace 2 años
Alex Azarov	3a48d558a6 metal : replace loop of dispatch_async with dispatch_apply (#4934)	hace 2 años
Alex Azarov	7c8d3abd1a metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (#4936)	hace 2 años
Maximilian Winter	122ed4840c examples : fix and improv docs for the grammar generator (#4909)	hace 2 años
Justine Tunney	a0b3ac8c48 ggml : introduce GGML_CALL function annotation (#4850)	hace 2 años
Daniel Bevenius	d75c232e1d finetune : use LLAMA_FILE_MAGIC_GGLA (#4961)	hace 2 años
stduhpf	e0324285a5 speculative : threading options (#4959)	hace 2 años
ngc92	3e5ca7931c pass cpu-architecture arguments only to host code (C;C++) (#4943)	hace 2 años
David Friehs	4483396751 llama : apply classifier-free guidance to logits directly (#4951)	hace 2 años
Victor Z. Peng	d9aa4ffa6e awq-py : fix typo in awq-py/README.md (#4947)	hace 2 años
Georgi Gerganov	ddb008d845 cuda : fix dequantize kernel names (#4938)	hace 2 años
Kawrakow	2faaef3979 llama : check for 256 divisibility for IQ2_XS, IQ2_XXS (#4950)	hace 2 años
Kawrakow	4a3156de2f CUDA: faster dequantize kernels for Q4_0 and Q4_1 (#4938)	hace 2 años

Posterior Anterior

Historial de Commits Buscar

Historial de Commits