cturan/llama.cpp

Аутор	SHA1 Порука	Датум
Georgi Gerganov	4f447a4833 llama : fix data units (#4101)	пре 2 година
Kerfuffle	91f6499393 Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040)	пре 2 година
Jared Van Bortel	a6fc554e26 llama : restore prefix space in llama tokenizer (#4081)	пре 2 година
Galunid	36eed0c42c stablelm : StableLM support (#3586)	пре 2 година
Georgi Gerganov	4760e7cc0b sync : ggml (backend v2) (#3912)	пре 2 година
Kerfuffle	bb50a792ec Add ReLU and SQR CUDA ops to (partially) fix Persimmon offloading (#4041)	пре 2 година
Galunid	df9d1293de Unbreak persimmon after #3837 (#4010)	пре 2 година
Meng Zhang	46876d2a2c cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (#3946)	пре 2 година
Meng Zhang	3d48f42efc llama : mark LLM_ARCH_STARCODER as full offload supported (#3945)	пре 2 година
cebtenzzre	3fdbe6b66b llama : change yarn_ext_factor placeholder to -1 (#3922)	пре 2 година
Georgi Gerganov	1efae9b7dc llm : prevent from 1-D tensors being GPU split (#3697)	пре 2 година
cebtenzzre	0eb332a10f llama : fix llama_context_default_params after #2268 (#3893)	пре 2 година
cebtenzzre	898aeca90a llama : implement YaRN RoPE scaling (#2268)	пре 2 година
Georgi Gerganov	c43c2da8af llm : fix llm_build_kqv taking unused tensor (benign, #3837)	пре 2 година
Georgi Gerganov	523e49b111 llm : fix falcon norm after refactoring (#3837)	пре 2 година
Georgi Gerganov	50337961a6 llm : add llm_build_context (#3881)	пре 2 година
Andrew Godfrey	73bdcb395e finetune : add -ngl parameter (#3762)	пре 2 година
Georgi Gerganov	71e3718abd llama : refactor graph build code (#3837)	пре 2 година
kalomaze	238657db23 samplers : Min-P sampler implementation [alternative to Top P/Top K] (#3841)	пре 2 година
Georgi Gerganov	207b51900e ggml : move FP16 <-> FP32 code to ggml-impl.h (#3861)	пре 2 година
Kerfuffle	6e08281e58 Extend llama_kv_cache_seq_rm to allow matching any sequence (#3843)	пре 2 година
Georgi Gerganov	71a09da301 llama : fix kv shift bug (#3835)	пре 2 година
Georgi Gerganov	d69d777c02 ggml : quantization refactoring (#3833)	пре 2 година
Kerfuffle	bd6d9e2059 llama : allow quantizing k-quants to fall back when tensor size incompatible (#3747)	пре 2 година
Georgi Gerganov	fdee152e4e starcoder : add GPU offloading (#3827)	пре 2 година
cebtenzzre	6d459cbfbe llama : correctly report GGUFv3 format (#3818)	пре 2 година
Georgi Gerganov	2f9ec7e271 cuda : improve text-generation and batched decoding performance (#3776)	пре 2 година
Marcus Dunn	5be6c803fa llama : remove token functions with `context` args in favor of `model` (#3720)	пре 2 година
goerch	9e70cc0322 Add test for MPT tokenization (#3728)	пре 2 година
Kerfuffle	a5e7dbd614 llama : validate special token ids are in range when loading GGUF model (#3635)	пре 2 година

Новије Старије

Историја ревизија Пронађи

Историја ревизија