cturan/llama.cpp

Autor	SHA1 Zpráva	Datum
Georgi Gerganov	d5a1cbde60 llama : support optional tensors (#4283)	před 2 roky
CausalLM	03562f3a86 llama : support attention bias on LLaMA architecture (#4283)	před 2 roky
Shijie	37c746d687 llama : add Qwen support (#4281)	před 2 roky
Georgi Gerganov	880f57973b llama : fix integer overflow during quantization (#4284)	před 2 roky
Georgi Gerganov	ef47ec18da ggml : add ggml_soft_max_ext (#4256)	před 2 roky
Jared Van Bortel	15f5d96037 build : fix build info generation and cleanup Makefile (#3920)	před 2 roky
Daniel Bevenius	b18c66ca6e llama : fix alignment of general.name in print meta (#4254)	před 2 roky
tarcey	954e22858c llama : fix typical sampling (#4261)	před 2 roky
Georgi Gerganov	8406b0924b ggml : re-enable BLAS for CPU when src0 != F32 + remove redundant full offload checks in llama.cpp (#4240)	před 2 roky
Marcus Dunn	f837c3a992 llama : grammar `reserve` space in `decode_utf8` (#4210)	před 2 roky
slaren	e9c13ff781 llama : set metal log callback correctly (#4204)	před 2 roky
slaren	8a052c131e ggml-cuda : support stablelm rope (#4156)	před 2 roky
Georgi Gerganov	6b0a7420d0 llama : KV cache view API + better KV cache management (#4170)	před 2 roky
Galunid	8e672efe63 stablelm : simplify + speedup generation (#4153)	před 2 roky
slaren	e937066420 gguf-py : export chat templates (#4125)	před 2 roky
slaren	bbecf3f415 llama : increase max nodes (#4115)	před 2 roky
slaren	e85bb1a8e7 llama : add functions to get the model's metadata (#4013)	před 2 roky
Georgi Gerganov	4f447a4833 llama : fix data units (#4101)	před 2 roky
Kerfuffle	91f6499393 Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040)	před 2 roky
Jared Van Bortel	a6fc554e26 llama : restore prefix space in llama tokenizer (#4081)	před 2 roky
Galunid	36eed0c42c stablelm : StableLM support (#3586)	před 2 roky
Georgi Gerganov	4760e7cc0b sync : ggml (backend v2) (#3912)	před 2 roky
Kerfuffle	bb50a792ec Add ReLU and SQR CUDA ops to (partially) fix Persimmon offloading (#4041)	před 2 roky
Galunid	df9d1293de Unbreak persimmon after #3837 (#4010)	před 2 roky
Meng Zhang	46876d2a2c cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (#3946)	před 2 roky
Meng Zhang	3d48f42efc llama : mark LLM_ARCH_STARCODER as full offload supported (#3945)	před 2 roky
cebtenzzre	3fdbe6b66b llama : change yarn_ext_factor placeholder to -1 (#3922)	před 2 roky
Georgi Gerganov	1efae9b7dc llm : prevent from 1-D tensors being GPU split (#3697)	před 2 roky
cebtenzzre	0eb332a10f llama : fix llama_context_default_params after #2268 (#3893)	před 2 roky
cebtenzzre	898aeca90a llama : implement YaRN RoPE scaling (#2268)	před 2 roky

Novější Starší

Historie revizí Hledat

Historie revizí