cturan/llama.cpp

Автор	SHA1 Съобщение	Дата
Justine Tunney	8cc91dc63c ggml : add llamafile sgemm (#6414)	преди 1 година
slaren	280345968d cuda : rename build flag to LLAMA_CUDA (#6299)	преди 1 година
Kawrakow	76aa30a263 Add ability to use Q5_0, Q5_1, and IQ4_NL for quantized K cache (#6183)	преди 1 година
slaren	2bf8d0f7c4 backend : offload large batches to GPU (#6083)	преди 1 година
slaren	b0bc9f4a9d llama-bench : use random tokens to improve accuracy with mixtral (#6069)	преди 1 година
Steve Grubb	6e0438da3c gguf : fix resource leaks (#6061)	преди 1 година
slaren	f30ea47a87 llama : add pipeline parallelism support (#6017)	преди 1 година
Georgi Gerganov	6cdabe6526 llama-bench : add embeddings option (#5924)	преди 1 година
Neo Zhang Jianyu	715641391d Support multiple GPUs (split mode) on SYCL backend (#5806)	преди 1 година
Pierrick Hymbert	3ab8b3a92e llama : cleanup unused mmq flags (#5772)	преди 1 година
Georgi Gerganov	ab336a9d5e code : normalize enum names (#5697)	преди 1 година
bmwl	f486f6e1e5 ggml : add numa options (#5377)	преди 1 година
Michael Klimenko	52bb63c708 refactor : switch to emplace_back to avoid extra object (#5291)	преди 2 години
Neo Zhang Jianyu	128dcbd3c9 add --no-mmap in llama-bench (#5257)	преди 2 години
Georgi Gerganov	5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240)	преди 2 години
Jared Van Bortel	e8dc55d006 kompute : llama-bench support and ggml_cpu_has_kompute() (#5226)	преди 2 години
0cc4m	2307523d32 ggml : add Vulkan backend (#2059)	преди 2 години
slaren	e7e4df031b llama : ggml-backend integration (#4766)	преди 2 години
slaren	226460cc0d llama-bench : add no-kv-offload parameter (#4812)	преди 2 години
Georgi Gerganov	bcc0eb4591 llama : per-layer KV cache + quantum K cache (#4309)	преди 2 години
cebtenzzre	b12fa0d1c1 build : link against build info instead of compiling against it (#3879)	преди 2 години
Kerfuffle	6e08281e58 Extend llama_kv_cache_seq_rm to allow matching any sequence (#3843)	преди 2 години
Marcus Dunn	5be6c803fa llama : remove token functions with `context` args in favor of `model` (#3720)	преди 2 години
Cebtenzzre	bc39553c90 build : enable more non-default compiler warnings (#3200)	преди 2 години
slaren	16bc66d947 llama.cpp : split llama_context_params into model and context params (#3301)	преди 2 години
Georgi Gerganov	ec893798b7 llama : custom attention mask + parallel decoding + no context swaps (#3228)	преди 2 години
Rickard Hallerbäck	dc6897404e metal : reusing llama.cpp logging (#3152)	преди 2 години
Georgi Gerganov	8c00b7a6ff sync : ggml (Metal F32 support + reduce ggml-alloc size) (#3192)	преди 2 години
slaren	15b67a66c2 llama-bench : use two tokens in the warmup run for prompt evals (#3059)	преди 2 години
Cebtenzzre	de2fe892af examples : replace fprintf to stdout with printf (#3017)	преди 2 години

По-нови По-стари

Commit History Намери

Commit History