cturan/llama.cpp

Author	SHA1 Message	Date
slaren	5bf3953d7e cuda : improve cuda pool efficiency using virtual memory (#4606)	2 years ago
LeonEricsson	7082d24cec lookup : add prompt lookup decoding example (#4484)	2 years ago
FantasyGmm	a55876955b cuda : fix jetson compile error (#4560)	2 years ago
Michael Kesper	28cb35a0ec make : add LLAMA_HIP_UMA option (#4587)	2 years ago
Georgi Gerganov	32259b2dad gguf : simplify example dependencies	2 years ago
slaren	d232aca5a7 llama : initial ggml-backend integration (#4520)	2 years ago
Matheus Gabriel Alves Silva	919c40660f build : Check the ROCm installation location (#4485)	2 years ago
Jared Van Bortel	70f806b821 build : detect host compiler and cuda compiler separately (#4414)	2 years ago
slaren	799a1cb13b llama : add Mixtral support (#4406)	2 years ago
Jared Van Bortel	6138963fb2 build : target Windows 8 for standard mingw-w64 (#4405)	2 years ago
Georgi Gerganov	fe680e3d10 sync : ggml (new ops, tests, backend, etc.) (#4359)	2 years ago
Jared Van Bortel	511f52c334 build : enable libstdc++ assertions for debug builds (#4275)	2 years ago
WillCorticesAI	d2809a3ba2 make : fix Apple clang determination bug (#4272)	2 years ago
Jared Van Bortel	15f5d96037 build : fix build info generation and cleanup Makefile (#3920)	2 years ago
Georgi Gerganov	922754a8d6 lookahead : add example for lookahead decoding (#4207)	2 years ago
Kerfuffle	28a2e6e7d4 tokenize example: Respect normal add BOS token behavior (#4126)	2 years ago
Roger Meier	8e9361089d build : support ppc64le build for make and CMake (#3963)	2 years ago
Michael Potter	6bb4908a17 Fix MacOS Sonoma model quantization (#4052)	2 years ago
Georgi Gerganov	413503d4b9 make : do not add linker flags when compiling static llava lib (#3977)	2 years ago
Damian Stewart	381efbf480 llava : expose as a shared library for downstream projects (#3613)	2 years ago
cebtenzzre	b12fa0d1c1 build : link against build info instead of compiling against it (#3879)	2 years ago
cebtenzzre	2046eb4345 make : remove unnecessary dependency on build-info.h (#3842)	2 years ago
Georgi Gerganov	d69d777c02 ggml : quantization refactoring (#3833)	2 years ago
Georgi Gerganov	2f9ec7e271 cuda : improve text-generation and batched decoding performance (#3776)	2 years ago
Georgi Gerganov	e3932593d4 Revert "make : add optional CUDA_NATIVE_ARCH (#2482)"	2 years ago
Alex	96981f37b1 make : add optional CUDA_NATIVE_ARCH (#2482)	2 years ago
Georgi Gerganov	438c2ca830 server : parallel decoding and multimodal (#3677)	2 years ago
Georgi Gerganov	d1031cf49c sampling : refactor init to use llama_sampling_params (#3696)	2 years ago
Georgi Gerganov	0e89203b51 speculative : add tree-based sampling example (#3624)	2 years ago
M. Yusuf Sarıgöz	370359e5ba examples: support LLaVA v1.5 (multimodal model) (#3436)	2 years ago

Newer Older

Commit History Find

Commit History