cturan/llama.cpp

Autor	SHA1 Mensaje	Fecha
0cc4m	f8e9140cb4 Vulkan Fixes (#5223)	hace 2 años
Yiming Cui	d62520eb2c Fix typos of IQ2_XXS and IQ3_XXS in llama.cpp (#5231)	hace 2 años
Neo Zhang Jianyu	01684139c3 support SYCL backend windows build (#5208)	hace 2 años
Jared Van Bortel	e8dc55d006 kompute : llama-bench support and ggml_cpu_has_kompute() (#5226)	hace 2 años
Georgi Gerganov	e0085fdf7c Revert "server : change deps.sh xxd files to string literals (#5221)"	hace 2 años
Georgi Gerganov	e6f291d158 server : fix context shift (#5195)	hace 2 años
JohnnyB	4003be0e5f server : change deps.sh xxd files to string literals (#5221)	hace 2 años
Kawrakow	fea4fd4ba7 ggml : fix IQ3_XXS on Metal (#5219)	hace 2 años
Georgi Gerganov	8f8ddfcfad sync : ggml (#0)	hace 2 años
Georgi Gerganov	6fb50ebbf0 gguf : fix comparison (ggml/715)	hace 2 años
John Balis	625a699b54 `ggml_cuda_cpy` support for 4d tensors and float16->float32 upcasting (ggml/686)	hace 2 años
Georgi Gerganov	a4b07c057a gguf : add input validation, prevent integer overflows (ggml/709)	hace 2 años
Georgi Gerganov	549a1e6cd5 ci : fix yolo URLs + fix metal capture (ggml/712)	hace 2 años
Jack Mousseau	5f14ee0b0c metal : add debug capture backend function (ggml/694)	hace 2 años
Kawrakow	8e14e3ddb3 Faster AVX2 dot product for IQ2_XS (#5187)	hace 2 años
Kawrakow	f4d7e54974 SOTA 3-bit quants (#5196)	hace 2 años
0cc4m	2256f36b79 Vulkan Windows APU Memory Handling (#5199)	hace 2 años
Vladimir Malyutin	7359016c7c quantize : fix typo (#5211)	hace 2 años
divinity76	813416991a main : allow empty --prompt-cache file (#5176)	hace 2 años
Romain Neutron	5589921ef8 readme : minor (#5204)	hace 2 años
Georgi Gerganov	49f44b5c55 readme : update hot topics	hace 2 años
Wu Jian Ping	6685cc41c2 server : improve README (#5209)	hace 2 años
Paul Tsochantaris	ceebbb5b21 ggml alloc: Fix for null dereference on alloc failure (#5200)	hace 2 años
Jared Van Bortel	6daa69ee81 kompute : fix fallback to CPU (#5201)	hace 2 años
Jared Van Bortel	fbf1ddec69 Nomic Vulkan backend (#4456)	hace 2 años
divinity76	2aed77eb06 fix typo "RLIMIT_MLOCK" (#5175)	hace 2 años
Wu Jian Ping	c82d18e863 server : embeddings compatibility for OpenAI (#5190)	hace 2 años
Georgi Gerganov	14fef85e2d py : fix except (#5194)	hace 2 años
Sang-Kil Park	e76627bcce py : improve BPE tokenizer support (#5189)	hace 2 años
slaren	fbe7dfa53c ggml : add max buffer sizes to opencl and metal backends (#5181)	hace 2 años

Posterior Anterior

Historial de Commits Buscar

Historial de Commits