cturan/llama.cpp

Autor	SHA1 Wiadomość	Data
Kawrakow	89503dcb5f iq3_xxs: quards for the no-imatrix situation (#5334)	1 rok temu
Jared Van Bortel	1ec3332ade YaRN : store rope scaling type as int32_t in memory (#5285)	1 rok temu
Ian Bull	e1e721094d llama : fix memory leak in llama_batch_free (#5252)	1 rok temu
Guoteng	ce32060198 llama : support InternLM2 (#5184)	1 rok temu
Georgi Gerganov	d3bac7d584 llama : reorder build_orion() at correct place (#5118)	1 rok temu
Georgi Gerganov	5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240)	1 rok temu
Yiming Cui	d62520eb2c Fix typos of IQ2_XXS and IQ3_XXS in llama.cpp (#5231)	1 rok temu
Jared Van Bortel	e8dc55d006 kompute : llama-bench support and ggml_cpu_has_kompute() (#5226)	1 rok temu
Kawrakow	f4d7e54974 SOTA 3-bit quants (#5196)	1 rok temu
Jared Van Bortel	6daa69ee81 kompute : fix fallback to CPU (#5201)	2 lat temu
Jared Van Bortel	fbf1ddec69 Nomic Vulkan backend (#4456)	2 lat temu
divinity76	2aed77eb06 fix typo "RLIMIT_MLOCK" (#5175)	2 lat temu
0cc4m	2307523d32 ggml : add Vulkan backend (#2059)	2 lat temu
Abhilash Majumder	0f648573dd ggml : add unified SYCL backend for Intel GPUs (#2690)	2 lat temu
Johannes Gäßler	9241c3a2ac Apply min_p to unsorted tokens (#5115)	2 lat temu
Johannes Gäßler	b2b2bf988c Tests for min_p, sampling queue (#5147)	2 lat temu
sharpHL	f2e69d28c0 llama : add support for Orion-14B (#5118)	2 lat temu
Kawrakow	1182cf4d4f Another bucket sort (#5109)	2 lat temu
l3utterfly	5eaf9964fc llama : dynamic temperature sampling (#4972)	2 lat temu
Kawrakow	faa3526a1e Fix Q3_K_XS for MoE models (#5113)	2 lat temu
slaren	1387ea2117 llama : pre-allocate input tensors in a separate buffer (#5100)	2 lat temu
Georgi Gerganov	89758723c7 minor : clean-up some warnings and style (#5094)	2 lat temu
slaren	011e8ec577 llama : fix not enough space in buffer with Qwen (#5086)	2 lat temu
compilade	d6bd4d46dd llama : support StableLM 2 1.6B (#5052)	2 lat temu
Kawrakow	66d575c45c llama : add Q3_K_XS (#5060)	2 lat temu
Shijie	3466c6ebcf llama : add more qwen2 models (#5071)	2 lat temu
slaren	6df465a91d llama : run all KQV ops on the CPU with no KV offload (#5049)	2 lat temu
Shijie	9b75cb2b3c llama : support upcoming Qwen2 (#5037)	2 lat temu
chiranko	2b3b999cac llama : add CodeShell support (#5016)	2 lat temu
John	57e2a7a52a llama : fix falcon arch for tied output embeddings (#4978)	2 lat temu

Nowsze Starsze

Historia zmian Szukaj

Historia zmian