cturan/llama.cpp

Auteur	SHA1 Message	Date
Diego Devesa	c6807b3f28 ci : add ubuntu cuda build, build with one arch on windows (#10456)	il y a 1 an
Charles Xu	25669aa92c ggml-cpu: cmake add arm64 cpu feature check for macos (#10487)	il y a 1 an
Georgi Gerganov	84e1c33cde server : fix parallel speculative decoding (#10513)	il y a 1 an
Georgi Gerganov	811872a59d speculative : simplify the implementation (#10504)	il y a 1 an
Shanshan Shen	9a4b79bcfa CANN: Improve the Inferencing Performance for Ascend NPU Device (#10454)	il y a 1 an
Chenguang Li	7066b4cce2 CANN: RoPE and CANCAT operator optimization (#10488)	il y a 1 an
Junil Kim	0eb4e12bee vulkan: Fix a vulkan-shaders-gen arugment parsing error (#10484)	il y a 1 an
Eric Curtin	0cc63754b8 Introduce llama-run (#10291)	il y a 1 an
Diego Devesa	50d5cecbda ci : build docker images only once daily (#10503)	il y a 1 an
Georgi Gerganov	9fd8c2687f server : add more information about error (#10455)	il y a 1 an
Georgi Gerganov	47f931c8f9 server : enable cache_prompt by default (#10501)	il y a 1 an
Georgi Gerganov	106964e3d2 metal : enable mat-vec kernels for bs <= 4 (#10491)	il y a 1 an
Shane A	80acb7b430 Rename Olmo1124 to Olmo2 (#10500)	il y a 1 an
Diego Devesa	10bce0450f llama : accept a list of devices to use to offload a model (#10497)	il y a 1 an
Johannes Gäßler	1f922254f0 Github: update issue templates [no ci] (#10489)	il y a 1 an
brucepro	a9a678a6b2 Add download chat feature to server chat (#10481)	il y a 1 an
Georgi Gerganov	9ca2e67762 server : add speculative decoding support (#10455)	il y a 1 an
Diego Devesa	5931c1f233 ggml : add support for dynamic loading of backends (#10469)	il y a 1 an
Georgi Gerganov	f6d12e7df8 tests : fix compile warning	il y a 1 an
Georgi Gerganov	b756441104 metal : minor code formatting	il y a 1 an
Neo Zhang Jianyu	5a8987793f [SYCL] Fix building Win package for oneAPI 2025.0 update (#10483)	il y a 1 an
Georgi Gerganov	d9d54e498d speculative : refactor and add a simpler example (#10362)	il y a 1 an
Georgi Gerganov	cce5a90075 flake.lock: Update (#10470)	il y a 1 an
Diego Devesa	dc39012cba llama : fix op mul check with command-r-plus (#10476)	il y a 1 an
Gabe Goodhart	9336db462c convert : XLMRoberta Type Vocab Size (#10458)	il y a 1 an
momonga	96fa2c5e2d fix gguf-py: Conversion error when multiple licenses are configured (#9807)	il y a 1 an
Diego Devesa	55ed008b2d ggml : do not use ARM features not included in the build (#10457)	il y a 1 an
蕭澧邦	6dfcfef078 ci: Update oneAPI runtime dll packaging (#10428)	il y a 1 an
Johannes Gäßler	599b3e0cd4 GitHub: ask for more info in issue templates (#10426)	il y a 1 an
leo-pony	c18610b4ee CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216)	il y a 1 an

Récemment Précédemment

Historique des commits Trouver

Historique des commits