cturan/llama.cpp

Auteur	SHA1 Message	Date
Jeff Bolz	5b3466bedf vulkan: Handle GPUs with less shared memory (#10468)	il y a 1 an
Jeff Bolz	249a7902ec vulkan: further optimize q5_k mul_mat_vec (#10479)	il y a 1 an
Jeff Bolz	71a64989a5 vulkan: skip integer div/mod in get_offsets for batch_idx==0 (#10506)	il y a 1 an
Jeff Bolz	4a57d362e1 vulkan: optimize Q2_K and Q3_K mul_mat_vec (#10459)	il y a 1 an
Diego Devesa	c9b00a70b0 ci : fix cuda releases (#10532)	il y a 1 an
Shane A	de5097351c Add OLMo 2 model in docs (#10530)	il y a 1 an
Diego Devesa	5a349f2809 ci : remove nix workflows (#10526)	il y a 1 an
Diego Devesa	30ec398321 llama : disable warnings for 3rd party sha1 dependency (#10527)	il y a 1 an
Tristan Druyen	be0e350c8b Fix HIP flag inconsistency & build docs (#10524)	il y a 1 an
R0CKSTAR	249cd93da3 mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516)	il y a 1 an
Jeff Bolz	904109ed0d vulkan: fix group_norm (#10496)	il y a 1 an
Xuan Son Nguyen	45abe0f74e server : replace behave with pytest (#10416)	il y a 1 an
Neo Zhang Jianyu	0bbd2262a3 restore the condistion to build & update pacakge when merge (#10507)	il y a 1 an
Georgi Gerganov	ab96610b1e cmake : enable warnings in llama (#10474)	il y a 1 an
Diego Devesa	7db3846a94 ci : publish the docker images created during scheduled runs (#10515)	il y a 1 an
Diego Devesa	c6807b3f28 ci : add ubuntu cuda build, build with one arch on windows (#10456)	il y a 1 an
Charles Xu	25669aa92c ggml-cpu: cmake add arm64 cpu feature check for macos (#10487)	il y a 1 an
Georgi Gerganov	84e1c33cde server : fix parallel speculative decoding (#10513)	il y a 1 an
Georgi Gerganov	811872a59d speculative : simplify the implementation (#10504)	il y a 1 an
Shanshan Shen	9a4b79bcfa CANN: Improve the Inferencing Performance for Ascend NPU Device (#10454)	il y a 1 an
Chenguang Li	7066b4cce2 CANN: RoPE and CANCAT operator optimization (#10488)	il y a 1 an
Junil Kim	0eb4e12bee vulkan: Fix a vulkan-shaders-gen arugment parsing error (#10484)	il y a 1 an
Eric Curtin	0cc63754b8 Introduce llama-run (#10291)	il y a 1 an
Diego Devesa	50d5cecbda ci : build docker images only once daily (#10503)	il y a 1 an
Georgi Gerganov	9fd8c2687f server : add more information about error (#10455)	il y a 1 an
Georgi Gerganov	47f931c8f9 server : enable cache_prompt by default (#10501)	il y a 1 an
Georgi Gerganov	106964e3d2 metal : enable mat-vec kernels for bs <= 4 (#10491)	il y a 1 an
Shane A	80acb7b430 Rename Olmo1124 to Olmo2 (#10500)	il y a 1 an
Diego Devesa	10bce0450f llama : accept a list of devices to use to offload a model (#10497)	il y a 1 an
Johannes Gäßler	1f922254f0 Github: update issue templates [no ci] (#10489)	il y a 1 an

Récemment Précédemment

Historique des commits Trouver

Historique des commits