cturan/llama.cpp

Author	SHA1 Message	Date
Aman Gupta	6d758839ff Add LLaDA-7b-MoE diffusion model (#16003)	4 months ago
Jake Karnes	3d4053f77f CUDA: fix im2col_3d to respect non-contiguous inputs (views) (#15956)	4 months ago
Diego Devesa	dc381aa9a6 docker : enable rocWMMA in ROCm images, add gfx1151 (#15997)	4 months ago
Diego Devesa	10d197409b releases : switch to rocWMMA develop branch, add gfx1151 (#15992)	4 months ago
yael-works	b907255f4b SYCL: Add COUNT_EQUAL operator support (#15991)	4 months ago
Nikolay Popov	28c39da7c6 llama-run: Fix model download on Windows (#15988)	4 months ago
Aman Gupta	106220562a CUDA: some micro-optimizations in mmf.cuh for mul_mat_id (#15926)	4 months ago
ddh0	a68f31edd7 fix KLD percentile output (#15999)	4 months ago
Sigbjørn Skjæret	b8e09f08b9 model : add grok-2 support (#15539)	4 months ago
Sigbjørn Skjæret	6c019cb04e server : only attempt to enable thinking if using jinja (#15967)	4 months ago
Georgi Gerganov	9dcd200d57 metal : remove memory pools (#15966)	4 months ago
Adam	0fa154e350 rocm.Dockerfile: added gfx1200,gfx1201 architectures to support AMD Radeon RX 9000 series (#15994)	4 months ago
Ruben Ortlam	261e6a20ff Vulkan: Clean up mul_mm shader (#15987)	4 months ago
lcy	a0e13dcbe5 build: fix the build failures of Windows HIP release job (#15984)	4 months ago
Georgi Gerganov	a14bd35014 metal : fix kernel requirements (#15983)	4 months ago
Radoslav Gerganov	918b26f197 rpc : fix regression when --device is used (#15981)	4 months ago
Diego Devesa	9ecb884346 releases : update ROCM, add gfx1200, gfx1201, gfx1151 (#15972)	4 months ago
Radoslav Gerganov	d1c6f11f47 doc : update documentation for --tensor-split (#15980)	4 months ago
Aaron Teo	6380d6a3e7 ggml-zdnn: rm user mapped buffers (#15965)	4 months ago
Jeff Bolz	aa0c461efe vulkan: fix failing dequant shaders (#15862)	4 months ago
Jeff Bolz	b9c9c9f789 vulkan: initialize vulkan-hpp to allow using extension function pointers (#15705)	4 months ago
Diego Devesa	50f4281a6f llama : allow using iGPUs with --device (#15951)	4 months ago
Georgi Gerganov	55758b00ca metal : refactor kernel loading (#15964)	4 months ago
Georgi Gerganov	f161463a54 metal : allow ops to run concurrently (#15929)	4 months ago
Georgi Gerganov	84d7b2fca1 metal : fix memory leaks (#15962)	4 months ago
Aaron Teo	40be51152d ggml-zdnn: fix #15414, activate FP16 and BF16 acceleration and incorrect zTensor free (#15839)	4 months ago
Eric Curtin	4bf5549269 Add docker protocol support for llama-server model loading (#15790)	4 months ago
Haiyue Wang	f4e664f838 context : remove redundant explicit casting to the same type (#15948)	4 months ago
Georgi Gerganov	f088b6a84f server : adjust prompt similarity thold + add logs (#15913)	4 months ago
Ruben Ortlam	304ac5693d Vulkan iGPU device selection overhaul and PCI ID API support (#15947)	4 months ago

Newer Older

Commit History Find

Commit History