cturan/llama.cpp

Autor	SHA1 Mensagem	Data
Lars Sonchocky-Helldorf	18b663d8e4 install : add macports (#12518)	há 10 meses atrás
Xuan-Son Nguyen	fbdfefe74e llama : gemma3 : use output tensor if it exists in model weight (#12506)	há 10 meses atrás
Georgi Gerganov	ba932dfb50 ggml : fix quantized cpy op (#12310)	há 10 meses atrás
R0CKSTAR	fac63a3d78 musa: refine compute capability (#12493)	há 10 meses atrás
Jeff Bolz	eddfb43850 vulkan: Optimize mul_mat_vec p021 and nc shaders (#12505)	há 10 meses atrás
stduhpf	4375415b4a Vulkan: RTE rounding for cpy to quant (#12480)	há 10 meses atrás
Eve	30c42ef5cb vulkan: workaround for AMD Windows driver 16 bit unpack8 bug (#12472)	há 10 meses atrás
Georgi Gerganov	af04481e6b model : do not repack if a GPU device is present (#12498)	há 10 meses atrás
Sigbjørn Skjæret	960e726077 chore : cleanup llama_model_loader::TENSOR_ usage (#12492)	há 10 meses atrás
marcoStocchi	ea1518e839 llama-tts : avoid crashes related to bad model file paths (#12482)	há 10 meses atrás
蕭澧邦	1aa87ee53d [SYCL] Fix build on Windows when ccache enabled (#9954) (#9976)	há 10 meses atrás
Svetlozar Georgiev	9ffcc9e374 sycl: cleanup oneDNN related code (#12097)	há 10 meses atrás
Woof Dog	e04643063b webui : Prevent rerendering on textarea input (#12299)	há 10 meses atrás
Sigbjørn Skjæret	dbb3a4739e llama : make Qwen2MoE QKV bias optional (#12477)	há 10 meses atrás
Srihari-mcw	3d82dbcbce ggml : block interleaving support for Q4_K quantization for x86 AVX2 architecture (#12332)	há 10 meses atrás
Bartowski	732b5fbf5e convert : avoid calls to tokenizer.added_tokens_decoder (#12473)	há 10 meses atrás
fairydreaming	568013d0cd context : clear sets containing encoder output sequence ids before storing new values (#12470)	há 10 meses atrás
Gaurav Garg	517b5ddbf0 CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case (#12183)	há 10 meses atrás
Jeff Bolz	a9b59288e2 vulkan: optimize iq1 coopmat2 dequant functions (#12427)	há 10 meses atrás
Guus Waals	0fd8487b14 Fix visionOS build and add CI (#12415)	há 10 meses atrás
Sigbjørn Skjæret	108e53c2f1 llama : add support for GPT2, Bloom and CodeShell tied word embeddings (#12456)	há 10 meses atrás
Sigbjørn Skjæret	a686171ea7 convert : Support chat_template.json (#12460)	há 10 meses atrás
Jeff Bolz	c446b2edd2 vulkan: Submit once enough matmul work has been recorded (#12406)	há 10 meses atrás
lhez	d84635b1b0 opencl: improve profiling (#12442)	há 10 meses atrás
Georgi Gerganov	75422e8bc4 graph : normalize Q, K, V shapes + sync cross attention (#12449)	há 10 meses atrás
R0CKSTAR	bb115d2bf7 musa: override warp_size of musa device to 32 (#12445)	há 10 meses atrás
Xuan-Son Nguyen	29fff308c7 llama : support converting Mistral Small text-only (#12450)	há 10 meses atrás
Georgi Gerganov	c6af2161b2 speculative : fix seg fault in certain cases (#12454)	há 10 meses atrás
Xuan-Son Nguyen	99aa304fb9 llama : add support for EXAONE tied word embeddings (#12451)	há 10 meses atrás
Georgi Gerganov	8551c44d84 context : always use non-causal attention for encoder graphs (#12447)	há 10 meses atrás

Recente Antigo

Histórico de Commits Pesquisar

Histórico de Commits