cturan/llama.cpp

Author	SHA1 Message	Date
Ervin Áron Tasnádi	a979ca22db ggml: adds CONV_2D op and direct GEMM Vulkan implementation (#14316)	6 months ago
compilade	90083283ec imatrix : use GGUF to store importance matrices (#9400)	6 months ago
Peter0x44	d4b91ea7b2 vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) (#14707)	6 months ago
0cc4m	83f5872404 Vulkan: Fix fprintf format-security warning (#14770)	6 months ago
rspOverflow	f0d4d176df Documentation: Update build.md's Vulkan section (#14736)	6 months ago
Georgi Gerganov	b17230917c sync : ggml	6 months ago
Georgi Gerganov	bf9087f59a metal : fuse add, mul + add tests (#14596)	6 months ago
Georgi Gerganov	9fb1042ce6 graph : fix graph reuse reset of params (#14760)	6 months ago
Georgi Gerganov	2adf8d83ac parallel : add option for different RNG seeds (#14757)	6 months ago
Oliver Simons	021cc28bef cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (#14741)	6 months ago
Georgi Gerganov	d498af3d5a graph : avoid huge warm-up graphs for MoE models (#14753)	6 months ago
Georgi Gerganov	eacdeb5bfc model : fix build after merge conflict (#14754)	6 months ago
lgai-exaone	e0cb5c5cb8 model : add EXAONE 4.0 support (#14630)	6 months ago
Aman Gupta	f9a31eea06 CUDA: set_rows + cpy.cu refactor (#14712)	6 months ago
Georgi Gerganov	8f974bc1e9 graph : refactor context to not pass gf explicitly (#14629)	6 months ago
Nexes the Elder	09651d09ff graph : Pass the graph placeholder message in debug mode (#14748)	6 months ago
Neo Zhang Jianyu	349ea79fce use max work group size for device to replace the magic number (#14732)	6 months ago
Piotr Wilkin (ilintar)	670e1360cd convert : fix Ernie4.5 MoE without shared experts (#14746)	6 months ago
Wroclaw	760b4484e3 nix : use optionalAttrs for env mkDerivation attrset argument (#14726)	6 months ago
Piotr Wilkin (ilintar)	cb887f1bc1 model: add Ernie 4.5 MoE support (#14658)	6 months ago
Georgi Gerganov	d6fb3f6b49 kv-cache : fix k-shift for multiple streams (#14742)	6 months ago
Georgi Gerganov	01612b7409 llama : reuse compute graphs (#14482)	6 months ago
Tarek Dakhran	086cf81e88 llama : fix parallel processing for lfm2 (#14705)	6 months ago
Georgi Gerganov	d9b691081c kv-cache : opt mask set input (#14600)	6 months ago
Georgi Gerganov	ad57d3edd2 batch : fix uninitialized has_cpl flag (#14733)	6 months ago
Sigbjørn Skjæret	1ba45d4982 ci : disable failing vulkan crossbuilds (#14723)	6 months ago
Sigbjørn Skjæret	19e5943d9e convert : make hf token optional (#14717)	6 months ago
Diner Burger	496957e1cb llama : fix parameter order for hybrid memory initialization (#14725)	6 months ago
Reese Levine	21c021745d ggml: Add initial WebGPU backend (#14521)	6 months ago
tempstudio	b0f0ecc3dc model : support output bias for qwen2 (#14711)	6 months ago

Newer Older

Commit History Find

Commit History