cturan/llama.cpp

نویسنده	SHA1 پیام	تاریخ
Ruben Ortlam	8a3519b708 vulkan: fix mmq out of bounds reads (#17108)	2 ماه پیش
Jeff Bolz	80a6cf6347 vulkan: fuse mul_mat_id + mul (#17095)	2 ماه پیش
Georgi Gerganov	0750a59903 metal : retain src and dst buffers during async ops (#17101)	2 ماه پیش
Xuan-Son Nguyen	aa3b7a90b4 arg: add --cache-list argument to list cached models (#17073)	2 ماه پیش
chansikpark	333f2595a3 webui: fix keyboard shortcuts for new chat & edit chat title (#17007)	2 ماه پیش
Jeff Bolz	53d7d21e61 vulkan: Use spec constants for conv2d s/d/p and kernel W/H (#16978)	2 ماه پیش
Aidan	eeee367de5 server: fix correct time_ms calculation in prompt_progress (#17093)	2 ماه پیش
Aman Gupta	64fe17fbb8 Revert "CUDA: add expert reduce kernel (#16857)" (#17100)	2 ماه پیش
Aman Gupta	c1b187688d CUDA: skip fusion for repeating adds in bias (#17080)	2 ماه پیش
SavicStefan	b8a5cfd11a vulkan: Increase BK to 32; use BK/4 for non-CM mul_mm.comp (#16636)	2 ماه پیش
Aleksei Nikiforov	08416ebe7f ggml: disable vxe for cross-compilation by default (#16966)	2 ماه پیش
Jeff Bolz	b4e335d8dc vulkan: fuse rms_norm + mul + rope (+ view + set_rows) (#16977)	2 ماه پیش
Jeff Bolz	d6fe40fa00 vulkan: Fix test-thread-safety crashes (#17024)	2 ماه پیش
Johannes Gäßler	e14e842e87 CUDA: fix MMQ stream-k fixup ne1 indices (#17089)	2 ماه پیش
Reese Levine	647b960bd8 ggml webgpu: faster matrix multiplication/matrix-vector multiplication (#17031)	2 ماه پیش
bssrdf	299f5d782c CUDA: properly handle nb00=nb02 case for cpy (#17081)	2 ماه پیش
Acly	ac76d36201 vulkan : refactor buffer handling in vk_op_f32 (#16840)	2 ماه پیش
Johannes Gäßler	6515610506 CUDA: fix should_use_mmvf for ne11 == 1 (#17085)	2 ماه پیش
Georgi Gerganov	7956bb4d7f bench : cache the llama_context state at computed depth (#16944)	2 ماه پیش
Sigbjørn Skjæret	9008027aa3 hparams : add n_embd_inp() to support extended embed (#16928)	2 ماه پیش
Georgi Gerganov	16bcc1259d kv-cache : pad the cache size to 256 for performance (#17046)	2 ماه پیش
Adrien Gallouët	9eb9a1331d Revert "ggml-cpu: detect correct cpu flags for arm64 (#16229) (#16239)" (#17084)	2 ماه پیش
iron	7c23f3f0d4 ggml-cpu: detect correct cpu flags for arm64 (#16229) (#16239)	2 ماه پیش
Georgi Gerganov	8c0d6bb455 server : print the samplers chain for each request (#17070)	2 ماه پیش
Xuan-Son Nguyen	5c9a18e674 common: move download functions to download.(cpp\|h) (#17059)	2 ماه پیش
xctan	7f09a680af ggml-cpu : optimize RVV q2_k and q3_k kernels (#16887)	2 ماه پیش
Johannes Gäßler	aa374175c3 CUDA: fix crash on uneven context without FA (#16988)	2 ماه پیش
Georgi Gerganov	5b180c3d60 metal : initial Metal4 tensor API support (#16634)	2 ماه پیش
Georgi Gerganov	b7f9010d24 server : disable checkpoints with mtmd (#17045)	2 ماه پیش
Xuan-Son Nguyen	4882f0ff78 clip: implement minicpm-v sinusoidal embd using GGML (#17036)	2 ماه پیش

جدیدتر قدیمی‌تر

تاریخچه Commit ها یافتن

تاریخچه Commit ها