cturan/llama.cpp

Autor	SHA1 Zpráva	Datum
Georgi Gerganov	727368c60f llama : use LLAMA_TOKEN_NULL (#11062)	před 1 rokem
Georgi Gerganov	5047dd3546 llama : use _impl suffix instead of _internal (#11060)	před 1 rokem
Johannes Gäßler	46e3556e01 CUDA: add BF16 support (#11093)	před 1 rokem
0cc4m	b56f079e28 Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver (#11074)	před 1 rokem
fairydreaming	9394bbd484 llama : Add support for DeepSeek V3 (#11049)	před 1 rokem
matt23654	f922a9c542 [GGML][RPC] Support for models with non-512-aligned tensors over RPC. (#11047)	před 1 rokem
DAN™	46be942214 llama : add support for the cohere2 model architecture (#10900)	před 1 rokem
Georgi Gerganov	78c6785175 sync : ggml	před 1 rokem
Georgi Gerganov	5e3b08d606 ggml : do not install metal source when embed library (ggml/1054)	před 1 rokem
Daniel Bevenius	db68c93b57 ggml : improve inputs log sched_print_assignments (ggml/1053)	před 1 rokem
Gilad S.	c31fc8b966 fix: Vulkan shader gen binary path (#11037)	před 1 rokem
Molly Sophia	4b0c638b9a common : disable KV cache shifting automatically for unsupported models (#11053)	před 1 rokem
Georgi Gerganov	e7da954ecc metal : avoid uint (#11019)	před 1 rokem
Georgi Gerganov	f66f582927 llama : refactor `src/llama.cpp` (#10902)	před 1 rokem
Pierrick Hymbert	2f0ee84b9b server: bench: minor fixes (#10765)	před 1 rokem
Xuan Son Nguyen	0da5d86026 server : allow using LoRA adapters per-request (#10994)	před 1 rokem
Benson Wong	a45433ba20 readme : add llama-swap to infrastructure section (#11032)	před 1 rokem
Srihari-mcw	0827b2c1da ggml : fixes for AVXVNNI instruction set with MSVC and Clang (#11027)	před 1 rokem
Xuan Son Nguyen	45095a61bf server : clean up built-in template detection (#11026)	před 1 rokem
Xuan Son Nguyen	5896c65232 server : add OAI compat for /v1/completions (#10974)	před 1 rokem
ymcki	bc7b1f8632 convert : fix Llama-3_1-Nemotron-51B rope settings (#11008)	před 1 rokem
Peter	6e1531aca5 common, examples, ggml : fix MSYS2 GCC compiler errors and warnings when building with LLAMA_CURL=ON and GGML_OPENCL=ON (#11013)	před 1 rokem
Jeff Bolz	716bd6dec3 vulkan: optimize mul_mat for small values of N (#10991)	před 1 rokem
ag2s20150909	c250ecb315 android : fix llama_batch free (#11014)	před 1 rokem
Jeff Bolz	a813badbbd vulkan: im2col and matmul optimizations for stable diffusion (#10942)	před 1 rokem
Jeff Bolz	fdd2188912 vulkan: Use push constant offset to handle misaligned descriptors (#10987)	před 1 rokem
Isaac McFadyen	f865ea149d server: added more docs for response_fields field (#10995)	před 1 rokem
Alexey Parfenov	16cdce7b68 server : fix token duplication when streaming with stop strings (#10997)	před 1 rokem
Eve	d79d8f39b4 vulkan: multi-row k quants (#10846)	před 1 rokem
Peter	d283d02bf2 examples, ggml : fix GCC compiler warnings (#10983)	před 1 rokem

Novější Starší

Historie revizí Hledat

Historie revizí