cturan/llama.cpp

Autor	SHA1 Wiadomość	Data
Neuman Vong	4b7b38bef5 vulkan: Set limit for task concurrency (#5427)	1 rok temu
Daniel Bevenius	e00d2a62dd llava : add requirements.txt and update README.md (#5428)	1 rok temu
Riley Stewart	7c777fcd5d server : fix prompt caching for repeated prompts (#5420)	1 rok temu
Paul Tsochantaris	e5ca3937c6 llama : do not cap thread count when MoE on CPU (#5419)	1 rok temu
Marko Tasic	e4124c2477 readme : add JavaScript/Wasm repo (#5415)	1 rok temu
Michael Podvitskiy	b2f87cb64d ggml : fix `error C2078: too many initializers` for MSVC ARM64 (#5404)	1 rok temu
0cc4m	44fbe34360 Fix Vulkan crash on APUs with very little device memory (#5424)	1 rok temu
Johannes Gäßler	8e6a9d2de0 CUDA: more warps for mmvq on NVIDIA (#5394)	1 rok temu
slaren	41f308f58e llama : do not print "offloading layers" message in CPU-only builds (#5416)	1 rok temu
Abhilash Majumder	6e99f2a04f Fix f16_sycl cpy call from Arc (#5411)	1 rok temu
Daniel Bevenius	ff4ff05c5f llava : add missing .py, and fix paths in README.md (#5414)	1 rok temu
Johannes Gäßler	b7b74cef36 fix trailing whitespace (#5407)	1 rok temu
runfuture	4aa43fab56 llama : fix MiniCPM (#5392)	1 rok temu
Daniel Bevenius	a6e514a85f llava: fix typo/formatting in README.md (#5405)	1 rok temu
Johannes Gäßler	26d4efd11e sampling: fix top_k <= 0 (#5388)	1 rok temu
Georgi Gerganov	8504d2d0da tests : .gitignore obj files	1 rok temu
Michael Podvitskiy	c4fbb6717c CMAKE_OSX_ARCHITECTURES for MacOS cross compilation (#5393)	1 rok temu
Ebey Abraham	8c933b70c2 fix typo in readme (#5399)	1 rok temu
Kamil Tomšík	b906596bb7 Add Ava in the list of llama.cpp UIs (#4362)	1 rok temu
Johannes Gäßler	aa7ab99be2 CUDA: fixed mmvq kernel for bs 2,3,4 and -sm row (#5386)	1 rok temu
Neo Zhang Jianyu	10afa6f1d1 [SYCL] update install make by w64devkit (#5297)	1 rok temu
Xiao-Yong Jin	0ef46da632 llava-cli : always tokenize special tokens (#5382)	1 rok temu
0cc4m	ee1628bdfe Basic Vulkan Multi-GPU implementation (#5321)	1 rok temu
Eve	ed0bf32290 readme : modernize (#5379)	1 rok temu
Ben Williams	9a697d842b readme : update ui list (#5354)	1 rok temu
runfuture	316c7faf77 llama : add MiniCPM support (#5346)	1 rok temu
Justin Parker	f3e2b4fa3f server : update `/props` with "total_slots" value (#5373)	1 rok temu
Sang-Kil Park	f68664ac24 convert : fix TypeError on GPT-2 vocab.json (#5288)	1 rok temu
Alexey Parfenov	213d1439fa server : remove model.json endpoint (#5371)	1 rok temu
Johannes Gäßler	17c97fb062 CUDA: mul_mat_vec_q max. batch size 8 -> 4 (#5370)	1 rok temu

Nowsze Starsze

Historia zmian Szukaj

Historia zmian