cturan/llama.cpp

Author	SHA1 Message	Date
Pascal	683fa6ba4e fix: added a normalization step for MathJax-style \[\] and \(\) delimiters (#16599)	3 months ago
GittyBurstein	b22572e97d sycl : add ARANGE operator (#16362)	3 months ago
Chenguang Li	7a50cf388a CANN: format code using .clang-format (#15863)	3 months ago
takasurazeem	6f5d924637 common : Update the docs on -t --threads (#16236)	3 months ago
takuya kodama	adc9b60f19 ggml-cpu: replace putenv with setenv for const-correctness (#16573)	3 months ago
yael-works	ee50ee1ead SYCL: Add GGML_OP_MEAN operator support (#16009)	3 months ago
Aleksei Nikiforov	7adc79c032 gguf-py : add support for endian conversion of BF16 data (#16594)	3 months ago
safranowith	466c1911ab cpu : add FLOOR, CEIL, ROUND and TRUNC unary operators (#16083)	3 months ago
lhez	0cb7a0683b opencl: add q8_0 mm support (#16469)	3 months ago
lhez	d93f8439b0 opencl: fix FA for f32 (#16584)	3 months ago
Aleksander Grygier	f9fb33f263 Add server-driven parameter defaults and syncing (#16515)	3 months ago
Sam/Samuel	f4ce81c45e metal: optimise `GGML_OP_SUM` (#16559)	3 months ago
Georgi Gerganov	17304cbcc1 server : fix img token logs (#16595)	3 months ago
Xuan-Son Nguyen	3e3cb19f64 llama-quant: add support for mmproj (#16592)	3 months ago
Julius Tischbein	5acd455460 CUDA: Changing the CUDA scheduling strategy to spin (#16585)	3 months ago
Georgi Gerganov	554fd578a5 server : fix mtmd checkpoints (#16591)	3 months ago
Georgi Gerganov	fa882fd2b1 metal : avoid using Metal's gpuAddress property (#16576)	3 months ago
SavicStefan	ffa059034c vulkan: Add ACC_TYPE_VEC2 implementation (#16203)	3 months ago
Aman Gupta	120bf7046d CUDA + openCL: fix bug in accessing rms_norm->src while doing fusion (#16577)	3 months ago
Jeff Bolz	4258e0cfe7 vulkan: Support FA with K/V in F32 (#16543)	3 months ago
Jeff Bolz	7ea15bb64c vulkan: Improve build time for MSVC (#16545)	3 months ago
Johannes Gäßler	9c7185dd28 CUDA: enable FA for FP32 KV cache (#16546)	3 months ago
Aman Gupta	1ee9d0b415 CUDA: use fastdiv + ggml_cuda_mad for mmvf (#16557)	3 months ago
Aman Gupta	48e2fa9fb7 CUDA: add fp kernel for larger batch size MoE (#16512)	3 months ago
Anav Prasad	5b6913c47b cuda : remove legacy copy-op pointer indirection code (#16485)	3 months ago
Georgi Gerganov	bc07349a7f server : dynamic token limit for prompt cache (#16560)	3 months ago
Georgi Gerganov	e60f241eac metal : FA support F32 K and V and head size = 32 (#16531)	3 months ago
Georgi Gerganov	e38b7c6e9e graph : support cacheless embeddings with FA and iSWA (#16528)	3 months ago
lhez	5016b72862 opencl: fix build targeting CL 2 (#16554)	3 months ago
Johannes Gäßler	7049736b2d CUDA: fix numerical issues in tile FA kernel (#16540)	3 months ago

Newer Older

Commit History Find

Commit History