cturan/llama.cpp

Author	SHA1 Message	Date
matteo	8cf6b42d46 server : send partial stop string when <EOG> is reached (#15007)	2 months ago
Matthew Michel	9de9672adb sycl: use async memory allocation to fix crashes during graph recording (#16644)	2 months ago
Max Krasnyansky	63d2fc46e1 Add experimental ggml-hexagon backend for the Hexagon NPU (#16547)	2 months ago
Diego Devesa	a2e0088d92 Revert "ggml : Leverage the existing GGML_F32_VEC helpers to vectorize ggml_v…" (#16723)	2 months ago
Pascal	9b9201f65a webui: introduce OpenAI-compatible model selector in JSON payload (#16562)	2 months ago
sirus20x6	19a5a3edfd ggml : Leverage the existing GGML_F32_VEC helpers to vectorize ggml_vec_set_f32 for faster fills (#16522)	2 months ago
Acly	d8eaa26e4d tests : fix test-thread-safety when compiling with multiple backends (#16699)	2 months ago
Aman Gupta	9285325ce0 CUDA: fix bug in topk-moe softmax (#16711)	2 months ago
Aman Gupta	03792ad936 CUDA: topk-moe: add optional parameter for gpt-oss (#16649)	2 months ago
Johannes Gäßler	51d1a8c997 CUDA: better error for FA kernel with 0 occupancy (#16643)	2 months ago
Aman Gupta	4926419c4d ggml: add ggml_can_fuse_subgraph (#16662)	2 months ago
lhez	6ea37f5739 opencl: fix warnings and clean up profiling (#16688)	2 months ago
Jeff Bolz	fb349848f3 vulkan: Handle FA with all -inf mask values (#16447)	2 months ago
YehuditE	6de8ed7519 sycl : add PAD_REFLECT_D1 operator support (#16145)	2 months ago
Sigbjørn Skjæret	84bf3c6778 model : add BailingMoeV2 support (#16063)	2 months ago
Aleksander Grygier	c9c1972e2c Handle legacy 'context' attachments (#16687)	2 months ago
Diego Devesa	b617cfd289 ggml-alloc : fix leak when reusing a tensor with a larger size (#16679)	2 months ago
Aleksander Grygier	79068501fa Prevent premature submission on IME input (#16673)	2 months ago
Aleksander Grygier	0e4a0cf2fa Import/Export UX improvements (#16619)	2 months ago
Aleksander Grygier	13f2cfad41 Enable per-conversation loading states to allow having parallel conversations (#16327)	2 months ago
takuya kodama	06332e2867 llama-batch: fix build fails with `-Werror=missing-braces` (#16614)	2 months ago
Ron Evans	72d53e6c4d readme: update bindings (#16651)	2 months ago
safranowith	2330de7b84 SYCL: Add support for FLOOR,CEIL,ROUND and TRUNC unary operators (#16613)	2 months ago
takuya kodama	7062dd8460 llama-context: only warn on pooling_type when user specified (#16674)	2 months ago
Giuseppe Scrivano	0398752dd4 model : add Granite Hybrid types (#16635)	2 months ago
Aaron Teo	4f73d0a951 ci : fix binaries release failure for s390x (binaries may not work yet) (#16664)	2 months ago
Sigbjørn Skjæret	cec5edbcae ci : avoid manual updates of docs/ops.md (#16663)	3 months ago
Aaron Teo	fcb235b466 ci: include s390x release binaries (#16648)	3 months ago
Aman Gupta	55754bebd5 CODEOWNERS: update for ggml-cuda/mmf (#16660)	3 months ago
Johannes Gäßler	ee09828cb0 HIP: fix GPU_TARGETS (#16642)	3 months ago

Newer Older

Commit History Find

Commit History