Percy Piper
|
c508256db2
rpc : Fix build on OpenBSD (#13541)
|
7 months ago |
Xuan-Son Nguyen
|
40aaa8a403
mtmd : add support for Qwen2-Audio and SeaLLM-Audio (#13760)
|
7 months ago |
ddpasa
|
a08c1d2845
docs : add Moondream2 pre-quantized link (#13745)
|
7 months ago |
Olivier Chafik
|
d785f9c1fd
server: fix/test add_generation_prompt (#13770)
|
7 months ago |
Piotr Jasiukajtis
|
4032ca4066
llama : add support for Qwen3 MoE tied word embeddings (#13768)
|
7 months ago |
Akarshan Biswas
|
515fdbf7ed
SYCL: revert "sycl: simplify bin_bcast_kernel (#13383)" (#13752)
|
7 months ago |
Olivier Chafik
|
f5cd27b71d
`server`: streaming of tool calls and thoughts when `--jinja` is on (#12379)
|
7 months ago |
Diego Devesa
|
a2d02d5793
releases : bundle llvm omp library in windows release (#13763)
|
7 months ago |
Diego Devesa
|
17fc817b58
releases : enable openmp in windows cpu backend build (#13756)
|
7 months ago |
Diego Devesa
|
2bd1b30f69
ggml-cpu : set openmp wait time if not set (#13758)
|
7 months ago |
0cc4m
|
259469c4b5
Move GLM4 f32 attention fix to the correct function (#13750)
|
7 months ago |
Xuan-Son Nguyen
|
4c32832c59
ggml : add ggml_gelu_erf() CUDA kernel (#13719)
|
7 months ago |
Sigbjørn Skjæret
|
c3a2624339
vocab : fix ugm tokenizer precision (#13743)
|
7 months ago |
Johannes Gäßler
|
ffd0eae60b
CUDA: fix race condition in FA vector kernels (#13742)
|
7 months ago |
Diego Devesa
|
b775345d78
ci : enable winget package updates (#13734)
|
7 months ago |
Diego Devesa
|
a70a8a69c2
ci : add winget package updater (#13732)
|
7 months ago |
Georgi Gerganov
|
d13d0f6135
hparams : initialize arrays (#13728)
|
7 months ago |
Xuan-Son Nguyen
|
8a2afb7520
llama : allow custom list of swa_layers (#13726)
|
7 months ago |
Xuan-Son Nguyen
|
9ecf3e66a3
server : support audio input (#13714)
|
7 months ago |
Chenguang Li
|
faaaff5f94
CANN: Support MUL_MAT_ID for q8_0 and q4_0 (#13705)
|
7 months ago |
Xuan-Son Nguyen
|
e16c4731c7
ggml : fix the order of ggml_unary_op (#13718)
|
7 months ago |
Jeff Bolz
|
1dcd01960c
vulkan: support CPY from any type to itself (#13695)
|
7 months ago |
Jeff Bolz
|
c10ed6cbcc
vulkan: Disable coopmat/coopmat2/bfloat extensions if glslc doesn't support it (#13696)
|
7 months ago |
Judd
|
a127ff1780
use LOG_WARN to replace `std::cerr` (#13657)
|
7 months ago |
Diego Devesa
|
3079e9ac8e
release : fix windows hip release (#13707)
|
7 months ago |
Georgi Gerganov
|
8a1d206f1d
tts : fix n_ubatch + make WavTokenizer cache-less (#13713)
|
8 months ago |
Xuan-Son Nguyen
|
797990c4bc
mtmd : add ultravox audio input (#13623)
|
8 months ago |
Aaron Teo
|
ab86335760
common: Include torch package for s390x (#13699)
|
8 months ago |
Georgi Gerganov
|
cc74d5be99
server : pad small embedding batches (#13692)
|
8 months ago |
Sigbjørn Skjæret
|
5be24af73d
gguf-py : correct charsmap parameter typing (#13701)
|
8 months ago |