Gabe Goodhart
|
5886f4f545
examples(gguf): GGUF example outputs (#17025)
|
2 месяцев назад |
Xuan-Son Nguyen
|
92bb84f775
mtmd: allow QwenVL to process larger image by default (#17020)
|
2 месяцев назад |
Georgi Gerganov
|
13b339bcd9
server : do not default to multiple slots with speculative decoding (#17017)
|
2 месяцев назад |
Xuan-Son Nguyen
|
2f0c2db43e
mtmd: improve struct initialization (#16981)
|
2 месяцев назад |
손희준
|
fd2f84f468
docs: Clarify the endpoint that webui uses (#17001)
|
2 месяцев назад |
Li Pengzhan
|
9f052478c2
model : add openPangu-Embedded (#16941)
|
2 месяцев назад |
Reese Levine
|
03ea04175d
ggml webgpu: minor set rows optimization (#16810)
|
2 месяцев назад |
Georgi Gerganov
|
cdabeb2c27
sync : ggml
|
2 месяцев назад |
Georgi Gerganov
|
852ce5180a
ggml : fix conv2d_dw SVE path (ggml/1380)
|
2 месяцев назад |
mnehete32
|
9aa63374f2
CUDA: update ops.md (#17005)
|
2 месяцев назад |
lhez
|
5e90233bdb
opencl: update doc (#17011)
|
2 месяцев назад |
nullname
|
a5c07dcd7b
refactor: replace sprintf with snprintf for safer string handling in dump functions (#16913)
|
2 месяцев назад |
Jeff Bolz
|
ad51c0a720
vulkan: remove the need for the dryrun (#16826)
|
2 месяцев назад |
Georgi Gerganov
|
66d8eccd42
server : do context shift only while generating (#17000)
|
2 месяцев назад |
Georgi Gerganov
|
afd353246d
readme : update hot topics (#17002)
|
2 месяцев назад |
Acly
|
cc98f8d349
ggml-cpu : bicubic interpolation (#16891)
|
2 месяцев назад |
Sigbjørn Skjæret
|
d945834366
ci : apply model label to models (#16994)
|
2 месяцев назад |
Sigbjørn Skjæret
|
b164259bba
chore : fix models indent after refactor (#16992)
|
2 месяцев назад |
Noah
|
1f5accb8d0
Fix garbled output with REPACK at high thread counts (#16956)
|
2 месяцев назад |
Aman Gupta
|
2759ccdb4a
CUDA: avoid mul + bias fusion when doing fusion (#16935)
|
2 месяцев назад |
lhez
|
c5023daf60
opencl: support imrope (#16914)
|
2 месяцев назад |
Aleksander Grygier
|
e7da30b584
fix: Viewing multiple PDF attachments (#16974)
|
2 месяцев назад |
Daniel Bevenius
|
ed8aa63320
model-conversion : pass config to from_pretrained (#16963)
|
2 месяцев назад |
Georgi Gerganov
|
48bd26501b
server : add props.model_alias (#16943)
|
2 месяцев назад |
theo77186
|
622cd010ff
ggml: CUDA: add head size 72 for flash-attn (#16962)
|
2 месяцев назад |
Xuan-Son Nguyen
|
070ff4d535
mtmd: add --image-min/max-tokens (#16921)
|
2 месяцев назад |
Xuan-Son Nguyen
|
bf7b0c9725
mtmd: pad mask for qwen2.5vl (#16954)
|
2 месяцев назад |
Jinyang He
|
fcfce040e8
ggml : LoongArch fixes (#16958)
|
2 месяцев назад |
Olivier Chafik
|
ee3a5a10ad
sync: minja (glm 4.6 & minmax m2 templates) (#16949)
|
2 месяцев назад |
shani-f
|
7e994168b1
SYCL: optimized repeat_back kernel (3× fewer asm instructions, 2× faster)Feature/sycl repeat back opt (#16869)
|
2 месяцев назад |