Ruben Ortlam
|
9e6649ecf2
vulkan: fix mul_mat_vec_iq1_s formatting (#18026)
|
1 月之前 |
Xuan-Son Nguyen
|
0759b09c90
graph: add f_attn_temp_offset (#18025)
|
1 月之前 |
Georgi Gerganov
|
254098a279
common : refactor common_sampler + grammar logic changes (#17937)
|
1 月之前 |
Jeff Bolz
|
3238b1400c
vulkan: Fix data race/hang in scalar/cm1 flash attention (#17887)
|
1 月之前 |
lovedheart
|
4722671641
vulkan: improve mul_mat_vec_iq1_s speed (#17874)
|
1 月之前 |
Eve
|
d15d177f43
vulkan: faster q6_k matmul (#17813)
|
1 月之前 |
Georgi Gerganov
|
77ad8542bd
model-conversion : cast logits to float32 (#18009)
|
1 月之前 |
Georgi Gerganov
|
609a2d0268
models : fix YaRN regression + consolidate logic (#18006)
|
1 月之前 |
Georgi Gerganov
|
a63cbafbbc
ggml : arm repack fix build
|
1 月之前 |
Georgi Gerganov
|
0e59224990
sync : ggml
|
1 月之前 |
Georgi Gerganov
|
71fdcf0616
ggml : arm repack fix build (whisper/0)
|
1 月之前 |
Congcong Cai
|
615655aafe
cmake : set `CMAKE_RUNTIME_OUTPUT_DIRECTORY` for non standalone build (ggml/1394)
|
1 月之前 |
Xuan-Son Nguyen
|
c00ff929dc
scripts: add script to compare logprobs of llama.cpp against other frameworks (#17947)
|
1 月之前 |
Sergey Fedorov
|
4ed2bae50d
server-models.cpp: add missing <filesystem> (#18000)
|
1 月之前 |
Jeff Bolz
|
5266379bca
llama_context: synchronize before reallocating output buffer (#17974)
|
1 月之前 |
Xuan-Son Nguyen
|
4d5ae24c0a
arg: fix common_params_parse not accepting negated arg (#17991)
|
1 月之前 |
Gustavo Rocha Dias
|
66ba51252e
cmake: correct scope - link ws2_32 for MinGW/w64devkit builds in cpp-httplib (#17972)
|
1 月之前 |
Jeff Bolz
|
36255a2268
vulkan: support get_rows for i32 (#17941)
|
1 月之前 |
Jeff Bolz
|
3229a23fa6
vulkan: support GGML_OP_DIAG (#17893)
|
1 月之前 |
Jeff Bolz
|
303f8615e9
vulkan: Multi-pass softmax for large number of cols (#17892)
|
1 月之前 |
Georgi Gerganov
|
3c6391e748
speculative-simple : free batch on exit (#17985)
|
1 月之前 |
Sigbjørn Skjæret
|
8e4d678528
common : skip model validation when --completion-bash is requested (#17975)
|
1 月之前 |
Jeff Bolz
|
07a10c1090
vulkan: Allow non-pow2 n_experts in topk_moe (#17872)
|
1 月之前 |
Sigbjørn Skjæret
|
2bc94e7928
add llama-completion to completion-bash executables (#17976)
|
1 月之前 |
Daniel Bevenius
|
fd1085ffb7
model-conversion : use CONVERTED_MODEL value for converted model [no ci] (#17984)
|
1 月之前 |
Xuan-Son Nguyen
|
380b4c984e
common: support negated args (#17919)
|
1 月之前 |
Xuan-Son Nguyen
|
e39a2ce66d
clip: move model cgraphs into their own files (#17965)
|
1 月之前 |
jiahao su
|
a8c7f33d79
ci : change the cann version and the container pull method (#17953)
|
1 月之前 |
Sigbjørn Skjæret
|
b7f5f46e03
docker : include legacy llama-completion binary (#17964)
|
1 月之前 |
Johannes Gäßler
|
482211438d
CUDA: fix overflow in MMA kernel without stream-k (#17939)
|
1 月之前 |