Georgi Gerganov
|
8a4280ce43
kv-cache : remove LLAMA_SET_ROWS checks (#15505)
|
4 maanden geleden |
Aleksei Nikiforov
|
64387f6e95
gguf-py: byteswapping improvements (#12851)
|
4 maanden geleden |
Joshua Cogliati
|
d35a1e8c41
cli : change log to warning to explain reason for stopping (#15604)
|
4 maanden geleden |
Daniel Bevenius
|
46d9caa27a
model-conversion : add mmproj conversion target (#15628)
|
4 maanden geleden |
matiaslin
|
5a0e3ef6f0
cuda: Add cublasLt_static linking when GGML_STATIC is enabled (#15622)
|
4 maanden geleden |
Johannes Gäßler
|
fbef0fad7a
server: higher timeout for tests (#15621)
|
4 maanden geleden |
Georgi Gerganov
|
da54f9f1a2
presets : add qwen3-30B-a3b FIM (#15616)
|
4 maanden geleden |
uvos
|
47373271f9
HIP: Enable support for ggml_backend_cuda_register_host_buffer (#15615)
|
4 maanden geleden |
Georgi Gerganov
|
1bded5a3b3
kv-cache : better estimate of n_kv for multi-sequence batches (#15610)
|
4 maanden geleden |
Chenguang Li
|
1e7489745a
CANN: refactor mask handling and improve performance in FA (#15561)
|
4 maanden geleden |
xctan
|
1cf123a343
ggml-cpu : add basic RVV support for vector f32 ops (#15057)
|
4 maanden geleden |
Daniel Bevenius
|
fcca2182a1
common : add -m to bash completion for --model [no ci] (#15591)
|
4 maanden geleden |
rmatif
|
86076f92de
OpenCL: add fused group_norm/norm, mul, add (#15314)
|
4 maanden geleden |
Diego Devesa
|
bcbddcd54f
tests : fix test-opt with GGML_BACKEND_DL (#15599)
|
4 maanden geleden |
Akarshan Biswas
|
8b69686136
SYCL: fix rms_norm_mul_add for tensor dim not a multiple of sg_size (#15592)
|
4 maanden geleden |
fidoriel
|
8ce3ff1d91
mtmd : fix mtmd ios build (#15579)
|
4 maanden geleden |
Eve
|
44b1efa41a
tests: add performance test for mul mat id (#15543)
|
4 maanden geleden |
shalinib-ibm
|
a6a58d6478
llamafile: PowerPC Sgemm Optimization (#15558)
|
4 maanden geleden |
Georgi Gerganov
|
0373486dbc
graph : fix assert in memory-less build_attn (#15590)
|
4 maanden geleden |
Daniel Bevenius
|
62cef26ac5
model-conversion : add qat-q4 quantization targets (#15588)
|
4 maanden geleden |
Johannes Gäßler
|
8f5afa94c4
CUDA: return -1 for nonexistent compiled arch (#15587)
|
4 maanden geleden |
Georgi Gerganov
|
b3964c1e89
metal : optimize FA vec for large sequences and BS <= 8 (#15566)
|
4 maanden geleden |
Xuan-Son Nguyen
|
79a546220c
mtmd : support Kimi VL model (#15458)
|
4 maanden geleden |
Georgi Gerganov
|
85cc1ae998
context : print graph stats for memory-less contexts (#15586)
|
4 maanden geleden |
Georgi Gerganov
|
1d8d83deaa
metal : improve `MUL_MAT_ID` (#15541)
|
4 maanden geleden |
tc-mb
|
c4e9239064
model : support MiniCPM-V 4.5 (#15575)
|
4 maanden geleden |
Sigbjørn Skjæret
|
39842a7f73
gguf-py : remove erroneous FFN_GATE entry (#15583)
|
4 maanden geleden |
Sigbjørn Skjæret
|
0fd90db585
metal : remove contiguous assertion for src0 in IM2COL (#15577)
|
4 maanden geleden |
Yoshi_likes_e4
|
4c37636b3e
Add a warning for special devices (#15563)
|
4 maanden geleden |
Jeff Bolz
|
34bdbbd7c2
vulkan: Remove splitting for mul_mat_id (#15568)
|
4 maanden geleden |