Georgi Gerganov
|
39173bcacb
context : reserve new scheduler when graph topology changes (#18547)
|
2 weeks ago |
Johannes Gäßler
|
5c662d21a3
CUDA: fix allignment on register spill for FA (#18815)
|
2 weeks ago |
shalinib-ibm
|
8cc0ba957b
ggml-cpu: optimize ggml_vec_dot_bf16 for Power9 (#18837)
|
2 weeks ago |
Xuan-Son Nguyen
|
a7e6ddb8bd
lora: make sure model keep track of associated adapters (#18490)
|
2 weeks ago |
Sigbjørn Skjæret
|
2a13180100
model-loader : support bool array sliding window pattern (#18850)
|
2 weeks ago |
Adrien Gallouët
|
ec997b4f2b
tests : download models only when running ctest (#18843)
|
2 weeks ago |
Max Krasnyansky
|
cff777f226
hexagon: support for OP_CPY, host buffers now optional, hvx-utils refactoring and optimizations (#18822)
|
2 weeks ago |
Oliver Simons
|
36f0132464
CUDA: Factor out and re-use `block_reduce` function (#18785)
|
2 weeks ago |
Piotr Wilkin (ilintar)
|
d98b548120
Restore clip's cb() to its rightful glory - extract common debugging elements in llama (#17914)
|
2 weeks ago |
Junwon Hwang
|
8fb7175576
model : clean up and fix EXAONE-MoE configuration (#18840)
|
2 weeks ago |
Adrien Gallouët
|
516a4ca9b5
refactor : remove libcurl, use OpenSSL when available (#18828)
|
2 weeks ago |
Jeff Bolz
|
3e4bb29666
vulkan: Check maxStorageBufferRange in supports_op (#18709)
|
2 weeks ago |
Aman Gupta
|
47f9612492
llama-model: fix unfortunate typo (#18832)
|
2 weeks ago |
Daniel Bevenius
|
01cbdfd7eb
CUDA : fix typo in clang pragma comment [no ci] (#18830)
|
2 weeks ago |
Ruben Ortlam
|
635ef78ec5
vulkan: work around Intel fp16 bug in mmq (#18814)
|
2 weeks ago |
Perry Naseck
|
7d587e5544
ggml-metal: do not copy headers for embedded, use current binary dir for embedded (#18705)
|
2 weeks ago |
Daniel Benjaminsson
|
d34aa07193
mmap: add Haiku support by skipping RLIMIT_MEMLOCK check (#18819)
|
2 weeks ago |
Adrien Gallouët
|
f709c7a33f
ci, tests : use cmake to download models and remove libcurl dependency (#18791)
|
2 weeks ago |
ddh0
|
6e36299b47
llama : print_info alignment fix (#18708)
|
2 weeks ago |
Junwon Hwang
|
60591f01d4
model : add EXAONE MoE (#18543)
|
2 weeks ago |
Georgi Gerganov
|
e4832e3ae4
vocab : fix attribute overrides for harmony (#18806)
|
2 weeks ago |
Ruben Ortlam
|
960e5e3b46
llama-mmap: fix direct-io loading fallback EOF exception (#18801)
|
2 weeks ago |
Daniel Bevenius
|
20ca2e12c4
model-conversion : remove -c 0 from model card template [no ci] (#18807)
|
2 weeks ago |
yulo
|
ea4a321f2a
HIP: add fattn-mma-f16 for RDNA4 (#18481)
|
2 weeks ago |
Johannes Gäßler
|
c1e79e610f
doc: ban AI-generated PR descriptions [no ci] (#18765)
|
2 weeks ago |
Xuan-Son Nguyen
|
e047f9ee9d
mtmd: fix use_non_causal being reported incorrectly (#18793)
|
2 weeks ago |
Georgi Gerganov
|
0a57271ab6
CUDA : fix unused argument when USE_CUDA_GRAPH=OFF (#18800)
|
2 weeks ago |
Gabe Goodhart
|
076b0faf7d
graph : clean up t5 input builders (#18795)
|
2 weeks ago |
Ruben Ortlam
|
db79dc06b1
llama-bench: add direct_io parameter (#18778)
|
2 weeks ago |
Adrien Gallouët
|
537d4240d4
ci : remove libcurl in releases (#18775)
|
2 weeks ago |