Christian Kastner
|
374101fd74
cmake : enable building llama.cpp using system libggml (#12321)
|
hai 10 meses |
Akarshan Biswas
|
b3c9a65673
SYCL: set extras only on GGML_TYPE_Q4_0 (#12366)
|
hai 10 meses |
Sigbjørn Skjæret
|
8ba95dca20
llama : fix OLMo-2-0325-32B-Instruct K-norm size (#12400)
|
hai 10 meses |
Georgi Gerganov
|
dc079cfdff
context : fix init of n_outputs (#12397)
|
hai 10 meses |
Daniel Bevenius
|
7b61bcc87c
ci : add --symlinks to xcframework zip command (#12409)
|
hai 10 meses |
marcoStocchi
|
f4c3dd5daa
llama-tts : add '-o' option (#12398)
|
hai 10 meses |
aubreyli
|
3d35d87b41
SYCL: Delete redundant plus sign and space (#12391)
|
hai 10 meses |
fairydreaming
|
b19bd064c0
SYCL : support non-contiguous tensors in binary ops (add, sub, etc) (#12399)
|
hai 10 meses |
Chenguang Li
|
92a391327e
[CANN]MUL_MAT optimization (#12382)
|
hai 10 meses |
Eric Curtin
|
9f2250ba72
Add CLI arg to llama-run to adjust the number of threads used (#12370)
|
hai 10 meses |
Sigbjørn Skjæret
|
774973b8f3
main : add -sysf / --system-prompt-file (#12249) (#12250)
|
hai 10 meses |
fairydreaming
|
8fcb563613
Load all MoE experts during warmup (#11571)
|
hai 10 meses |
Victor
|
add2a3aa5a
server: fix "--grammar-file" parameter (#12285)
|
hai 10 meses |
Georgi Gerganov
|
c522ce4143
graph : simplify attn input build for unified KV cache (#12381)
|
hai 10 meses |
Georgi Gerganov
|
081bee8c64
hparams : add SWA rope parameters (#12374)
|
hai 10 meses |
Georgi Gerganov
|
84d5475541
llama : fix Gemma3 SWA KV cache shift (#12373)
|
hai 10 meses |
Xuan-Son Nguyen
|
be7c303410
arg : no n_predict = -2 for examples except for main and infill (#12364)
|
hai 10 meses |
Georgi Gerganov
|
e0dbec0bc6
llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)
|
hai 10 meses |
Ishaan Gandhi
|
2048b5913d
server : fix crash when using verbose output with input tokens that are not in printable range (#12178) (#12338)
|
hai 10 meses |
Oscar Barenys
|
f08f4b3187
Update build.yml for Windows Vulkan builder to use Vulkan 1.4.304 SDK for VK_NV_cooperative_matrix2 support (#12301)
|
hai 10 meses |
Daniel Bevenius
|
80a02aa858
llama.swiftui : fix xcframework dir in README [no ci] (#12353)
|
hai 10 meses |
Alberto Cabrera Pérez
|
363f8c5d67
sycl : variable sg_size support for mmvq kernels (#12336)
|
hai 10 meses |
uvos
|
34c961b181
CUDA/HIP: Fix fattn-vec-* when device warp size is not 32 (#12315)
|
hai 10 meses |
Xuan-Son Nguyen
|
7841fc723e
llama : Add Gemma 3 support (+ experimental vision capability) (#12343)
|
hai 10 meses |
Jeff Bolz
|
bf69cfe62f
vulkan: fix bug in coopmat1 mul_mat_id (#12316)
|
hai 10 meses |
uvos
|
10f2e81809
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code. (#12177)
|
hai 10 meses |
jklincn
|
ba7654380a
ggml-backend : fix backend search path (#12330)
|
hai 10 meses |
BB-fat
|
6ab2e4765a
metal : Cache the Metal library at the device context level (#12265)
|
hai 10 meses |
Xuan-Son Nguyen
|
96e1280839
clip : bring back GPU support (#12322)
|
hai 10 meses |
Eve
|
2c9f833d17
mat vec double buffer (#12188)
|
hai 10 meses |