Henry Linjamäki
|
f79243992c
opencl : fix `ulong` kernel args were set from `int` variables (#12174)
|
10 months ago |
simon886212
|
ed4ce0dda2
opencl : fix profile-related errors (#12095)
|
10 months ago |
Rémy O
|
07d1572347
ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (#12154)
|
10 months ago |
Akarshan Biswas
|
5e43f104cc
SYCL: Disable f16 Unary OPs as not supported by the kernels (#12201)
|
10 months ago |
Plamen Minev
|
16e4b22c5e
ggml : fix GGMLMetalClass ODR (#12200)
|
10 months ago |
Daniel Bevenius
|
074c4fd39d
ci : add fetch-depth to xcframework upload (#12195)
|
10 months ago |
Olivier Chafik
|
669912d9a5
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
|
10 months ago |
Daniel Bevenius
|
fa31c438e0
ci : fix xcframework artifact tag (#12191)
|
10 months ago |
Daniel Bevenius
|
3ccbfe5a71
ci : remove xframework upload (#12190)
|
10 months ago |
Clauszy
|
06a92a193a
server : fix cache reuse logic (#12161)
|
10 months ago |
Daniel Bevenius
|
a057897ad4
llama : add xcframework build script (#11996)
|
10 months ago |
mgroeber9110
|
5bbe6a9fe9
ggml : portability fixes for VS 2017 (#12150)
|
10 months ago |
Georgi Gerganov
|
20a9b8f5e1
readme : fix roadmap link (#12185)
|
10 months ago |
Sigbjørn Skjæret
|
56d7a9f812
main: allow preloading conversation with -p and add -st / --single-turn (#12145)
|
10 months ago |
Olivier Chafik
|
1a24c4621f
`server`: fix deadly typo in response_format.json_schema.schema handling (#12168)
|
10 months ago |
David Huang
|
becade5de7
HIP: implement FlashAttention via rocWMMA for CDNA and RDNA3+ (#12032)
|
10 months ago |
Georgi Gerganov
|
dfd6b2c0be
sync : ggml
|
10 months ago |
cmdr2
|
b64d7cc272
cuda: unary ops as float + de-duplicate (ggml/1130)
|
10 months ago |
Georgi Gerganov
|
3d1cf3cf33
sync : ggml
|
10 months ago |
cmdr2
|
0cbee131ad
cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129)
|
10 months ago |
Georgi Gerganov
|
8371d44595
sync : ggml
|
10 months ago |
cmdr2
|
87abb7e903
cuda/cpu: Increase support for fp16 unary operations (ggml/1125)
|
10 months ago |
Diego Devesa
|
6d4c23b81b
whisper : support GGML_BACKEND_DL (whisper/2843)
|
10 months ago |
midnight
|
6512a90037
cmake : fix compile assumptions for power9/etc (whisper/2777)
|
11 months ago |
petterreinholdtsen
|
4512055792
Told cmake to install ggml-cpp.h as a public header file. (ggml/1126)
|
10 months ago |
cmdr2
|
f54a4ba11e
Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121)
|
10 months ago |
Georgi Gerganov
|
aede2074f6
scripts : sync-ggml-am.sh fix
|
10 months ago |
Daniel Bevenius
|
2679c3b55d
ci : set GITHUB_ACTION env var for server tests (#12162)
|
10 months ago |
dm4
|
c43af9276b
tts: add speaker file support (#12048)
|
10 months ago |
Diego Devesa
|
d5c63cd7f9
test-backend-ops : add option -p to filter by op params (#12155)
|
10 months ago |