David Huang
|
3ffbbd5ce1
HIP: rocWMMA documentation and enabling in workflow builds (#12179)
|
hace 10 meses |
Olivier Chafik
|
42994048a3
update function-calling.md w/ template override for functionary-small-v3.2 (#12214)
|
hace 10 meses |
Aaron Teo
|
e9b2f84f14
llava: add big-endian conversion for image encoder (#12218)
|
hace 10 meses |
uvos
|
e721c05c93
HIP/CUDA: set the paramerter value in maintain_cuda_graph instead of replaceing it. (#12209)
|
hace 10 meses |
Han Yin
|
57b6abf85a
android : fix KV cache log message condition (#12212)
|
hace 10 meses |
Henry Linjamäki
|
94bb63e4f0
opencl : fix buffer alignment (#12197)
|
hace 10 meses |
Henry Linjamäki
|
f79243992c
opencl : fix `ulong` kernel args were set from `int` variables (#12174)
|
hace 10 meses |
simon886212
|
ed4ce0dda2
opencl : fix profile-related errors (#12095)
|
hace 10 meses |
Rémy O
|
07d1572347
ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (#12154)
|
hace 10 meses |
Akarshan Biswas
|
5e43f104cc
SYCL: Disable f16 Unary OPs as not supported by the kernels (#12201)
|
hace 10 meses |
Plamen Minev
|
16e4b22c5e
ggml : fix GGMLMetalClass ODR (#12200)
|
hace 10 meses |
Daniel Bevenius
|
074c4fd39d
ci : add fetch-depth to xcframework upload (#12195)
|
hace 10 meses |
Olivier Chafik
|
669912d9a5
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
|
hace 10 meses |
Daniel Bevenius
|
fa31c438e0
ci : fix xcframework artifact tag (#12191)
|
hace 10 meses |
Daniel Bevenius
|
3ccbfe5a71
ci : remove xframework upload (#12190)
|
hace 10 meses |
Clauszy
|
06a92a193a
server : fix cache reuse logic (#12161)
|
hace 10 meses |
Daniel Bevenius
|
a057897ad4
llama : add xcframework build script (#11996)
|
hace 10 meses |
mgroeber9110
|
5bbe6a9fe9
ggml : portability fixes for VS 2017 (#12150)
|
hace 10 meses |
Georgi Gerganov
|
20a9b8f5e1
readme : fix roadmap link (#12185)
|
hace 10 meses |
Sigbjørn Skjæret
|
56d7a9f812
main: allow preloading conversation with -p and add -st / --single-turn (#12145)
|
hace 10 meses |
Olivier Chafik
|
1a24c4621f
`server`: fix deadly typo in response_format.json_schema.schema handling (#12168)
|
hace 10 meses |
David Huang
|
becade5de7
HIP: implement FlashAttention via rocWMMA for CDNA and RDNA3+ (#12032)
|
hace 10 meses |
Georgi Gerganov
|
dfd6b2c0be
sync : ggml
|
hace 10 meses |
cmdr2
|
b64d7cc272
cuda: unary ops as float + de-duplicate (ggml/1130)
|
hace 10 meses |
Georgi Gerganov
|
3d1cf3cf33
sync : ggml
|
hace 10 meses |
cmdr2
|
0cbee131ad
cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129)
|
hace 10 meses |
Georgi Gerganov
|
8371d44595
sync : ggml
|
hace 10 meses |
cmdr2
|
87abb7e903
cuda/cpu: Increase support for fp16 unary operations (ggml/1125)
|
hace 10 meses |
Diego Devesa
|
6d4c23b81b
whisper : support GGML_BACKEND_DL (whisper/2843)
|
hace 10 meses |
midnight
|
6512a90037
cmake : fix compile assumptions for power9/etc (whisper/2777)
|
hace 11 meses |