Sam/Samuel
|
f4ce81c45e
metal: optimise `GGML_OP_SUM` (#16559)
|
3 달 전 |
Georgi Gerganov
|
17304cbcc1
server : fix img token logs (#16595)
|
3 달 전 |
Xuan-Son Nguyen
|
3e3cb19f64
llama-quant: add support for mmproj (#16592)
|
3 달 전 |
Julius Tischbein
|
5acd455460
CUDA: Changing the CUDA scheduling strategy to spin (#16585)
|
3 달 전 |
Georgi Gerganov
|
554fd578a5
server : fix mtmd checkpoints (#16591)
|
3 달 전 |
Georgi Gerganov
|
fa882fd2b1
metal : avoid using Metal's gpuAddress property (#16576)
|
3 달 전 |
SavicStefan
|
ffa059034c
vulkan: Add ACC_TYPE_VEC2 implementation (#16203)
|
3 달 전 |
Aman Gupta
|
120bf7046d
CUDA + openCL: fix bug in accessing rms_norm->src while doing fusion (#16577)
|
3 달 전 |
Jeff Bolz
|
4258e0cfe7
vulkan: Support FA with K/V in F32 (#16543)
|
3 달 전 |
Jeff Bolz
|
7ea15bb64c
vulkan: Improve build time for MSVC (#16545)
|
3 달 전 |
Johannes Gäßler
|
9c7185dd28
CUDA: enable FA for FP32 KV cache (#16546)
|
3 달 전 |
Aman Gupta
|
1ee9d0b415
CUDA: use fastdiv + ggml_cuda_mad for mmvf (#16557)
|
3 달 전 |
Aman Gupta
|
48e2fa9fb7
CUDA: add fp kernel for larger batch size MoE (#16512)
|
3 달 전 |
Anav Prasad
|
5b6913c47b
cuda : remove legacy copy-op pointer indirection code (#16485)
|
3 달 전 |
Georgi Gerganov
|
bc07349a7f
server : dynamic token limit for prompt cache (#16560)
|
3 달 전 |
Georgi Gerganov
|
e60f241eac
metal : FA support F32 K and V and head size = 32 (#16531)
|
3 달 전 |
Georgi Gerganov
|
e38b7c6e9e
graph : support cacheless embeddings with FA and iSWA (#16528)
|
3 달 전 |
lhez
|
5016b72862
opencl: fix build targeting CL 2 (#16554)
|
3 달 전 |
Johannes Gäßler
|
7049736b2d
CUDA: fix numerical issues in tile FA kernel (#16540)
|
3 달 전 |
Jie Fu (傅杰)
|
01d2bdc2bc
ggml : fix build broken with -march=armv9-a on MacOS (#16520)
|
3 달 전 |
Chenguang Li
|
56fc38b965
CANN: fix CPU memory leak in CANN backend (#16549)
|
3 달 전 |
Pascal
|
1fb9504eb7
fix: add remark plugin to render raw HTML as literal text (#16505)
|
3 달 전 |
Sam/Samuel
|
3f750f8d76
metal: add support for opt_step_sgd (#16539)
|
3 달 전 |
Georgi Gerganov
|
c515fc5771
ggml : fix scalar path for computing norm (#16558)
|
3 달 전 |
hipudding
|
f9bc66c3eb
CANN: Update several operators to support FP16 data format (#16251)
|
3 달 전 |
Sam/Samuel
|
a31cf36ad9
metal : add opt_step_adamw and op_sum (#16529)
|
3 달 전 |
Pascal
|
81d54bbfd5
webui: remove client-side context pre-check and rely on backend for limits (#16506)
|
3 달 전 |
Neo Zhang Jianyu
|
c7be9febcb
[SYCL] fix UT fault cases: count-equal, argsort, pad OPs (#16521)
|
3 달 전 |
Mathieu Baudier
|
8415f61e23
ci : add Vulkan on Ubuntu with default packages build (#16532)
|
3 달 전 |
Aldehir Rojas
|
2c301e91ab
common : handle unicode during partial json parsing (#16526)
|
3 달 전 |