Xuan-Son Nguyen
|
b9154ecff9
mtmd : add methods to access `mtmd_image_tokens` (#12906)
|
9 ماه پیش |
Radoslav Gerganov
|
2db9ba1464
rpc : add RPC_CMD_HELLO (#12955)
|
9 ماه پیش |
Georgi Gerganov
|
2f74c354c0
graph : make FA compatible with MLA + add initial Metal kernels (#12953)
|
9 ماه پیش |
Alan Gray
|
207c22ec2d
ggml: Re-enable CUDA graphs in presence of CONT and DUP nodes (#12970)
|
9 ماه پیش |
hipudding
|
7a395f67a7
CANN: Add support for async operator submission (#12864)
|
9 ماه پیش |
Mikko Juola
|
971f245b3b
llama : recognize IBM Granite 3.3 FIM tokens (#12988)
|
9 ماه پیش |
kimminsu
|
12b17501e6
opencl: fix incorrect local_size index in profiling log (#12868)
|
9 ماه پیش |
Jeff Bolz
|
015022bb53
vulkan: enable coopmat2 FA gqa and split_k optimizations more often (#12931)
|
9 ماه پیش |
Chenguang Li
|
b43d89e311
CANN: Add 310P operator support check (#12962)
|
9 ماه پیش |
lhez
|
80f19b4186
opencl: split `ggml-opencl.cl` into multiple files and cleanup (#12886)
|
9 ماه پیش |
Georgi Gerganov
|
f8f820cc4d
metal : add FA-vec kernels for head size 96 (#12952)
|
9 ماه پیش |
hipudding
|
54a7272043
CANN: Add x86 build ci (#12950)
|
9 ماه پیش |
David Huang
|
84778e9770
CUDA/HIP: Share the same unified memory allocation logic. (#12934)
|
9 ماه پیش |
Akarshan Biswas
|
510676475f
SYCL: Add ROPE vision kernel (#12887)
|
9 ماه پیش |
Juk Armstrong
|
daa422881a
llama : DeepSeek V2/V3 MLA implementation (#12801)
|
9 ماه پیش |
Srihari-mcw
|
eccc7a1602
ggml : Add AVX512 implementation of GEMM - Q4_Kx8 (#12829)
|
9 ماه پیش |
Chenguang Li
|
0019279bb5
CANN: Opt ROPE optimization (#12865)
|
9 ماه پیش |
Xinpeng Dou
|
b0c75ac9f9
CANN: Optimize CANN buffer pool memory management (#12875)
|
9 ماه پیش |
Russyyds
|
d6d2c2ab8c
Add performance print for gemma3 in example (#12929)
|
9 ماه پیش |
Akarshan Biswas
|
75afa0ae31
SYCL: Fix im2col (#12910)
|
9 ماه پیش |
Radoslav Gerganov
|
c772d54926
rpc : use ggml_context_ptr (#12938)
|
9 ماه پیش |
Neo Zhang Jianyu
|
81c7e64fc2
dsiable curl lib check, this action is missed by commit bd3f59f81289b920bcc597a208c14f55e39ed37e (#12761) (#12937)
|
9 ماه پیش |
Georgi Gerganov
|
526739b879
sync : ggml
|
9 ماه پیش |
cmdr2
|
a25355e264
cpu: fix cpu backend's supports-op for GET_ROWS_BACK. fixes a fatal when running test-backend-ops with only the CPU backend (ggml/1190)
|
9 ماه پیش |
SXX
|
e959d32b1c
ggml: use _mm[512/256]_dpbusd[_avx]_epi32 to directly accumulate into the result register (#12773)
|
9 ماه پیش |
Alan Gray
|
307bfa253d
ggml: disable CUDA graphs for unsupported DUP and CONT node types (#12891)
|
9 ماه پیش |
Ed Addario
|
71e90e8813
quantize: Handle user-defined quantization levels for additional tensors (#12511)
|
9 ماه پیش |
Prajwal B Mehendarkar
|
bc091a4dc5
common : Define cache directory on AIX (#12915)
|
9 ماه پیش |
Jeff Bolz
|
a4837577aa
vulkan: use aligned loads for flash attention mask (#12853)
|
9 ماه پیش |
Matt Clayton
|
e59ea539b8
llava: Fix cpu-only clip image encoding sefault (#12907)
|
9 ماه پیش |