Georgi Gerganov
|
ab47dec3d3
security : add note about RPC and server functionality (#13061)
|
hai 9 meses |
Georgi Gerganov
|
7b53389c24
metal : add memory pool for temp allocs (#12850)
|
hai 9 meses |
Xuan-Son Nguyen
|
243453533e
llava : update documentations (#13055)
|
hai 9 meses |
Diego Devesa
|
1d735c0b4f
ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (#12871)
|
hai 9 meses |
Akarshan Biswas
|
5368ddda7a
SYCL: Add non-contiguous support in ROPE (#12993)
|
hai 9 meses |
Xuan-Son Nguyen
|
84a9bf2fc2
mtmd : merge llava, gemma3 and minicpmv CLI into single `llama-mtmd-cli` (#13012)
|
hai 9 meses |
Xuan-Son Nguyen
|
2016f07bd1
convert : experimental support for `--mmproj` flag (#13023)
|
hai 9 meses |
Jeffrey Morgan
|
6602304814
llava: fix errors in clip.h on certain compilers (#13030)
|
hai 9 meses |
Jeff Bolz
|
66168204be
vulkan: support noncontiguous rms_norm (#13031)
|
hai 9 meses |
Jeffrey Morgan
|
4ba9d711ba
metal: add neg operator (#13029)
|
hai 9 meses |
bandoti
|
00137157fc
Disable CI cross-compile builds (#13022)
|
hai 9 meses |
Sigbjørn Skjæret
|
fb28f4f80e
gguf-py : fix upload python package workflow (#13020)
|
hai 9 meses |
Xuan-Son Nguyen
|
37b9f0d29d
clip : refactor, add `image_manipulation` and `llava_uhd` classes (#13011)
|
hai 9 meses |
Daniel Tang
|
6408210082
main : Fix Ctrl+D/newline handling (#12951)
|
hai 9 meses |
Chris Thompson
|
aff9d107b0
gguf-py : GGUF Editor GUI - Python + Qt6 (#12930)
|
hai 9 meses |
Xuan-Son Nguyen
|
35370ba945
server : use std::move whenever possible (#12936)
|
hai 9 meses |
Akarshan Biswas
|
8d66005763
SYCL: Refactor and enable FP16 in binary broadcast OPs (#12975)
|
hai 9 meses |
Xuan-Son Nguyen
|
b9154ecff9
mtmd : add methods to access `mtmd_image_tokens` (#12906)
|
hai 9 meses |
Radoslav Gerganov
|
2db9ba1464
rpc : add RPC_CMD_HELLO (#12955)
|
hai 9 meses |
Georgi Gerganov
|
2f74c354c0
graph : make FA compatible with MLA + add initial Metal kernels (#12953)
|
hai 9 meses |
Alan Gray
|
207c22ec2d
ggml: Re-enable CUDA graphs in presence of CONT and DUP nodes (#12970)
|
hai 9 meses |
hipudding
|
7a395f67a7
CANN: Add support for async operator submission (#12864)
|
hai 9 meses |
Mikko Juola
|
971f245b3b
llama : recognize IBM Granite 3.3 FIM tokens (#12988)
|
hai 9 meses |
kimminsu
|
12b17501e6
opencl: fix incorrect local_size index in profiling log (#12868)
|
hai 9 meses |
Jeff Bolz
|
015022bb53
vulkan: enable coopmat2 FA gqa and split_k optimizations more often (#12931)
|
hai 9 meses |
Chenguang Li
|
b43d89e311
CANN: Add 310P operator support check (#12962)
|
hai 9 meses |
lhez
|
80f19b4186
opencl: split `ggml-opencl.cl` into multiple files and cleanup (#12886)
|
hai 9 meses |
Georgi Gerganov
|
f8f820cc4d
metal : add FA-vec kernels for head size 96 (#12952)
|
hai 9 meses |
hipudding
|
54a7272043
CANN: Add x86 build ci (#12950)
|
hai 9 meses |
David Huang
|
84778e9770
CUDA/HIP: Share the same unified memory allocation logic. (#12934)
|
hai 9 meses |