Xuan-Son Nguyen
|
ecda2ec4b3
mtmd : Support Pixtral 12B (#13065)
|
9 bulan lalu |
piDack
|
eb1776b15a
convert : Append mult-eos,half-rope,bos to GLM4-0414 and Z (#13021)
|
9 bulan lalu |
Radoslav Gerganov
|
2cca6c01e4
rpc : add command line option for number of threads for the CPU backend (#13060)
|
9 bulan lalu |
Johannes Gäßler
|
658987cfc9
CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (#13014)
|
9 bulan lalu |
Xuan-Son Nguyen
|
dc39a5e7a8
mtmd : support SmolVLM (version 1 and 2) (#13050)
|
9 bulan lalu |
Georgi Gerganov
|
ab47dec3d3
security : add note about RPC and server functionality (#13061)
|
9 bulan lalu |
Georgi Gerganov
|
7b53389c24
metal : add memory pool for temp allocs (#12850)
|
9 bulan lalu |
Xuan-Son Nguyen
|
243453533e
llava : update documentations (#13055)
|
9 bulan lalu |
Diego Devesa
|
1d735c0b4f
ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (#12871)
|
9 bulan lalu |
Akarshan Biswas
|
5368ddda7a
SYCL: Add non-contiguous support in ROPE (#12993)
|
9 bulan lalu |
Xuan-Son Nguyen
|
84a9bf2fc2
mtmd : merge llava, gemma3 and minicpmv CLI into single `llama-mtmd-cli` (#13012)
|
9 bulan lalu |
Xuan-Son Nguyen
|
2016f07bd1
convert : experimental support for `--mmproj` flag (#13023)
|
9 bulan lalu |
Jeffrey Morgan
|
6602304814
llava: fix errors in clip.h on certain compilers (#13030)
|
9 bulan lalu |
Jeff Bolz
|
66168204be
vulkan: support noncontiguous rms_norm (#13031)
|
9 bulan lalu |
Jeffrey Morgan
|
4ba9d711ba
metal: add neg operator (#13029)
|
9 bulan lalu |
bandoti
|
00137157fc
Disable CI cross-compile builds (#13022)
|
9 bulan lalu |
Sigbjørn Skjæret
|
fb28f4f80e
gguf-py : fix upload python package workflow (#13020)
|
9 bulan lalu |
Xuan-Son Nguyen
|
37b9f0d29d
clip : refactor, add `image_manipulation` and `llava_uhd` classes (#13011)
|
9 bulan lalu |
Daniel Tang
|
6408210082
main : Fix Ctrl+D/newline handling (#12951)
|
9 bulan lalu |
Chris Thompson
|
aff9d107b0
gguf-py : GGUF Editor GUI - Python + Qt6 (#12930)
|
9 bulan lalu |
Xuan-Son Nguyen
|
35370ba945
server : use std::move whenever possible (#12936)
|
9 bulan lalu |
Akarshan Biswas
|
8d66005763
SYCL: Refactor and enable FP16 in binary broadcast OPs (#12975)
|
9 bulan lalu |
Xuan-Son Nguyen
|
b9154ecff9
mtmd : add methods to access `mtmd_image_tokens` (#12906)
|
9 bulan lalu |
Radoslav Gerganov
|
2db9ba1464
rpc : add RPC_CMD_HELLO (#12955)
|
9 bulan lalu |
Georgi Gerganov
|
2f74c354c0
graph : make FA compatible with MLA + add initial Metal kernels (#12953)
|
9 bulan lalu |
Alan Gray
|
207c22ec2d
ggml: Re-enable CUDA graphs in presence of CONT and DUP nodes (#12970)
|
9 bulan lalu |
hipudding
|
7a395f67a7
CANN: Add support for async operator submission (#12864)
|
9 bulan lalu |
Mikko Juola
|
971f245b3b
llama : recognize IBM Granite 3.3 FIM tokens (#12988)
|
9 bulan lalu |
kimminsu
|
12b17501e6
opencl: fix incorrect local_size index in profiling log (#12868)
|
9 bulan lalu |
Jeff Bolz
|
015022bb53
vulkan: enable coopmat2 FA gqa and split_k optimizations more often (#12931)
|
9 bulan lalu |