Xuan-Son Nguyen
|
06c2b1561d
convert : fix Norway problem when parsing YAML (#12114)
|
10 months ago |
William Tambellini
|
70680c48e5
ggml : upgrade init_tensor API to return a ggml_status (#11854)
|
10 months ago |
Xuan-Son Nguyen
|
c43a3e7996
llama : add Phi-4-mini support (supersede #12099) (#12108)
|
10 months ago |
Alex Brooks
|
84d5f4bc19
Update granite vision docs for 3.2 model (#12105)
|
10 months ago |
Rémy O
|
438a83926a
vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595)
|
10 months ago |
Johannes Gäßler
|
9c42b1718c
CUDA: fix logic for V100 + GGML_CUDA_FORCE_MMQ (#12098)
|
10 months ago |
Prashant Vithule
|
05e6f5aad0
ggml: aarch64: implement SVE kernels for q2_k_q8_k vector dot (#12064)
|
10 months ago |
hipudding
|
673cfef9aa
CANN: Fix build error with GCC 13 (#11990)
|
10 months ago |
Eve
|
fbeda9002d
vulkan: matmul dequantization improvements (#12015)
|
10 months ago |
Daniele
|
581650b7ca
vulkan: improve im2col (#11826)
|
10 months ago |
Vladimir Vuksanovic
|
b95c8af37c
cmake: Fix ggml backend dependencies and installation (#11818)
|
10 months ago |
Ting Lou
|
a800ae46da
llava : add struct for FFI bindgen (#12079)
|
10 months ago |
Sigbjørn Skjæret
|
69050a11be
Refactor gguf scripts to improve metadata handling (#11909)
|
10 months ago |
Aleksei Nikiforov
|
3567ee3a94
gguf-py: enable reading non-native endian files (#12081)
|
10 months ago |
Kante Yin
|
53e4db1012
readme : update infra list (#9096)
|
10 months ago |
Olivier Chafik
|
d7cfe1ffe0
docs: add docs/function-calling.md to lighten server/README.md's plight (#12069)
|
10 months ago |
Jeff Bolz
|
a82c9e7c23
vulkan: fix assertion when qy_needs_dequant (#12068)
|
10 months ago |
rhjdvsgsgks
|
401af80b54
server: handle echo=false on /v1/completions (#12060)
|
10 months ago |
Judd
|
c132239bfb
add OP sigmoid (#12056)
|
10 months ago |
Molly Sophia
|
393fca629e
ggml-cpu: Fix build with sve (#12059)
|
10 months ago |
Rémy O
|
61d4f39dfe
vulkan: implement more backpropagation operators (#11914)
|
10 months ago |
Olivier Chafik
|
0b52745649
server: support add_generation_prompt query param (#12062)
|
10 months ago |
Alex Brooks
|
4d1051a40f
Add Doc for Converting Granite Vision -> GGUF (#12006)
|
10 months ago |
Vitali Lovich
|
3e9a2860e9
llama : expose llama_model_n_head_kv in the API (#11997)
|
10 months ago |
Gian-Carlo Pascutto
|
58d07a8043
metal : copy kernels for quant to F32/F16 conversions (#12017)
|
10 months ago |
lhez
|
34a846b584
opencl: fix for small models (#11950)
|
10 months ago |
Alex Brooks
|
7a2c913e66
llava : Add Granite Vision Support (#11794)
|
11 months ago |
Neo Zhang Jianyu
|
08d5986290
[SYCL] Optimize mul_mat for Q4_0 on Intel GPU (#12035)
|
11 months ago |
Aleksei Nikiforov
|
651adf4b66
gguf_convert_endian.py: implement byteswapping for q4_k and q6_k (#11349)
|
11 months ago |
Akarshan Biswas
|
8303e8b0fb
SYCL: Fix GGML_SYCL_DEBUG macro (#11995)
|
11 months ago |