Georgi Gerganov
|
2db78c75e4
ggml : bump version to 0.9.1
|
3 ヶ月 前 |
Rafal Lewczuk
|
02463ab27b
ggml-backend : add root cause in error message if loading backend library fails (#16172)
|
3 ヶ月 前 |
Sigbjørn Skjæret
|
adc76347d7
ggml : check cuda and metal argsort limits and add test (#16323)
|
3 ヶ月 前 |
Aleksander Grygier
|
3a2bdcda0b
Improve Mobile UI for dialogs and action dropdowns (#16222)
|
3 ヶ月 前 |
Pascal
|
66bb7985c3
fix: preserved zero values in chat settings inputs and textareas by switching to nullish coalescing for field values and default placeholders (#16312)
|
3 ヶ月 前 |
Vinkal
|
2f61c0f5bf
llama-cli: prevent spurious assistant token (#16202)
|
3 ヶ月 前 |
ddh0
|
3ffd0fae47
perplexity : show more kl-divergence data (#16321)
|
3 ヶ月 前 |
Georgi Gerganov
|
a4a0aa5ea2
ggml : fix dependencies for ggml_set_rows (#16318)
|
3 ヶ月 前 |
Jeff Bolz
|
92cd103f62
vulkan: Fix validation failure in quantized flash attention (#16292)
|
3 ヶ月 前 |
Sigbjørn Skjæret
|
b887d2f341
ggml : fix GGML_F32_VEC_FMA argument order in ggml_vec_mad1_f32 (#16307)
|
3 ヶ月 前 |
crat0z
|
bd0af02fc9
common : fix reasoning before forced tool call via tool_choice = required (#16264)
|
3 ヶ月 前 |
R0CKSTAR
|
d9e0e7c819
ci : fix musa docker build (#16306)
|
3 ヶ月 前 |
Aaron Teo
|
0124ac989f
devops: switch to using ubuntu-22.04-s390x image (#16302)
|
3 ヶ月 前 |
Imad Saddik
|
2811c65286
Fixed a few typos in the README of the LLaMA.cpp HTTP Server [no ci] (#16297)
|
3 ヶ月 前 |
Jeff Bolz
|
d8359f5fde
vulkan: 64-bit im2col (#16135)
|
3 ヶ月 前 |
Georgi Gerganov
|
6a2c6145a0
metal : extend mat-mat multiplication support (#16225)
|
3 ヶ月 前 |
Georgi Gerganov
|
3b53634fe3
metal : fuse non-sequential nodes (#16102)
|
3 ヶ月 前 |
Jeff Bolz
|
1384abf8b8
vulkan: handle mat_mul with A matrix > 4GB (#16176)
|
3 ヶ月 前 |
Jeff Bolz
|
e6d65fb02d
vulkan: support arbitrary KV dimension in flash attention (#16160)
|
3 ヶ月 前 |
Acly
|
8656f5de68
vulkan : make the vulkan.hpp dynamic dispatcher instance private (#16224)
|
3 ヶ月 前 |
Aleksander Grygier
|
4807e8f96a
Show message actions by default (#16289)
|
3 ヶ月 前 |
Aman Gupta
|
c0bfc57af4
CUDA: mul_mat_id for mmf for bs <= 64 for f16 and bs <= 32 for f32 (#16277)
|
3 ヶ月 前 |
Johannes Gäßler
|
75a3a6c2cd
CUDA: refactor and deduplicate vector FA kernels (#16208)
|
3 ヶ月 前 |
Dmytro Minochkin
|
0499b29c6f
vulkan: throw system error instead of SIGABRT during init on older devices (#16156)
|
3 ヶ月 前 |
Adrien Gallouët
|
234e2ff8ed
server : remove old LLAMA_SERVER_SSL (#16290)
|
3 ヶ月 前 |
Jeff Bolz
|
3f81b4e91c
vulkan: support GET_ROWS for k-quants (#16235)
|
3 ヶ月 前 |
Adrien Gallouët
|
ace6a54565
build : add LLAMA_OPENSSL option (#16287)
|
3 ヶ月 前 |
Vinkal
|
72b24d96c6
model : make minicpm embedding_scale, residual_scale and logit_scale optional with legacy defaults (#16273)
|
3 ヶ月 前 |
Aaron Teo
|
624207e676
devops: add s390x & ppc64le CI (#15925)
|
3 ヶ月 前 |
Aleksander Grygier
|
807e8c6d31
Enhance text file detection logic for file attachments (#16199)
|
3 ヶ月 前 |