Johannes Gäßler
|
4a8ccb37ad
CUDA: no -sm row for very small matrices (#10185)
|
пре 1 година |
Georgi Gerganov
|
2a82891a85
speculative : fix out-of-bounds access (#10289)
|
пре 1 година |
Jeff Bolz
|
af148c9386
vulkan: Optimize binary ops (#10270)
|
пре 1 година |
Jeff Bolz
|
66798e42fb
vulkan: Use macros to make the mat mul pipeline creation more concise (#10259)
|
пре 1 година |
Michael Podvitskiy
|
fb4a0ec083
llama : propagate the results of `graph_compute` (#9525)
|
пре 1 година |
Georgi Gerganov
|
5ea926dad7
sync : ggml
|
пре 1 година |
Small Grass Forest
|
1ee9eea094
docs : update bindings list (#10261)
|
пре 1 година |
Alexey Parfenov
|
ff7fb670d0
server : add missing docs (#10269)
|
пре 1 година |
Jhen-Jie Hong
|
0e712a5acb
server : fix incorrect res in validate_model_chat_template (#10272)
|
пре 1 година |
Brian
|
a0ec17b32e
metadata: Detailed Dataset Authorship Metadata (#8875)
|
пре 1 година |
Alberto Cabrera Pérez
|
2e82ffa4af
sycl : Fixes to broken builds and test-backend-ops (#10257)
|
пре 1 година |
Jeff Bolz
|
80dd7ff22f
vulkan: Optimize contiguous copies (#10254)
|
пре 1 година |
Jeff Bolz
|
54ef9cfc72
vulkan: Throttle the number of shader compiles during the build step. (#10222)
|
пре 1 година |
Georgi Gerganov
|
b0cefea58a
metal : more precise Q*K in FA vec kernel (#10247)
|
пре 1 година |
Georgi Gerganov
|
b141e5f6ef
server : enable KV cache defrag by default (#10233)
|
пре 1 година |
Georgi Gerganov
|
4b3a9212b6
flake.lock: Update (#10243)
|
пре 1 година |
MaggotHATE
|
505f33274d
server : (web UI) Add back sampler settings (#10239)
|
пре 1 година |
Jeff Bolz
|
160687b3ed
vulkan: Fix newly added tests for permuted mul_mat and 1D im2col (#10226)
|
пре 1 година |
Georgi Gerganov
|
6423c65aa8
metal : reorder write loop in mul mat kernel + style (#10231)
|
пре 1 година |
Georgi Gerganov
|
39a334a9aa
metal : fix build and some more comments (#10229)
|
пре 1 година |
Georgi Gerganov
|
bb38cdd8ba
metal : fix F32 accumulation in FA vec kernel (#10232)
|
пре 1 година |
Georgi Gerganov
|
f018acba22
llama : fix Qwen model type strings
|
пре 1 година |
Georgi Gerganov
|
46323fa9ef
metal : hide debug messages from normal log
|
пре 1 година |
SXX
|
5b359bb1e3
ggml: fix zero division in ‘dne’ calculation in CUDA COUNT_EQUAL operator when ‘ne’ is small (#10213)
|
пре 1 година |
amritahs-ibm
|
e89213492d
ggml : optimize llamafile cpu matrix multiplication for ppc64le (#10156)
|
пре 1 година |
haopeng
|
8fc393f246
scripts : fix pattern and get n_tokens in one go (#10221)
|
пре 1 година |
Georgi Gerganov
|
ec450d3bbf
metal : opt-in compile flag for BF16 (#10218)
|
пре 1 година |
Georgi Gerganov
|
695ad752b2
metal : improve clarity (minor) (#10171)
|
пре 1 година |
Georgi Gerganov
|
841f27abdb
metal : optimize FA kernels (#10171)
|
пре 1 година |
Jhen-Jie Hong
|
d05b3127bd
swift : exclude ggml-metal-embed.metal (#10211)
|
пре 1 година |