Jeff Bolz
|
80dd7ff22f
vulkan: Optimize contiguous copies (#10254)
|
1 rok temu |
Jeff Bolz
|
54ef9cfc72
vulkan: Throttle the number of shader compiles during the build step. (#10222)
|
1 rok temu |
Georgi Gerganov
|
b0cefea58a
metal : more precise Q*K in FA vec kernel (#10247)
|
1 rok temu |
Georgi Gerganov
|
b141e5f6ef
server : enable KV cache defrag by default (#10233)
|
1 rok temu |
Georgi Gerganov
|
4b3a9212b6
flake.lock: Update (#10243)
|
1 rok temu |
MaggotHATE
|
505f33274d
server : (web UI) Add back sampler settings (#10239)
|
1 rok temu |
Jeff Bolz
|
160687b3ed
vulkan: Fix newly added tests for permuted mul_mat and 1D im2col (#10226)
|
1 rok temu |
Georgi Gerganov
|
6423c65aa8
metal : reorder write loop in mul mat kernel + style (#10231)
|
1 rok temu |
Georgi Gerganov
|
39a334a9aa
metal : fix build and some more comments (#10229)
|
1 rok temu |
Georgi Gerganov
|
bb38cdd8ba
metal : fix F32 accumulation in FA vec kernel (#10232)
|
1 rok temu |
Georgi Gerganov
|
f018acba22
llama : fix Qwen model type strings
|
1 rok temu |
Georgi Gerganov
|
46323fa9ef
metal : hide debug messages from normal log
|
1 rok temu |
SXX
|
5b359bb1e3
ggml: fix zero division in ‘dne’ calculation in CUDA COUNT_EQUAL operator when ‘ne’ is small (#10213)
|
1 rok temu |
amritahs-ibm
|
e89213492d
ggml : optimize llamafile cpu matrix multiplication for ppc64le (#10156)
|
1 rok temu |
haopeng
|
8fc393f246
scripts : fix pattern and get n_tokens in one go (#10221)
|
1 rok temu |
Georgi Gerganov
|
ec450d3bbf
metal : opt-in compile flag for BF16 (#10218)
|
1 rok temu |
Georgi Gerganov
|
695ad752b2
metal : improve clarity (minor) (#10171)
|
1 rok temu |
Georgi Gerganov
|
841f27abdb
metal : optimize FA kernels (#10171)
|
1 rok temu |
Jhen-Jie Hong
|
d05b3127bd
swift : exclude ggml-metal-embed.metal (#10211)
|
1 rok temu |
Xuan Son Nguyen
|
76c6e7f105
server : minor UI fix (#10207)
|
1 rok temu |
Xuan Son Nguyen
|
a71d81cf8c
server : revamp chat UI with vuejs and daisyui (#10175)
|
1 rok temu |
Georgi Gerganov
|
eec4d71737
scripts : add amx to sync-ggml.sh [no ci]
|
1 rok temu |
Georgi Gerganov
|
3b08828674
sync : ggml
|
1 rok temu |
Georgi Gerganov
|
a2c6fd747c
scripts : sync update
|
1 rok temu |
Diego Devesa
|
97404c4a03
ggml : add ggml-cpu.h to the public headers (#10204)
|
1 rok temu |
Faisal Zaghloul
|
60e17ce23c
Remove identical wte/etw logic for jais (#10203)
|
1 rok temu |
wwoodsTM
|
5107e8cea3
DRY: Fixes clone functionality (#10192)
|
1 rok temu |
snadampal
|
2319126a70
fix q4_0_8_8 format for corrupted tokens issue (#10198)
|
1 rok temu |
Zhiyuan Li
|
3bcd40b3c5
Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (#10133)
|
1 rok temu |
Georgi Gerganov
|
5c333e0140
metal : add BF16 support (#8439)
|
1 rok temu |