Srihari-mcw
|
74d73dc85c
Make updates to fix issues with clang-cl builds while using AVX512 flags (#10314)
|
1 year ago |
Johannes Gäßler
|
4047be74da
scripts: update compare-llama-bench.py (#10319)
|
1 year ago |
slaren
|
883d206fbd
ggml : fix some build issues
|
1 year ago |
Georgi Gerganov
|
09ecbcb596
cmake : fix ppc64 check (whisper/0)
|
1 year ago |
thewh1teagle
|
3225008973
ggml : vulkan logs (whisper/2547)
|
1 year ago |
Georgi Gerganov
|
cbf5541a82
sync : ggml
|
1 year ago |
Eve
|
18429220bd
AVX BF16 and single scale quant optimizations (#10212)
|
1 year ago |
R0CKSTAR
|
f0204a0ec7
ci: build test musa with cmake (#10298)
|
1 year ago |
Romain Biessy
|
57f8355b29
sycl: Update Intel docker images to use DPC++ 2025.0 (#10305)
|
1 year ago |
Xuan Son Nguyen
|
9901068ac7
server : (web UI) add copy button for code block, fix api key (#10242)
|
1 year ago |
Chenguang Li
|
231f9360d9
cann: dockerfile and doc adjustment (#10302)
|
1 year ago |
Georgi Gerganov
|
4802ad350b
scripts : fix regex in sync [no ci]
|
1 year ago |
Romain Biessy
|
5a54af4d4f
sycl: Use syclcompat::dp4a (#10267)
|
1 year ago |
Charles Xu
|
1607a5e5b0
backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (#9921)
|
1 year ago |
Diego Devesa
|
ae8de6d50a
ggml : build backends as libraries (#10256)
|
1 year ago |
Johannes Gäßler
|
4a8ccb37ad
CUDA: no -sm row for very small matrices (#10185)
|
1 year ago |
Georgi Gerganov
|
2a82891a85
speculative : fix out-of-bounds access (#10289)
|
1 year ago |
Jeff Bolz
|
af148c9386
vulkan: Optimize binary ops (#10270)
|
1 year ago |
Jeff Bolz
|
66798e42fb
vulkan: Use macros to make the mat mul pipeline creation more concise (#10259)
|
1 year ago |
Michael Podvitskiy
|
fb4a0ec083
llama : propagate the results of `graph_compute` (#9525)
|
1 year ago |
Georgi Gerganov
|
5ea926dad7
sync : ggml
|
1 year ago |
Small Grass Forest
|
1ee9eea094
docs : update bindings list (#10261)
|
1 year ago |
Alexey Parfenov
|
ff7fb670d0
server : add missing docs (#10269)
|
1 year ago |
Jhen-Jie Hong
|
0e712a5acb
server : fix incorrect res in validate_model_chat_template (#10272)
|
1 year ago |
Brian
|
a0ec17b32e
metadata: Detailed Dataset Authorship Metadata (#8875)
|
1 year ago |
Alberto Cabrera Pérez
|
2e82ffa4af
sycl : Fixes to broken builds and test-backend-ops (#10257)
|
1 year ago |
Jeff Bolz
|
80dd7ff22f
vulkan: Optimize contiguous copies (#10254)
|
1 year ago |
Jeff Bolz
|
54ef9cfc72
vulkan: Throttle the number of shader compiles during the build step. (#10222)
|
1 year ago |
Georgi Gerganov
|
b0cefea58a
metal : more precise Q*K in FA vec kernel (#10247)
|
1 year ago |
Georgi Gerganov
|
b141e5f6ef
server : enable KV cache defrag by default (#10233)
|
1 year ago |