Kawrakow
|
99009e72f8
ggml : add SOTA 2,3,4,5,6 bit k-quantizations (#1684)
|
2 anni fa |
Georgi Gerganov
|
ecb217db4f
llama : Metal inference (#1642)
|
2 anni fa |
Johannes Gäßler
|
3b126f654f
LLAMA_DEBUG adds debug symbols (#1617)
|
2 anni fa |
Kerfuffle
|
0df7d63e5b
Include server in releases + other build system cleanups (#1610)
|
2 anni fa |
Johannes Gäßler
|
1fcdcc28b1
cuda : performance optimizations (#1530)
|
2 anni fa |
0cc4m
|
2e6cd4b025
OpenCL Token Generation Acceleration (#1459)
|
2 anni fa |
Stefan Sydow
|
7780e4f479
make : .PHONY clean (#1553)
|
2 anni fa |
Zenix
|
b8ee340abe
feature : support blis and other blas implementation (#1536)
|
2 anni fa |
Georgi Gerganov
|
ea600071cb
Revert "feature : add blis and other BLAS implementation support (#1502)"
|
2 anni fa |
Zenix
|
07e9ace0f9
feature : add blis and other BLAS implementation support (#1502)
|
2 anni fa |
sandyiscool
|
2a5ee023ad
Add alternate include path for openblas (#1476)
|
2 anni fa |
Georgi Gerganov
|
bda4d7c215
make : fix PERF build with cuBLAS
|
2 anni fa |
DaniAndTheWeb
|
173d0e6419
makefile: automatic Arch Linux detection (#1332)
|
2 anni fa |
Ionoclast Laboratories
|
2d13786e91
Fix for OpenCL / clbast builds on macOS. (#1329)
|
2 anni fa |
DannyDaemonic
|
55bc5f0900
Call sh on build-info.sh (#1294)
|
2 anni fa |
DannyDaemonic
|
f4cef87edf
Add git-based build information for better issue tracking (#1232)
|
2 anni fa |
Pavol Rusnak
|
6f79699286
build: add armv{6,7,8} support to cmake (#1251)
|
2 anni fa |
Stephan Walter
|
f0d70f147d
Various fixes to mat_mul benchmark (#1253)
|
2 anni fa |
Georgi Gerganov
|
214b6a3570
ggml : adjust mul_mat_f16 work memory (#1226)
|
2 anni fa |
Georgi Gerganov
|
305eb5afd5
build : fix reference to old llama_util.h
|
2 anni fa |
slaren
|
7fc50c051a
cuBLAS: use host pinned memory and dequantize while copying (#1207)
|
2 anni fa |
0cc4m
|
7296c961d9
ggml : add CLBlast support (#1164)
|
2 anni fa |
Johannes Gäßler
|
92a6e13a31
Add Manjaro CUDA include and lib dirs to Makefile (#1212)
|
2 anni fa |
slaren
|
e4cf982e0d
Fix cuda compilation (#1128)
|
2 anni fa |
Georgi Gerganov
|
e4422e299c
ggml : better PERF prints + support "LLAMA_PERF=1 make"
|
2 anni fa |
Georgi Gerganov
|
872c365a91
ggml : fix AVX build + update to new Q8_0 format
|
2 anni fa |
slaren
|
50cb666b8a
Improve cuBLAS performance by using a memory pool (#1094)
|
2 anni fa |
slaren
|
2005469ea1
Add Q4_3 support to cuBLAS (#1086)
|
2 anni fa |
源文雨
|
5addcb120c
fix: LLAMA_CUBLAS=1 undefined reference 'shm_open' (#1080)
|
2 anni fa |
slaren
|
02d6988121
Improve cuBLAS performance by dequantizing on the GPU (#1065)
|
2 anni fa |