alonfaraj
|
75fafcbccc
make : fix tests build (#2855)
|
2 éve |
Henri Vasserman
|
6bbc598a63
ROCm Port (#1087)
|
2 éve |
Georgi Gerganov
|
6381d4e110
gguf : new file format with flexible meta data (beta) (#2398)
|
2 éve |
slaren
|
097e121e2f
llama : add benchmark example (#2626)
|
2 éve |
drbh
|
7cf54e1f74
tests : adds simple llama grammar tests (#2618)
|
2 éve |
Shouzheng Liu
|
bf83bff674
metal : matrix-matrix multiplication kernel (#2615)
|
2 éve |
drbh
|
ee77efea2a
test : add simple grammar parsing tests (#2594)
|
2 éve |
byte-6174
|
b19edd54d5
Adding support for llama2.c models (#2559)
|
2 éve |
Johannes Gäßler
|
25d43e0eb5
CUDA: tuned mul_mat_q kernels (#2546)
|
2 éve |
Martin Krasser
|
f5bfea0580
Allow passing grammar to completion endpoint (#2532)
|
2 éve |
GiviMAD
|
34a14b28ff
[Makefile] Move ARM CFLAGS before compilation (#2536)
|
2 éve |
DannyDaemonic
|
3498588e0f
Add --simple-io option for subprocesses and break out console.h and cpp (#1558)
|
2 éve |
Eve
|
81844fbcfd
tests : Fix compilation warnings (Linux/GCC) (#2451)
|
2 éve |
Johannes Gäßler
|
49e7cb5bb1
CUDA: fixed LLAMA_FAST compilation option (#2473)
|
2 éve |
Johannes Gäßler
|
0728c5a8b9
CUDA: mmq CLI option, fixed mmq build issues (#2453)
|
2 éve |
slaren
|
a113689571
ggml : add graph tensor allocator (#2411)
|
2 éve |
Johannes Gäßler
|
11f3ca06b8
CUDA: Quantized matrix matrix multiplication (#2160)
|
2 éve |
Cebtenzzre
|
6df1f5940f
make : build with -Wmissing-prototypes (#2394)
|
2 éve |
Aarni Koskela
|
b3f138d058
Chat UI extras (#2366)
|
2 éve |
Evan Jones
|
84e09a7d8b
llama : add grammar-based sampling (#1773)
|
2 éve |
Jose Maldonado
|
91171b8072
make : fix CLBLAST compile support in FreeBSD (#2331)
|
2 éve |
Jose Maldonado
|
73643f5fb1
gitignore : changes for Poetry users + chat examples (#2284)
|
2 éve |
Georgi Gerganov
|
a814d04f81
make : fix indentation
|
2 éve |
Sky Yan
|
42c7c2e2e9
make : support customized LLAMA_CUDA_NVCC and LLAMA_CUDA_CCBIN (#2275)
|
2 éve |
Jiří Podivín
|
54e3bc76fe
make : add new target for test binaries (#2244)
|
2 éve |
Przemysław Pawełczyk
|
9cf022a188
make : fix embdinput library and server examples building on MSYS2 (#2235)
|
2 éve |
wzy
|
7dabc66f3c
make : use pkg-config for OpenBLAS (#2222)
|
2 éve |
James Reynolds
|
229aab351c
make : fix combination of LLAMA_METAL and LLAMA_MPI (#2208)
|
2 éve |
Evan Miller
|
5656d10599
mpi : add support for distributed inference via MPI (#2099)
|
2 éve |
dylan
|
84525e7962
docker : add support for CUDA in docker (#1461)
|
2 éve |