Georgi Gerganov
|
f3f65429c4
llama : reorganize source code + improve CMake (#8006)
|
1 سال پیش |
Johannes Gäßler
|
a818f3028d
CUDA: use MMQ instead of cuBLAS by default (#8075)
|
1 سال پیش |
slaren
|
95f57bb5d5
ggml : remove ggml_task_type and GGML_PERF (#8017)
|
1 سال پیش |
Clint Herron
|
c5a8d4b749
JSON Schema to GBNF integration tests (#7790)
|
1 سال پیش |
Ulrich Drepper
|
61665277af
Allow compiling with CUDA without CUDA runtime installed (#7989)
|
1 سال پیش |
0cc4m
|
7c7836d9d4
Vulkan Shader Refactor, Memory Debugging Option (#7947)
|
1 سال پیش |
Xuan Son Nguyen
|
0c7b3595b9
Add `cvector-generator` example (#7514)
|
1 سال پیش |
slaren
|
f578b86b21
move BLAS to a separate backend (#6210)
|
1 سال پیش |
Olivier Chafik
|
1c641e6aac
`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809)
|
1 سال پیش |
Johannes Gäßler
|
7d1a378b8f
CUDA: refactor mmq, dmmv, mmvq (#7716)
|
1 سال پیش |
Georgi Gerganov
|
554c247caf
ggml : remove OpenCL (#7735)
|
1 سال پیش |
Georgi Gerganov
|
0cd6bd3483
llama : remove beam search (#7736)
|
1 سال پیش |
Radoslav Gerganov
|
bde7cd3cd9
llama : offload to RPC in addition to other backends (#7640)
|
1 سال پیش |
Masaya, Kato
|
a5735e4426
ggml : use OpenMP as a thread pool (#7606)
|
1 سال پیش |
Johannes Gäßler
|
0b832d53ba
make: fix debug options not being applied to NVCC (#7714)
|
1 سال پیش |
Yazan Agha-Schrader
|
2e666832e6
server : new UI (#7633)
|
1 سال پیش |
Johannes Gäßler
|
9b596417af
CUDA: quantized KV support for FA vec (#7527)
|
1 سال پیش |
Daniele
|
30e238b246
Improve HIP compatibility (#7672)
|
1 سال پیش |
Johannes Gäßler
|
10b1e45876
make: add --device-debug to NVCC debug flags (#7542)
|
1 سال پیش |
Georgi Gerganov
|
e84b71c2c6
ggml : drop support for QK_K=64 (#7473)
|
1 سال پیش |
junchao-loongson
|
65c58207ec
ggml : add loongarch lsx and lasx support (#6454)
|
1 سال پیش |
slaren
|
d359f30921
llama : remove MPI backend (#7395)
|
1 سال پیش |
Gavin Zhao
|
82ca83db3c
ROCm: use native CMake HIP support (#5966)
|
1 سال پیش |
agray3
|
bc4bba364f
Introduction of CUDA Graphs to LLama.cpp (#6766)
|
1 سال پیش |
Georgi Gerganov
|
92139b90af
tests : add test-tokenizer-0.sh + fix some tokenizers (#7036)
|
1 سال پیش |
Georgi Gerganov
|
f4ab2a4147
llama : fix BPE pre-tokenization (#6920)
|
1 سال پیش |
Przemysław Pawełczyk
|
577277ffd2
make : change GNU make default CXX from g++ to c++ (#6966)
|
1 سال پیش |
Pierrick Hymbert
|
0c4d489e29
quantize: add imatrix and dataset metadata in GGUF (#6658)
|
1 سال پیش |
Justine Tunney
|
192090bae4
llamafile : improve sgemm.cpp (#6796)
|
1 سال پیش |
Olivier Chafik
|
5cf5e7d490
`build`: generate hex dump of server assets during build (#6661)
|
1 سال پیش |