Markus Tavenrath
|
8ebe8ddebd
Improve Vulkan shader build system (#9239)
|
1 年之前 |
compilade
|
9bc6db28d0
ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151)
|
1 年之前 |
awatuna
|
32b2ec88bc
Update build.yml (#9184)
|
1 年之前 |
Michael Podvitskiy
|
1031771faa
CMake fix: host for msvc compiler can only be x86 or x64 (#8624)
|
1 年之前 |
slaren
|
4db04784f9
cuda : fix defrag with quantized KV (#9319)
|
1 年之前 |
slaren
|
bdf314f38a
llama-bench : fix NUL terminators in CPU name (#9313)
|
1 年之前 |
Srihari-mcw
|
581c305186
ggml : AVX2 support for Q4_0_8_8 (#8713)
|
1 年之前 |
Ouadie EL FAROUKI
|
5910ea9427
[SYCL] Fix DMMV dequantization (#9279)
|
1 年之前 |
杨朱 · Kiki
|
c8671ae282
Fix broken links in docker.md (#9306)
|
1 年之前 |
Radoslav Gerganov
|
82e3b03c11
rpc : make RPC servers come first in the device list (#9296)
|
1 年之前 |
Pascal Patry
|
9379d3cc17
readme : rename result_format to response_format (#9300)
|
1 年之前 |
Georgi Gerganov
|
7605ae7daf
flake.lock: Update (#9261)
|
1 年之前 |
Aarni Koskela
|
8962422b1c
llama-bench : add JSONL (NDJSON) output mode (#9288)
|
1 年之前 |
Georgi Gerganov
|
b69a480af4
readme : refactor API section + remove old hot topics
|
1 年之前 |
Xuan Son Nguyen
|
48baa61ecc
server : test script : add timeout for all requests (#9282)
|
1 年之前 |
Zhenwei Jin
|
f1485161e5
src: make tail invalid when kv cell is intersection for mamba (#9249)
|
1 年之前 |
slaren
|
048de848ee
docker : fix missing binaries in full-cuda image (#9278)
|
1 年之前 |
yuri@FreeBSD
|
f771d064a9
ggml : add pthread includes on FreeBSD (#9258)
|
1 年之前 |
Xuan Son Nguyen
|
6e7d133a5f
server : refactor multitask handling (#9274)
|
1 年之前 |
Guoliang Hua
|
b60074f1c2
llama-cli : remove duplicated log message (#9275)
|
1 年之前 |
Tushar
|
9c1ba55733
build(nix): Package gguf-py (#5664)
|
1 年之前 |
Georgi Gerganov
|
c6d4cb4655
llama : minor style
|
1 年之前 |
Molly Sophia
|
8f1d81a0b6
llama : support RWKV v6 models (#8980)
|
1 年之前 |
Echo Nolan
|
a47667cff4
nix: fix CUDA build - replace deprecated autoAddOpenGLRunpathHook
|
1 年之前 |
Srihari-mcw
|
ea5d7478b1
sgemm : improved Q4_0 and Q8_0 performance via 4xN and Mx4 gemm (#8908)
|
1 年之前 |
Daniel Bevenius
|
49271efbaf
llama : fix typo in xcda_array_view comment [no ci] (#9132)
|
1 年之前 |
Sutou Kouhei
|
0ab30f8d82
llama : fix llama_split_mode enum values in main_gpu document (#9057)
|
1 年之前 |
蕭澧邦
|
cddae4884c
Correct typo run_llama2.sh > run-llama2.sh (#9149)
|
1 年之前 |
tc-mb
|
7ea8d80d53
llava : the function "clip" should be int (#9237)
|
1 年之前 |
Faisal Zaghloul
|
42c76d1358
Threadpool: take 2 (#8672)
|
1 年之前 |