Georgi Gerganov
|
ba1cf846ed
cann : fix doxy (ggml/0)
|
1 year ago |
Mengqing Cao
|
d2d3200b38
cann : add Ascend NPU support (whisper/2336)
|
1 year ago |
Georgi Gerganov
|
51d964a4ef
cuda : mark BF16 CONT as unsupported
|
1 year ago |
Salvatore Mesoraca
|
efe6a83e30
ggml : fix cont with transposed tensors when one dimension is 1 (ggml/934)
|
1 year ago |
Kevin Gibbons
|
fbb7fcffbc
llama : set attrs of mislabelled EOT/EOM tokens (#9348)
|
1 year ago |
Georgi Gerganov
|
a5b5d9a101
llama.android : fix build (#9350)
|
1 year ago |
Georgi Gerganov
|
f12295b8a9
llama : fix empty ring buffer push (#9358)
|
1 year ago |
Georgi Gerganov
|
faf69d4237
llama : sanitize invalid tokens (#9357)
|
1 year ago |
Eve
|
e536426ded
llamafile : disable sgemm for batch-size 1 (#9330)
|
1 year ago |
Xuan Son Nguyen
|
1b9ae5189c
common : refactor arg parser (#9308)
|
1 year ago |
slaren
|
e32d0816ed
ggml : always check bounds on get_rows operations (#9354)
|
1 year ago |
Georgi Gerganov
|
df270ef745
llama : refactor sampling v2 (#9294)
|
1 year ago |
Xuan Son Nguyen
|
947538acb8
ggml : fix missing `cpu_set_t` on emscripten (#9336)
|
1 year ago |
slaren
|
6c89eb0b47
ci : disable rocm image creation (#9340)
|
1 year ago |
Xuan Son Nguyen
|
9b2c24c099
server : simplify state machine for slot (#9283)
|
1 year ago |
Aarni Koskela
|
134bc38ecf
llama-bench : log benchmark progress (#9287)
|
1 year ago |
Aarni Koskela
|
815b1fb20a
batched-bench : add `--output-format jsonl` option (#9293)
|
1 year ago |
Changyeon Kim
|
409dc4f8bb
ggml : fix build break for the vulkan-debug (#9265)
|
1 year ago |
Xuan Son Nguyen
|
4a1411b4f1
server : fix missing lock (#9334)
|
1 year ago |
Markus Tavenrath
|
8ebe8ddebd
Improve Vulkan shader build system (#9239)
|
1 year ago |
compilade
|
9bc6db28d0
ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151)
|
1 year ago |
awatuna
|
32b2ec88bc
Update build.yml (#9184)
|
1 year ago |
Michael Podvitskiy
|
1031771faa
CMake fix: host for msvc compiler can only be x86 or x64 (#8624)
|
1 year ago |
slaren
|
4db04784f9
cuda : fix defrag with quantized KV (#9319)
|
1 year ago |
slaren
|
bdf314f38a
llama-bench : fix NUL terminators in CPU name (#9313)
|
1 year ago |
Srihari-mcw
|
581c305186
ggml : AVX2 support for Q4_0_8_8 (#8713)
|
1 year ago |
Ouadie EL FAROUKI
|
5910ea9427
[SYCL] Fix DMMV dequantization (#9279)
|
1 year ago |
杨朱 · Kiki
|
c8671ae282
Fix broken links in docker.md (#9306)
|
1 year ago |
Radoslav Gerganov
|
82e3b03c11
rpc : make RPC servers come first in the device list (#9296)
|
1 year ago |
Pascal Patry
|
9379d3cc17
readme : rename result_format to response_format (#9300)
|
1 year ago |