Clint Herron
|
b5a5f34efa
Removing extra blank lines that were breaking Lint. (#8067)
|
1 year ago |
Xuan Son Nguyen
|
3e58b0ee35
cvector: fix CI + correct help message (#8064)
|
1 year ago |
HatsuneMikuUwU33
|
adf480c3ab
cvector-generator: Moe Moe Fixie-Fixie for Lots of Formats~! ♡(ᐢ ᴥ ᐢ)♡ (#8052)
|
1 year ago |
0xspringtime
|
3aa184a8c7
convert-hf : change assert to exception (#8015)
|
1 year ago |
ddh0
|
5b48cd53a8
Update llama-quantize ppl/file size output from LLaMA-v1 to Llama-3 values (#8058)
|
1 year ago |
Clint Herron
|
c5a8d4b749
JSON Schema to GBNF integration tests (#7790)
|
1 year ago |
k.h.lai
|
557b653dc9
vulkan: detect multiple devices by deviceUUID instead of deviceID (#8022)
|
1 year ago |
Eve
|
7d5e8777ae
ggml : AVX IQ quants (#7845)
|
1 year ago |
Georgi Gerganov
|
a927b0f3dd
llama : optimize long word tokenization with WPM (#8034)
|
1 year ago |
Douglas Hanley
|
80ea089d77
llama : allow pooled embeddings on any model (#7477)
|
1 year ago |
Shuichi Tsutsumi
|
0e64591e82
swiftui : enable stream updating (#7754)
|
1 year ago |
Hamdoud Hakem
|
b1ef562bc1
requirements : Bump torch and numpy for python3.12 (#8041)
|
1 year ago |
Hamdoud Hakem
|
17b291a6a5
convert-hf : Fix the encoding in the convert-hf-to-gguf-update.py (#8040)
|
1 year ago |
Johannes Gäßler
|
abd894ad96
common: fix warning (#8036)
|
1 year ago |
luoyu-intel
|
de391e4c80
[SYCL] Fix windows build and inference (#8003)
|
1 year ago |
Johannes Gäßler
|
d50f8897a7
CUDA: stream-k decomposition for MMQ (#8018)
|
1 year ago |
Michael de Gans
|
2075a66a96
metal : fix `ggml_metal_supports_op` for BF16 (#8021)
|
1 year ago |
sasha0552
|
ba58993152
server : fix smart slot selection (#8020)
|
1 year ago |
Michael de Gans
|
a7854743c5
un-ignore `build-info.cmake` and `build-info.sh` (#7996)
|
1 year ago |
slaren
|
9c77ec1d74
ggml : synchronize threads using barriers (#7993)
|
1 year ago |
Georgi Gerganov
|
a04a953cab
codecov : remove (#8004)
|
1 year ago |
Meng, Hengyu
|
623494a478
[SYCL] refactor (#6408)
|
1 year ago |
jaime-m-p
|
37bef89433
tokenizer : BPE fixes (#7530)
|
1 year ago |
Sigbjørn Skjæret
|
91c188d6c2
Only use FIM middle token if it exists (#7648)
|
1 year ago |
jojorne
|
84f6de17f6
Fix no gcc pragma on Windows (#7751)
|
1 year ago |
Ulrich Drepper
|
61665277af
Allow compiling with CUDA without CUDA runtime installed (#7989)
|
1 year ago |
Frank Mai
|
b96f9afb0d
chore: clean useless beam search param (#7985)
|
1 year ago |
Abheek Gulati
|
1193778105
readme : update UI list (#7943)
|
1 year ago |
Georgi Gerganov
|
5326bcceeb
ggml : sync
|
1 year ago |
Georgi Gerganov
|
e6ecc2be47
whisper : use ggml_backend_sched (whisper/2239)
|
1 year ago |