Denis Spasyuk
|
a8db2a9ce6
Update llama-cli documentation (#8315)
|
vor 1 Jahr |
Alex Tuddenham
|
4090ea5501
ci : add checks for cmake,make and ctest in ci/run.sh (#8200)
|
vor 1 Jahr |
Andy Tai
|
f1948f1e10
readme : update bindings list (#8222)
|
vor 1 Jahr |
Brian
|
f7cab35ef9
gguf-hash: model wide and per tensor hashing using xxhash and sha1 (#8048)
|
vor 1 Jahr |
toyer
|
905942abdb
llama : support glm3 and glm4 (#8031)
|
vor 1 Jahr |
Georgi Gerganov
|
b5040086d4
llama : fix n_rot default (#8348)
|
vor 1 Jahr |
compilade
|
d39130a398
py : use cpu-only torch in requirements.txt (#8335)
|
vor 1 Jahr |
standby24x7
|
b81ba1f96b
finetune: Rename command name in README.md (#8343)
|
vor 1 Jahr |
standby24x7
|
210eb9ed0a
finetune: Rename an old command name in finetune.sh (#8344)
|
vor 1 Jahr |
Bjarke Viksøe
|
cb4d86c4d7
server: Retrieve prompt template in /props (#8337)
|
vor 1 Jahr |
Derrick T. Woolworth
|
86e7299ef5
added support for Authorization Bearer tokens when downloading model (#8307)
|
vor 1 Jahr |
Xuan Son Nguyen
|
60d83a0149
update main readme (#8333)
|
vor 1 Jahr |
Daniel Bevenius
|
87e25a1d1b
llama : add early return for empty range (#8327)
|
vor 1 Jahr |
jaime-m-p
|
213701b51a
Detokenizer fixes (#8039)
|
vor 1 Jahr |
Xuan Son Nguyen
|
be20e7f49d
Reorganize documentation pages (#8325)
|
vor 1 Jahr |
Georgi Gerganov
|
7ed03b8974
llama : fix compile warning (#8304)
|
vor 1 Jahr |
Natsu
|
1d894a790e
cmake : add GGML_BUILD and GGML_SHARED macro definitions (#8281)
|
vor 1 Jahr |
Ouadie EL FAROUKI
|
1f3e1b66e2
Enabled more data types for oneMKL gemm_batch (#8236)
|
vor 1 Jahr |
Georgi Gerganov
|
148ec970b6
convert : remove AWQ remnants (#8320)
|
vor 1 Jahr |
Georgi Gerganov
|
2cccbaa008
llama : minor indentation during tensor loading (#8304)
|
vor 1 Jahr |
Johannes Gäßler
|
8e558309dc
CUDA: MMQ support for iq4_nl, iq4_xs (#8278)
|
vor 1 Jahr |
Daniele
|
0a423800ff
CUDA: revert part of the RDNA1 optimizations (#8309)
|
vor 1 Jahr |
Douglas Hanley
|
d12f781074
llama : streamline embeddings from "non-embedding" models (#8087)
|
vor 1 Jahr |
Johannes Gäßler
|
bcefa03bc0
CUDA: fix MMQ stream-k rounding if ne00 % 128 != 0 (#8311)
|
vor 1 Jahr |
Pieter Ouwerkerk
|
5a7447c569
readme : fix minor typos [no ci] (#8314)
|
vor 1 Jahr |
Daniel Bevenius
|
61ecafa390
passkey : add short intro to README.md [no-ci] (#8317)
|
vor 1 Jahr |
Georgi Gerganov
|
aa5898dc53
llama : prefer n_ over num_ prefix (#8308)
|
vor 1 Jahr |
Georgi Gerganov
|
6c05752c50
contributing : update guidelines (#8316)
|
vor 1 Jahr |
luoyu-intel
|
a9554e20b6
[SYCL] Fix WARP_SIZE=16 bug of Intel GPU (#8266)
|
vor 1 Jahr |
Georgi Gerganov
|
e235b267a2
py : switch to snake_case (#8305)
|
vor 1 Jahr |