cturan/llama.cpp

Author	SHA1 Message	Date
Georgi Gerganov	6f0dbf6ab0 infill : assert prefix/suffix tokens + remove old space logic (#8351)	1 year ago
Kevin Wang	ffd00797d8 common : avoid unnecessary logits fetch (#8358)	1 year ago
toyer	04ce3a8b19 readme : add supported glm models (#8360)	1 year ago
compilade	3fd62a6b1c py : type-check all Python scripts with Pyright (#8341)	1 year ago
Denis Spasyuk	a8db2a9ce6 Update llama-cli documentation (#8315)	1 year ago
Alex Tuddenham	4090ea5501 ci : add checks for cmake,make and ctest in ci/run.sh (#8200)	1 year ago
Andy Tai	f1948f1e10 readme : update bindings list (#8222)	1 year ago
Brian	f7cab35ef9 gguf-hash: model wide and per tensor hashing using xxhash and sha1 (#8048)	1 year ago
toyer	905942abdb llama : support glm3 and glm4 (#8031)	1 year ago
Georgi Gerganov	b5040086d4 llama : fix n_rot default (#8348)	1 year ago
compilade	d39130a398 py : use cpu-only torch in requirements.txt (#8335)	1 year ago
standby24x7	b81ba1f96b finetune: Rename command name in README.md (#8343)	1 year ago
standby24x7	210eb9ed0a finetune: Rename an old command name in finetune.sh (#8344)	1 year ago
Bjarke Viksøe	cb4d86c4d7 server: Retrieve prompt template in /props (#8337)	1 year ago
Derrick T. Woolworth	86e7299ef5 added support for Authorization Bearer tokens when downloading model (#8307)	1 year ago
Xuan Son Nguyen	60d83a0149 update main readme (#8333)	1 year ago
Daniel Bevenius	87e25a1d1b llama : add early return for empty range (#8327)	1 year ago
jaime-m-p	213701b51a Detokenizer fixes (#8039)	1 year ago
Xuan Son Nguyen	be20e7f49d Reorganize documentation pages (#8325)	1 year ago
Georgi Gerganov	7ed03b8974 llama : fix compile warning (#8304)	1 year ago
Natsu	1d894a790e cmake : add GGML_BUILD and GGML_SHARED macro definitions (#8281)	1 year ago
Ouadie EL FAROUKI	1f3e1b66e2 Enabled more data types for oneMKL gemm_batch (#8236)	1 year ago
Georgi Gerganov	148ec970b6 convert : remove AWQ remnants (#8320)	1 year ago
Georgi Gerganov	2cccbaa008 llama : minor indentation during tensor loading (#8304)	1 year ago
Johannes Gäßler	8e558309dc CUDA: MMQ support for iq4_nl, iq4_xs (#8278)	1 year ago
Daniele	0a423800ff CUDA: revert part of the RDNA1 optimizations (#8309)	1 year ago
Douglas Hanley	d12f781074 llama : streamline embeddings from "non-embedding" models (#8087)	1 year ago
Johannes Gäßler	bcefa03bc0 CUDA: fix MMQ stream-k rounding if ne00 % 128 != 0 (#8311)	1 year ago
Pieter Ouwerkerk	5a7447c569 readme : fix minor typos [no ci] (#8314)	1 year ago
Daniel Bevenius	61ecafa390 passkey : add short intro to README.md [no-ci] (#8317)	1 year ago

Newer Older

Commit History Find

Commit History