cturan/llama.cpp

Author	SHA1 Message	Date
Daniel Illescas Romero	c75ca5d96f llama.swiftui : use correct pointer for llama_token_eos (#4797)	2 years ago
Georgi Gerganov	96e80dabc6 examples : improve base-translate.sh script (#4783)	2 years ago
a-n-n-a-l-e-e	eec22a1c63 cmake : check for openblas64 (#4134)	2 years ago
Ikko Eltociear Ashimine	be36bb946a flake.nix : fix typo (#4700)	2 years ago
Georgi Gerganov	91d38876df metal : switch back to default.metallib (ggml/681)	2 years ago
Georgi Gerganov	d061bf9405 ggml : fix q2_k bpw in comments (ggml/680)	2 years ago
Finn Voorhees	1bf681f90e ggml : add error handling to graph_compute (whisper/1714)	2 years ago
Georgi Gerganov	c1d7cb28d3 ggml : do not sched_yield when calling BLAS (#4761)	2 years ago
Georgi Gerganov	3681f22443 examples : add few-shot translation example (#4783)	2 years ago
Daniel Bevenius	b3a7c20b5c finetune : remove unused includes (#4756)	2 years ago
Georgi Gerganov	012cf349ae server : send token probs for "stream == false" (#4714)	2 years ago
Johannes Gäßler	a91928014f Print backend name on test-backend-ops failure (#4751)	2 years ago
singularity	3c0b585561 llama.swiftui : support loading custom model from file picker (#4767)	2 years ago
Michael Coppola	e5804313a1 server : fix options in README.md (#4765)	2 years ago
Georgi Gerganov	dc891b7f7a ggml : include stdlib.h before intrin.h (#4736)	2 years ago
singularity	46cea79e1f llama.swiftui : fix build of ggml.metallib (#4754)	2 years ago
Daniel Bevenius	cb1e2818e0 train : fix typo in overlapping-samples help msg (#4758)	2 years ago
Ashraful Islam	ece9a45e8f swift : update Package.swift to use ggml as dependency (#4691)	2 years ago
Georgi Gerganov	7bed7eba35 cuda : simplify expression	2 years ago
Georgi Gerganov	d55356d3ba cuda : mark I16 and I32 ops as unsupported	2 years ago
Georgi Gerganov	75e3fd8581 sync : ggml	2 years ago
Georgi Gerganov	289313716f metal : add kernel_get_rows_i32	2 years ago
Georgi Gerganov	ab62fc3e55 scripts : fix sync order + metal sed	2 years ago
Guillaume Wenzek	5f66ebca9c ggml : extend ggml_get_rows, ggml_repeat, ggml_concat (ggml/639)	2 years ago
Justin Parker	f2eb19bd8b server : throw an error when `slot unavailable` (#4741)	2 years ago
Georgi Gerganov	f3f62f0d83 metal : optimize ggml_mul_mat_id (faster Mixtral PP) (#4725)	2 years ago
Phil H	0ef3ca2ac6 server : add token counts to html footer (#4738)	2 years ago
Georgi Gerganov	540938f890 llama : llama_model_desc print number of experts	2 years ago
Marcus Dunn	0040d42eeb llama : replace all API facing `int`'s with `int32_t` (#4577)	2 years ago
postmasters	83e633c27e llama : differentiate the KV dims in the attention (#4657)	2 years ago

Newer Older

Commit History Find

Commit History