cturan/llama.cpp

Author	SHA1 Message	Date
Joan Fontanals	f5d7b268ec llama : add jina v2 base code (#7596)	1 year ago
Georgi Gerganov	2b3389677a ggml : refactor rope norm/neox (#7634)	1 year ago
Georgi Gerganov	1442677f92 common : refactor cli arg parsing (#7675)	1 year ago
Georgi Gerganov	554c247caf ggml : remove OpenCL (#7735)	1 year ago
Georgi Gerganov	0cd6bd3483 llama : remove beam search (#7736)	1 year ago
jaime-m-p	3b38d48609 Per token attributes (#7685)	1 year ago
Radoslav Gerganov	bde7cd3cd9 llama : offload to RPC in addition to other backends (#7640)	1 year ago
0cc4m	3d7ebf6312 Vulkan Mixture of Experts (MoE) support (#7628)	1 year ago
zhangkaihuo	6f28a333c1 llama : MiniCPM support tied embeddings (#7664)	1 year ago
Georgi Gerganov	549279d804 llama : avoid double token-to-piece cache (#7654)	1 year ago
Johannes Gäßler	9b596417af CUDA: quantized KV support for FA vec (#7527)	1 year ago
Georgi Gerganov	5921b8f089 llama : cache llama_token_to_piece (#7587)	1 year ago
Georgi Gerganov	fb76ec31a9 ggml : fix YARN + add tests + add asserts (#7617)	1 year ago
jaime-m-p	02c1ecad07 Tokenizer WPM fixes (#7500)	1 year ago
Giuseppe Scrivano	5442939fcc llama : support small Granite models (#7481)	1 year ago
fairydreaming	ee3dff6b8e Add support for DeepseekV2ForCausalLM (#7519)	1 year ago
Georgi Gerganov	8b99e2aa66 llama : handle unknown utf8 bytes (#7588)	1 year ago
Bartowski	c429b33beb llama : add Smaug 70B support (#7402)	1 year ago
Justine Tunney	00c6390793 main : don't print special tokens with --grammar (#6923)	1 year ago
Masaya, Kato	faa0e6979a ggml: aarch64: SVE kernels for q8_0_q8_0, q4_0_q8_0 vector dot (#7433)	1 year ago
fairydreaming	fbca2f27fc Add support for ArcticForCausalLM (#7020)	1 year ago
Tristan Druyen	007489e895 Fix phi3 chat template confusion with zephyr (#7449)	1 year ago
Daniel Bevenius	3015851c5a llama : add getters for n_threads/n_threads_batch (#7464)	1 year ago
Georgi Gerganov	55ac3b7aea ci : use Pythia models instead of OpenLlama (#7470)	1 year ago
fairydreaming	9b82476ee9 Add missing inference support for GPTNeoXForCausalLM (Pythia and GPT-NeoX base models) (#7461)	1 year ago
Georgi Gerganov	a61a94e543 llama : rename n_ctx -> cache.size, less confusing (#0)	1 year ago
Georgi Gerganov	e84b71c2c6 ggml : drop support for QK_K=64 (#7473)	1 year ago
slaren	b18532a4ef phi3 : duplicate rope factors in each layer (#7447)	1 year ago
Justine Tunney	03d8900ebe llama : add missing model type names (#7445)	1 year ago
liuwei-git	201cc11afa llama : add phi3 128K model support (#7225)	1 year ago

Newer Older

Commit History Find

Commit History