cturan/llama.cpp

Author	SHA1 Message	Date
Georgi Gerganov	a73ccf1aa3 llama : replace (permute + reshape + view_1d) with (view_3d) (#2538)	2 years ago
drbh	7cf54e1f74 tests : adds simple llama grammar tests (#2618)	2 years ago
Shouzheng Liu	a872a2b28e ggml-alloc : fix discrepency between measure&eval (#2639)	2 years ago
Kolen Cheung	0919a0f73d cmake : install ggml-meta.metal if LLAMA_METAL (#2449)	2 years ago
Jhen-Jie Hong	ed53db86c3 metal : print error of load pipeline state (#2564)	2 years ago
Shouzheng Liu	fc8ef549e5 metal : enable ggml-alloc (#2627)	2 years ago
Shouzheng Liu	bf83bff674 metal : matrix-matrix multiplication kernel (#2615)	2 years ago
Georgi Gerganov	b5ffb2849d scripts : add helper script to get wikitext	2 years ago
Jhen-Jie Hong	3ebb00935f server : add missing /json-schema-to-grammar.mjs (#2616)	2 years ago
Jhen-Jie Hong	d783f7982e metal : return null instead of exit(1) (#2573)	2 years ago
Cheng Shao	d75561df20 server : add --numa support (#2524)	2 years ago
Kamil Tomšík	348acf188c llama : add missing enum keyword in function signatures (#2610)	2 years ago
Johannes Gäßler	1cd06fa25e CUDA: launch_bounds, small q4_K, q5_K mmq refactor (#2596)	2 years ago
Jhen-Jie Hong	2feb8934eb server : fix default grammar by use empty string in the UI (#2604)	2 years ago
Jhen-Jie Hong	5517d6e692 server : implement json-schema-to-grammar.mjs & add grammar param in the UI (#2588)	2 years ago
vxiiduu	f31b539714 Enhance Windows 7 and below compatibility. (#2592)	2 years ago
drbh	ee77efea2a test : add simple grammar parsing tests (#2594)	2 years ago
Johannes Gäßler	f64d44a9b9 CUDA: Fixed OpenLLaMA 3b mmq, reduced compile time (#2590)	2 years ago
byte-6174	b19edd54d5 Adding support for llama2.c models (#2559)	2 years ago
Equim	53dc399472 server: fixed wrong variable name in timing json (#2579)	2 years ago
DannyDaemonic	9ca4abed89 Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows.	2 years ago
Christian Demsar	e59fcb2bc1 Add --n-predict -2 for stopping generation on full context (#2565)	2 years ago
Martin Krasser	1638757767 Fix grammar-based sampling issue in server (#2566)	2 years ago
Sam Spilsbury	916a9acdd0 ggml-alloc: Don't try to re-use buffers of external tensors (#2562)	2 years ago
grahameth	ea04a4ca19 add log_callback to llama_context_params for custom logging. (#2234)	2 years ago
Johannes Gäßler	25d43e0eb5 CUDA: tuned mul_mat_q kernels (#2546)	2 years ago
Martin Krasser	f5bfea0580 Allow passing grammar to completion endpoint (#2532)	2 years ago
Johannes Gäßler	acfc5478ff CUDA: tighter VRAM scratch size for 65b/70b (#2551)	2 years ago
chaihahaha	7ed8d1fe7f llm.vim : multiline autocompletion, get rid of "^@" (#2543)	2 years ago
Georgi Gerganov	e7f94d6fdc vim : bring back simple llm.vim example	2 years ago

Newer Older

Commit History Find

Commit History