cturan/llama.cpp

Autor	SHA1 Zpráva	Datum
Evan Jones	943e6081cc examples : add persistent chat (#1495)	před 2 roky
Jason McCartney	7694b52b9a main : make reverse prompt option act as a stop token in non-interactive mode (#1032)	před 2 roky
David Kennedy	79e3efb0e9 readme : adds WizardLM to the list of supported models (#1485)	před 2 roky
Georgi Gerganov	4b7e245adf minor : fix compile warnings	před 2 roky
Erik Scholz	5ea4339273 make kv_f16 the default for api users (#1517)	před 2 roky
DannyDaemonic	ee9654138a Fixes #1511 lambda issue for w64devkit (mingw) (#1513)	před 2 roky
Stephan Walter	dc271c52ed Remove unused n_parts parameter (#1509)	před 2 roky
rankaiyx	c238b5873a benchmark-matmul: Print the average of the test results (#1490)	před 2 roky
Tom Jobbins	2b2646931b convert.py: Support models which are stored in a single pytorch_model.bin (#1469)	před 2 roky
Ilya Kurdyukov	42627421ec ~7% faster Q5_1 AVX2 code (#1477)	před 2 roky
András Salamon	9560655409 define default model path once, sync path with readme (#1366)	před 2 roky
sandyiscool	2a5ee023ad Add alternate include path for openblas (#1476)	před 2 roky
zrm	63d20469b8 fix get_num_physical_cores() (#1436)	před 2 roky
slaren	b5c9295eef benchmark-matmul: fix clang-tidy issues, report results in GFLOPS (#1458)	před 2 roky
Johannes Gäßler	eb363627fd cuda : deduplicated dequantization code (#1453)	před 2 roky
xaedes	79b2d5b69d ggml : alternative fix for race condition bug in non-inplace ggml_compute_forward_diag_mask_f32 (#1454)	před 2 roky
Georgi Gerganov	13c351ad72 ggml : various fixes (#1450)	před 2 roky
katsu560	60f8c361ca ggml : add AVX support based on AVX2 code (#1430)	před 2 roky
Georgi Gerganov	601a033475 ggml : add GGML_QNT_VERSION to track quantization format changes	před 2 roky
Georgi Gerganov	08737ef720 cuda : fix convert function (#1412)	před 2 roky
Georgi Gerganov	bda4d7c215 make : fix PERF build with cuBLAS	před 2 roky
Georgi Gerganov	5a5aeb1e91 llama : fix unused warning	před 2 roky
Georgi Gerganov	66841fdb0e ggml : multi-thread mul and diag_mask ops (#1428)	před 2 roky
Johannes Gäßler	905d87b70a ggml : GPU-accelerated token generation (#1412)	před 2 roky
xaedes	f954edda93 ggml : implement backward pass for llama + small training-llama-from-scratch example (#1360)	před 2 roky
Georgi Gerganov	f048af0230 ggml : sync alibi fix from ggml repo	před 2 roky
3ooabkhxtn	ac0cd259d5 Adding SSE instructions to ggml_vec_dot_q4_0_q8_0 (#1413)	před 2 roky
Georgi Gerganov	0cd22e190a llama : fix various warnings	před 2 roky
Rinne	6456a4eb9f embedding : remove unused code (#1426)	před 2 roky
Georgi Gerganov	cdd5350892 readme : update Q4_0 perplexities	před 2 roky

Novější Starší

Historie revizí Hledat

Historie revizí