cturan/llama.cpp

Author	SHA1 Message	Date
Georgi Gerganov	18b35625c3 ggml : fix bug in LBFGS optimizer (found by ggml tests)	2 years ago
l3utterfly	ba4e85a833 llama : use aligned memory during ggml_init call from loading saved sessions (#1934)	2 years ago
Georgi Gerganov	23fc5c219a cmake : fix trailing whitespaces	2 years ago
Kawrakow	cb40dfca69 llama : only use Q6_K for output weights if tensor size is multiple of 256 (#1932)	2 years ago
Kawrakow	ca7c3f4da5 cuda : faster k-quants on older GPUs (#1930)	2 years ago
Georgi Gerganov	b97ca431db ggml : sync latest ggml repo (#1924)	2 years ago
Howard Su	1e3abfcef0 cmake : fix build shared ggml when CUDA is enabled (#1929)	2 years ago
Johannes Gäßler	16b9cd1939 Convert vector to f16 for dequantize mul mat vec (#1913)	2 years ago
Johannes Gäßler	b24c3049d9 Added tokens per second to info prints (#1928)	2 years ago
Johannes Gäßler	0ede372a51 Fixed incorrectly applying RMS norm twice (#1925)	2 years ago
l3utterfly	8596af4277 ggml : fix bug in ggml_compute_forward_add_q_f32 (#1918)	2 years ago
Mike	e1886cf4fe readme : update Android build instructions (#1922)	2 years ago
Kawrakow	8ab8ba62eb llama : prevent usage of k-quants when tensor size is not a multiple of 256 (#1921)	2 years ago
Kawrakow	90cc59d6ab examples : fix examples/metal (#1920)	2 years ago
Georgi Gerganov	ce2c7d72e2 metal : handle buffers larger than device's maxBufferLength (#1826)	2 years ago
Howard Su	57cd69460f cmake : add CUDA_ARCHITECTURES to new target ggml_static (#1917)	2 years ago
Georgi Gerganov	b2416493ab make : do not print help for simple example	2 years ago
Georgi Gerganov	4f9c43e3bd minor : warning fixes	2 years ago
Johannes Gäßler	2c9380dd2f Only one CUDA stream per device for async compute (#1898)	2 years ago
Georgi Gerganov	051e1b0e6a llama : fix kv_cache `n` init (close #1903)	2 years ago
DaniAndTheWeb	86c7571864 make : update for latest Arch (#1701)	2 years ago
Howard Su	3d59ec5935 ggml : fix warnings under MSVC (#1908)	2 years ago
Aaron Miller	0711a5f6dc metal : add norm, cpy f16->f16, alibi kernels (#1823)	2 years ago
Faez Shakil	fc45a81bc6 exposed modules so that they can be invoked by nix run github:ggerganov/llama.cpp#server etc (#1863)	2 years ago
Randall Fitzgerald	794db3e7b9 Server Example Refactor and Improvements (#1570)	2 years ago
Jiří Podivín	5ddf7ea1fb hooks : setting up flake8 and pre-commit hooks (#1681)	2 years ago
Gustavo Rocha Dias	bac19927c3 readme : alternative way to build for Android with CLBlast. (#1828)	2 years ago
Kerfuffle	b4c6f46f17 Allow cmake to build ggml as a library (#1896)	2 years ago
David Yang	92f20d9942 train : get raw text instead of page with html (#1905)	2 years ago
0cc4m	d411968e99 opencl : support k-quants (#1836)	2 years ago

Newer Older

Commit History Find

Commit History