Georgi Gerganov 2347e45e7b llama : do a warm-up eval at start for better timings (#1824) 2 years ago
..
baby-llama f954edda93 ggml : implement backward pass for llama + small training-llama-from-scratch example (#1360) 2 years ago
benchmark ec2e10c444 llama : add llama_init_backend() API (close #1527) 2 years ago
embedding ec2e10c444 llama : add llama_init_backend() API (close #1527) 2 years ago
jeopardy 5fba3c016b examples : add Jeopardy example (#1168) 2 years ago
main 2347e45e7b llama : do a warm-up eval at start for better timings (#1824) 2 years ago
metal ecb217db4f llama : Metal inference (#1642) 2 years ago
perplexity ec2e10c444 llama : add llama_init_backend() API (close #1527) 2 years ago
quantize 74d4cfa343 Allow "quantizing" to f16 and f32 (#1787) 2 years ago
quantize-stats 99009e72f8 ggml : add SOTA 2,3,4,5,6 bit k-quantizations (#1684) 2 years ago
save-load-state dc271c52ed Remove unused n_parts parameter (#1509) 2 years ago
server 17366df842 Multi GPU support, CUDA refactor, CUDA scratch buffer (#1703) 2 years ago
CMakeLists.txt ecb217db4f llama : Metal inference (#1642) 2 years ago
Miku.sh a8a2efdc81 examples : various prompt and example fixes (#1298) 2 years ago
alpaca.sh e9a9cb0c54 examples : Improve Alpaca Default Repeat Penalty: Better Match Alpaca.cpp Experience (#1107) 2 years ago
chat-13B.bat d9ad104440 Create chat-13B.bat (#592) 2 years ago
chat-13B.sh 6daa09d879 examples : read chat prompts from a template file (#1196) 2 years ago
chat-persistent.sh 1359b6aba5 chat-persistent.sh : use bracket expressions in grep (#1564) 2 years ago
chat.sh 79b2b266db If n_predict == -1, generate forever 2 years ago
common.cpp fa84c4b3e8 Fix issue where interactive mode crashes when input exceeds ctx size (#1789) 2 years ago
common.h fa84c4b3e8 Fix issue where interactive mode crashes when input exceeds ctx size (#1789) 2 years ago
gpt4all.sh 107980d970 examples : add -n to alpaca and gpt4all scripts (#706) 2 years ago
reason-act.sh a6956b25a1 add example of re-act pattern (#583) 2 years ago