wrong_usages.feature 711 B

123456789101112131415161718192021
  1. # run with ./test.sh --tags wrong_usage
  2. @wrong_usage
  3. Feature: Wrong usage of llama.cpp server
  4. #3969 The user must always set --n-predict option
  5. # to cap the number of tokens any completion request can generate
  6. # or pass n_predict/max_tokens in the request.
  7. Scenario: Infinite loop
  8. Given a server listening on localhost:8080
  9. And a model file stories260K.gguf
  10. # Uncomment below to fix the issue
  11. #And 64 server max tokens to predict
  12. Then the server is starting
  13. Given a prompt:
  14. """
  15. Go to: infinite loop
  16. """
  17. # Uncomment below to fix the issue
  18. #And 128 max tokens to predict
  19. Given concurrent completion requests
  20. Then all prompts are predicted