issues.feature 855 B

123456789101112131415161718192021222324252627282930313233343536
  1. # List of ongoing issues
  2. @bug
  3. Feature: Issues
  4. # Issue #5655
  5. Scenario: Multi users embeddings
  6. Given a server listening on localhost:8080
  7. And a model file stories260K.gguf
  8. And a model alias tinyllama-2
  9. And 42 as server seed
  10. And 64 KV cache size
  11. And 2 slots
  12. And continuous batching
  13. And embeddings extraction
  14. Then the server is starting
  15. Then the server is healthy
  16. Given a prompt:
  17. """
  18. Write a very long story about AI.
  19. """
  20. And a prompt:
  21. """
  22. Write another very long music lyrics.
  23. """
  24. And a prompt:
  25. """
  26. Write a very long poem.
  27. """
  28. And a prompt:
  29. """
  30. Write a very long joke.
  31. """
  32. Given concurrent embedding requests
  33. Then the server is busy
  34. Then the server is idle
  35. Then all embeddings are generated