embeddings.feature 2.3 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394
  1. @llama.cpp
  2. @embeddings
  3. Feature: llama.cpp server
  4. Background: Server startup
  5. Given a server listening on localhost:8080
  6. And a model file bert-bge-small/ggml-model-f16.gguf from HF repo ggml-org/models
  7. And a model alias bert-bge-small
  8. And 42 as server seed
  9. And 2 slots
  10. And 1024 as batch size
  11. And 2048 KV cache size
  12. And embeddings extraction
  13. Then the server is starting
  14. Then the server is healthy
  15. Scenario: Embedding
  16. When embeddings are computed for:
  17. """
  18. What is the capital of Bulgaria ?
  19. """
  20. Then embeddings are generated
  21. Scenario: OAI Embeddings compatibility
  22. Given a model bert-bge-small
  23. When an OAI compatible embeddings computation request for:
  24. """
  25. What is the capital of Spain ?
  26. """
  27. Then embeddings are generated
  28. Scenario: OAI Embeddings compatibility with multiple inputs
  29. Given a model bert-bge-small
  30. Given a prompt:
  31. """
  32. In which country Paris is located ?
  33. """
  34. And a prompt:
  35. """
  36. Is Madrid the capital of Spain ?
  37. """
  38. When an OAI compatible embeddings computation request for multiple inputs
  39. Then embeddings are generated
  40. Scenario: Multi users embeddings
  41. Given a prompt:
  42. """
  43. Write a very long story about AI.
  44. """
  45. And a prompt:
  46. """
  47. Write another very long music lyrics.
  48. """
  49. And a prompt:
  50. """
  51. Write a very long poem.
  52. """
  53. And a prompt:
  54. """
  55. Write a very long joke.
  56. """
  57. Given concurrent embedding requests
  58. Then the server is busy
  59. Then the server is idle
  60. Then all embeddings are generated
  61. Scenario: Multi users OAI compatibility embeddings
  62. Given a prompt:
  63. """
  64. In which country Paris is located ?
  65. """
  66. And a prompt:
  67. """
  68. Is Madrid the capital of Spain ?
  69. """
  70. And a prompt:
  71. """
  72. What is the biggest US city ?
  73. """
  74. And a prompt:
  75. """
  76. What is the capital of Bulgaria ?
  77. """
  78. And a model bert-bge-small
  79. Given concurrent OAI embedding requests
  80. Then the server is busy
  81. Then the server is idle
  82. Then all embeddings are generated
  83. Scenario: All embeddings should be the same
  84. Given 10 fixed prompts
  85. And a model bert-bge-small
  86. Given concurrent OAI embedding requests
  87. Then all embeddings are the same