소스 검색

server : add parse_special option to /tokenize endpoint (#14783)

IsaacDynamo 6 달 전
부모
커밋
b4efd77f8a
2개의 변경된 파일4개의 추가작업 그리고 1개의 파일을 삭제
  1. 2 0
      tools/server/README.md
  2. 2 1
      tools/server/server.cpp

+ 2 - 0
tools/server/README.md

@@ -575,6 +575,8 @@ These words will not be included in the completion, so make sure to add them to
 
 `add_special`: (Optional) Boolean indicating if special tokens, i.e. `BOS`, should be inserted.  Default: `false`
 
+`parse_special`: (Optional) Boolean indicating if special tokens should be tokenized. When `false` special tokens are treated as plaintext.  Default: `true`
+
 `with_pieces`: (Optional) Boolean indicating whether to return token pieces along with IDs.  Default: `false`
 
 **Response:**

+ 2 - 1
tools/server/server.cpp

@@ -4516,9 +4516,10 @@ int main(int argc, char ** argv) {
         json tokens_response = json::array();
         if (body.count("content") != 0) {
             const bool add_special = json_value(body, "add_special", false);
+            const bool parse_special = json_value(body, "parse_special", true);
             const bool with_pieces = json_value(body, "with_pieces", false);
 
-            llama_tokens tokens = tokenize_mixed(ctx_server.vocab, body.at("content"), add_special, true);
+            llama_tokens tokens = tokenize_mixed(ctx_server.vocab, body.at("content"), add_special, parse_special);
 
             if (with_pieces) {
                 for (const auto& token : tokens) {