Georgi Gerganov a19b5cef16 llama : fix FA when KV cache is not used (i.e. embeddings) (#12825) hai 9 meses
..
test_basic.py a86ad841f1 server : add flag to disable the web-ui (#10762) (#10751) hai 1 ano
test_chat_completion.py 1a24c4621f `server`: fix deadly typo in response_format.json_schema.schema handling (#12168) hai 10 meses
test_completion.py cf8cc856d7 server : Fixed wrong function name in llamacpp server unit test (#11473) hai 11 meses
test_ctx_shift.py 45abe0f74e server : replace behave with pytest (#10416) hai 1 ano
test_embedding.py a19b5cef16 llama : fix FA when KV cache is not used (i.e. embeddings) (#12825) hai 9 meses
test_infill.py e6e7c75d94 server : fix extra BOS in infill endpoint (#11106) hai 1 ano
test_lora.py 0da5d86026 server : allow using LoRA adapters per-request (#10994) hai 1 ano
test_rerank.py 63ac128563 server : add TEI API format for /rerank endpoint (#11942) hai 11 meses
test_security.py 45abe0f74e server : replace behave with pytest (#10416) hai 1 ano
test_slot_save.py 45abe0f74e server : replace behave with pytest (#10416) hai 1 ano
test_speculative.py 0da5d86026 server : allow using LoRA adapters per-request (#10994) hai 1 ano
test_tokenize.py 45abe0f74e server : replace behave with pytest (#10416) hai 1 ano
test_tool_call.py be421fc429 `tool-call`: ensure there's always a non-empty tool call id (#12292) hai 10 meses