自定义使用#

自定义结果分析#

该工具在测试期间会将所有数据保存到 sqlite3 数据库中,包括请求和响应。您可以在测试后分析测试数据。

import sqlite3
import base64
import pickle
import json
result_db_path = 'db_name.db'
con = sqlite3.connect(result_db_path)
query_sql = "SELECT request, response_messages, prompt_tokens, completion_tokens \
                FROM result WHERE success='1'"
# how to save base64.b64encode(pickle.dumps(benchmark_data["request"])).decode("ascii"), 
with con:
    rows = con.execute(query_sql).fetchall()
    if len(rows) > 0:
        for row in rows:
            request = row[0]
            responses = row[1]
            request = base64.b64decode(request)
            request = pickle.loads(request)
            responses = base64.b64decode(responses)
            responses = pickle.loads(responses)
            response_content = ''
            for response in responses:
                response = json.loads(response)
                if not response['choices']:
                   continue
                response_content += response['choices'][0]['delta']['content']
            print('prompt: %s, tokens: %s, completion: %s, tokens: %s' %
                  (request['messages'][0]['content'], row[2], response_content,
                   row[3]))

自定义请求 API#

目前支持的 API 请求格式有 openai、dashscope。要扩展 API,您可以继承 ApiPluginBase 类,并使用 @register_api("api名称") 进行注解,需实现如下两个方法:

  • build_request()方法通过 messagesparam中的modelquery_template 来构建请求,改请求后续发送到目标API。

  • parse_responses() 方法将返回 prompt_tokenscompletion_tokens 的数量,用于计算推理速度。

参考如下代码:

from typing import Any, Dict, List, Tuple
from evalscope.perf.arguments import Arguments

@register_api('custom')
class CustomPlugin(ApiPluginBase):
    def __init__(self, model_path: str) -> None:
        self.model_path = model_path
        
    @abstractmethod
    def build_request(self, messages: List[Dict], param: Arguments) -> Dict:
        """Build a api request body.

        Args:
            messages (List[Dict]): The messages generated by dataset.
            param (QueryParameters): The query parameters.

        Raises:
            NotImplementedError: Not implemented.

        Returns:
            Dict: The api request body.
        """
        raise NotImplementedError
    
    @abstractmethod
    def parse_responses(self, 
                        responses: List, 
                        request: Any=None,
                        **kwargs:Any) -> Tuple[int, int]:
        """Parser responses and return number of request and response tokens.

        Args:
            responses (List[bytes]): List of http response body, for stream output,
                there are multiple responses, each is bytes, for general only one. 
            request (Any): The request body.

        Returns:
            Tuple: (Number of prompt_tokens and number of completion_tokens).
        """
        raise NotImplementedError  

自定义数据集#

要自定义数据集,您可以继承 DatasetPluginBase 类,并使用 @register_dataset('数据集名称') 进行注解,然后实现 build_messages 方法以返回一个message,格式参考OpenAI API

from typing import Dict, Iterator, List

from evalscope.perf.arguments import Arguments
from evalscope.perf.plugin.datasets.base import DatasetPluginBase
from evalscope.perf.plugin.registry import register_dataset


@register_dataset('custom')
class CustomDatasetPlugin(DatasetPluginBase):
    """Read dataset and return prompt.
    """

    def __init__(self, query_parameters: Arguments):
        super().__init__(query_parameters)

    def build_messages(self) -> Iterator[List[Dict]]:
        for item in self.dataset_line_by_line(self.query_parameters.dataset_path):
            prompt = item.strip()
            if len(prompt) > self.query_parameters.min_prompt_length and len(
                    prompt) < self.query_parameters.max_prompt_length:
                yield [{'role': 'user', 'content': prompt}]