自定义使用#
自定义结果分析#
该工具在测试期间会将所有数据保存到 sqlite3 数据库中,包括请求和响应。您可以在测试后分析测试数据。
import sqlite3
import base64
import pickle
import json
result_db_path = 'db_name.db'
con = sqlite3.connect(result_db_path)
query_sql = "SELECT request, response_messages, prompt_tokens, completion_tokens \
FROM result WHERE success='1'"
# how to save base64.b64encode(pickle.dumps(benchmark_data["request"])).decode("ascii"),
with con:
rows = con.execute(query_sql).fetchall()
if len(rows) > 0:
for row in rows:
request = row[0]
responses = row[1]
request = base64.b64decode(request)
request = pickle.loads(request)
responses = base64.b64decode(responses)
responses = pickle.loads(responses)
response_content = ''
for response in responses:
response = json.loads(response)
if not response['choices']:
continue
response_content += response['choices'][0]['delta']['content']
print('prompt: %s, tokens: %s, completion: %s, tokens: %s' %
(request['messages'][0]['content'], row[2], response_content,
row[3]))
自定义请求 API#
目前支持的 API 请求格式有 openai、dashscope。要扩展 API,您可以继承 ApiPluginBase
类,并使用 @register_api("api名称")
进行注解,需实现如下两个方法:
build_request()
方法通过messages
和param
中的model
和query_template
来构建请求,改请求后续发送到目标API。parse_responses()
方法将返回prompt_tokens
和completion_tokens
的数量,用于计算推理速度。
参考如下代码:
from typing import Any, Dict, List, Tuple
from evalscope.perf.arguments import Arguments
@register_api('custom')
class CustomPlugin(ApiPluginBase):
def __init__(self, model_path: str) -> None:
self.model_path = model_path
@abstractmethod
def build_request(self, messages: List[Dict], param: Arguments) -> Dict:
"""Build a api request body.
Args:
messages (List[Dict]): The messages generated by dataset.
param (QueryParameters): The query parameters.
Raises:
NotImplementedError: Not implemented.
Returns:
Dict: The api request body.
"""
raise NotImplementedError
@abstractmethod
def parse_responses(self,
responses: List,
request: Any=None,
**kwargs:Any) -> Tuple[int, int]:
"""Parser responses and return number of request and response tokens.
Args:
responses (List[bytes]): List of http response body, for stream output,
there are multiple responses, each is bytes, for general only one.
request (Any): The request body.
Returns:
Tuple: (Number of prompt_tokens and number of completion_tokens).
"""
raise NotImplementedError
自定义数据集#
要自定义数据集,您可以继承 DatasetPluginBase
类,并使用 @register_dataset('数据集名称')
进行注解,然后实现 build_messages
方法以返回一个message,格式参考OpenAI API。
from typing import Dict, Iterator, List
from evalscope.perf.arguments import Arguments
from evalscope.perf.plugin.datasets.base import DatasetPluginBase
from evalscope.perf.plugin.registry import register_dataset
@register_dataset('custom')
class CustomDatasetPlugin(DatasetPluginBase):
"""Read dataset and return prompt.
"""
def __init__(self, query_parameters: Arguments):
super().__init__(query_parameters)
def build_messages(self) -> Iterator[List[Dict]]:
for item in self.dataset_line_by_line(self.query_parameters.dataset_path):
prompt = item.strip()
if len(prompt) > self.query_parameters.min_prompt_length and len(
prompt) < self.query_parameters.max_prompt_length:
yield [{'role': 'user', 'content': prompt}]