CMMMU#
Overview#
CMMU (Chinese Massive Multi-discipline Multimodal Understanding) includes manually collected multimodal questions from college exams, quizzes, and textbooks, covering six core disciplines in Chinese. It is the Chinese counterpart to MMMU.
Task Description#
Task Type: Chinese Multimodal Question Answering
Input: Image(s) + question in Chinese with answer choices
Output: Correct answer choice
Language: Chinese
Key Features#
30 subjects across 6 core disciplines
Art & Design, Business, Science, Health & Medicine, Humanities & Social Science, Tech & Engineering
39 heterogeneous image types (charts, diagrams, maps, tables, etc.)
College-level difficulty
Multiple question types (multiple-choice, true/false, short answer)
Evaluation Notes#
Default configuration uses 0-shot evaluation
Evaluates on validation split
Simple accuracy metric
Chinese language prompts used
Properties#
Property |
Value |
|---|---|
Benchmark Name |
|
Dataset ID |
|
Paper |
N/A |
Tags |
|
Metrics |
|
Default Shots |
0-shot |
Evaluation Split |
|
Data Statistics#
Metric |
Value |
|---|---|
Total Samples |
900 |
Prompt Length (Mean) |
185.59 chars |
Prompt Length (Min/Max) |
91 / 1045 chars |
Per-Subset Statistics:
Subset |
Samples |
Prompt Mean |
Prompt Min |
Prompt Max |
|---|---|---|---|---|
|
18 |
216.39 |
136 |
347 |
|
21 |
149.57 |
110 |
185 |
|
16 |
166.44 |
143 |
240 |
|
33 |
171.21 |
118 |
268 |
|
20 |
227.3 |
105 |
412 |
|
39 |
197.05 |
103 |
528 |
|
29 |
166 |
106 |
333 |
|
22 |
236.73 |
135 |
423 |
|
16 |
191.62 |
131 |
370 |
|
35 |
266.11 |
146 |
497 |
|
49 |
160.02 |
93 |
326 |
|
42 |
187.4 |
103 |
322 |
|
35 |
162.49 |
91 |
293 |
|
43 |
188.23 |
101 |
370 |
|
28 |
183.39 |
94 |
310 |
|
46 |
183.04 |
110 |
271 |
|
32 |
146.56 |
92 |
219 |
|
12 |
150.08 |
116 |
189 |
|
35 |
181.77 |
97 |
413 |
|
25 |
185.84 |
111 |
264 |
|
29 |
144.83 |
96 |
207 |
|
7 |
163.86 |
109 |
227 |
|
24 |
165 |
100 |
249 |
|
35 |
218.89 |
101 |
532 |
|
29 |
179.48 |
110 |
318 |
|
40 |
196.1 |
98 |
817 |
|
32 |
186.44 |
100 |
330 |
|
38 |
194.5 |
102 |
476 |
|
49 |
175.39 |
97 |
415 |
|
21 |
215.86 |
96 |
1045 |
Image Statistics:
Metric |
Value |
|---|---|
Total Images |
1,023 |
Images per Sample |
min: 1, max: 5, mean: 1.14 |
Resolution Range |
112x38 - 1500x3000 |
Formats |
jpeg, png |
Sample Example#
Subset: 设计
{
"input": [
{
"id": "47c0e169",
"content": [
{
"text": "请回答以下多项选择题,并选出正确选项。这些题目可能包括单选和多选题型。如果所提供的信息不足以确定一个明确的答案,那么请根据可用的数据和你的判断来选择最可能正确的选项。\n\n问题:"
},
{
"image": "[BASE64_IMAGE: png, ~17.4KB]"
},
{
"text": "为一幅灰度图,要为它局部添加颜色以得到右图所示的效果,正确的操作步骤是( )。\n选项:\n(A) 先将色彩模式转为RGB,然后用工具箱中的 【画笔工具】上色\n(B) 先将色彩模式转为RGB,制作局部选区,然后打开【色相/饱和度】对话框,在其中点中【着色】项,调节色彩属性参数\n(C) 先将色彩模式转为RGB,制作局部选区,然后打开【可选颜色】对话框,调节参数\n(D) 打开【色相/饱和度】对话框,直接调节色彩属性参数\n\n正确答案:\n"
}
]
}
],
"target": "B",
"id": 0,
"group_id": 0,
"subset_key": "设计",
"metadata": {
"id": "1900",
"type": "选择",
"source_type": "website",
"analysis": null,
"distribution": "本科",
"difficulty_level": "easy",
"subcategory": "设计",
"category": "艺术与设计",
"subfield": "['图像编辑', '色彩调整']",
"img_type": "['屏幕截图']",
"answer": "B",
"option1": "先将色彩模式转为RGB,然后用工具箱中的 【画笔工具】上色",
"option2": "先将色彩模式转为RGB,制作局部选区,然后打开【色相/饱和度】对话框,在其中点中【着色】项,调节色彩属性参数",
"option3": "先将色彩模式转为RGB,制作局部选区,然后打开【可选颜色】对话框,调节参数",
"option4": "打开【色相/饱和度】对话框,直接调节色彩属性参数"
}
}
Prompt Template#
No prompt template defined.
Usage#
Using CLI#
evalscope eval \
--model YOUR_MODEL \
--api-url OPENAI_API_COMPAT_URL \
--api-key EMPTY_TOKEN \
--datasets cmmmu \
--limit 10 # Remove this line for formal evaluation
Using Python#
from evalscope import run_task
from evalscope.config import TaskConfig
task_cfg = TaskConfig(
model='YOUR_MODEL',
api_url='OPENAI_API_COMPAT_URL',
api_key='EMPTY_TOKEN',
datasets=['cmmmu'],
dataset_args={
'cmmmu': {
# subset_list: ['设计', '音乐', '艺术'] # optional, evaluate specific subsets
}
},
limit=10, # Remove this line for formal evaluation
)
run_task(task_cfg=task_cfg)