CMMMU#

Overview#

CMMU (Chinese Massive Multi-discipline Multimodal Understanding) includes manually collected multimodal questions from college exams, quizzes, and textbooks, covering six core disciplines in Chinese. It is the Chinese counterpart to MMMU.

Task Description#

Task Type: Chinese Multimodal Question Answering
Input: Image(s) + question in Chinese with answer choices
Output: Correct answer choice
Language: Chinese

Key Features#

30 subjects across 6 core disciplines
Art & Design, Business, Science, Health & Medicine, Humanities & Social Science, Tech & Engineering
39 heterogeneous image types (charts, diagrams, maps, tables, etc.)
College-level difficulty
Multiple question types (multiple-choice, true/false, short answer)

Evaluation Notes#

Default configuration uses 0-shot evaluation
Evaluates on validation split
Simple accuracy metric
Chinese language prompts used

Properties#

Property	Value
Benchmark Name	`cmmmu`
Dataset ID	lmms-lab/CMMMU
Paper	N/A
Tags	`Chinese`, `Knowledge`, `MultiModal`, `QA`
Metrics	`acc`
Default Shots	0-shot
Evaluation Split	`val`

Data Statistics#

Metric	Value
Total Samples	900
Prompt Length (Mean)	185.59 chars
Prompt Length (Min/Max)	91 / 1045 chars

Per-Subset Statistics:

Subset	Samples	Prompt Mean	Prompt Min	Prompt Max
`设计`	18	216.39	136	347
`音乐`	21	149.57	110	185
`艺术`	16	166.44	143	240
`艺术理论`	33	171.21	118	268
`经济`	20	227.3	105	412
`会计`	39	197.05	103	528
`金融`	29	166	106	333
`管理`	22	236.73	135	423
`营销`	16	191.62	131	370
`物理`	35	266.11	146	497
`地理`	49	160.02	93	326
`化学`	42	187.4	103	322
`生物`	35	162.49	91	293
`数学`	43	188.23	101	370
`临床医学`	28	183.39	94	310
`公共卫生`	46	183.04	110	271
`基础医学`	32	146.56	92	219
`诊断学与实验室医学`	12	150.08	116	189
`制药`	35	181.77	97	413
`历史`	25	185.84	111	264
`心理学`	29	144.83	96	207
`文献学`	7	163.86	109	227
`社会学`	24	165	100	249
`计算机科学`	35	218.89	101	532
`电子学`	29	179.48	110	318
`机械工程`	40	196.1	98	817
`能源和电力`	32	186.44	100	330
`材料`	38	194.5	102	476
`建筑学`	49	175.39	97	415
`农业`	21	215.86	96	1045

Image Statistics:

Metric	Value
Total Images	1,023
Images per Sample	min: 1, max: 5, mean: 1.14
Resolution Range	112x38 - 1500x3000
Formats	jpeg, png

Sample Example#

Subset: 设计

{
  "input": [
    {
      "id": "47c0e169",
      "content": [
        {
          "text": "请回答以下多项选择题，并选出正确选项。这些题目可能包括单选和多选题型。如果所提供的信息不足以确定一个明确的答案，那么请根据可用的数据和你的判断来选择最可能正确的选项。\n\n问题："
        },
        {
          "image": "[BASE64_IMAGE: png, ~17.4KB]"
        },
        {
          "text": "为一幅灰度图，要为它局部添加颜色以得到右图所示的效果，正确的操作步骤是（ ）。\n选项：\n(A) 先将色彩模式转为RGB，然后用工具箱中的 【画笔工具】上色\n(B) 先将色彩模式转为RGB，制作局部选区，然后打开【色相/饱和度】对话框，在其中点中【着色】项,调节色彩属性参数\n(C) 先将色彩模式转为RGB，制作局部选区，然后打开【可选颜色】对话框,调节参数\n(D) 打开【色相/饱和度】对话框，直接调节色彩属性参数\n\n正确答案：\n"
        }
      ]
    }
  ],
  "target": "B",
  "id": 0,
  "group_id": 0,
  "subset_key": "设计",
  "metadata": {
    "id": "1900",
    "type": "选择",
    "source_type": "website",
    "analysis": null,
    "distribution": "本科",
    "difficulty_level": "easy",
    "subcategory": "设计",
    "category": "艺术与设计",
    "subfield": "['图像编辑', '色彩调整']",
    "img_type": "['屏幕截图']",
    "answer": "B",
    "option1": "先将色彩模式转为RGB，然后用工具箱中的 【画笔工具】上色",
    "option2": "先将色彩模式转为RGB，制作局部选区，然后打开【色相/饱和度】对话框，在其中点中【着色】项,调节色彩属性参数",
    "option3": "先将色彩模式转为RGB，制作局部选区，然后打开【可选颜色】对话框,调节参数",
    "option4": "打开【色相/饱和度】对话框，直接调节色彩属性参数"
  }
}

Prompt Template#

No prompt template defined.

Usage#

Using CLI#

evalscope eval \
    --model YOUR_MODEL \
    --api-url OPENAI_API_COMPAT_URL \
    --api-key EMPTY_TOKEN \
    --datasets cmmmu \
    --limit 10  # Remove this line for formal evaluation

Using Python#

from evalscope import run_task
from evalscope.config import TaskConfig

task_cfg = TaskConfig(
    model='YOUR_MODEL',
    api_url='OPENAI_API_COMPAT_URL',
    api_key='EMPTY_TOKEN',
    datasets=['cmmmu'],
    dataset_args={
        'cmmmu': {
            # subset_list: ['设计', '音乐', '艺术']  # optional, evaluate specific subsets
        }
    },
    limit=10,  # Remove this line for formal evaluation
)

run_task(task_cfg=task_cfg)