CCBench#

Overview#

CCBench (Chinese Culture Bench) is an extension of MMBench specifically designed to evaluate multimodal models’ understanding of Chinese traditional culture. It covers various aspects of Chinese cultural heritage through visual question answering.

Task Description#

  • Task Type: Visual Multiple-Choice Q&A (Chinese Culture)

  • Input: Image with question about Chinese culture

  • Output: Single correct answer letter (A, B, C, or D)

  • Language: Primarily Chinese content

Key Features#

  • Questions about Chinese traditional culture

  • Categories: Calligraphy, Painting, Cultural Relics, Food & Clothes

  • Historical Figures, Scenery & Building, Sketch Reasoning, Traditional Shows

  • Tests cultural knowledge combined with visual understanding

  • Extension of the MMBench evaluation framework

Evaluation Notes#

  • Default configuration uses 0-shot evaluation

  • Uses Chain-of-Thought (CoT) prompting

  • Evaluates on test split

  • Simple accuracy metric for scoring

  • Requires both visual perception and cultural knowledge

Properties#

Property

Value

Benchmark Name

cc_bench

Dataset ID

lmms-lab/MMBench

Paper

N/A

Tags

Knowledge, MCQ, MultiModal

Metrics

acc

Default Shots

0-shot

Evaluation Split

test

Data Statistics#

Metric

Value

Total Samples

2,040

Prompt Length (Mean)

270.1 chars

Prompt Length (Min/Max)

254 / 394 chars

Image Statistics:

Metric

Value

Total Images

2,040

Images per Sample

min: 1, max: 1, mean: 1

Resolution Range

119x118 - 512x512

Formats

jpeg

Sample Example#

Subset: cc

{
  "input": [
    {
      "id": "2797f551",
      "content": [
        {
          "text": "Answer the following multiple choice question. The last line of your response should be of the following format: 'ANSWER: [LETTER]' (without quotes) where [LETTER] is one of A,B,C,D. Think step by step before answering.\n\n图中所示建筑名称为?\n\nA) 天坛\nB) 故宫\nC) 黄鹤楼\nD) 少林寺"
        },
        {
          "image": "[BASE64_IMAGE: jpeg, ~22.7KB]"
        }
      ]
    }
  ],
  "choices": [
    "天坛",
    "故宫",
    "黄鹤楼",
    "少林寺"
  ],
  "target": "A",
  "id": 0,
  "group_id": 0,
  "metadata": {
    "index": 0,
    "category": "scenery_building",
    "source": "https://zh.wikipedia.org/wiki/%E5%A4%A9%E5%9D%9B"
  }
}

Prompt Template#

Prompt Template:

Answer the following multiple choice question. The last line of your response should be of the following format: 'ANSWER: [LETTER]' (without quotes) where [LETTER] is one of {letters}. Think step by step before answering.

{question}

{choices}

Usage#

Using CLI#

evalscope eval \
    --model YOUR_MODEL \
    --api-url OPENAI_API_COMPAT_URL \
    --api-key EMPTY_TOKEN \
    --datasets cc_bench \
    --limit 10  # Remove this line for formal evaluation

Using Python#

from evalscope import run_task
from evalscope.config import TaskConfig

task_cfg = TaskConfig(
    model='YOUR_MODEL',
    api_url='OPENAI_API_COMPAT_URL',
    api_key='EMPTY_TOKEN',
    datasets=['cc_bench'],
    limit=10,  # Remove this line for formal evaluation
)

run_task(task_cfg=task_cfg)