DrivelologyNarrativeWriting#

Overview#

Drivelology Narrative Writing evaluates models’ ability to generate detailed descriptions illustrating the implicit narrative of “drivelology” text - linguistic utterances that are syntactically coherent yet pragmatically paradoxical, emotionally loaded, or rhetorically subversive.

Task Description#

  • Task Type: Narrative Generation and Evaluation

  • Input: Drivelology text sample

  • Output: Generated narrative description explaining implicit meaning

  • Domain: Linguistic analysis, narrative generation

Key Features#

  • Tests narrative explanation generation ability

  • Requires understanding of layered linguistic meanings

  • LLM-as-judge evaluation against reference narratives

  • Likert scale scoring (1-5) for match quality

  • Tests depth of linguistic and cultural understanding

Evaluation Notes#

  • Default configuration uses 0-shot evaluation

  • Uses LLM-as-judge for evaluation

  • Metrics: Average Likert score (1-5 scale)

  • Evaluates relevance, accuracy, depth, and detail of generated narratives

Properties#

Property

Value

Benchmark Name

drivel_writing

Dataset ID

extraordinarylab/drivel-hub

Paper

N/A

Tags

Knowledge, Reasoning

Metrics

bert_score, gpt_score

Default Shots

0-shot

Evaluation Split

test

Data Statistics#

Metric

Value

Total Samples

600

Prompt Length (Mean)

313.18 chars

Prompt Length (Min/Max)

256 / 717 chars

Sample Example#

Subset: narrative-writing-english

{
  "input": [
    {
      "id": "f47953a9",
      "content": [
        {
          "text": "You need to first read and understand the text given. Generate a detailed description to illustrate the implicit narrative of the text.\n\nPlease provide your response in English, with a clear and comprehensive explanation of the narrative.\n\nText: 後天的努力比什麼都重要,所以今天和明天休息。"
        }
      ]
    }
  ],
  "target": "This creates a paradoxical tone, as it acknowledges the value of diligence but simultaneously advocates for procrastination. The underlying message could reflect a lighthearted take on balancing work and rest or even poking fun at the tendency to delay responsibilities.",
  "id": 0,
  "group_id": 0,
  "metadata": {
    "text": "後天的努力比什麼都重要,所以今天和明天休息。",
    "reference_narrative": "This creates a paradoxical tone, as it acknowledges the value of diligence but simultaneously advocates for procrastination. The underlying message could reflect a lighthearted take on balancing work and rest or even poking fun at the tendency to delay responsibilities."
  }
}

Prompt Template#

Prompt Template:

You need to first read and understand the text given. Generate a detailed description to illustrate the implicit narrative of the text.

Please provide your response in English, with a clear and comprehensive explanation of the narrative.

Text: {text}

Usage#

Using CLI#

evalscope eval \
    --model YOUR_MODEL \
    --api-url OPENAI_API_COMPAT_URL \
    --api-key EMPTY_TOKEN \
    --datasets drivel_writing \
    --limit 10  # Remove this line for formal evaluation

Using Python#

from evalscope import run_task
from evalscope.config import TaskConfig

task_cfg = TaskConfig(
    model='YOUR_MODEL',
    api_url='OPENAI_API_COMPAT_URL',
    api_key='EMPTY_TOKEN',
    datasets=['drivel_writing'],
    limit=10,  # Remove this line for formal evaluation
)

run_task(task_cfg=task_cfg)