TORGO#

Overview#

TORGO is a specialized database of dysarthric speech designed for evaluating ASR systems on speakers with motor speech disorders. It contains aligned acoustic and articulatory data from speakers with cerebral palsy (CP) or amyotrophic lateral sclerosis (ALS).

Task Description#

  • Task Type: Dysarthric Speech Recognition

  • Input: Audio recordings from speakers with speech disorders

  • Output: Transcribed text

  • Focus: Accessibility and inclusive ASR evaluation

Key Features#

  • Specialized dataset for dysarthric speech

  • Speakers with cerebral palsy (CP) or ALS

  • Intelligibility-based subsets (mild, moderate, severe)

  • 3D articulatory feature alignment

  • Important for accessibility research

Evaluation Notes#

  • Default configuration uses test split

  • Subsets by intelligibility: mild, moderate, severe

  • Metrics: CER (Character Error Rate), WER (Word Error Rate), SemScore

  • Requires jiwer package for CER/WER metrics

  • Requires jellyfish package for SemScore metric

  • Supports batch scoring for efficiency

Properties#

Property

Value

Benchmark Name

torgo

Dataset ID

extraordinarylab/torgo

Paper

N/A

Tags

Audio, SpeechRecognition

Metrics

cer, wer, sem_score

Default Shots

0-shot

Evaluation Split

test

Data Statistics#

Metric

Value

Total Samples

5,553

Prompt Length (Mean)

67 chars

Prompt Length (Min/Max)

67 / 67 chars

Per-Subset Statistics:

Subset

Samples

Prompt Mean

Prompt Min

Prompt Max

mild

1,479

67

67

67

moderate

1,666

67

67

67

severe

2,408

67

67

67

Audio Statistics:

Metric

Value

Total Audio Files

5,553

Audio per Sample

min: 1, max: 1, mean: 1

Formats

wav

Sample Example#

Subset: mild

{
  "input": [
    {
      "id": "1220f252",
      "content": [
        {
          "text": "Please recognize the speech and only output the recognized content:"
        },
        {
          "audio": "[BASE64_AUDIO: wav, ~89.1KB]",
          "format": "wav"
        }
      ]
    }
  ],
  "target": "FEE",
  "id": 0,
  "group_id": 0,
  "subset_key": "mild",
  "metadata": {
    "transcript": "FEE",
    "intelligibility": "mild",
    "duration": 2.8499999046325684
  }
}

Prompt Template#

Prompt Template:

Please recognize the speech and only output the recognized content:

Usage#

Using CLI#

evalscope eval \
    --model YOUR_MODEL \
    --api-url OPENAI_API_COMPAT_URL \
    --api-key EMPTY_TOKEN \
    --datasets torgo \
    --limit 10  # Remove this line for formal evaluation

Using Python#

from evalscope import run_task
from evalscope.config import TaskConfig

task_cfg = TaskConfig(
    model='YOUR_MODEL',
    api_url='OPENAI_API_COMPAT_URL',
    api_key='EMPTY_TOKEN',
    datasets=['torgo'],
    dataset_args={
        'torgo': {
            # subset_list: ['mild', 'moderate', 'severe']  # optional, evaluate specific subsets
        }
    },
    limit=10,  # Remove this line for formal evaluation
)

run_task(task_cfg=task_cfg)