Skip to content
Logo LogoEvalScope
Docs Blogs
⌘ K
Logo LogoEvalScope
Docs Blogs

🚀 Quick Start

  • Introduction
  • Installation
  • Basic Usage
  • Parameters
  • Supported Datasets

🔧 User Guides

  • Evaluation Backends
    • OpenCompass
    • VLMEvalKit
    • RAGEval
      • MTEB
      • CLIP Benchmark
      • RAGAS
  • Model Inference Stress Testing
    • Quick Start
    • Parameter Description
    • Examples
    • Speed Benchmark Testing
    • Custom Usage
  • Arena Mode

🛠️ Advanced Tutorials

  • Custom Datasets
    • Large Language Model
    • Multimodal Large Model
    • Embedding Model
    • CLIP Model
  • Custom Model

🧰 Third-Party Tools

  • ToolBench
  • LongBench-Write

🧪 Benchmarking Results

  • Benchmarking
    • MMLU
  • Speed Benchmarking
    • QwQ-32B-Preview

📖 Best Practices

  • ms-swift Integration
  • Full-Chain LLM Training
EvalScope
/
Custom Datasets

Custom Datasets#

The following section introduces how to use EvalScope to customize evaluation datasets, including large model evaluation datasets, multimodal evaluation datasets, embedding models, and CLIP model evaluations.

  • Large Language Model
    • Multiple-Choice Question Format (MCQ)
    • Question-Answering Format (QA)
    • (Optional) Custom Evaluation Using the ms-swift Framework
  • Multimodal Large Model
    • Multiple-Choice Question Format (MCQ)
    • Custom QA Question Format (VQA)
  • Embedding Model
    • Custom Text Retrieval Evaluation
  • CLIP Model
    • Custom Image-Text Retrieval Dataset
    • Convert Image-Text Retrieval Data to Text Retrieval Data
Arena Mode
Large Language Model

© 2022-2024, Alibaba ModelScope Built with Sphinx 8.1.3