AGENT Benchmarks#

Below is the list of supported AGENT benchmarks. Click on a benchmark name for details.

Benchmark Name

Pretty Name

Task Categories

bfcl_v3

BFCL-v3

Agent, FunctionCalling

bfcl_v4

BFCL-v4

Agent, FunctionCalling

general_fc

General-FunctionCalling

Agent, Custom, FunctionCalling

tau2_bench

τ²-bench

Agent, FunctionCalling, Reasoning

tau_bench

τ-bench

Agent, FunctionCalling, Reasoning

tool_bench

ToolBench-Static

FunctionCalling, Reasoning