mcp-tool-selection-benchmark-v2

MCP Tool

Tsubaki414/mcp-tool-selection-benchmark-v2

Benchmark measuring agent tool selection failure rates across 1817 real MCP tools. Tests Claude Sonnet 4 & GPT-4o with diagnosable failure taxonomy and targeted fixes.

Install

$ npx loaditout add Tsubaki414/mcp-tool-selection-benchmark-v2

Platform-specific configuration:

.claude/settings.json

{
  "mcpServers": {
    "mcp-tool-selection-benchmark-v2": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-tool-selection-benchmark-v2"
      ]
    }
  }
}

Add the config above to .claude/settings.json under the mcpServers key.

Reviews

Loading reviews...

Quality Signals

Stars

Installs

Last updated30 days ago

Security: B

Safety

Risk Levelmedium

Data Access

read

Network Accessnone

Details

Sourcegithub-crawl

Last commit3/16/2026

View on GitHub→

Embed Badge

[![Loaditout](https://loaditout.ai/api/badge/Tsubaki414/mcp-tool-selection-benchmark-v2)](https://loaditout.ai/skills/Tsubaki414/mcp-tool-selection-benchmark-v2)