loaditout.ai
SkillsPacksTrendingLeaderboardAPI DocsBlogSubmitRequestsCompareAgentsXPrivacyDisclaimer
{}loaditout.ai
Skills & MCPPacksBlog

AgentTrust

MCP Tool

chenglin1112/AgentTrust

Real-time trustworthiness evaluation and safety interception for AI agents. Semantic analysis, safe alternative suggestions, multi-step attack chain detection, and LLM-as-Judge.

Install

$ npx loaditout add chenglin1112/AgentTrust

Platform-specific configuration:

.claude/settings.json
{
  "mcpServers": {
    "AgentTrust": {
      "command": "npx",
      "args": [
        "-y",
        "AgentTrust"
      ]
    }
  }
}

Add the config above to .claude/settings.json under the mcpServers key.

About

<div align="center">

AgentTrust

Real-time trustworthiness evaluation and safety interception for AI agents.

The first framework that understands, judges, suggests, and tracks agent actions — before they execute.

[](https://www.python.org/downloads/) [](LICENSE) [](https://github.com/chenglin1112/AgentTrust/actions) [](https://github.com/chenglin1112/AgentTrust)

42 risk patterns | 21 policy rules | 37 SafeFix rules | 7 chain detectors | 300 benchmark scenarios | 95 tests | < 1ms latency

Quick Start | Architecture | SafeFix | RiskChain | Benchmark | Docs

</div>

---

Why AgentTrust

AI agents execute real-world actions: file operations, shell commands, API calls, database queries. A single misjudged action — an accidental rm -rf /, an exposed API key, or silent data exfiltration through a benign-looking HTTP call — can cause irreversible damage.

Existing solutions fall short:

graph LR
    A["Post-hoc Benchmarks<br/>(AgentHarm, TrustBench)"] -.->|"Too late<br/>Damage already done"| X["GAP"]
    B["Rule-based Guardrails<br/>(Invariant, NeMo)"] -.->|"Too shallow<br/>Miss semantic context"| X
    C["Infrastructure Sandboxes<br/>(OpenShell)"] -.->|"Too low-level<br/>Don't understand intent"| X
    X ==>|"AgentTrust fills this"| D["Real-time<br/>Semantic<br/>Explainable"]
    style X fill:#ff6b6b,stroke:#c0392b,color:#fff
    style D fill:#2ecc71,stroke:#27ae60,color:#fff

AgentTrust provides **real-time, semantic-level safety

Tags

agentai-safetybenchmarkguardrailsllmmcppythonsecuritytrustworthiness

Reviews

Loading reviews...

Quality Signals

0
Installs
Last updated19 days ago
Security: AREADME

Safety

Risk Levelmedium
Data Access
read
Network Accessnone

Details

Sourcegithub-crawl
Last commit3/26/2026
View on GitHub→

Embed Badge

[![Loaditout](https://loaditout.ai/api/badge/chenglin1112/AgentTrust)](https://loaditout.ai/skills/chenglin1112/AgentTrust)