Open-source runtime security for AI agents

Defend AI agents against tool-use traps.

TrapDefense helps teams control risky tool calls, block unsafe actions, redact sensitive data, and keep audit trails for MCP, internal assistants, and tool-using agent workflows.

Python-first MCP-ready Shadow -> Warn -> Enforce Audit-friendly
runtime policy checkpoint shadow mode active
from asr import Guard
from asr.mcp import mcp_guard

guard = Guard(
    mode="shadow",
    domain_allowlist=["api.internal.com"],
    block_egress=True,
    pii_action="block",
    capability_policy={"shell_exec": "block"},
)

# protect MCP tool handlers before damage happens
@server.tool()
@mcp_guard(guard, capabilities=["network_send"])
async def send_email(to, subject, body):
    ...
guard
Control risky actions before execution

Block external posts, restrict file paths, and gate sensitive capabilities.

audit
Keep evidence your security team can use

Record structured JSONL events for scans, tool decisions, and redaction.

data protection
Redact PII in args and outputs

Protect sensitive data even when you are still rolling out policies in shadow mode.

deployment
Start simple, grow safely

Use the open-source SDK now, then add enterprise workflows when your team needs them.

focus
Tool-use runtime control
integration
MCP + Python agent workflows
rollout
Shadow, warn, enforce
evidence
Structured audit logs
Why TrapDefense

The model is not the only risk.

Prompt filters can catch suspicious text, but they do not stop the final action. TrapDefense focuses on the last line of defense: runtime control at tool execution time, plus audit evidence for what happened and why.

the real problem

Damage happens when the agent actually uses a tool.

Hidden instructions in content are dangerous, but the real business risk appears when an agent sends email, calls an external API, reads the wrong file, or leaks sensitive data through a connected workflow.

Prompt filters are not enough Scanning content helps, but it does not prevent the final outbound action.
Security teams need evidence Enforcement without logs creates blind spots. Logs without enforcement arrive too late.
Production rollouts need safety rails Start in shadow mode, learn from real traffic, and enforce when policies are ready.
research direction

Built for practical agent-trap defense.

TrapDefense is inspired by the emerging security problem of AI agent traps and intentionally focuses on practical runtime defenses for content-injection and behavioral-control risks.

Control outbound requests and risky destinations.
Restrict file access and sensitive capabilities.
Redact sensitive data in tool args and outputs.
Keep structured evidence for security review.
Core capabilities

Practical controls for real agent workflows.

TrapDefense stays focused on the controls teams actually need first: policy enforcement, sensitive data protection, gradual rollout, and auditability across AI agent execution paths.

01

Guard risky tool use

Control outbound requests, file access, sensitive capabilities, and unknown tools before they execute.

02

Redact sensitive data

Detect and redact PII in tool arguments and outputs so agents do not leak secrets by accident.

03

Protect MCP workflows

Wrap MCP tool handlers with policy checks and audit logging using a lightweight adapter.

04

Ship policies safely

Roll out new rules with shadow, warn, and enforce modes rather than turning on hard blocks from day one.

05

Scan for basic injection signals

Catch hidden text, metadata payloads, HTML comment attacks, base64 instructions, and prompt-injection keywords.

06

Keep audit trails

Write structured JSONL events for findings, tool decisions, result redaction, and error events.

How it works

A runtime defense layer built for rollout.

TrapDefense is designed to fit the way real teams ship: observe first, tighten gradually, and keep the evidence needed to tune policies without breaking production.

Scan inbound content

Detect basic content-injection patterns before suspicious material gets normalized inside agent workflows.

Evaluate tool calls

Apply policy checks in sequence: blocklists, egress, file paths, PII, capability fallback, and default actions.

Protect outputs and record evidence

Redact sensitive data, log decisions, and leave a trace your team can review later.

Shadow -> Warn -> Enforce lets teams deploy runtime controls without pretending they already know every safe rule on day one.
Use cases

Built for teams operating agents in the real world.

TrapDefense is especially strong where actions matter more than text: MCP servers, internal assistants, and tool-calling systems connected to sensitive company workflows.

MCP

Protect MCP servers

Add policy enforcement and audit logging to MCP tool handlers without standing up a separate proxy first.

API

Control internal agents

Restrict where assistants can send data, which files they can touch, and what capabilities they can invoke.

PII

Reduce exfiltration risk

Block suspicious destinations, catch risky outputs, and redact sensitive data before it leaves the system.

OPS

Roll out security safely

Start with observation, move into warning, then enforce with confidence after policy tuning.

Open source + enterprise

Open source first. Enterprise when your team needs more.

TrapDefense starts as an open-source SDK for runtime security in AI agent systems. Enterprise expands the operational layer: shared policies, audit workflows, and support for serious teams.

Open source

Ship the runtime layer now.

Use the SDK for policy enforcement, MCP integration, basic content scanning, PII redaction, and audit logging.

Guard engine for tool-use policy enforcement
Scanner for basic content-injection detection
Audit logger for structured JSONL events
MCP integration for protected tool handlers
Policy files and shadow/warn/enforce rollout
Enterprise

Operate policies across a team.

TrapDefense Enterprise is the next layer for teams that need centralized governance, onboarding help, and security operations around agent runtime controls.

Centralized policy management
Shared audit workflows and review processes
Operational rollout guidance
Enterprise onboarding and support
Custom integration help for production teams
Talk to us

Start an enterprise conversation.

Tell us a bit about your team and what you want to protect. We'll route the inquiry to hellocosmos@gmail.com.

Submissions are delivered to email, and users may see a captcha depending on spam protection.
FAQ

What teams usually ask first.

The goal is not to pretend every security problem is solved. The goal is to be crisp about what TrapDefense already does well.

What does TrapDefense protect against?

TrapDefense focuses on practical runtime defenses for tool-use risks, including outbound requests, file access, sensitive capabilities, PII exposure, and basic content-injection signals.

Is this a full agent security platform?

Not yet. Today, TrapDefense is an open-source runtime security SDK with an enterprise path. It is intentionally focused on execution-time controls and auditability.

Why not just use prompt filtering?

Prompt filtering can catch some suspicious input, but it does not stop the final action. TrapDefense adds policy enforcement at the moment a tool is about to run.

Do I need MCP to use TrapDefense?

No. MCP is a strong starting point, but TrapDefense can also be used directly in Python agent workflows with decorators and policy evaluation hooks.

Start where the risk actually is

Give your agent stack a runtime defense layer.

Open-source for builders. Enterprise-ready for teams that need auditability, rollout safety, and stronger operational controls.