Built on 67 Paperclip GitHub issues analyzed

Paperclip runs your AI company.
Orchesis keeps it from running away.

Orchesis is an open-source HTTP proxy that sits between Paperclip agents and LLM APIs to detect loops at API call #3, scan for injection on every request, and meter real costs independently of what the agent reports.

12 attack surfaces · $0.05 loop catch · 0 Paperclip security docs · 96% injection detected

Install in 30 seconds →★ Star on GitHub
🔍 12 attack surfaces found
$0.05 loop catch at call #3
🔒 18 injection patterns
MIT · Free · Self-hosted
What Paperclip's dashboard doesn't show you

Three things that actually happen in production Paperclip fleets.

Each one has an issue number.

Budget tracking · Heartbeat gap
$55–150
unmonitored spend per agent run

Budget checks only fire between heartbeats

Paperclip checks budgets when an agent wakes up and when it finishes. Inside the run: unlimited spend. If the agent times out, the cost vanishes from the dashboard entirely. A 30-minute heartbeat with a 10-minute run means 20 minutes of complete cost blindness per cycle.

Orchesis detects the loop at API call #3$0.05 spent. That's 450× to 1,800,000× faster than waiting for the next heartbeat.

Issue #212 · Codex adapter
$0.00
reported cost for every Codex run

Your dashboard says your Codex agents cost nothing

The Codex adapter reports $0.00 for all runs. Millions of tokens processed, zero recorded cost. The dashboard shows a number. The number is wrong. There is no independent verification.

Orchesis meters every API call at the network layerReal tokens, real cost, independent of what the agent reports.

Cascading Injection · Formal proof
10 min
to infect your entire fleet from one task

One poisoned issue description compromises 13 agents

Issue descriptions have zero input sanitization. An attacker writes a malicious task. The CEO agent reads it, delegates to CTOs, who delegate to engineers. At fan-out 3 and depth 2: 13 agents compromised in 10 minutes. We proved this formally: N(t) = O(f^{t/Δ_h}).

Orchesis at depth 0 saves 92% of the fleetAt depth 1: 69%. Pattern-matched injection is fully contained.

One environment variable. That's it.

No code changes in Paperclip. No code changes in your agents.

Works with Claude Code, Codex, Cursor, and OpenClaw adapters.

# 1. Install Orchesis
pip install orchesis
orchesis init --with-crystal-alert --with-injection-shield

# 2. Add one line to your Paperclip .env
ANTHROPIC_BASE_URL="http://orchesis:8080/v1"
# 3. Verify
orchesis verify

✓ Proxy running on :8080
✓ Crystal Alert: active
✓ Injection Shield: active (33+ patterns + 14 Paperclip-specific)
✓ Cost metering: real-time
⚠ Paperclip heartbeat detected: 30 min · Orchesis covers the gaps
30s
Phase 0: Minimum Viable Integration
One env var. Loop detection + cost metering + injection scanning. Covers 60–70% of total value immediately.
14h
Phase 1: Core Adapter
Paperclip-specific patterns. Agent ID correlation. Detection jumps from 30–40% to 55–65%.
84h
Full Integration
Budget bridge, plugin hooks, dual proxy, workspace monitoring. All 6 phases.
12 attack surfaces. What catches what.

We identified 12 ways to attack a Paperclip fleet.

Orchesis covers 9. Five surfaces have zero protection from any tool.

Attack surfaceSeverityPaperclipOrchesis (proxy)Orchesis (plugin)
Issue description injectionHIGHNoneScan + block
Goal ancestry cascadeHIGHNoneChain tracking
Shared workspace filesHIGHNonePartialIntegrity check
Session state leakageMEDIUMNoneSessionTracer
Plugin system (full Node.js)CRITICALNoneBlind spotBlind spot
Clipmart templatesCRITICALNo reviewBlind spotBlind spot
Approval flow bypassCRITICALBroken (#696)Detect + alert
Cost reporting (self-reported)HIGHTrusts agentIndependent metering
Auth system (local_trusted)CRITICALPR #1315Blind spotBlind spot
Cross-agent API (ctx.*)HIGHNo RLSBlind spotAudit hooks
Heartbeat gap (filesystem)MEDIUMNoneBlind spotSnapshot
Stdout credential leaksMEDIUMDisplayed in UIAPI-level only

5 surfaces marked "Blind spot" require fixes in Paperclip itself. We report these through responsible disclosure, not marketing.

What Orchesis catches for Paperclip fleets

Every number below is calculated, not estimated.

Methods and assumptions are published.

$0.05
Loop detected at API call #3
vs $55–150 at next heartbeat
96%
Explicit injection patterns detected
0% false positives · calibrated
2ms
Cost anomaly detection latency
vs 15–60 minutes
92%
Fleet saved at cascade depth 0
Cascading Injection Theorem
$52–143K
Annual value for fleet of 5 agents
Calibrated range
57%
Detection rate with dual proxy
+26pp over single proxy
OpenClaw vs Paperclip: different problems, same solution

Same proxy. Different scale.

OpenClaw + Orchesis
Individual developer, one agent
Fleet size1
Session modelContinuous
Integration0 hours
Lead problemInvisible attacks
Collusion riskNone (N=1)
Injection surfaces3–4
Annual value$5–20K
Paperclip + Orchesis
Team, 5–20 agents in production
Fleet size5–20
Session modelHeartbeat
Integration30 min → 84h
Lead problemCost bleeding
Collusion riskYes (Folk theorem)
Injection surfaces12
Annual value$52–143K

Also available: See our OpenClaw integration →

What Orchesis can't do for Paperclip

We publish our limits.

Nobody else in this space does this. You need them to make real security decisions.

Semantic injection: 0% detection.Natural language attacks disguised as legitimate tasks are fundamentally undetectable by any finite set of regex patterns. This is proven, not an engineering gap.
Plugin layer: complete blind spot.Paperclip plugins run as in-process Node.js. HTTP proxy cannot see ctx.* calls.
5 of 12 surfaces: zero protection.Plugin system, auth, cross-agent API, heartbeat filesystem gap, stdout logs. Neither Orchesis nor Paperclip covers these today.
Realistic detection: 38%.Against a mix of 40% explicit and 60% semantic attacks. Not 96%. The 96% is for explicit patterns only.
Cascade containment: bifurcated.Explicit payloads: fully contained. Semantic payloads: not containable at proxy layer.
Frequently asked about Orchesis + Paperclip

Common questions

How does Orchesis integrate with Paperclip?

One environment variable: ANTHROPIC_BASE_URL=http://orchesis:8080/v1. All agent API calls route through Orchesis automatically. Works with Claude Code, Codex, Cursor, and OpenClaw adapters. No code changes in Paperclip or your agents.

What does Orchesis detect that Paperclip doesn't?

Runaway loops at API call #3 ($0.05 spent) instead of next heartbeat ($55–150). Prompt injection in issue descriptions, which have zero sanitization. Real API costs independent of self-reported agent data. Cross-agent behavioral correlation across your fleet.

Is Paperclip secure enough for production?

Paperclip has 30,000+ stars and zero security documentation. We found 12 attack surfaces including unsandboxed plugins with full server access, broken approval flows, and zero input sanitization on issue descriptions. External monitoring is strongly recommended for any production deployment.

What are the honest limitations?

Semantic injection (natural language attacks): 0% detection. Plugin layer: complete blind spot. 5 of 12 surfaces: zero protection. Realistic mixed-attack detection: 38%. We publish these numbers because transparency builds better security decisions.

How much does Orchesis cost?

$0. MIT license. Self-hosted. No telemetry. No account required. The annual value for a Paperclip fleet of 5 agents is $52K–143K in prevented cost overruns and security incidents.

The fleet that's bleeding money right now

won't tell you.

Get started →See OpenClaw integration →

Works whether AI wins or loses.