PermForge
The permission control and audit evidence layer for AI agents.
Post-hoc audits miss 41.7% of what your agent actually does.
Source · AgentLeak benchmark · 4,979 traces · 2026
Agents now run 220+ sub-calls per task. Human-speed oversight breaks structurally.
a16z calls it the "thundering herd." One prompt fans out across LLM calls, vector lookups, tool calls, and sub-agents — at agent speed, not human speed.
- 01 220+ sub-calls per task
recursive fan-out across LLM, vector DB, tool calls, sub-agents
- 02 5,000 sub-tasks in milliseconds
new median for production vertical AI agents
- 03 0 human-speed approvals work
OAuth · RBAC · step-up · manual review · all break at this scale
Three time phases of agent governance. The middle one is empty.
Regulators have already named the empty box: "proportionate real-time oversight" (EU AI Act Article 14). Today the market sells policy authoring before, and tracing after. The during is unowned.
- Before · T-1
Policy drift.
Static linting and pre-prod rules try to anticipate every edge case — but agents adapt faster than policies update.
Product · static lint · prompt firewalls - During · T-0
No enforcement.
Mid-execution, while the agent is calling tools, expanding scopes, and spawning sub-agents — nothing inspects the decision graph.
PermForge fills this → - After · T+1
Audit gaps.
Post-hoc trace tools find what already shipped. By then the gap, leak, or hallucinated commit is in production.
Product · Braintrust · LangSmith · Langfuse
15+ named buyers. Already funded. Already regulated. Already shipping.
Combined funding ≥ $13B across the three verticals where regulated AI agents are already in production. Every one of them has the same gap.
-
Legal
Funded · Regulated · ShippingRule 1.6 client-matter wall · ABA 5.3 attorney supervision
- Harvey
- Legora
- EvenUp
- Crogl
- Eve
- Wordsmith
-
Healthcare
Funded · HIPAA · ShippingHIPAA Minimum Necessary · per-call PHI necessity (CMS 2026-03)
- Hippocratic
- Abridge
- Notable
- OpenEvidence
-
Financial Compliance
Funded · MNPI · ShippingMNPI propagation · KYC reasoning chain · EU AI Act high-risk
- Norm Ai
- Themis
- Greenlite
- Hummingbird
Source · public funding filings · Crunchbase · LinkedIn job-post permissions language · sample, not exhaustive
Four enforcement deadlines stack in 2026 H2.
Permission control gap becomes a legal + cash-flow risk this quarter. Network insurers and customer SOC 2 reviewers have already begun treating "agent permission audit trail" as a renewal condition.
- 2026-06-30 T-35 days
Colorado AI Act
Cure period ends. First enforcement day for "high-risk AI" duty to disclose & manage.
- 2026-08-02 T-68 days
EU AI Act
High-risk systems enforcement. Article 14 demands proportionate real-time oversight. €35M or 7% global revenue fines.
- 2026-02 · ongoing
ABA Model Rule 5.3
Extended to AI agents acting under attorney supervision. Law firm AI procurement now requires audit trail conformity.
- 2026-03 · ongoing
CMS HIPAA Minimum Necessary
Clarified: agent-driven PHI access must demonstrate per-call necessity. Hits the fan-out failure mode directly.
Five capabilities. All inline, at sub-call latency.
PermForge sits between the agent runtime and every privileged action it tries to take. We don't sit beside it as a tracing tool — we gate it as a runtime control plane.
- 01 INTERCEPT
Inline interception
Hook the 4 entry points where agent permission creep happens: tool call · scope upgrade · sub-agent spawn · token forwarding. Miss one, you still leak.
- 02 ROUTE
Risk-graded routing
Low risk auto-passes inline. Medium batches to async approval. High blocks and escalates with full evidence chain. Policy templates ship per regulation.
- 03 ELICIT
Async approval
Batch elicitation collapses 220 sub-call asks into 5 human decisions. Slack · mobile push · SMS · Magic Link. Timeout defaults are policy-driven.
- 04 EVIDENCE
Audit-grade evidence
Every request → decision → approver → timestamp → outcome is signed and immutable. Maps directly to EU AI Act Annex III & ABA 5.3 evidence requirements.
- 05 KILL
Circuit breaker
Wrong approval can be revoked. Anomalous agent behavior triggers kill. This is the contractual "right to interrupt" your large customers will ask for in 2026 H2.
Four benchmarks pin the gap. Not opinion. Public data.
Buyers we sell to have already cited at least one of these in their internal AI risk reviews. We don't argue the gap — we measure it on your traces in a 1-week shadow audit.
- 41.7% arxiv.org · 2502.16793 ↗
AgentLeak · visibility gap
Multi-agent privacy violations missed by output-only audits. 4,979 production traces. Inter-agent channel = 68.9% of leakage, invisible to Braintrust / LangSmith.
- 10.8% arxiv.org · 2406.12045 ↗
τ-bench · policy compliance gap
Even SOTA agents fail organizational policy in 1 of 10 multi-turn workflows. Gap is structural — not a model upgrade fix.
-
AgentHarm · model self-defense gap
Refusal-trained LLMs jailbreak easily when operating as browser agents. Built-in safety training fails at agent-time. External control plane is mandatory.
- own words Braintrust buyer guide · 2026 ↗
Braintrust 2026 buyer guide
"Shows trace after users complain · can't block before it ships." Their own buyer guide confirms post-hoc is structurally late. The "during" is empty by design.
Three control surfaces × three time phases. One cell is open.
Real-time × Permission is the cell every regulated vertical AI agent needs by 2026 H2. Static authorization (OAuth, OPA) and behavior sandboxing don't cover it. Tracing doesn't either.
existing player · partial / token coverage · PermForge
Our open-source benchmark
PermBench
How regulated vertical AI agents score on sub-call permission control. Run it on your own agent. Compare against Harvey, Hippocratic, Norm Ai. Cite us in your AI risk review.
- 120+ failure cases
- 8 vertical regulations mapped
- Apache 2.0 license · no rug-pull
Free shadow audit · 1 week.
You send 1 week of agent traces — de-identified is fine, or wire the SDK for a live capture. You get back a signed report listing every silent permission boundary your agent crossed, with regulatory exposure estimates.
first 12 audits free · June 2026 1-week turnaround no commitment
- 01 You send A 1–7 day window of agent traces (de-identified is fine). Or wire our SDK for a live capture.
- 02 We run PermBench scoring + behavioral graph extraction. We map every cross-tenant access, scope upgrade, and silent ethical-wall hit.
- 03 You get A signed report: visibility gap %, regulatory exposure, recommended controls. Pilot pricing only if it warrants it.