Product

A neutral execution-time control plane for your AI agents.

Boxed.ai wraps the tools your agents call. Every action passes through a policy gate. Every decision lands in an evidence-grade log. Works across providers — you don't rip out the agent stack you already have.

How it works

Four steps from agent intent to auditable outcome.

01

Agent makes a request

Your agent — built on OpenAI Responses, Azure OpenAI Assistants, LangChain or AutoGen — calls a wrapped tool through the Boxed.ai gateway instead of the raw provider.
02

Policy gate evaluates

The gateway checks the call against your policy: scoped permission, sanitised inputs, rate limits, blast-radius checks. Sensitive actions trigger a step-up approval.
03

Action executes or pauses

Approved calls pass through to the underlying tool. Calls awaiting approval queue for a named human; denied calls return a structured refusal the agent can handle gracefully.
04

Evidence is recorded

Every prompt, retrieved context item, tool call, parameters, approver and outcome is appended to a hash-chained audit log. Exception MI surfaces what your second line needs to see.

The gateway model

Sit between the agent and the tools, not inside the model.

Model-layer alignment is necessary but not sufficient. NCSC and OWASP both flag indirect prompt injection as a class of attack that cannot be fully patched at the model. Boxed.ai works at the execution boundary, where actions are bounded by code rather than by hope.

That means you get the same guarantees regardless of which provider, framework or model your team picks next quarter.

AGENT RUNTIME

OpenAI Responses

Azure Assistants

LangChain / AutoGen

BOXED.AI

Policy gate

Approvals

Audit log

TOOLS

Email, CRM

Files, Git

Payments, APIs

request →

decision →

action

← response

← outcome

← result

Tool wrappers

Three demo tools at launch. Connectors expand from there.

The MVP ships with three high-value, high-risk tool wrappers. They are deliberately conservative defaults — easy to relax, expensive to forget.

Filesystem

Read-only by default

Agents can browse and read project files. Writes, deletes and renames require an explicit policy grant — by directory, file pattern and named approver.

Git

Pull-request only

Agents can branch, commit and open PRs. Direct pushes to protected branches are blocked. Merges require a human review on the PR itself, never the agent.

Email

Draft-only by default

Agents can compose, address and attach files. Sending requires a named approver per recipient class. Drafts and decisions are recorded on the audit log.

Payments and CRM

Roadmap

Connectors for Stripe, GoCardless, HubSpot, Salesforce and common UK practice-management systems are on the build plan for the first paid tier.

The audit log

Every entry, hash-chained. Every chain, exportable.

Boxed.ai writes an append-only record for every tool call evaluated by the policy gate. Each entry is hash-chained to the previous one — so any tampering with historical events is detectable on review.

Logs export as JSONL for ingest into your SIEM, GRC tooling or auditor's evidence locker. A signed manifest accompanies every export.

Captured per call

Agent identity and version
Underlying model and provider
Full prompt (with system, user, tool messages)
Retrieved context (with source URI and content hash)
Tool name, parameters and arguments
Policy decision and matched rule
Approver identity and timestamp (if applicable)
Tool response and outcome status
Token, latency and cost telemetry
Hash chain link to the previous log entry

Where we are

Honest about the build.

Boxed.ai is in active development. We are running design partners through a working policy gate, three tool wrappers and the audit log described on this page. RBAC, GRC connectors and automated incident response are on the build plan for later phases. If you'd like to be in the design-partner cohort, we'd like to hear from you.

Talk to a founder

A neutral execution-time control plane for your AI agents.

Four steps from agent intent to auditable outcome.

Agent makes a request

Policy gate evaluates

Action executes or pauses

Evidence is recorded