Wrong refund
- Tool
- stripe.refund
- Target
- in_original
- State
- changed
Agent side-effect regression
Side-effect regression testing for AI agents.
Trace timeline
11 sec replayThe problem
Freeze the exact run, preserve the state boundary, and turn the incident into a gate.
3 / 8 workflow
Capture the production run, compare the candidate, and block regressions before release.
Capture the run
agentreplay freeze Compare side effects
agentreplay diff Block regressions
agentreplay gate
agentreplay diff bad_trace.json fixed_trace.json
Section 4 / 8
Replay the bad run, compare the candidate, and block the regression before release.
5 / 8
Record inputs, responses, approvals, and side effects without replacing your agent framework.
harness.wrapTool('stripe.refund', refundCustomer)
6 / 8
Billing, support, RevOps, and platform teams need proof before the next release.
refund gate
draft approval
CI release gate
client proof bundle
import { createHarness } from 'agentreplay'
const harness = createHarness({
projectKey: process.env.AGENTREPLAY_PROJECT_KEY,
redact: ['pii', 'raw_keys']
})
const refund = harness.wrapTool(
'stripe.refund',
refundCustomer
)
agentreplay gate traces/billing-bad-run.json PASSED 6/6
agentreplay diff bad.json fixed.json DIFF 3 Developer experience
AgentReplay works beside your framework: OpenAI Agents, LangGraph, custom MCP servers, or your own loop.
Every side effect frozen. Every fix compared. Every release gated.