Retrace records a failed Python test or CI run as a deterministic replay. Open it in VS Code, step backwards from the failure, and inspect the runtime state that actually happened.
Pip install retracesoftware
Open source · CPython 3.11+ · Pytest in one command · VS Code replay
A failed pytest run replayed in VS Code. Step backwards from the exception and inspect the value that caused it.
Replay in VS Code
Open the failed run locally and debug the execution that actually happened.
Step backwards
Move backwards from the exception to the runtime state that caused it.
Runtime facts for AI agents
Give your AI coding agent real stack frames and values — not just logs and tracebacks.
No test rewrite
Wrap your existing pytest command. No special test harness required.
Production path
Start in CI. Use the same recording model for production failures when ready.
source .venv/bin/activate
python -m pip install retracesoftware
No app rewrite required.
RETRACE_RECORDING=recordings/failed-run.retrace \
python -m pytest
If the test passes, discard the recording.
If it fails, keep it.
# Open recordings/failed-run.retrace
# Start replay from the Retrace sidebar
# Step backwards from the failure
No live test process required. You are debugging the recorded execution.
Run pytest under Retrace in CI. If the job passes, ignore the recording. If it fails, upload the `.retrace` file as a build artifact.
Now the failed run does not disappear when the CI process exits. A developer can replay it locally in VS Code, step backwards from the failure, and inspect the runtime state that caused it.
An AI coding agent can use the same artifact as runtime context instead of guessing from logs and a traceback.
run: |
mkdir -p recordings
RETRACE_RECORDING=recordings/failed-run.retrace python -m pytest
- name: Upload Retrace recording
if: failure()
uses: actions/upload-artifact@v4
with:
name: retrace-failed-run
path: recordings/failed-run.retrace
Retrace lets you replay the run that crashed.
| Today | With Retrace | |
| CI artifacts | CI artifacts are logs and tracebacks | CI artifact is replayable |
| AI agents | AI agents infer from partial context | AI agents get runtime evidence |
| Failure | Stack trace shows where it crashed | Replay shows what happened before |
| What gets preserved | Logs show what you predicted would matter | Retrace preserves the failed execution |
Provenance shows you why.
A recording lets you step through the execution. But when you're staring at a wrong value, the real question is: where did it come from?
Retrace's provenance engine traces any value back through the execution — from the point you noticed it, through every transformation, to the original input that caused it.
-
Select any value. Jump to its origin.
Click a variable in the debugger and instantly see the exact line and inputs that produced it.
-
Chain backwards through transformations.
Each origin has its own provenance. Keep drilling back until you reach the root cause.
-
Works on every value, not just outputs.
Intermediate variables, function returns, container mutations — provenance covers everything in the execution.
Now in early access with select design partners.
Three clicks from ZeroDivisionError to root cause: the API caller sent qty: "0" in the request body. No manual searching. No log correlation.
Python 3.12 support is in progress, with broader coverage planned before GA.
It records at the Python runtime layer (not
A re-run often takes a different path.
Retrace lets you debug the exact execution that happened, after the fact.
Retrace records the real execution and lets you replay it deterministically, so you can inspect the actual code path and state.
Retrace records external interactions (DB, API calls, file I/O, time) during a real run, then replays them deterministically in your local debugger — no prod access needed.
Perfect for:
- Debug production-only bugs you can’t reproduce
Replay the exact execution that already happened. No repro steps required. - Reproduce race conditions and timing-sensitive failures
Capture and deterministically replay concurrency, async behavior, and thread interactions. - Stabilise flaky CI tests
Replay the exact failing run to understand and fix non-deterministic test failures. - Debug systems with external dependencies
Reproduce failures involving databases, APIs, file I/O, and other external services. - Investigate failures after the fact
Inspect real code paths and state from incidents that are already over. - Let AI agents debug your production failures Retrace's DAP integration lets tools like Claude Code and Cursor step through recorded executions programmatically, including backwards.