Documentation Index
Fetch the complete documentation index at: https://preflight-ee1e633f.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Workflows fail in the real world: APIs time out, databases lock, humans go offline. Good error handling turns chaos into retries, compensations, and clear operator signals instead of silent data drift.
Failures
Classify transient versus permanent failures. Transient steps should retry with jitter; permanent failures should stop fast and surface a crisp error code.
Observability
Attach correlation IDs across steps so support can trace a single user action through every hop. Log HTTP status bodies at reduced verbosity to avoid leaking secrets.
Poison messages
If the same payload fails repeatedly, stop retrying and quarantine it—otherwise you starve healthy traffic.
Compensation
Undo partial effects when a downstream step fails—especially for payments or external posts. Compensation may mean voiding an invoice, deleting a draft record, or sending a corrective webhook.
Ordering
Design compensations in reverse dependency order: undo the last successful side effect first.
Partial success
When only one of two external systems succeeded, document the manual reconciliation path in the alert body.
Alerts
Route failures to on-call channels with runbook links. Include workflow name, step name, and last payload hash—not full PII—in the first line of the alert.
Fatigue
Throttle duplicate alerts for the same root cause. Pair alerts with dashboards so responders can see whether failures are trending or isolated.