Night Mode LabsBlue Book
Checklists

Incident Review Checklist

Use this checklist after incidents to turn response experience into system improvement. The review should focus on learning and prevention, not blame.

Facts

  • Incident start and end times are recorded.
  • Impacted users, services, and regions are identified.
  • Detection source is documented.
  • Timeline includes key decisions and mitigations.
  • Recent changes are reviewed.

Response

  • Incident roles were assigned or gaps are noted.
  • Escalation path worked or gaps are captured.
  • Communications were timely or gaps are captured.
  • Runbooks helped or missing steps are identified.
  • Access issues are recorded.

Technical learning

  • Root causes and contributing factors are documented.
  • Missing alerts or noisy alerts are identified.
  • Dashboard, log, or trace gaps are captured.
  • Rollback, restore, or mitigation gaps are captured.
  • Dependency failure behavior is understood.

Follow-up

  • Action items have owners and due dates.
  • High-risk actions are prioritized.
  • Accepted risks are explicit.
  • Service catalog, runbooks, and docs are updated.
  • Follow-up review date is scheduled.

On this page