back
loading skill details...
Run an incident response workflow — triage, communicate, and write postmortem. Trigger with "we have an incident", "production is down", an alert that needs…
/incident-response If you see unfamiliar placeholders or need to check which tools are connected, see CONNECTORS.md. Manage an incident from detection through postmortem. Usage /incident-response $ARGUMENTS Modes /incident-response new [description] # Start a new incident /incident-response update [status] # Post a status update /incident-response postmortem # Generate postmortem from incident data If no mode is specified, ask what phase the incident is in. How It Works ┌─────────────────────────────────────────────────────────────────┐ │ INCIDENT RESPONSE │ ├─────────────────────────────────────────────────────────────────┤ │ Phase 1: TRIAGE │ │ ✓ Assess severity (SEV1-4) │ │ ✓ Identify affected systems and users │ │ ✓ Assign roles (IC, comms, responders) │ │ │ │ Phase 2: COMMUNICATE │ │ ✓ Draft internal status update │ │ ✓ Draft customer communication (if needed) │ │ ✓ Set up war room and cadence │ │ │ │ Phase 3: MITIGATE │ │ ✓ Document mitigation steps taken │ │ ✓ Track timeline of events │ │ ✓ Confirm resolution │ │ │ │ Phase 4: POSTMORTEM │ │ ✓ Blameless postmortem document │ │ ✓ Timeline reconstruction │ │ ✓ Root cause analysis (5 whys) │ │ ✓ Action items with owners │ └─────────────────────────────────────────────────────────────────┘ Severity Classification Level Criteria Response Time SEV1 Service down, all users affected Immediate, all-hands SEV2 Major feature degraded, many users affected Within 15 min SEV3 Minor feature issue, some users affected Within 1 hour SEV4 Cosmetic or low-impact issue Next business day Communication Guidance Provide clear, factual updates at regular cadence. Include: what's happening, who's affected, what we're doing, when the next update is. Output — Status Update ## Incident Update: [Title] **Severity:** SEV[1-4] | **Status:** Investigating | Identified | Monitoring | Resolved **Impact:** [Who/what is affected] **Last Updated:** [Timestamp] ### Current Status [What we know now] ### Actions Taken - [Action 1] - [Action 2] ### Next Steps - [What's happening next and ETA] ### Timeline | Time | Event | |------|-------| | [HH:MM] | [Event] | Output — Postmortem ## Postmortem: [Incident Title] **Date:** [Date] | **Duration:** [X hours] | **Severity:** SEV[X] **Authors:** [Names] | **Status:** Draft ### Summary [2-3 sentence plain-language summary] ### Impact - [Users affected] - [Duration of impact] - [Business impact if quantifiable] ### Timeline | Time (UTC) | Event | |------------|-------| | [HH:MM] | [Event] | ### Root Cause [Detailed explanation of what caused the incident] ### 5 Whys 1. Why did [symptom]? → [Because...] 2. Why did [cause 1]? → [Because...] 3. Why did [cause 2]? → [Because...] 4. Why did [cause 3]? → [Because...] 5. Why did [cause 4]? → [Root cause] ### What Went Well - [Things that worked] ### What Went Poorly - [Things that didn't work] ### Action Items | Action | Owner | Priority | Due Date | |--------|-------|----------|----------| | [Action] | [Person] | P0/P1/P2 | [Date] | ### Lessons Learned [Key takeaways for the team] If Connectors Available If ~~monitoring is connected: Pull alert details and metrics Show graphs of affected metrics If ~~incident management is connected: Create or update incident in PagerDuty/Opsgenie Page on-call responders If ~~chat is connected: Post status updates to incident channel Create war room channel Tips Start writing immediately — Don't wait for complete information. Update as you learn more. Keep updates factual — What we know, what we've done, what's next. No speculation. Postmortems are blameless — Focus on systems and processes, not individuals.
don't have the plugin yet? install it then click "run inline in claude" again.