Support & On-Call
Source: content/manual/04-platform-engineering/chapters/06-support-and-oncall.md
Purpose and scope
Define how teams get help, who is paged, and how incidents are resolved.
Outcomes
- Clear intake and triage.
- SLAs for capabilities and incidents.
- Faster MTTR via runbooks.
Signals of trouble
- Ping-pong escalations across teams.
- Unclear ownership during incidents.
- Runbooks missing or outdated.
Remediation steps
- Publish support tiers and on-call rotations.
- Maintain runbooks; practice game days.
- Integrate post-incident actions into the roadmap.
Checklists and assets
playbooks/accelerating-mttr/checklist.mdincident steps.
References
- Incident policy; paging procedures.
