Good podcast
https://www.infoq.com/articles/staff-engineers-impact-incidents/?utm_source=notification_email&utm_campaign=notifications&utm_medium=link&utm_content=&utm_term=weekly
- Staff engineers can provide examples of – and coach teammates in – productive behaviors like transparency, admitting knowledge gaps, and questioning assumptions to help prevent incidents.
- Bolstering a supportive, inclusive engineering culture provides another layer of defense against incidents. As culture stewards, staff engineers should continually invest in psychological safety.
- Staff engineers have the skills to excel as incident commanders during outages, including coordination across workstreams, communicating with stakeholders, and preventing responder burnout.
- Staff engineers should get involved in post-mortems to raise the quality of root cause analysis and push for pragmatic action items tied to culture gaps.
- Improving the underlying cultural issues prevents more incidents than procedural gates.
- Testing - the change wasn’t tested in a pre-production environment first to verify it worked as intended.
- Code Review – the change was approved in the code review without any questions or discussion.
- Deployment Verification - the change wasn’t verified after it had been deployed to production to make sure it was working as expected.