Over the years there have been a bunch of great talks on site reliability and incident response. Below are a few we thought stood out(in no specific order) and is defintely worth a peek.
What is SRE? What is the difference between SRE vs DevOps?
SREcon19 Europe/Middle East/Africa - All of Our ML Ideas Are Bad (and We Should Feel Bad) by Todd Underwood(Google)
SREcon18 Americas - The History of Fire Escapes by Tanya Reilly(Squarespace)
Who Destroyed Three Mile Island? - Nickolas Means | The Lead Developer Austin 2018 by Nick Means(Muve Health)
Incidents as we Imagine Them Versus How They Actually Are with John Allspaw
LISA19 - What Connections Can Teach Us about Postmortems by Chastity Blackwell(Truss)
LISA19 - Earthquakes, Forest Fires, and Your Next Production Incident by Alex Hidalgo(Squarespace)
SREcon18 Europe - SRE for Good: Engineering Intersections between Operations and Social Activism by Liz Fong-Jones(Honeycomb)
SREcon18 Europe - Ethics in Computing by Theo Schlossnagle(Circonus)
The SRE I aspire to be SRECon19 EU by Yaniv Aknin(Google)
02 Jul 2020