Don’t Panic! How to Cope Now You’re Responsible for Production

Euan Finlay, The FT

More and more developers are expected to be on-call, provide out-of-hours support, and respond to production outages. Without much experience handling incidents, it can be scary, intimidating, and feel like being dropped in the deep end. But it doesn’t have to be that way!

Over two years on the FT’s Content team, we’ve transformed our incident response – from a number of mildly terrifying multi-hour outages, to a stable platform where team members feel comfortable on-call.

This talk will provide practical tips and advice on:

  • setting up an incident response framework
  • what to do when Everything Is On Fire™
  • improving things afterwards
  • and some horror stories of our own…

Required audience experience

Low – aimed at engineers and teams new to supporting production services

Objective of the talk

Attendees will leave with practical ideas for setting up a standard incident framework.

I’ll cover:

  • how we used to handle incidents at the FT
  • what we did to improve
  • standard processes to follow during an incident
  • what to do afterwards to ensure problems don’t happen again

Track 1
Location: Date: May 16, 2018 Time: 3:45 pm - 4:30 pm Euan Finlay, The FT Euan Finlay, Financial Times