We had some similar problems, though at a different scale, at Intercom and used a bunch of similar techniques improve out-of-hours oncall. We also emphasized ownership, though basically made the call to have teams oncall for their own stuff during office hours, and a shared oncall team out of hours. We also have a Rails monolith which makes things a bit easier to share the oncall work :)
Hey!
This is a great write-up, thank you for sharing!
We had some similar problems, though at a different scale, at Intercom and used a bunch of similar techniques improve out-of-hours oncall. We also emphasized ownership, though basically made the call to have teams oncall for their own stuff during office hours, and a shared oncall team out of hours. We also have a Rails monolith which makes things a bit easier to share the oncall work :)
I wrote about it here if you are interested: intercom.com/blog/rapid-response-h...
Looking forward to giving your Oncall Nightmares podcast a listen :)
Thanks! Def will give your post a read over break!
Podcast was just released this morning π Hope you Enjoy! podomatic.com/podcasts/oncallnight...