DEV Community

Jess Lee
Jess Lee

Posted on

How does your team handle 'on call' evening/weekend hours?

As Ben pointed out in his post, "I've been de facto on call 24 hours a day since starting dev.to but this weekend I'm going camping", we clearly don't have a fair schedule for handling late night or weekend emergencies, yet.

I know some teams rotate weekend duty and get a weekday off here or there, but I'd really like some real world recommendations or what to or not to do.

Top comments (16)

Collapse
 
donbavand profile image
Daniel Donbavand • Edited

Our team runs a Tuesday to Tuesday weekly schedule for a primary and secondary on-call person. We have around 8 people on call, so we get a 2 week break in-between.

If we have something important on while on-call, other team members have been great at picking those days up.

It's also important to add, we are paid for being on call, whether we get a call or not.

If someone is on vacation, they are on vacation, no on-call duty for them :)

Collapse
 
rapidnerd profile image
George

Well I work remotely. It really depends on what the actual issue is, as our teams are split into different sub categories of what we do. Generally if a machine fails and stop sending data we'll all get a notification to our phone saying its gone down, after 10 minutes if no one has responded and more start to go down an alarm goes off from my phone that screams alert in my face. If the time comes and we can't be on for this we're able to set it so it wont alert us at all, but only if we're not near our computer.

Collapse
 
ivanfabrynugraha profile image
Ivan Febriansyah Hadi Nugraha • Edited

What team do you have? in my office, critical moments like that might happen when the server is down or there are bugs that make a site not running properly. Usually, all members of the team are always on hand when things happen like that. And when someone was on vacation, there will always be other team members who willing to replace.

We believe that a vacation is a necessity and will make someone become more productive, and when there are members of our team who were on vacation, as much as possible we will not interfere with him/her. so take your 'me time' :)

Collapse
 
sheyd profile image
Sena Heydari • Edited

Here are a few recommendations that have worked well in environments I've worked in the past :

  • Big +1 on what @val_baca said below. Determine alternate weeks and definitely have an on-call calendar that everyone's dates and rotations are on.
  • Documentation of critical systems and troubleshooting steps to get things back to good working state are key. This has saved hours of troubleshooting and downtime across multiple teams I've been in. Equally beneficial is person who's out can relax a lot more knowing things are being handled in a somewhat organized fashion in case of emergency, and they won't come back to a worse fire than the original one.
  • List out the major holidays/travel times of the year that people usually want to be with friends and family, e.g. Christmas, New Year's, etc. Figure out who's going to cover each one. It's often the worst feeling when everyone on a team has unchangeable plans over a long weekend and someone gets stuck with being on-call at the last minute.
  • Track on-call coverage data and adjust accordingly. Equitable scheduling should ideally balance out all incidents, but if a team deploys heavily at the beginning of every month, and the same person is always on-call during that time, chances are they'll get a lot more calls during off-hours than everyone else in the rotation. Assess who's getting "on-call" burn-out, and swap around the responsibilities for a few weeks to help them recover.
Collapse
 
monknomo profile image
Gunnar Gissel

My team had a frank discussion with management about overtime and management decided that it could affort 10 hours of overtime/dev/year. This amounts to "we fix problems during regular business hours".

That's ok for us, because we're the government and our publicly facing software generally serves a group of people who have 10 days to get us the information. Assuming they don't wait until day 9 on a Saturday, slow weekend response is ok.

We do pay for a 24 hour help desk with a binder full of answers to common questions that people can call, which reduces the need for devs.

We talked about what we would do if 24/7, 365 responsiveness ever became necessary. Our feelings were that having enough devs to do 3 shifts, a la real 24 hour factory operations were probably the optimal solution, but failing the budget for that, rotating 2 weeks on, 2 weeks off on call duty seemed next best.

Collapse
 
val_baca profile image
Valentin Baca

Every dev goes on-call 24/7 for one-week.

Every dev is on-call one week every N weeks, where N = devs on team.

Exceptions are made as necessary, usually just one-day swaps but occasionally full-week swaps are necessary.

Collapse
 
vgrovestine profile image
Vincent Grovestine

Managers at my workplace are on a rotating weekly schedule with just one individual on call after-hours. Person in possession of the on call cell phone is responsible for triaging issues. If something can wait until the next business day, then a normal support ticket gets created on behalf of the caller with the incident particulars. On the other hand, if the matter is a true emergency, then our internal knowledgebase contains a series of "who to call if X happens" lists to get the correct non-management people in play.

Collapse
 
monknomo profile image
Gunnar Gissel

The call tree with "who to call if X happens" is very helpful for incident response!

Collapse
 
anthonydelgado profile image
Anthony Delgado

At one of my last companies we used to get an extra days pay in our pay checks every month and we would rotate being "on call" every weekend it would be someone else's turn. At my last company we used to substitute weekend work days for PTO so if you worked weekends for a few weeks straight you could save up days and get an extra vacation. One of our team members took like a month off with saved up PTO (Paid time off)

Collapse
 
agrothe profile image
Andrew Grothe

I'm currently on a team that supports a critical government app responsible for people's pay cheques. We have a phone that the on call person carries and only get calls from the help desk when they get a call. We do two weeks on and have anywhere from 2 to 5 people on rotation depending on the current team size. There is extra pay for "standby" and also hourly payment for doing work outside of business hours. It works well and all the devs are onboard.

Collapse
 
samcj profile image
Sam Johnson

Same as many here. On call roster is 1 week shifts of being secondary followed by 1 week shift of primary followed by a week off. It is really easy to manage via PagerDuty. They have calendar integrations and an easy way to manage overrides for particular days. It is important to have levels of on call so that people know that in an emergency there are other people to rely on as well.

We all switch around weekends a lot when people have different events to go to. I think its best to allow flexibility but keep a solid schedule.

Also, pay people for the days they are on-call. These are extra hours that they have to work and they deserve compensation for it. It also makes it easier to switch out days since there is an incentive.

Collapse
 
k4ml profile image
Kamal Mustafa

We have just started our on-call shift for about 2 weeks ? Previously, I don't agree on having on-call shift, preferring "mutual responsibility" approach. But now I started to feel it's bad for everyone, as that mean everyone can't "disconnect" from work at all time. So We discussed how the on-call shift should look like - 2 days per shift, or one week shift etc. We're a team of 10 but only 5 being put on call for now. In the end we decided for 2 days shift as 1 week seem too long for 1 person to take on.

We have a python script that printing the on-call roster for the following week but plan to have one month schedule ahead, as that seem easier to plan your day when you know what day you'll be on-call for the month.

p/s: I'm currently on-call, that's why I came here 😁

Collapse
 
tcratius profile image
Me, myself, and Irenne

I mainly sleep during my shift mainly I don't have work lol

Collapse
 
aghost7 profile image
Jonathan Boudreau

We have no real system for this at my work. It's whoever is available, willing, and most capable that usually gets called.

Collapse
 
laoman profile image
dude*

There's no magic recipe I guess, the trick is to have a big team working on rotation. There have been times I used to support two technologies handling issues for both on the same time 😏

Collapse
 
rhymes profile image
rhymes

I don't have anything to add to the conversation, just wanted to point you in the direction of this publication which has an entire issue on "on call": increment.com/on-call/