DEV Community

Cover image for Down for me?
Alan Barr
Alan Barr

Posted on

Down for me?

I was trying to post and dev.to was down for me.

When you make a change, how do you know when an update is a bad break?

Have you ever had a bad deployment?

Top comments (5)

Collapse
 
ben profile image
Ben Halpern

DEV crashed due to a change that was introduced a week ago, and we don't have a great idea of why did this happen now yet. It was a state of data change that triggered a sensitive loading of site-wide config, we're still looking into what went wrong. We'll get to the bottom of it, but it's unclear at the moment.

🤪

Have you ever had a bad deployment?

To answer this question, the worst deployment I ever made was issuing a change which caused an infinite loop upon boot which was problematic enough to trigger a service-wide status update on Heroku. I personally wrote code so bad that Heroku users all over the world were affected in some small way.

Collapse
 
alanmbarr profile image
Alan Barr

Makes sense. I hope you all find the problem!

Collapse
 
ben profile image
Ben Halpern

Thanks! We fixed the code, but still don't fully understand the problem. I'm curious once we figure it out whether it will seem like we should have seen that coming or it is totally random.

Thread Thread
 
alanmbarr profile image
Alan Barr

In my experience, it's not one problem but a series of changes. Second and third-order effects cause those issues, and you must step back and apply system thinking. Pre/Postmortems are a great way to avoid those challenges earlier with testing, checklists, or a mind map to review what to not skip.

Collapse
 
theaccordance profile image
Joe Mainwaring

Have you ever had a bad deployment?

Indeed, I have. If you deploy frequently enough, you'll eventually encounter an issue that you didn't catch in your non-prod environments.

It happens, what matters is getting back up and running quickly, then identifying the problem and figuring out how to avoid repeating the same situation. Some situations will be easily identified and mitigated, others less so.

ProTip: Have whiskey on hand for after your prod instance has recovered

giphy