Skip to content
loading...
markdown guide
 
 

What we've been up to

 

We're using Sentry to catch live errors. It has a handy integration to JIRA so the person who discovered the error can open a ticket directly from there.

Once reported, the team plans, implements, and deploys the fix within the same day.

 

We use Sentry to make sure we get notified directly if something goes wrong.

What to Do?

If there is NO migration made (no db changes) in the latest deployed release:

  1. We run a container from our previous docker image.
  2. Let Nginx's load balancer do its magic to load the traffic from the bad container into the good one.
  3. Fix the issue and make sure things work fine before deploying 😁
  4. Repeat from 1 to 2, but with the new bug-free-hopefully container.

If there is migration made (db changes made to existing data... not happening so often):

  1. Realize that you fu**ed up this time (cuz you can't use an old container)!
  2. Fix it fast.
  3. Deploy again.
 

We get notified of issues in production in several ways: sentry, pingdom, our Customer Success Team

When an issue occurs in production we have a predefined process we go through:

  • Assign a production incident Marshall to drive the effort, this is the customer success team lead
  • Work with a product team member to investigate the issue
  • Recruit help from others when needed
  • Work towards a resolution
  • Create a Production Incident Report
  • Review the report in a Production Incident Retrospective
  • Schedule actions that came up from the retrospective

Works incredible well

Classic DEV Post from Nov 24 '19

I created the Web Almanac. Ask me anything about the state of the web!

AMA about the Web Almanac and the state of the web.

Ben Halpern profile image
A Canadian software developer who thinks he’s funny. He/Him.