loading...
Cover image for Call for Contributions: Move All DelayedJobs to Sidekiq
The DEV Team

Call for Contributions: Move All DelayedJobs to Sidekiq

molly_struve profile image Molly Struve (she/her) ・1 min read

Hey everyone!!! 👋

Now that we have a smokin fast Redis instance going we are moving all of our background jobs from DelayedJob to Sidekiq! Click on the Issue linked below to learn more!

There are lots of details in there about why we are moving to Sidekiq, how we plan to do it, a guideline for helping out, and even a few example PRs to get you going. I think this is a GREAT opportunity for anyone looking to dip their feet in the open-source waters. Let me know if you have any questions!

Happing coding 😃

Epic: Moving Jobs From DelayedJob to Sidekiq #5305

Why We Are Moving To Sidekiq

Because DelayedJob relies on your database to process jobs, when you get a large volume of jobs it can be very slow. For this reason we are switching to Sidekiq to help ensure that as our job volume increases we have the ability to keep up with the new load. Also in DelayedJob, because it didnt handle lots of jobs well we have been ignoring any job failures in production which is not great when you are trying to ensure reliability. On Sidekiq we won't have to worry about allowing jobs to fail bc Sidekiq is built to handle a lot of jobs very quickly thanks to the help of Redis.

Sidekiq also has A LOT of very nice features and plugins that we can use as our job/worker flow becomes more complex. If you want to know more about why Sidekiq is a great background job tool checkout this blog post I wrote about moving to it from Resque.

Why not hook Sidekiq directly into ActiveJob? Yes, Sidekiq can hook into ActiveJob, however, there are some downsides. Notably, it is 2-20x slower. Another big downside is that ActiveJob is not compatible with Sidekiq's commercial features which we may want to make use of in the future. Using ActiveJob to send emails is about the only thing I plan to tie Sidekiq into bc those jobs are simple. Any other job it would be advantageous if we could have full control over it by completely moving it to Sidekiq and keeping the interface as simple as possible.

Strategy For Moving Jobs

Same as we did for Redis keys, we are going to be moving jobs over one by one. When a job moves from DJ(DelayedJob) to Sidekiq we are going to be renaming it by ending its name with Worker For example: BustMultipleCachesJob => BustMultipleCachesWorker These new workers will be stored in a separate folder away form the jobs which will help us quickly evaluate what still needs to be done.

How You Can Help!

There are a lot of jobs that need to be moved which is why I would LOVE help from the community doing this! Grab a job, move it to Sidekiq and then open a PR with this issue tagged.

When Moving a Job

When you are moving a job there are 2 ways you can do it 1. Create a new Worker class and delete the job all in the same PR: This is 100% fine for those jobs that do not run a lot. For example, if we run a job from the scheduler once a day, that is one we could roll over in a single PR. 2. Create a new Worker, open a PR. Once that PR is merged, open a second PR to remove the old Job: This is what we need to do for jobs that are executed a lot. Think indexing or cache busting jobs. By merging the PRs separately we can ensure we don't strand any jobs that might be inflight when the new worker is created.

What queue should I put it in? Look at how often the job is executed and how important that job is. For example, jobs kicked off from the UI that are updating data we want do very quickly so the user gets the data they want. A scheduled job handing out badges, that is not as important and can be delayed a little bit so it can go on a low_priority queue. The Sidekiq queues we have are

:queues:
  - ["default", 1]
  - ["low_priority", 10]
  - ["medium_priority", 100]
  - ["high_priority", 1000]
  - ["scheduler", 1000]
  - ["mailers", 1000]

The numbers on the right indicate the queue importance. The higher the number the more important the queue is and the fast those jobs will be executed. The numbers are actually a ratio. In the event that every queue is full we will execute 1000 high_priority jobs for every 100 medium_priority, every 10 low_priority etc.

When you are moving jobs PLEASE feel free to take a look at the job itself and improve on either performance or how it is being executed. For example:

  • Add TESTS for the new worker even if the old job didn't have them. Also add this new shared_context example spec for ensuring the correct queue is set for each job. Other test helpers have also been added that you can use thanks to @rhymes
  • Add early return statements when data is missing to avoid unnecessary work
  • Is the job silently failing? There may be times we don't want jobs to silently fail anymore
  • Are the arguments objects? We DO NOT want to be passing any object arguments, simple strings is our goal so take this time to change those to strings, ids, whatever.

Example PRs

Single PR Job Move: https://github.com/thepracticaldev/dev.to/pull/5269 Multiple PR Job Move: PR 1 - https://github.com/thepracticaldev/dev.to/pull/5302

Jobs TODO

Let me know what you want to work on!!!

Articles:

  • [x] bust_multiple_caches_job
  • [x] detect_human_language_job.rb
  • [x] score_calc_job.rb
  • [x] update_analytics_job.rb
  • [x] update_main_image_background_hex_job.rb

Audit

  • [x] save_to_persistent_storage_job.rb

Badge Achievements

  • [x] send_email_notification_job.rb

ChatChannels

  • [x] index_job.rb

Classified

  • [x] bust_cache_job.rb

Comments

  • [x] bust_cache_job.rb
  • [x] calculate_score_job.rb
  • [x] create_first_reaction_job.rb
  • [x] create_id_code_job.rb
  • [x] send_email_notification_job.rb
  • [x] touch_user_job.rb

Events

  • [x] bust_cache_job.rb

Follows

  • [x] create_chat_channel_job.rb
  • [x] send_email_notification_job.rb
  • [x] touch_follower_job.rb

Mentions

  • [x] create_all_job.rb
  • [x] send_email_notifications_job.rb

Notifications

  • [x] mention_job.rb
  • [x] new_badge_achievement_job.rb
  • [x] new_reaction_job.rb
  • [x] remove_all_job.rb
  • [x] welcome_notification_job.rb
  • [x] milestone_job.rb
  • [x] new_comment_job.rb
  • [x] notifiable_action_job.rb
  • [x] tag_adjustment_notification_job.rb
  • [x] moderation_notification_job.rb
  • [x] new_follower_job.rb
  • [x] remove_all_by_action_job.rb
  • [x] update_job.rb

Orgs

  • [x] bust_cache_job

Pages

  • [x] bust_cache_job

Podcast Eps

  • [x] bust_cache_job
  • [x] create_job
  • [x] update_reactable_job

Podcasts

  • [x] bust_cache_job
  • [x] get_episodes_job

Pro Memberships

  • [x] popular_history_job

Reactions

  • [x] bust_homepage_cache_job.rb
  • [x] bust_reactable_cache_job.rb
  • [x] create_job.rb
  • [x] update_reactable_job.rb

Search

  • [x] index_job
  • [x] remove_from_index_job

Streams

  • [x] twitch_webhook..._job

Tags

  • [x] bust_cache_job

Users

  • [x] bust_cache_job.rb
  • [x] follow_job.rb
  • [x] self_delete_job.rb
  • [x] touch_job.rb
  • [x] estimate_default_language_job.rb
  • [x] resave_articles_job.rb
  • [x] subscribe_to_mailchimp_newsletter_job.rb

Webhook

  • [x] destroy_job
  • [x] dispatch_event_job

Other

  • [x] html_variant_trial_create_job.rb
  • [x] html_variant_success_create_job.rb
  • [x] export_content_job.rb
  • [x] slack_bot_ping_job.rb
  • [x] rss_reader_fetch_user_job.rb

Discussion

pic
Editor guide
Collapse
peterdenham profile image
Peter Denham

At my $dayjob we hit similar scaling issues, but due to the fact that we had built and depend on several extensions around delayed job, we decided to bring a redis cache layer to delayed job instead of moving to Sidekiq. While it's not something we could open source right away, you can read a bit about our implementation here: salsify.com/blog/engineering/job-pump

Collapse
molly_struve profile image
Molly Struve (she/her) Author

Implementing Redis with DelayedJob, impressive! Thanks for sharing!

Collapse
etampro profile image
Edward Tam

Side-tracked, but congrats running into volume issues. The site is doing well and this is the "good" type of issues we all like to have :)