Dmitrii Krasnov

Posted on Apr 25 • Edited on Apr 29

Refactoring for Readability in Ruby: a Showcase

#ruby #rails #refactoring #programming

The problem

Within my 10 years of developer experience, over and over again I see an unnecessarily complex code: methods span tens of lines, contain a ton of low-level details, are hard to understand and very expensive to test. This causes programmers who come to change them due to some new requirements to prefer just adding their stuff to the mix cause, well, "I have work to do, no time for the design". This makes perfect sense: refactoring such code may take days, if not weeks. But it is also true that this makes things even worse with time.

My personal solution to this: write the code as readable as possible from the very beginning. The key skill to do this is an ability to recognize levels of abstraction in your code. Once properly recognized, they can be isolated from each other, which drastically reduces the mental effort required to understand what's going on in the code. There is a so-called "Single Level of Abstraction" (SLA) principle, popularized by Uncle Bob in his book Clean Code, that is all about that.

But, unfortunately, I neither hear much about it in the Ruby world, nor observe its application in practice. Today I had a chance to apply it again while refactoring another piece of code, and decided to write a post about it.

The Showcase

The code below is a business operation that answers the question: "What are the available appointment slots for a given practitioner within a week, starting from a given date?".

Initially, the logic was implemented as a single method within a Practitioner class. That code was working and was fully covered with tests, but is was not very readable, so I decided to refactor it.

I wanted to achieve the following:

1) The logic should be extracted from it's initial context and encapsulated in a separate class.
2) The class should follow the [slightly simplified version of the] coding style I describe in my rails code organization showcase.
3) The logic inside that class should be split into several methods that maintain a single level of abstraction.

With this showcase, I try to demonstrate the importance and the benefits of maintaining a single level of abstraction in your code. Compare the two implementations and ask yourself which you would like to work with

The Setup

The application in general solves a problem of connecting individual doctors with patients.

The part of it we're interested in here has 2 models: Practitioner & Event. Practitioner holds the doctors, Event -- the events related to the practitioner's available time. There are two types of events: "opening" and "appointment":

Events with kind = "opening" indicate that a given practitioner has marked some time window as available for appointments. The time window is always a multiple of 30 minutes, e.g. 10:00 - 11:30 or 10:00 - 10:30. 30 minutes is the size of a standard appointment slot, so a single opening event might contain multiple slots.
Events with kind = "appointment" indicate that a patient has booked a particular time slot, which is always 30 minutes.

Event model has the following fields:

practitioner_id:integer
kind:string
starts_at:datetime
ends_at:datetime

# app/models/practitioner.rb
#
class Practitioner < ApplicationRecord
  has_many :events
  has_many :openings, -> { opening }, class_name: 'Event'
  has_many :appointments, -> { appointment }, class_name: 'Event'
end

# app/models/event.rb
#
class Event < ApplicationRecord
  enum kind: { opening: 'opening', appointment: 'appointment' }
  belongs_to :practitioner
end

The business operation is being called from the controller action, so it's one of the entry points to the application logic.

The Initial Implementation

# app/controllers/api/v1/practitioners/appointment_slots_controller.rb
#
module Api
  module V1
    module Practitioners
      class AppointmentSlotsController < ApplicationController
        def index
          practitioner = Practitioner.find(params[:practitioner_id])
          slots = practitioner.available_slots(params[:date])

          render json: slots, serializer: Api::V1::AppointmentSlotsSerializer
        end
      end
    end
  end
end

# app/models/practitioner.rb
#
class Practitioner < ApplicationRecord
  # ...

  def available_slots(date)
    date = Date.parse(date)
    availabilities = 7.times.to_h { |i| [date + i.days, []] }

    week_openings = openings.where(starts_at: (date..(date + 1.week)))
    week_appointments = appointments.where(starts_at: (date..(date + 1.week)))

    week_openings.each do |opening|
      slots = []

      slot = opening.starts_at
      loop do
        slots << slot

        slot = slot + 30.minutes
        break if slot >= opening.ends_at
      end

      slots = slots.reject { |slot| week_appointments.any? { _1.starts_at == slot } }

      availabilities[opening.starts_at.to_date] += slots
    end

    availabilities
      .transform_keys { |date| date.strftime("%Y-%m-%d") }
      .transform_values { |slots| slots.sort.map { _1.strftime("%H:%M") } }
  end
end

The Refactoring process

After meditating for some time on the initial implementation, I've realized that the problem of getting available time slots might be split into 3 sub-problems:

How to get the slots for a given practitioner?
How to reject the taken slots?
How to format the result?

If I want to maintain a single level of abstraction, the main method of the refactored code should have those three steps, easily readable, and as few other stuff as possible. Let's sketch it:

def call(...)
  slots = get_slots(...)
  available_slots = reject_taken(slots)
  format_result(available_slots)
end

The other important thing to notice is that the initial implementation has the logic of the operation split between the controller action and the Practitioner#available_slots method: controller fetches the practitioner, method does the rest. People are used to writing this kind of code because all rails guides encourage it, so we often do things like this without thinking, but there actually are things to reconsider here. Knowing how to find the resource(s) the operation operates on is conceptually a part of the business operation itself, this logic might become complex with time, so in all but the trivial cases I try to put it into my business operations. Updating the sketch accordingly:

def call(practitioner_id:, ...)
  practitioner = Practitioner.find(practitioner_id)

  slots = get_slots(practitioner, ...)
  available_slots = reject_taken(slots, ...)
  format_result(available_slots, ...)
end

The last thing that substitutes the responsibilities of the operation is the input parsing. User provides the date as a string, controller extracts that from the params, but in my opinion knowing how to parse it should be the business operation's responsibility. In that particular case, it might seem arguable, but if you think of other business operations your system might have, some of them may receive more complex data as an input (e.g. a data from a Stripe callback), knowing how to parse that data is clearly a part of a domain knowledge and therefore it should not stay in the controller. Returning to our example, I choose to parse the date within a business operation cause I believe that the more unification you have in your code base -- the better. But that's a topic for a separate blog, if not the whole conference talk.

With all that said, I update the sketch for the last time:

def call(practitioner_id:, start_date:)
  practitioner = Practitioner.find(practitioner_id)
  start_date = Date.parse(start_date)

  slots = get_slots(practitioner, start_date, ...)
  available_slots = reject_taken(slots, ...)
  format_result(available_slots, ...)
end

Now that I've figured out the high-level structure of the operation and the complete set of it's responsibilities, I can start implementing it. The [personal] conventions I follow here are:

Each business operation gets it's own class named with a verb
Operations are organized into the [subdomain/resource/operation] pattern. I don't mention other subdomains in this blog so I'll skip the subdomain part and just put the code to /app/services, which is a good place for the code you don't know which subdomain to fit into yet.
Operations have the only public method, which is always #call
Operations always get their inputs as keyword arguments
#call is documented with YARD

I also try to identify the parameters hard-coded into the code and extract them into constants for visibility. In this example, I've extracted the length of the interval we fetch and the size of the slot.

The code becomes more readable when the related things are close to each other, so I've moved the fetching of openings/appointments into the low-level methods, closer to where they are being used. Sometimes this can't be done for optimization reasons, but our case is not like that.

And, finally, I try to simplify the code wherever I can, traces of which you may find in the resulting code.

While reading the resulting implementation, I encourage you to ask yourself, how do you feel about it, compared to the initial one. Do you like it? Does this code require less effort to understand? How simple it'll be to change it? What would you make differently? Let me know in the comments below

The Result

# app/controllers/api/v1/practitioners/appointment_slots_controller.rb
#
module Api
  module V1
    module Practitioners
      class AppointmentSlotsController < ApplicationController
        def index
          slots = ::Practitioners::GetAvailableSlots.new.call(
            practitioner_id: params[:practitioner_id],
            start_date: params[:date]
          )

          render json: slots, serializer: Api::V1::AppointmentSlotsSerializer
        end
      end
    end
  end
end

# app/services/practitioners/get_available_slots.rb
#
module Practitioners
  class GetAvailableSlots
    DAYS_TO_FETCH = 7
    SLOT_SIZE = 30.minutes

    # @param practitioner_id [Integer]
    # @param start_date [String]
    #
    # @return [Hash{String=>Array<String>}]
    # @raise [ActiveRecord::RecordNotFound]
    # @raise [Date::Error]
    #
    def call(practitioner_id:, start_date:)
      start_date = Date.parse(start_date)
      practitioner = Practitioner.find(practitioner_id)

      get_slots(practitioner, start_date)
        .then { reject_taken(practitioner, _1) }
        .then { format_result(_1, start_date) }
    end

    private

    def get_slots(practitioner, start_date)
      openings = practitioner.openings.where(starts_at: fetch_interval(start_date))

      openings.flat_map { split_into_slots(_1) }
    end

    def reject_taken(practitioner, slots)
      interval = ((slots.min)..(slots.max))
      appointments = practitioner.appointments.where(starts_at: interval)

      slots.reject { |slot| appointments.any? { _1.starts_at == slot } }
    end

    def format_result(slots, start_date)
      DAYS_TO_FETCH.times.to_h do |i|
        date = start_date + i.days
        day_slots = slots.select { (date...(date + 1.day)).cover?(_1) }

        [date.strftime('%Y-%m-%d'), day_slots.sort.map { _1.strftime('%H:%M') }]
      end
    end

    # Opening might stretch across multiple time slots, in this case we sptit 
    # it into chunks of 30 minutes: opening<starts_at=10:00, ends_at=11:30>
    # becomes [10:00, 10:30, 11:00]
    #
    def split_into_slots(opening)
      num_slots = (opening.ends_at - opening.starts_at).round / SLOT_SIZE
      num_slots.times.map { |i| opening.starts_at + i * SLOT_SIZE }
    end

    def fetch_interval(start_date)
      start_date..(start_date + DAYS_TO_FETCH.days)
    end
  end
end

Top comments (11)

Augusts Bautra • Apr 27

Thanks for sharing this, Dimitrii, popped up in my google feed. I'll read through on Monday, but talking about how to factor Rails code beyond MVC directories is sorely needed.

Augusts Bautra • Apr 29

OK, here are my thoughts after having read the post. Once again, thanks for sharing, Dimitrii.

May be out of scope, but the modeling of openings and appointments as pseudo-STI via :kind seems sub-optimal. Why not have standalone models - Opening and Appointment? Would be great if we could agree that openings and their one appointment start and end strictly at 30-min steps, for example 10:00, never 10:05 etc.
Good job identifying that there's a split of logic between controller and model. I try to follow "Skinny controller" paradigm (well, skinny-everything, really), so ideally a controller action merely handles HTTP and wraps some service call:

def index
  slot_lookup_outcome = 
    AppointmentSlotSecretary.call(practitioner_id: params[:practitioner_id])
    # OR
    AppointmentSlotSecretary.call(**index_params.permit!.to_h)

  if slot_lookup_outcome.success?
    render json: slot_lookup_outcome.slots, serializer: Api::V1::AppointmentSlotsSerializer
  else
    render json: slot_lookup_outcome.errors, ...
  end    
end

Where to put authorization checks VS process validity checks is a hard topic. Consider some "PublishBook" operation. Not only does it need to check whether the request-making actor has permission to publish books (or this particular book at least), but also whether the book can be published (has passed spell-check etc.). On the one hand Separation of Concerns (SoC) makes sense here in that it's easier to develop and test the operation assuming authorization was passed, the "if we got to operation logic, it was authorized" philosophy. But on the other hand leaving it in the controller oftentimes makes it hard to understand, especially if authorization is complex (it can get complex for creation operations where several records are used as "parents"). In those cases maybe a special subset of operations - Authorizators - need to be introduced.
On "Knowing how to parse [date param] should be the business operation's responsibility". I disagree. I believe uncle Bob would use the "if it were the 80s and business consisted of a dude with a phone and a spreadshseet, would he need this?" test. The answer is No. Param parsing is a side-effect of using HTTP, not business domain, so it's fine to leave it in the controller. Alternatively, you can approach this problem from Operation class API design perspective. What is better initialize(.., start_date: Date or initialize(.., start_date: String. This may vary from situation to situation. There are no solutions, only trade-offs.
I very much agree with your Operation conventions - only kwargs, only one public #call method, YARD docs. But I go a step further - move the params to the initializer. This way the instance gets to hold all the relevant data and less needs to be passed around with arguments:

def initialize(practitioner_id:, start_date:)
  @practitioner_id = practitioner_id, 
  ...
end

def call
  slots
end

private

attr_reader :practitioner_id, :start_date

def practitioner
  @practitioner ||= Practitioner.find(practitioner_id)
end

Finally, consider not using verbs in methods, often redundant. dev.to/epigene/the-three-types-of-...
Typo "you may аштв in the"

Dmitrii Krasnov • Apr 29 • Edited

Thank you very much for such a detailed feedback! Let me try to address all your points one by one

On the data model issues

Agree, the data model doesn't seem good to me as well, but I'm not the one who modeled it and I don't know all the details. Maybe they had their reasons, maybe that's only a part of the picture, who knows?

On a presentation/application boundary

I'm glad we're on the same here! I usually also split the UI responsibilities a bit further,into user-facing part (the controller action) and application-facing part (Action class). You can find an example of such a class here

On where to place authorization

Agree, authorization is a corner case. I'm still not 100% sure here, but my current thinking is that the permissions/roles are a part of the application logic and thus should stay within it, not outside. Rails ecosystem provides gems with simple mechanisms to authorize right within controllers, but having it there forces controllers to know about your application logic and system state, which creates a coupling between your presentation and application layers, which I'm not a fan of. This might be good at the start, but bad later when it comes to organizing the code into subdomains, cause controllers will know too much.

But once again, I'm not 100% sure. I've never developed a really complex authorization system, and as for the case you've mentioned, it'll be pretty easy to implement it right within the operation:

module Books
  class Publish
    # ...
    # @raise [ActiveRecord::RecordNotFound]
    # @raise [Books::NotPublishableError]
    #
    def call(author_id:, book_id:)
      author = Author.find(author_id)

      # This way we check if the book belongs to the author
      book = author.books.find(book_id) 
      raise NotPublishableError if !book.publishable?

      publish(book, author)
    end

    # ...
  end
end

Sorry, I need to board the plane, will continue later

Dmitrii Krasnov • Apr 29 • Edited

The date param.

I think you're right, it is a side effect and it should be parsed in the controller. God, such a minor thing for this example and such a lot of considerations! I've already thought about controller's responsibility to set request context (timezone), parsing webhook data being a responsibility of an infrastructure and not the application layer, complex inputs that doesn't map to ruby data types and should be parsed within the application layer, benefits of unification, benefits of simplification and delaying the decisions...

On passing params through an initializer

I understand why it seems cleaner to move the arguments to an initializer, but my decision to pass arguments through the #call method is deliberate. The main reason for this is that I reserve the initializer for dependency injection and configs. If eventually the logic in get_slots will become so complex that I'll decide to extract it into a separate class, or if the slot size will become a part of the application configuration, they'll become external dependencies of this class and I'll inject them as following:

class GetAvailableSlots
  def initialize(get_slots: Practitioners::GetAppointmentSlots.new)
    @_get_slots = get_slots
  end

  private

  attr_reader :_get_slots

  def get_slots(practitioner, start_date)
    _get_slots.call(practitioner:, start_date:, days_to_fetch:)
  end
end

class GetAvailableSlots
  def initialize(slot_length: Rails.application.config.appointment_slot_length)
    @_slot_length = slot_size.minutes
  end

  private

  attr_reader :_slot_length

  def split_into_slots(opening)
    num_slots = (opening.ends_at - opening.starts_at).round / _slot_length
    num_slots.times.map { |i| opening.starts_at + i * _slot_length }
  end
end

(I never use instance variables directly cause they are read-only and there's no mechanism in Ruby to protect the instance variable from being changed when addressed with @)

The two benefits of this approach are:

In your tests, you can easily stub all the external dependencies by passing procs. In Ruby, an object with an instance method #call acts exactly as a proc:

Rspec.describe "Practitioners::GetAvailableSlots" do
  subject(:get_available_slots) { described_class.new(get_slots: -> { [] }) }
end

This one popped up for me after using such an approach consistently for a while. Listing all the external dependencies in one place makes them more visible, which made me think more about what my code depends on and, with time, become a better programmer. I suggest you to check that talk by Tim Riley for the full picture

Augusts Bautra • Apr 30 • Edited

On passing params through an initializer, cont.d
To be frank, I think you're complexing things by drawing a distinction between "dep-injections and config" VS "real params". All that underscore-prefixing.. Embrace KISS, Power of One etc. etc. Kwargs give you all the flexibility you need.

class GetAvailableSlots
  DAYS_TO_FETCH = 7
  SLOT_SIZE = 30.minutes

  # @param practitioner_id [Integer]
  # @param start_date [String]
  # @param slot_querier [Class] one of slot querying service classes
  # @param slot_length [Integer] intended slot length, in seconds 
  def initialize(practitioner_id:, start_date:, slot_querier: GetAppointmentSlots, slot_length: SLOT_SIZE) 
    @practitioner_id = practitioner_id
    @start_date = start_date
    @slot_querier = slot_querier
    @slot_length = slot_length
  end

  # @return [Hash{String=>Array<String>}]
  # @raise [ActiveRecord::RecordNotFound]
  # @raise [Date::Error]
  def call
    slots ... 
  end

  private

  attr_reader :practitioner_id, :start_date, :starting_slots, :slot_length

  def slots
    @slots ||= slot_querier.new(practitioner: practitioner, ...).call
  end
end

And I don't follow your point about tests, init params are just as easy to specify or omit. In my example, you could even define an ad-hoc querier class to inject:

Rspec.describe "Practitioners::GetAvailableSlots", instance_name: :service do
  let(:service) { described_class.new(**options) }

  describe "#call" do
    subject(:call) { described_class.new(get_slots: -> { [] }) }

    context "when initialized with a custom slot querier class" do
      let(:options) { {practitioner_id: practitioner.id, start_date: 1.week.since.to_date, slot_querier: custom_slot_querier} }

      let(:custom_slot_querier) do
        c = Class.new(GetAppointmentSlots) do
          # slot-getting logic
        end

        stub_const("MyCustomQuerier", c)
      end

      it "returns expected slots" do
        is_expected.to eq(some_slots)
      end
    end  
  end  
end

Update: Having watched Tim's presentation, I see why the desire to have init-call param split. I cooked up a response why it's a bad idea.

Paweł Świątkowski • Apr 29

This is a great refactoring story, thanks for sharing!

I do, however, similar to @epigene, disagree with parsing the date from string inside this business class. It's a detail coming from infrastructure. The object should receive already parsed Date object.

To illustrate the point: let's assume this get called from a webhook from some service that passes the date as a UNIX timestamp. Now you need to change your business logic modelling class only because of that, but the actual logic does not change.

Max Cervantes • Apr 27

I like it when the object could operate on it's data, rather than being acted upon. From that point of view the end result is an anemic biz object

Dmitrii Krasnov • Apr 27

Thank you for your feedback. The approach I'm using is definitely not common, that's one of the reasons I'm sharing it. Same topic was popped up by a person under my announcement of that post on reddit, I'll leave the link here, feel free to join the discussion.

reddit.com/r/ruby/comments/1cd3i1l...

Paweł Świątkowski • Apr 29

I think it's more common than you think ;) I have seen similar approaches in pretty much every more mature Rails codebase I worked on.

Talking about anemic business object is a big misunderstanding what's going on in this code. It's not a business object, rather a business transaction wrapped in OOP code. Nothing wrong with that and DDD books describe this approach too.

Dmitrii Krasnov • Apr 29

I'm glad to hear that ^_^ In 80% of my past jobs the legacy code I've inherited was anything but readable

Riccardo Lucatuorto • May 3

Thanks, I usually do also this

    def call(practitioner_id:, start_date:, days_to_fetch: DAYS_TO_FETCH, slot_size: SLOT_SIZE)

I find this usefull

View full discussion (11 comments)

DEV Community