DEV Community

It's Ruby, There Must Be a Better Way

Ryan Palo on October 28, 2018

I was recently doing a challenge on Exercism, on the Ruby track, and I struggled a lot, but when I ended up with a final solution, I was amazed at ...

Read full post

Eugene Gilburg • Oct 28 '18

Instead of:

  @@names = POSSIBLE_NAMES.to_a.shuffle.each

  def self.forget
    @@names = POSSIBLE_NAMES.to_a.shuffle.each
  end

You can just write:

  def self.forget
    @@names = POSSIBLE_NAMES.to_a.shuffle.each
  end

  forget

Ryan Palo • Oct 28 '18

I wondered about that. That totally makes sense. Thanks!

Tom Lord • Oct 29 '18 • Edited

As yet another alternative approach... Taking either of your basic ideas (either storing all names in an Array, or an Enumerator, or keeping a track of taken_names), you could use the regexp-examples ruby gem that I wrote a few years back to abstract the problem of "getting a valid-formatted robot name" -- for example:

/[A-Z]{2}\d{3}/.random_example

Then, if you have different robots with other naming rules, the rest of the cost would work out of the box. (And as a bonus, this regex will likely already exist in your code, to validate that the robot name is ok!)

Ryan Palo • Oct 29 '18

Hi Tom! Thanks for sharing! How does this Gem handle collisions? One of the test cases generates all 676,000 robot names, so if calling random_example provides duplicates, we run into the same "theoretical infinite time complexity" issue, calling random_example repeatedly until it provides the one robot name we haven't encountered yet.

Tom Lord • Oct 29 '18 • Edited

You could use Regexp#random_example in conjunction with the taken_names, as per your first solution in this blog post.

Or -- with the potential performance issue of storing a large array in ruby memory (as with all your other examples that use to_a!) -- you could also use Regexp#examples to end up with very similar solutions. (See the documentation.) For example:
.

@names = /[A-Z]{2}\d{3}/.examples(max_group_results: 26, max_results_limit: 676000).shuffle.each

...Note that this would be a bad idea if the pattern got longer, so the number of possible results was much larger. All of your examples that use to_a, just like my Regexp#examples code above, would soon freeze the system as the array length grows exponentially.

Using Regexp#random_example - similar to your original implementation - would scale fine though.

Phil Nash • Oct 28 '18

I really enjoyed reading this journey and seeing the trade offs you had to face. Thanks for sharing!

Elizabeth Jenerson • Oct 29 '18

I agree w/ Phil.

Further, I've had the same cry sessions for similar reasons: I worked hard and long to do it one way and here comes a Ruby method or something that not only does it easier but with less lines of code.
🤦🏾‍♀️

Ryan Palo • Oct 29 '18

Glad you liked it!

Curtis Fenner • Oct 29 '18 • Edited

Because, each time we pop an item out, all of the items behind that item have to figure out what to do with the gap that it left.

You can fix this by swapping the element with the last element, and then popping off the last element, which is always cheap:

@@possible_names[next_index] = @@possible_names[-1]

# Drop off the last value. That's OK, because the last value just replaced
# the value-to-be-used-up.
@@possible_names.pop()

This only works because you don't actually care about the order of values in @@possible_names!

Another much more involved approach would be some kind of sparse Fenwick tree. That way, startup time is instantaneous, and you only use lots of memory when you've roughly used up every-other ID.

Ryan Palo • Oct 29 '18

Woah cool! I’ll look into that. Thanks for the ideas! 😃

Matt Jones • Oct 29 '18

There's another, more direct solution to the "running include? on an Array takes longer and longer as the array gets bigger" problem: use a Set.

Add require 'set' to the top of your file; the library ships with Ruby but isn't loaded by default.

Then replace @@taken_names = [] with @@token_names = Set.new.

Under the hood, Set uses a Hash to get constant-time lookups - source. See your favorite Ruby internals book for details on this - I'm a fan of Ruby Under A Microscope.

BTW, I was curious about the pitfalls of representing names as integers and then converting them - so I put together some code. Note: I've been writing a lot of Elixir lately, and the style reflects some of that.

module NameConverter
  DIGITS = ('0'..'9').to_a
  LETTERS = ('A'..'Z').to_a

  AVAILABLE = [LETTERS, LETTERS, DIGITS, DIGITS, DIGITS]
  BACKWARDS = AVAILABLE.reverse

  module_function

  def to_name(x)
    BACKWARDS.reduce(['', x]) do |(acc, n), digits|
      base = digits.length
      idx = n%base
      [acc+digits[idx], n/base]
    end.first.reverse
  end

  def from_name(name)
    AVAILABLE.reduce([name, 0]) do |(name, acc), digits|
      base = digits.length
      char = name[0]
      [name[1..-1], base*acc+digits.index(char)]
    end.last;
  end
end

irb(main):117:0> NameConverter.from_name('AZ700')
=> 25700
irb(main):090:0> NameConverter.to_name(25700)
=> "AZ700"

This would make using the Set approach even more efficient, since Integer objects are 8 bytes instead of 40 bytes for Strings.

Federico Ramirez • Oct 28 '18

Pretty cool! I didn't know that about ranges :)

tleish • Nov 1 '18 • Edited

Be careful... since self.names is a class method, then this implementation of @names is also a class variable.

So both of these are using class level variables and writing to the global namespace:

class Robot
    def self.names
      @@names = POSSIBLE_NAMES.to_a.shuffle.each
    end
end
Robot.names
Robot.class_variable_get(:@@names)

class Robot
    def self.names
      @names ||= POSSIBLE_NAMES.to_a.shuffle.each
    end
end
Robot.names
Robot.instance_variable_get(:@names)

In this last case, the Robot class is a singleton instance itself.

Class variables are not 'evil' as long as you know what you are doing. Best rule of thumb would be to consider a class variable generally as immutable. In this example, if we had 2 separate processes sharing the same service of make robot names... one could reset the names while the other is running. This would cause a bad side effect. However, if you simple used the class variable to store the initial full set of names and then randomized it with each instance would be safer. Class variable is being used as a constant, immutable value... while random is per instance.

Also, since the built array already exists inside the enumerable, reusing the array is slightly faster than rebuilding from a range each time:

  def self.forget
    @@names = @@names.to_a.shuffle.each
  end

Nick Cinger • Oct 28 '18 • Edited

Not even a Ruby dev, but this was a good read. Interesting challenge and nifty solution :D

Ryan Palo • Oct 28 '18

Thanks! Yeah, when I first read through the challenge, I didn’t think it was going to be that hard. I was wrong.

Ben Halpern • Oct 28 '18

I think Ruby is fundamentally interesting, simple and fun. If it weren't for Rails it would still be known as more of a delightful hobbyist language.

Ryan Palo • Oct 28 '18

Yeah, and a pretty powerful scripting/automation language too, for server-side stuff. A lot more readable and maintainable than Bash, a lot of times.

Jordan Running • Nov 2 '18 • Edited

This is a fun challenge. I wanted to see if I could figure out a way to do this without shuffle. The solution I settled on after much googling is to use a linear congruential generator algorithm, which is a psuedorandom number generator with the property that, given the right inputs, its output is a sequence of all integers in a range (0 .. m-1) in pseudorandom order with no repeats (until it exhausts the range, whereupon it begins the sequence again). In other words, it can be used as a "lazy shuffle" algorithm. Here's the implementation I came up with, as a Ruby Enumerator:

The "increment" value C could have been any number coprime with M; I chose 77777 purely for aesthetics.

Note that if we changed M.times do to loop do, the sequence would cycle infinitely.

Once you have this, all that's left is to format each number as a robot name, and then some bookkeeping for the forget! and reset! methods. Here's the final result:

Jordan Running • Nov 2 '18

Just for kicks, here's the same thing in JavaScript: beta.observablehq.com/@jrunning/ro... (Too bad dev.to doesn't embed Observable notebooks.)