DEV Community

Max Chernyak
Max Chernyak

Posted on • Originally published at max.engineer on

Ruby Enumerator.new(size)

Every ruby enumerator supports count. It’s a method that will iterate over every item and return their total count.

irb> enum = Enumerator.new { |yielder|
  (1..100).each do |i|
    puts "counting item: #{i}"
    yielder << i
  end
}

irb> enum.count
counting item: 1
counting item: 2
…
counting item: 100
=> 100
Enter fullscreen mode Exit fullscreen mode

However, Enumerable also has size. Except, by default it’s just nil.

irb> enum.size
=> nil
Enter fullscreen mode Exit fullscreen mode

A little-known feature in ruby is that you can pass a parameter to Enumerator.new to give it a shortcut “answer” to the size question.

irb> enum = Enumerator.new(100) { |yielder|
  (1..100).each do |i|
    puts "counting item: #{i}"
    yielder << i
  end
}

irb> enum.size
=> 100
Enter fullscreen mode Exit fullscreen mode

No more iterating to get the count. However, there’s an even more little-known feature. You can pass a lambda to determine the size lazily, and still faster than iterating. Let’s say that you’re enumerating over products in some kind of ecommerce API.

irb> api = EcommerceApi.new('connection config')
irb> enum = Enumerator.new { |yielder|
  api.products.each.with_index do |product, index|
    puts "fetching product: #{index}"
    yielder << product
  end
}
irb> enum.count
fetching product 0
fetching product 1
…
fetching product 235
=> 236
Enter fullscreen mode Exit fullscreen mode

Let’s say our API has a more efficent way of obtaining the count: total_count endpoint.

irb> api = EcommerceApi.new('connection config')
irb> enum = Enumerator.new(api.products.total_count) { |yielder|
  api.products.each.with_index do |product, index|
    puts "fetching product: #{index}"
    yielder << product
  end
}
irb> enum.size
=> 236
Enter fullscreen mode Exit fullscreen mode

We no longer have to iterate over products to get the total count, but notice a new problem: we now always run total_count, even if the user of our enum never calls size. Seems like a waste. Moreover, if the products are added to the API, our size will not change. The lambda would allow us to run the API call only when requested, and always get fresh count.

irb> api = EcommerceApi.new('connection config')
irb> enum = Enumerator.new(-> { api.products.total_count }) { |yielder|
  api.products.each.with_index do |product, index|
    puts "fetching product: #{index}"
    yielder << product
  end
}
irb> enum.size # Calls -> { api.products.total_count } lambda.
=> 236
Enter fullscreen mode Exit fullscreen mode

This feature also exists when using enum_for/to_enum to create the enumerator. You have to return it from the block passed into enum_for. The block arguments are any additional arguments passed to enum_for.

irb> def each_number(max = 100)
  return enum_for( __method__ , max) { |max| max } unless block_given?
  (1..max).each { |n| yield n }
end
irb> each_number(200).size
=> 200
Enter fullscreen mode Exit fullscreen mode

P.S. I often forget how this Ruby feature works, and searching never brings up quick examples, so hopefully this article will help when in need of a quick reminder.

Latest comments (0)