βProgress isn't made by early risers. It's made by lazy men trying to find easier ways to do something.β β Robert Heinlein
One of the benefits that Scala (and other functional programming languages) have over their imperative brethren is the concept of lazy evaluation. Lazy evaluation doesn't evaluate a computation in it's entirety, but simply computes what is absolutely required when and only when it's required.
The LazyList
, introduced in Scala 2.13 (see Stream
for 2.12 and before) is an immutable linked list implementation that is evaluated lazily. Elements are only computed as needed.
Consider a list from 0 to infinity. With a regular List, we couldn't declare this, as it would need to be fully evaluated, which would never terminate. But with the LazyList
we can just say
val list = LazyList.from(0)
val list: scala.collection.immutable.LazyList[Int] = LazyList(<not computed>)
Note that none of this list has been evaluated. Intuitively we know that this list is non-empty, but if we try to find out we get
val empty = list.isEmpty
val empty: Boolean = false
list
val res1: scala.collection.immutable.LazyList[Int] = LazyList(0, <not computed>)
So, in attempting to determine if the list is empty, the head had to be calculated. And once calculated it now has a concrete (immutable) value. Any further attempts to get the head will return that now-computed value.
This property of the LazyList, known as Memoization, is extremely useful, and to see why let's take a look at a classic example of recursion - calculating Fibonacci numbers.
Fibonacci
Consider the classic Fibonacci function, here recursively in Scala
def fib(n: Int): BigInt = {
if (n == 1) 0
else if (n == 2) 1
else fib(n - 1) + fib(n - 2)
}
This vary simple implementation has 2 base cases, then recursive calls to calculate fibonacci in terms of descending values until we reach the base case. But, there's a big problem with this naive implementation. Calculating fib(n) will cause calls to fib(n-1)
and fib(n-2)
. In turn these will generate calls to fib(n-2)
, fib(n-3)
, fib(n-3)
, fib(n-4)
and so on. By the time we get a few levels deep, we're calculating the same values over and over again. Running the above function to calculate fib(40)
on my Macbook Air takes about 4 seconds, and fib(50)
almost a minute. Add to this that there are a lot of non-tail recursive calls and the possibility of a stack overflow, should you have the patience to await termination.
Clearly all of that is an issue for a relatively straightforward problem.
Enter memoization and LazyList
The Scala docs for LazyList gives a nice single line implementation of fib as
val fibs: LazyList[BigInt] = BigInt(0) #:: BigInt(1) #:: fibs.zip(fibs.tail).map{ n => n._1 + n._2 }
We can see the base values at position 0 and 1 being populated with the LazyList cons operator #::
We then have subsequent values calculated by zipping all fibs with fibs.tail
and the next fib number is the sum of these two. Because it is lazily evaluated we can zip an indefinitely long sequence with its tail. And memoization ensures that the concrete value of any given position is only calculated once.
Using this I was able to compute in a few seconds that fib number 100,000 has 20,000 digits. Nice.
fibs.take(100000).lastOption.foreach(n => println(n.toString.length))
> 20899
None of this comes for free, however, and the trade off, as usual, is in memory space. In the above example, a linked list of size 100,000 must exist in memory, and will continue to do so while there is a reference to the head in scope (val fibs
).
If you are given to trying out competitive programming, or assessment sites like TopCoder or Hackerrank, then you are sure to come across Dynamic Programming (DP). Memoization is a technique of DP that will get you through many auto-graded assessments, and for that reason alone is worth learning. Plus it gives us a new way to look at old problems, and that's always worthwhile.
Top comments (0)