This is a deep dive into an interview question that explores several solutions, how you might get from one solution to another, and how you might weigh up which solution is right in which circumstance.
To provide some background about how this article came about, every week cassidoo puts out a newsletter that has some great content in it. A regular feature that I enjoy is the interview question (btw, a place that hits you with a weird question like this in your interview is maybe not interviewing in the best way, but its still pretty common and the idea of an 'interview question' makes sense to folks familiar with it so we're probably stuck with the concept for some time to come).
I use these little problems to practice a language I'm trying to learn, which at the moment is Scala3 (previously known as dotty, a language that is still a work in progress to some extent and whose tooling is not fully developed). In my code snippets in this article, I've tried to avoid using confusing idioms and syntax, so even if you're not familiar with Scala, you can probably still get the gist.
This week the question was
Given a positive integer n, write a function that finds the number of zeros at the end of n! in base 10.
I'm going to talk about 4 different solutions to this (well, 6 really), 3 that make sense in different contexts, and a fourth that is basically silly, but which was the reason I decided to write this all out.
Version 0
This version is correct and straightforward.
def factorial(n: Int): BigInt =
(1 to n).foldLeft(BigInt(1))((a, b) => a * b)
def numberTrailingZeros(str: String): Int =
str.split("[1-9]").last.length
def numberTrailingZerosOfFactorial0(n: Int): Int =
numberTrailingZeros(factorial(n).toString)
We create a function to calculate the factorial, another to split off and count the zeros at the end of a string of digits, and apply the functions in that order, using a conversion to string so the types are right.
Notice that factorial
maps from an Int to a BigInt. Scala has 3 integer types, Int, Long and BigInt. factorial(20)
is the largest factorial you can represent using Long, so we should use the theoretically limitless BigInt. This brings up the next point.
Factorial values get really huge really fast. Calculating them gets very computationally intense. And all we want is the number of zeros at the end. Is there a way to skip the calculation of the factorial, and still work out how many zeros there are?
Version 1
You can do a little analysis of the question before you dive in to try and answer it. Firstly, the number of zeros at the end of a base 10 number is the number of times 10 can divide the number without remainder, so you could skip the string conversion, and test how many times you can divide the factorial by 10, which is a slight improvement, but there's more to discover.
Even if you're not great at maths, you can look at the values you get applying this function you've made, and notice they go up or stay the same as the number you're inputting increases. You might find yourself wondering when do they stay the same, and when do they go up. Some exploration shows you they increase whenever 5 divides the number, that is
numberTrailingZerosOfFactorial0(5*n - 1) < numberTrailingZerosOfFactorial0(5*n)
is always true.
But sometimes it doesn't go up by just 1.
numberTrailingZerosOfFactorial0(24) == 4
numberTrailingZerosOfFactorial0(25) == 6
What's going on? If 10 divides a number (evenly, without remainder) so do 2 and 5, and if 2 and 5 divide a number, so does 10. If you think about how we construct the factorial, you can see that there are always more 2s dividing it than 5s. So we need to count how many times 5 divides the factorial, because that will be the same as the number of trailing zeros. When we go from factorial(24) to factorial(25), we are multiplying by 5 twice, so it will have two more zeros at the end of its decimal representation.
We might make a function
def timesFiveDivides(n: BigInt): BigInt =
if (n % 5 == 0 )
1 + timesFiveDivides(n/5)
else
0
def numberTrailingZerosOfFactorial01(n: Int): BigInt =
timesFiveDivides(factorial(n))
And we could check that it's returning the same values as our first version. But we're still calculating the factorial. That's not great, so let's call this v0.1.
Because 5 is prime, we can note that the sum of the number of times 5 divides the multiplicands is the number of times 5 divides the product. That is,
def timesFiveDividesFactorial(n: Int): BigInt =
(1 to n).map(i => timesFiveDivides(i)).sum
val numberTrailingZerosOfFactorial10: Function[Int, BigInt] = timesFiveDividesFactorial
And finally, we're not calculating the factorial. Worthy of a v1.0. But, is this the best we can do? Not really.
Notice that the types are kinda weird. The arguments in (a to b)
are necessarily Int
. (a to b)
is a lazy collection, by which I mean Scala represents it as an object that will tell you what is next in the collection, or if nothing is left in there, but it never creates the whole collection in memory, so its memory footprint is small. But it does kind of create a todo list, and Scala knows it can't deal with one that's absurdly large, so it restricts the types of the arguments to its smallest integer class, Int. Here, the todo list is as big as the initial input, so we're taking that many steps in our calculation.
We've stopped ourselves from calculating the factorial. Can we avoid taking so many steps?
Have a look at this intermediate value in our calculation of timesFiveDividesFactorial
.
(1 to n).map(timesFiveDivides(_))
If you plug in, say, 32 for n, you get
Vector(0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 1, 0, 0)
There are some pretty easy to see patterns, the 4 zeros then the non zero value, and those non zero values are 4 ones then a two. You could experiment and find out that for inputs above 125 its four twos and then a three, and in between those twos and threes you get identical patterns of zeros and ones, all 24 long and identical to the first 24 terms in our sequence.
We'll be talking about these sequences a lot in what follows. Summing the values of the sequence of length n accomplishes the same thing as counting the zeros at the end of factorial(n), so if we can understand these sequences and their sums, we might find better algorithms for our problem.
In fact, these sequences have a self similarity to them. By that I mean, if you take one of these sequences, of length n say, filter out the 0s and subtract one from everything left over, you get the sequence for n/5. You might want to try it out with your favourite language, this is key to understanding our next solution.
Using that self similarity we can write the following.
def betterTimesFiveDividesFactorial(n: BigInt): BigInt =
val next = n / 5
if (next == 0)
0
else
next + betterTimesFiveDividesFactorial(next)
val numberTrailingZerosOfFactorial11: Function[BigInt, BigInt] = betterTimesFiveDividesFactorial
This is very simple as code, we're not creating large collections and we're certainly not performing large integer products. It will return after ceiling(log5(n)) steps, and each step is a comparison, an integer division and an addition. In a lot of ways this is what I would call the reference solution to this problem. Its the best solution in the majority of cases.
Version 2
What if you were going to be counting the zeros at the end of factorials for millions of people, hundreds of times a day each, over and over again? What if it was worth your time to find a faster algorithm?
Have a look at the sequence for 573. The highest number in the sequence is 3. It occurs 4 times, at places 125, 250, 375 and 500. The first 125 values in the sequence are repeated in the next 125, the 125 after that, and the 125 after that. Whats left has two 2s, at places 525 and 550, and the entries from 501 to 525 are identical to those from 526 to 550, and to those from 1 to 25. Then 4 copies of 0, 0, 0, 0, 1
, which is how our sequence starts, and finally 3 zeros.
What's more, those first 125, 25 or 5 entries are the same in every sequence that's long enough to contain them, so if we know the sum of their values, we could use those sums to get to the number of divisors.
Using these insights we can do the following.
case class Power5(index: Int, power: BigInt, sum: BigInt) {
def next: Power5 = Power5(index + 1, power * 5, power + sum)
}
object PowersOfFive {
var cache = collection.mutable.Map[Int, Power5](0 -> Power5(0, BigInt(1), BigInt(0)))
var highestPower: Int = 0
def get(n: Int): Power5 =
if (n <= highestPower)
cache(n)
else
cacheMore(cache(highestPower), n)
@tailrec
def cacheMore(pow: Power5, n: Int): Power5 =
if (pow.index == n)
pow
else
val newPow = pow.next
cache(newPow.index) = newPow
highestPower += 1
cacheMore(newPow, n)
}
val log5 = math.log(5)
def logBase5(n: BigInt): Int =
math.ceil(math.log(n.toFloat)/log5).toInt
def ctfdf(n: BigInt, pow: Power5): BigInt =
pow match {
case Power5(i, power, sum) =>
if (n < 5)
0
else if (power > n)
ctfdf(n, PowersOfFive.get(i-1))
else
sum + ctfdf(n - power, pow)
}
def cachedTimesFiveDividesFactorial(n: BigInt): BigInt =
ctfdf(n, PowersOfFive.get(logBase5(n)))
val numberTrailingZerosOfFactorial2: Function[BigInt, BigInt] = cachedTimesFiveDividesFactorial
There's a lot going on there. An instance of Power5 stores
-
index
- the power we're raising 5 to -
power
- the power of 5 itself -
sum
- and the times 5 divides the factorial of that power which is the sum of the firstpower
terms of the sequence. It also has a method,next
, to make the next Power5 instance for the next index.
The object PowersOfFive is a singleton that is caching all those Power5 instances, and storing them in a map, so we only need to calculate those once. It's worth noticing that we don't do it in a thread safe way, which would be a problem in a production environment.
Then we have cachedTimesFiveDividesFactorial
which uses the helper function ctfdf
. It finds the biggest power of 5 less than n, then pulls up the Power5 data we need, and starts using that to sum up the terms in the sequence.
Is it much faster? I ran a pretty unscientific test, and for all that extra, confusing code that won't work in a production environment, we only went about 50% faster. Not super worth it.
Version 3
Because I'm a recovering mathematician, I noticed something odd.
Power5(0,1,0)
Power5(1,5,1)
Power5(2,25,6)
Power5(3,125,31)
Power5(4,625,156)
Look at the power and the sum (the second and third entries in the triple): power == sum * 4 + 1
is always true. And if you know a bit about geometric series you can see how to prove that.
So, we have a calculation that is blindingly fast, and is right for all the powers of 5.
def approxTimesFiveDividesFactorial(n: BigInt): BigInt =
(n-1)/4
But it starts getting stuff wrong pretty quickly. It thinks factorial(9)
will end with 2 zeros, its actually 362880, which only has 1. But then its right again for a while, but guessing high by 1 on 13, 14, 17, 18, 19, 21, 22, 23, 24, before getting it right again at 25.
But if you're looking for patterns, there's plenty to see. Its right on the multiples of 5. Its wrong a lot when its one less than a multiple of 5. The wrong streaks build up, initially just a one off, then two together, then three, then 4.
We can get a bit more info by running
(26 to 50).map( i =>
(i,
approxTimesFiveDividesFactorial(i) -
betterTimesFiveDividesFactorial(i))
)
which yields
Vector((26,0), (27,0), (28,0), (29,1), (30,0),
(31,0), (32,0), (33,1), (34,1), (35,0),
(36,0), (37,1), (38,1), (39,1), (40,0),
(41,1), (42,1), (43,1), (44,1), (45,1),
(46,1), (47,1), (48,1), (49,2), (50,0))
We've got our first multiple of 5 with some positive error at 45, and our first size 2 error at 49. But its all good again at 50.
Lets get a little bit more information.
def toBase5String(n: BigInt): String =
if (n == 0)
""
else
toBase5String(n/5) + (n % 5).toString
(26 to 50).map( i =>
(i, toBase5String(i),
approxTimesFiveDividesFactorial(i) - betterTimesFiveDividesFactorial(i))
)
which yields
Vector(
(26,101,0), (27,102,0), (28,103,0), (29,104,1), (30,110,0),
(31,111,0), (32,112,0), (33,113,1), (34,114,1), (35,120,0),
(36,121,0), (37,122,1), (38,123,1), (39,124,1), (40,130,0),
(41,131,1), (42,132,1), (43,133,1), (44,134,1), (45,140,1),
(46,141,1), (47,142,1), (48,143,1), (49,144,2), (50,200,0))
In the triples, the first term is the number base 10, the second is the number base 5, and the third how much our approximation overestimates the number of times 5 is a divisor.
It turns out that if the sum of the digits in the base 5 representation is between 1 and 4 inclusive, the approximation is accurate. If its between 5 and 8, its over estimating by 1, if its between 9 and 12, its over estimating by 2 etc. It took me a while to notice this, and it took longer to realise why it was the case, and this article is already a bit long and tedious, so I'll just write down our final version.
def base5digitsSum(n: BigInt): BigInt =
if (n == 0)
0
else
base5digitsSum(n/5) + (n % 5)
def error(n: BigInt): BigInt =
(base5digitsSum(n) - 1)/4
def sillyTimesFiveDividesFactorial(n: BigInt): BigInt =
approxTimesFiveDividesFactorial(n) - error(n)
val numberTrailingZerosOfFactorial3: Function[BigInt, BigInt] = sillyTimesFiveDividesFactorial
What is worth noting here is that even though we found out more about how the values are moving around, and related them to something else, it didn't end up reducing the amount of work that we needed to do to get to our answers: we're still doing all the divisions of v1.1, plus some extra work. Our approximation wasn't a profitable avenue to consider when looking for good solutions to this problem.
Conclusion
I was hoping to help the reader realise a few different things with this article.
Firstly, a solution is a solution. Our first solution might be slow, but that might be good enough for what we're doing at the time.
Secondly, that if we want to work out how to take a slow, inefficient solution and turn it into a fast one, we should look at what we're asking the computer to do, what it will have to build in memory, how many steps it will need to take to get through our implementation and then try to imagine ways we can take shortcuts or rearrange and regroup the steps performed. Creating some concrete examples of the things you're creating can be really helpful.
Thirdly, good code isn't the same as fast code. v2 was quicker than v1.1, but it was ugly and complicated and would be hard to maintain, as well as having problems with parallel computations. v1.1 was a better solution in the vast majority of real situations.
Fourthly, having a clever insight into the problem isn't quite the same as a good algorithm. v3 used an interesting idea, but to make it work you had to do the same calculations as you did for v1.1, and a few more besides. It was still interesting to pursue though. I enjoyed running down what was going on and how to make the approximation work, even if it wasn't productive in the end.
Finally, Scala3 is a complete rebuild of the language in a lot of ways. They've done a lot of stuff that makes it a really great way to transition from object oriented code, to functional flavoured OO code, to fully fledged functional approaches. Its syntax is pretty clean and expressive. I strongly recommend it if you're looking to get into a compiled, strongly typed language from loosely typed, interpreted languages. The type system mostly gets out of your way. IntelliJ CE has good support for the language, and the worksheets feature means that if you copy my code from that gist I linked at the start, you can start running things, and writing your own functions, very quickly. The getting started guide is a great place to get started with an interesting language that is trying to do interesting things in a performant and production ready way.
Finally finally, if you go to cassidoo's page you can subscribe to her newsletter. Its reliably got something that I find worth my time, either the interview question or an article or a thought provoking quote. I strongly recommend it.
Thanks for reading!
Top comments (0)