## DEV Community

Rodion Gorkovenko

Posted on

# Bash for Project Euler - really???

Whatever language and stack you prefer, learn bash also! - that wisdom come to me through years. I was doing electronics and POS-terminals in C++, backend in Java, web with PHP and JS, machine learning in BigData with Python, Scala and many horrible names...

And always there were chores to be done with linux/unix shell scripting!

Let's plan "miniseries" on bash, but let's start with math exercise. To see how bash rocks and know when it sucks :)

For example I pick Project Euler #7 - "10001-st Prime". There would be 3 versions of code and they all are in this git repo for reference.

### Let's Rock!

For generating primes we usually need array to collect them. I usually start by initializing first few elements manually (this helps avoiding spare stop condition inside the loop).

``````#!/usr/bin/env bash

primes=(2 3 5 7)
``````

As clever people warn, we add "shebang" header which allows script to properly start if it is executable.

Array is created by enclosing space-separated sequence in parentheses. (non-posix feature, though we can do without them)

Now we create main loop! Just make 20 first primes for test, not thousands...

``````next=9  # next number to try

while [[ \${#primes[@]} -lt 20 ]] ; do
#here we put more code
next=\$((next+2))
done

``````

we understand `while` and `do` and `done`, but what is this whimsical expression? Well, logical expressions are tested in square brackets, `-lt` is "numerical" comparison for "less than" and `\${#primes[@]}` is curious expression of the length of array. Bash has funny syntax sometimes.

Now the code inside.

At every iteration we are going to test if `next` is divisible by anything in array `primes`, starting from beginning.

``````    i=0
while true ; do
p=\${primes[i]}
if [[ \$((next % p)) -eq 0 ]] ; then
# next is divisible by p, not prime, halt
break
fi

# some end condition we add at next step...

i=\$((i+1))
done
``````

Here we see new features - `if - fi` block (it supports `else` too), `-eq` operator for numeric equality and forms `\$((...))` for evaluating numeric expressions. Shell heavily deals with text so usually variables are treated as strings, not numbers.

Now how we decide new prime is found? We don't need to go till end of `primes`, it's enough to stop when `p` is greater than `sqrt(next)` - let's add it to the loop:

``````        if [[ \$((p * p)) -gt \$next ]] ; then
primes+=(\$next)
break
fi
``````

Looks good! Let's print out `primes` after main loop ended to check:

``````        # various code before...
next=\$((next+2))
done

echo \${primes[@]}

``````

Curiously, syntax for "whole array" is quite close to syntax of taking length... Now I save this into file `pe-7.sh`, set executable flag and run:

``````\$ chmod a+x pe-7.sh
\$ ./pe-7.sh
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71
``````

Wonderful! Now we need two small changes:

• replace `20` with `10001` in the main loop condition
• don't print whole array, but only its last element with `\${primes[-1]}`

Feel free to check the whole code in github - it's brief.

Run the program with `time` program to see how long it takes :)

``````\$ time ./pe-7.sh
104743

real    0m12.896s
user    0m12.881s
sys 0m0.004s

``````

Result is correct. But wow, 13 seconds - definitely bash is not intended to be the fastest. However there are faster shell versions (e.g. `dash`) and on other hand I'll show you how to make it much slower :)

### Refactor it

Bash supports functions. Let's extract inner loop into function. It can be checked that running time doesn't increase seriously.

``````tryprime() {
local try=\$1   # \$1 is the 1-st parameter
i=0
while true ; do
p=\${primes[i]}
if [[ \$((try % p)) -eq 0 ]] ; then
break
fi
if [[ \$((p * p)) -gt \$try ]] ; then
primes+=(\$try)
break
fi
i=\$((i+1))
done
}
``````

The function takes single parameter and puts it into local variable. Curiously, parameters are not passed in the parentheses :)

We don't return value, rather just add it to `primes` global array. I'll explain this bit later.

Now call the function in main loop:

``````next=9
while [[ \${#primes[@]} -lt 10001 ]] ; do
tryprime \$next
next=\$((next+2))
done

``````

Hey, that's neat! Full code in GH. And it still works - took 14 seconds...

### Performance Pitfall

Now just small change to make it sloooooow!

Let's return value from function. This is not done by `return` (which serves for other purpose). We rather print value in the function and capture it from output on call:

``````
# change this inside function
# primes+=(\$try)
# to this
echo \$try

#and main loop will be:
while [[ \${#primes[@]} -lt 10001 ]] ; do
found=\$(tryprime \$next)
primes+=(\$found)
next=\$((next+2))
done

``````

The complete code is here. Run it to get... 76 seconds!!!

Well, bash is shell version designed for reach user-friendly features, not for speed. But it is important to know such pitfalls :)

Thanks for reading that far! Let's meet next time to make some web-related work with bash :)