Ali Sherief

Posted on Jan 20, 2020

Correctness proofs can reveal software bugs

#computerscience

We write code, functions, modules and whole programs expecting them to work when we run the code. But when they don't they often give us much frustration debugging them. One particularly thorny area where developers mess up frequently is in algorithms. Algorithm writing is hard. That's why it's important to make sure they work as intended before running them.

Why before? Because some algorithms take so long, or use so much resources, that it's infeasible to reactively test them.

Correctness proofs are currently a niche subject (see this Stack Exchange question), but that doesn't mean they aren't useful. Rather, my goal in writing this is that maybe someone reading this could apply these techniques to their workplace.

Steps to create the proof

I found these steps in a computer science book. If anyone can link me other correctness proof guidelines on the internet, I would much appreciate it.

Formulate the loop invariant. It can be derived from:
- Making sure that the loop invariant is satisfied before entering the loop. This can be assured by correctly initializing the variables.
- Showing that the invariant is satisfied at the end of an iteration if it was satisfied before the iteration.
Find the termination test that ends the loop.
Derive the precondition that gives the termination test a possibility of being True. This prevents the loop from running forever.
Derive the postcondition from the loop invariant and the termination test. This is the equation we want the algorithm to do.

We do not need to make a table of example values to prove the algorithm works correctly. This is a major advantage of correctness proofs.

Correctness proof of a factorial function

I have written the function to be compatible with C and C++ but it can be adapted to Java or PHP easily with a few minor syntax changes. You might notice some obvious range checking not being done here, but the reason for this will be explained below.

int factorial(int n) {
    int i = 0;
    int k = 1;
    while (i != n) {
        k = k*(i+1);
        i = i + 1;
    }
    return k;
}

I'm going to be using factorial() below a lot to reference the mathematical operator, not the function I just wrote. It is hard to embed TeX equations in here.

Immediately before the loop, we have k = factorial(i) because i is zero and factorial(0) = 1. During an iteration, the above equation is still true because we increment i by 1 and we update k with the new computed value. So our loop invariant is k = factorial(i). The loop invariant is satisfied before the loop because k = factorial(i) = factorial(0) = 1. After an iteration is completed, k = factorial(previous_i + 1) = factorial(previous_i)*(previous_i+1).
The termination test is i == n. If this is True then the loop terminates.
The precondition which will give the termination test a chance of being true is n in (the set of integers) and n >= 0. I left out error checking in the function to demonstrate this.
Because i == n will terminate the loop, k = factorial(i) = factorial(n).

A Fibonacci sequence

Because material about correctness proofs on the internet is scarce, I will provide another example below, this one is a correctness proof for the Fibonacci sequence. Once again, fibonacci will be used in this post referring to the mathematical operator.

This example has a different loop construct to demonstrate that a correctness proof can still be done for it.

int fibonacci(int n) {
    int fi = 1;
    int fi1 = 0;
    int i = 0;
    int temp;
    while (i != n) {
        temp = fi;
        fi = fi + fi1;
        fi1 = temp;
        i = i + 1;
    }
    return fi1;
}

Before any iterations, fi1 = 0 = fibonacci(0) = fibonacci(i). After an iteration, fi = previous_fi + fi1 = fibonacci(i+1) = fibonacci(i) + fi1. So our loop invariant is fibonacci(i) = fi1.
The termination test here is i == n. Keep in mind that real-world practical algorithms have more complicated termination tests than this.
Because we set i to an integer equal to zero and we only increment it by one, the precondition here is also n in (the set of integers) and n >= 0.
Eventually we will calculate the value of fibonacci(n) (for we iterate through the loop with i values from 0 to n-1 inclusive). Because we proved that fi = fibonacci(i+1) and i = n after the loop, fi = fibonacci(n+1), but the postcondition is the value that is being returned and that is fi1 = fibonacci(i) = fibonacci(n).

Typing the proof

The proof is almost always heavily equation-based, so it is best to write it in LaTeX. You'll most likely need a software engineer on your team who knows LaTeX as a result.

Following the DRY principle (Don't Repeat Yourself), do not write any code in the proof. Instead you should make a comment in your code pointing to the correctness proof location.

Sharing

Now that you have made a correctness proof it is time to share it. However there isn't a standardized way to include math in arbitrary code files so you need to figure out a solution that is suitable for you.

Including TeX in your code files

This is a particularly difficult problem because we ideally want a language-independent solution, that is a way to embed TeX or LaTeX in a code file regardless of the language being used.

For scripting languages it wouldn't make sense to do this because these files are supposed to be read in production and extra TeX code just for proof is unnecessary and might even be against your company's policy.

Also there is the issue of coupling the document with code, so that when the algorithm changes, you can be notified about the correctness proof to update it if needed. Of course I'm not expecting that the proof validates the algorithm automatically, for it's just a plain document.

Sharing with Word or Google Docs

This amounts to installing a code editing add-in (Code Blocks on G-Suite, no viable alternative on Office but Content Mixr comes close), embedding your code inside a document and decorating it with equations from your correctness group. It's possible to make a nice document like this but the issue of synchronization immediately arises.

So suppose your code is in a private Github repository with webhooks. It will be difficult to connect your Google/Microsoft account to the webhooks and that might also be against your company's policy.

Separate document in source tree

This is the most viable option because you get to express yourself independently of the code while it gets integrated with any webhooks you have in place. The document will most likely be in LaTeX and a webhook could create a PDF document of the proof, although a drawback of this is that continuously verifying proofs every time the algorithm code changes is not something you want to do manually unless you have dedicated software engineers for that.

Concluding remarks

So if you ask me which option is the best for storing your correctness proof (and I'm not claiming to be an expert in this), I suggest that putting them in your source tree is the most doable option.

Material from this post was sourced from Computer Science with Mathematica by Roman E. Maeder.

Image by Gerd Altmann from Pixabay

DEV Community

Correctness proofs can reveal software bugs

Steps to create the proof

Correctness proof of a factorial function

A Fibonacci sequence

Typing the proof

Sharing

Including TeX in your code files

Sharing with Word or Google Docs

Separate document in source tree

Concluding remarks

Latest comments (0)

Read next

Docker Log Observability: Analyzing Container Logs in HashiCorp Nomad with Vector, Loki, and Grafana

API simples que gera arquivos de forma assíncrona, com Java e Spring? Aqui tem!

Motivation, the neglected superpower.

Kickstarting Your DevSecOps Career - The 4 Essential Certifications You Need