How many times should a self-compiling compiler be compiled in order to test itself?

#computerscience #compiler #theory #showdev

If a compiler is designed to compile itself, how many times should it be compiled to ensure that it is properly compiling itself?

I recently started to rewrite the rumi compiler in its own language, because it would be much easier to develop it this way. I covered it in more details in this post. While rewriting the simple compiler in rumi, this question hit me, how should I properly test it? So, I'm going to solve this question with (simplified) category theory, here we go:

A program is something that takes an input (could be anything, from stdin, to database, a set of http requests? whatever) and produces an output (again, anything). In term of basic Haskell, we have:

prog :: a -> b

Note that a and b could be anything, or a few anythings, we don't really care. So, with that definition, what is a compiler?

compiler :: source -> prog

A compiler takes some source code and produces another program (which in turn takes some other input and produces some other output), we can open things up a little:

compiler :: source -> (a -> b)

that looks like a proper compiled language, but just to make things more fun, what are interpreters (like python)?

python :: python_source -> input -> output

Note that we can remove the parenthesis, and give the python executable the source and the input in one step. But enough side track, let's get back to our own thing. So, how would compiling a compiler look like with this definition?

cc :: compiler_source -> source -> prog

Okay, and we want to ensure that our compiler that is compiled with rumi behaves just like the version that is written and compiled with C++, in other words we want these two functions to be the same:

c_compiler(compiler_source)
c_compiler(compiler_source)(compiler_source_in_rumi)

So, how do we ensure that two functions are the same? We can't exactly compare their binary code, since the C++ version and the rumi version might behave differently on optimization and debug information and many other stuff (which is intentional, by the way, since the C++ version is just meant as a start and is not user friendly in any shape or form). So we have to rely on good old set theory, how can we ensure two functions are the same in set theory? Well, first, they must be defined on the same domains (which they are) and they should produce the same output for all possible inputs. In other words:

for all s \in rumi's source
for all i \in inputs
c_compiler(compiler_source)(s)(i) == c_compiler(compiler_source)(compiler_source_in_rumi)(s)(i)

But that's a little but far fetched, we can't possibly test it on all source codes, nor can we try all possible inputs. But, there is something else that we can try.

Since our computers are a form of Von-Neumann architectures, and we know that each command in this architecture relies only on the config of the device at that point, we can test all possible commands and their configurations, in other words all possible statements of rumi with their inputs:

for all stmt \in statements
for all input \in input
c_compiler(compile_source)(stmt)(input) == c_compiler(compiler_source)(compiler_source_in_rumi)(stmt)(input)

But then again, it would be unrealistic to account for all possible inputs to all possible statements, take if as an example, there could be any number of inputs to it, but in general, there are two groups, those that match true and those that match false. So we can test only those two configurations for if, and the same two configurations for if/else and so on for all of the system. Let's assume that all of those configurations of all of those statements appear in rumi's source code (which is not a wrong assumption, by the way, since we only chose statements that were absolutely essential), so we need to test that the rumi's output for the rumi's source code is the same in both cases, ,i.e.,:

c_compiler(compiler_source)(compiler_source_in_rumi) == c_compiler(compiler_source)(compiler_source_in_rumi)(compiler_source_in_rumi)

And if this assumption holds, we can be sure that our program is behaving in a similar way to the C++ version! In plain English, the rumi's binary version that is compiled with the compiler written in C++, must be the same as the rumi's binary version that is compiled with the compiler written in rumi. That's confusing, yes, but that's why we have math, isn't it?

DEV Community

How many times should a self-compiling compiler be compiled in order to test itself?

Top comments (0)