DEV Community 👩‍💻👨‍💻

Basile B.
Basile B.

Posted on • Updated on

The story of an absurd optimization

Before the absurd

Once I was bored and decided to profile my software. It's a compiler for a hobby language called Styx.

In the profiler data obtained while boostraping I noticed a function that had not a constant cost

Image description

The function is

function isImported(Declaration* node): bool
{
    return node.symbol.parentUnit():u8* != unitNode:u8*;
} 
Enter fullscreen mode Exit fullscreen mode

We are in a compiler, this function tells if a declaration is declared in the compilation unit that's being processed or not.

It calls

function parentUnit(): UnitDeclaration*
{
    if kind == $unit do
        return *((&astNode):UnitDeclaration**);
    return parent.parentUnit();
} 
Enter fullscreen mode Exit fullscreen mode

That function is a member of the Symbol class whose instances are associated to declarations. It allows to create the "parent" relationship that does not exist in an AST. Usually it is used for identifier lookups but it also allows to find the compilation unit (aka "the module").

So we have here a O(n) situation, with n the count of lexical scopes between the declaration and its unit. Not great not terrible. But it's possible to make it O(1). Why ?

Each compilation unit is associated to a file name. File names can be retrieved directly through the scope that a declaration introduces. Scopes are associated to declarations in case they would need to be analyzed in an "adhoc way". On top of that strings are refcounted. That means that we only need to compare the pointer of the compilation unit file name and the one of the declaration scope. Great.

The function now looks like this:

function isImported(Declaration* node): bool
{
    return node.scope.filename.ptr != unitNode.filename.ptr;
} 
Enter fullscreen mode Exit fullscreen mode

Now let's check the change with the profiler UI

Image description

Neat ! A call to this function is now constant time.

The absurd part

So what's been done here ? A decent optimization. A function was O(n), it is now O(1). But what if it is pointless ?

Actually the optimization only changes a very marginal part of the software. Maybe 15% of the CPU time is caused by my own code. The 85% remaining are spent by a dynamic library, LLVM

Image description

Top comments (0)

Join us at DEV
Yes, this is technically an “ad”, but really we just want to ask if you want to join DEV. We have 900k+ developers reading, posting, and enjoying community, and would love to have you.   Create an account and continue your coding journey.