DEV Community

arinak1017
arinak1017

Posted on

GCC: Automatic Function Multi-Versioning Wrap Up

Before we start...

If you have just stumbled upon my SPO600 series of blog posts, it has been created to document and share my learnings as I progress through my Software Portability and Optimization course at Seneca Polytechnic.

As this term comes to an end and we officially wrap up our course project, implementing a new experimental feature (Automatic Function Multi-Versioning) in the GNU Compiler Collection, I want to reflect on the progress I’ve made and summarize my work. Throughout this project, I delved into GCC compiler passes, tackled the complexities of the GIMPLE intermediate representation, and developed initial logic for identifying and pruning redundant function clones.

If you are curious to learn more about the project, feel free to check out my previous blog posts:

How did I approach the experiments?

Over the course of the last few weeks, I spent a significant amount of time navigating the GCC documentation, and experimenting with the dummy passes provided to us by our Professor, Chris Tyler.

While working on my test pass, I approached the problem by isolating it from the complexity of the GCC compiler. Imagine designing a simple function that scans a piece of text or a file and flags redundant paragraphs. This analogy reflects what we’re trying to achieve but makes the concept easier to grasp. Instead of analyzing text, however, we’re examining the binaries of function clones and flagging them for pruning if they are redundant.

Let’s take a look at the draft logic

bool compare_gimple_functions(gimple_seq g1, gimple_seq g2) {
    gimple_stmt_iterator gsi1 = gsi_start(g1);
    gimple_stmt_iterator gsi2 = gsi_start(g2);

    while (!gsi_end_p(gsi1) && !gsi_end_p(gsi2)) {
        gimple *stmt1 = gsi_stmt(gsi1);
        gimple *stmt2 = gsi_stmt(gsi2);

        // Check if the statements are equivalent
        if (!stmt1 || !stmt2 || !simple_cst_equal(stmt1, stmt2)) {
            return false;
        }

        gsi_next(&gsi1);
        gsi_next(&gsi2);
    }

    return gsi_end_p(gsi1) == gsi_end_p(gsi2);
}

unsigned int
pass_testprune::execute (function *fun)
  {
     struct cgraph_node *node;

    FOR_EACH_FUNCTION(node) {
        if (!node->analyzed)
            continue;

        std::string function_name = IDENTIFIER_POINTER(DECL_NAME(node->decl));
        if (function_name.find("_1") != std::string::npos) {
            size_t pos = function_name.find_last_of("_");
            std::string base_func_name = function_name.substr(0, pos);

            struct cgraph_node *base_node = node->get_base_function();
            if (base_node && base_node != node) {
                if (compare_gimple_functions(base_node->get_body(), node->get_body())) {
                    fprintf(dump_file, "PRUNE: %s\n", base_func_name.c_str());
                } else {
                    fprintf(dump_file, "NOPRUNE: %s\n", base_func_name.c_str());
                }
            }
        }
    }
    return 0;

  }
} // anon namespace
Enter fullscreen mode Exit fullscreen mode

Understanding the proposed logic:

  • compare_gimple_functions
    The function compare_gimple_functions attempts to compare two GIMPLE statement sequences using a simple while loop. It iterates through the sequences with iterators and checks if the statements are equivalent using the simple_cst_equal function. This logic is designed to mimic comparing lines in two files.

  • execute function
    The execute function is the core of this pass. It aims to process function clones within the binaries. Clones are typically identified by a suffix like _1 or _2 in their names. The pass compares each clone to its base function, flagging it for pruning if it matches.

  • Step-by-step execution

    • Initial Safeguard: the if (!node->analyzed) block is intended to ensure that the compiler has analyzed the function before processing it. If not, the function is skipped. This check is meant to act as a safeguard.
    • Iterating Through Functions: inside the FOR_EACH_FUNCTION loop, the pass attempts to identify clones by checking for function names ending with _1 or similar suffixes.
    • Comparing Clones: the base function and its clone are compared using the compare_gimple_functions method. If they appear identical, the pass prints PRUNE: <function_name>. Otherwise, it prints NOPRUNE: <function_name>.

You can find the source code with my newly added pass (tree-testprune.cc) in my personal fork of gcc-mirror

To try running it:

  • Clone the repo
git clone https://github.com/arilloid/gcc.git
  • Switch to the afmv-function-clone-comparison branch
git checkout afmv-function-clone-comparison
  • Follow the standard build process to configure, build, and install the GCC compiler with the added pass.

Code refrences:

  • To see which files were modified to add a pass, refer to this commit: 5ece499
  • The proposed logic for the pass is added in this commit: 61428c7

The issues with the proposed solution

The primary issue stems from the incorrect use of GIMPLE functions and macros. To address this, I spent a substantial amount of time studying the GCC GIMPLE documentation. Through this effort, I gained a better understanding of how GIMPLE passes work. However, due to time constraints, I couldn’t get the pass to run successfully.

Advice for future students

My main advice to future students taking this course is to become very comfortable with configuring, building, and installing the GCC compiler. These foundational skills will save you a lot of time and frustration as you progress through the project.

Additionally, dedicate time to thoroughly reading the documentation, particularly on GIMPLE, as the project revolves around designing a GIMPLE pass. The more familiar you are with the documentation, the more confident you’ll feel working with the GCC compiler. If we had a clear understanding of the necessary macros and operands from the start, completing the pass would have been much easier. However, since the documentation is mostly incomplete or hard to navigate, patience and persistence are key.

Finally, get comfortable navigating the codebase and always use Git for version control. Even for small changes, commit your work regularly. Mistakes are inevitable, and being able to revert changes will save you countless hours. Trust me, this is an underrated yet crucial tip when working with a project as complex as GCC.

Afterthoughts

All in all, this journey has been a great experience. Although I wasn’t able to achieve a functional solution, I’m proud of the effort I put in, actively making changes, experimenting, and continuously rebuilding the GCC compiler. This process has given me confidence in my ability to work with large and complex codebases.

In the professional world, the reality is that most of the time we’ll be working on someone else’s code, relying on company-specific libraries, and navigating official documentation to understand and extend functionality. This project closely mirrored that experience, and I’m grateful for the valuable skills and knowledge I gained along the way.

Top comments (0)