DEV Community

Facol
Facol

Posted on • Originally published at zen.su

Amalgamating Nim programs

Intro

Over the last few years, there were a couple of threads on the Nim forum about the possibility of making a single self-contained (more-or-less) C file from a Nim program. Most of the time, those threads were about participating in a competition that doesn't have Nim or writing a C assignment in Nim.

Today I finally decided to try to do this myself and, to my surprise, succeeded.

Starting out

Amalgamation is a technique that makes it possible to combine the whole project (usually in C or C++) into a few source files (usually one). It makes distributing libraries easier in some cases, or can even improve the performance, although usually, you can achieve similar results by enabling Link-Time Optimization.

The most popular library out there that can be amalgamated is probably SQLite.
In SQLite's case, this is achieved by carefully separating code into different files that still works when combined in a single file.

It is certainly possible to adjust the Nim's C backend to output C files that can be combined into a single one and compiled, but this requires certain effort. So I didn't go this route and started searching for a program that can amalgamate separate C files by itself, and found one - C Intermediate Language.

The project is aimed at creating a simplified subset of C, but it has a feature called "merger" that we'll be using.

Building CIL

First of all, we need to compile CIL itself. I chose to use a maintained fork instead of the original so that it's more likely that it works :)

For building CIL, I just followed the installation section in the README. Here's the listing of what I did (might or might not work for you):

# Install OCaml on Arch
$ sudo pacman -Sy ocaml

# Clone the repo
$ git clone https://github.com/goblint/cil && cd cil

# Create a local opam environment ("switch"). This will take a while
$ opam switch create . 

# Add local opam environment to the current shell
# This command is specific to the fish shell. For others, you might need to modify it
$ eval (opam env)
# Configure the compilation, the prefix is the _opam directory
# in CWD (because we use `opam switch`)
# For POSIX-shells replace () with ``
$ ./configure --prefix=(opam config var prefix)

# Build CIL
$ make
# This make install fails for me, but it's needed for commands below
$ make install
# To fix the error in the previous step we need to remove the logwrites dir
$ rm -r _opam/lib/goblint-cli/logwrites/
# Finally install it into currentdir/_opam/bin/
$ make install

# Add it to our PATH (your directory will of course be different)
$ set PATH /home/dian/Stuff/cil/_opam/bin $PATH
Enter fullscreen mode Exit fullscreen mode

Patching nimbase.h

Now we're almost ready to use it for Nim programs! There's one thing left - we need to patch nimbase.h (it's a file that provides some generic defines for different compilers on different platforms). You can usually find nimbase.h in your_nim_dist/lib/nimbase.h.

diff --git a/lib/nimbase.h b/lib/nimbase.h
index cbd35605b..7d3620881 100644
--- a/lib/nimbase.h
+++ b/lib/nimbase.h
@@ -75,9 +75,9 @@ __AVR__
 #endif
 /* ------------------------------------------------------------------------- */

-#if defined(__GNUC__)
-#  define _GNU_SOURCE 1
-#endif
+//#if defined(__GNUC__)
+//#  define _GNU_SOURCE 1
+//#endif

 #if defined(__TINYC__)
 /*#  define __GNUC__ 3
@@ -195,7 +195,7 @@ __AVR__
 #  define N_LIB_EXPORT_VAR  __declspec(dllexport)
 #  define N_LIB_IMPORT  extern __declspec(dllimport)
 #else
-#  define N_LIB_PRIVATE __attribute__((visibility("hidden")))
+#  define N_LIB_PRIVATE
 #  if defined(__GNUC__)
 #    define N_CDECL(rettype, name) rettype name
 #    define N_STDCALL(rettype, name) rettype name
@@ -324,7 +324,7 @@ namespace USE_NIM_NAMESPACE {
 typedef unsigned char NIM_BOOL; // best effort
 #endif

-NIM_STATIC_ASSERT(sizeof(NIM_BOOL) == 1, ""); // check whether really needed
+//NIM_STATIC_ASSERT(sizeof(NIM_BOOL) == 1, ""); // check whether really needed

 #define NIM_TRUE true
 #define NIM_FALSE false
@@ -543,7 +543,7 @@ static inline void GCGuard (void *ptr) { asm volatile ("" :: "X" (ptr)); }
 #endif

 // Test to see if Nim and the C compiler agree on the size of a pointer.
-NIM_STATIC_ASSERT(sizeof(NI) == sizeof(void*) && NIM_INTBITS == sizeof(NI)*8, "");
+//NIM_STATIC_ASSERT(sizeof(NI) == sizeof(void*) && NIM_INTBITS == sizeof(NI)*8, "");

 #ifdef USE_NIM_NAMESPACE
 }
Enter fullscreen mode Exit fullscreen mode

The main change here is that we don't want to define _GNU_SOURCE because CIL doesn't support most GNU-specific constructs. It also doesn't support the NIM_STATIC_ASSERT macro so we disable that as well, along with N_LIB_PRIVATE.

You can apply the patch with a simple git apply /path/to/nimbase.diff while in the Nim distribution directory.

Hello, Amalgamation!

Now that we have all the necessary things prepared, we can test out our amalgamation setup.

First of all, since Nim expects a compiler binary to be an executable file, we need to create cilly.sh that will call cilly (CIL's main tool) with all needed arguments:

#!/usr/bin/env sh
cilly --noPrintLn --merge --keepmerged $@
Enter fullscreen mode Exit fullscreen mode
  • --noPrintLn specifies that we don't want #line directives.
  • --merge for actually merging the C files.
  • --keepmerged to keep the amalgamated C file after compilation.

Save it somewhere in your $PATH.

Now let's create our first amalgamated Nim program:

# hello.nim
echo "Hello, Amalgamation!"
Enter fullscreen mode Exit fullscreen mode

We also need to specify that we want to use cilly.sh as our C compiler, so let's go ahead and create the configuration for our Nim file:

# hello.nim.cfg
gcc.exe="cilly.sh"
gcc.linkerexe="cilly.sh"
Enter fullscreen mode Exit fullscreen mode

We could've provided those on the CLI, but it's better to create a separate config file.

And let's finally compile our Nim program and create an amalgamation:

nim c -d:danger hello.nim
Enter fullscreen mode Exit fullscreen mode

There might be some warnings, but that's fine. After the compilation we'll have a new file called hello_comb.c in our directory, and that's the amalgamation that we wanted to get!

Now, if you check the line count of the file, you might be surprised - it's 6.6K lines long (at the time of writing).

Why is that? The main reason is that Nim treats C as a backend and makes many of its own types and functions.
That line count is not big - CIL merges all C files of your Nim program and system includes too, so the final size is relatively small. To improve it even more, you might want to consider using some of these options:

  • --gc:orc - use Nim's new GC, it has a much smaller runtime. If you're sure that your program doesn't have cycles, use --gc:arc for an even smaller line count.
  • --d:useMalloc - generally not preferred, but this will make the Nim compiler use C's memory functions instead of using its own allocator.

After adding those two options, the resulting line count is down to 3.4K lines of code. You can check out the resulting C file in this gist.

Afterword

While this is an interesting matter, I think that amalgamations are rarely useful. It's really hard to make them portable, and they can lead to more weird bugs. That said, you can use them if you really, really need to have a single C file.

Some relevant discussions:

Thanks for reading!

Top comments (0)