Today I Learned: the In-Tree, a data structure that I surprisingly never came across in over a decade despite working with things that use it. How?
So, some history, since not everyone knows my background. I'm not formally trained in Computer Science. I went to college for psychology and pre-med, and had to stop two classes short of a degree. Programming was always a hobby; a side hustle. Depending on how you look at it, I either started at 12 or 14; 12 if you count game mod scripts as programming. Up until 18 I was exclusively doing indie game development. For several reasons I'll talk about later, this pivoted during my time in college, into being far more interested in language, both natural and programming. For the next 12 years I grew increasingly knowledgeable about text manipulation, analyzers (the "front-end" of a compiler), and more. This area tends to have a lot of unique data structures to help support what's being done. One of these, the In-Tree, is widely used to implement call stacks, for tracing the function calls during execution, and in certain programming languages also used for scoping. You can also use an In-Tree for storing backwards traversal through a regular Tree, as an In-Tree is conceptually a splayed stack, and also where it gets its other names like Spaghetti Stack and Saguaro Stack.
So how did I never notice this in over a decade?
I've made contributions to GCC, Clang, and 9c. I should have noticed this in all that time. I believe the only reason I hadn't was that by the time I was providing fixes or tweaks, that part of the code was stable to the point of me not looking into it, and therefore not thinking about what I was seeing.
Now, I've written my own compilers too. Some, purely experimental, such as trying to work out how to implement Seed7-style syntax definition capabilities through clean-room techniques. More practically, I've written a really bad Ada compiler, both in that it's poorly written to the point of being unmaintainable, and it produced very poorly optimized code. It passed the ACATS, a test suite to determine an Ada compiler is conformant with the standard. A truly fantastic learning experience that I'm glad I did. But. It. Was. Bad.
So how did I implement these things without relying on an In-Tree? Graphs, mostly. Graphs are used a lot in game development, so I've grown extremely comfortable using them. This is the cool thing about independently solving problems. Sure, you have to work out tons of stuff on your own, and that's more difficult, but it also means you come up with unique solutions and everyone can then weigh those options. Utilization of directed graphs for call tracing is certainly more complex, but does allow for recursion to be built right in, and easily detectable. In fact, after learning this was normally implemented with an In-Tree, I took one of the compilers I have, compiled it, compiled a test program which recursed and would normally blow the stack (the In-Tree's current path), and ran it. No stack blowing. It just ran forever as expected if it was tail-call optimized. Only, to this day I still don't know how to tail-call optimize. So, like everything, this was a neat tradeoff, and I'm glad this option was uncovered.
What else have I worked on then? Sure, I mentioned indie game dev. But that was bad. Early part of my programming career. I thought I understood things better than I did. It was mostly turn-based-RPG's and action-RPG's. Towards the end I was favoring a Cross Code-like style, although nowhere near as polished as that gem is. And developed with what? RPG ToolKit. Getting that engine to do things it wasn't meant for was... an experience.
What outside of game dev though? Well, aside from contributions I mentioned earlier, there's more in the contributory area. I find low-level systems programming interesting as well, and so unsurprisingly have been involved in projects like Plan 9 and HaikuOS. HaikuOS is special to me as my first computer was a BeBox running BeOS, not Windows or Mac. I've done a lot of Ada development, although that doesn't happen anymore. I used to do a lot of libraries in Ada, such as advanced mathematics support or an alternatively designed containers/collections library. Those underwent numerous revisions over the years, constantly being improved, but spent the majority of their history as personal use libraries I'd then use for other things. A few years ago, 2016-2017 I think, I started reworking those libraries again, but actually putting them up publicly. At first on Chisel because I truly believe Fossil is the superior SCM, but later GitHub because more people use it. This led to two things: One, Ada has no default package manager which is just terrible and the project manager is compiler specific and in my opinion problematic, which led to an experiment with Ada-tools, a non-compiling analyzer for Ada providing these (this actually spawned the work in Stringier); and libanne, a push of my libraries into an actual replacement runtime, which was abandoned due to the difficulties getting it to work with an existing compiler that made assumptions about how the runtime would exist, beyond what the language standard asserted.
What I've got going on currently, is Stringier, as aforementioned; Consolator, which needs a major rewrite but is a replacement for System.Console; Defender, a guard-clause and other defensive programming library; and Langly, a DSL for rapidly creating language analyzers, which actually uses Stringier as its runtime.
Is this exhaustive? No. Due to some recent events I have gaps in my memory and I've probably been unable to recall parts of my own experiences. So, that's been interesting. But that's more or less my history.