Recently a blog post entitled "A Response to Hello World" by Caleb Doxsey has been making the rounds. In it he tries to dissect Drew DeVault's arguments for software simplicity in his own blog post "Hello World".
First things first: This isn't meant to be a personal attack on Caleb Doxsey in any way. The points he makes are all reasonable and – depending on your domain and background – right. I just want to shed a light on views and opinions that are in my opinion declining in popularity and acceptance. And I want to point out that this is probably the only thing that Drew wanted to make a point on, as well.
Meanwhile, Drew himself has posted a follow-up on the topic where he explains his motivation behind the initial article a bit more in detail. The biggest point he makes here is that his first publishing was in no way meant as a comparison or – as he puts it – a benchmark between the different programming languages he'd mentioned but just as an emphasis on one point: complexity. And as his measurement of complexity he used the number of syscalls issued by a program for a specific task.
I agree with Drew's opinion that software complexity induced by the abstraction and layering done by nearly all developers in the last years is eating up resources everywhere and that only awareness for the problem could maybe lead us into a better and more performant future in the field of software development. I want to elaborate on this by commenting on Caleb's blog post because I think he gives pretty good responses to the points brought up by Drew, just from a different point of view.
In his first part, Caleb talks about the three downsides of complexity that Drew mentions and elaborates on them one after another.
I have to fully agree with Caleb here that higher-level languages such as Go are not necessarily harder to debug than lower-level ones just because they're issuing more syscalls. Debuggability is much more influenced by the overall software architecture-wise complexity, whether or not parallelism is involved and – most importantly! – the tooling available.
More available Stack Overflow answers for your problems when using a high-level language is a questionable argument, though.
This argument is talked down by Caleb, but I have to stand up for it. The ever increasing size of programs is not just annoying, for me as an embedded developer it's a real problem. eMMCs are always too small and they won't get bigger just because you want to ship that fancy Go application. Binary size heavily impacts startup performance of embedded devices – yes, all these bits and bytes have to be read from a slow eMMC and put into memory when the device starts up! Storing build artifacts gets more and more expensive, the demands for the network capacity of your build infrastructure increases steadily.
Have you ever been annoyed by the way-too-small integrated memory and the always-full RAM of your Android device? Well...
Drew points out that more complex programs lead to longer execution times and that, in turn, leads to a worse user experience. And this is true, full stop.
Caleb tries to relativize this argument by saying that, while some of the programs initially posted by Drew in fact need more time to finish, all the times are still fast enough. This may be right in this special case, buts it's definitively not true in general. Further he tries to invalidate the argument by improving the Go version of the test provided by Drew and misses the actual point in multiple ways:
- He compares his optimized version against a not-in-the-same-way-optimized version of the assembly test.
- By optimizing the program in the first place he acknowledges Drew's main point that there's complexity in that program that the programmer needs to be aware of and that needs to be worked around in order to achieve acceptable performance.
- He works around startup complexity and argues that it's thus neglectable. But especially startup performance is essential! (Have you ever sat in front of your under-powered Windows machine waiting for the first call to Python that day to finish?)
He's right, though, that higher-level languages make it easier to develop multi-threaded programs and thus to actually utilize your whole hardware. But multi-threading isn't the solution to all problems, in particular when a more efficient environment could solve the same problem easily with just one thread.
In his next part Caleb talks of the costs of more efficient compilation of programs. Of course it's true that optimization costs time and can disrupt your workflow, and may even cost you a nice little feature in your language. And indeed that's an argument I hear quite often. Still, I disagree.
The code I've written professionally may run million to billion times. If a compiler optimization takes me one or two seconds longer but increases performance of every of those uncountable runs, is that really that high of a cost? Maybe folks should get away from compiler-driven development and try to write (nearly) error free code in the first place? That would probably reduce compile times far more than leaving out some optimizations.
Here, Caleb breaks down the syscall-induced complexity found by Drew. He categorizes and classifies the different syscalls found in the Go binary and comes to the conclusion that they're all useful. Again, their usefulness depends on what you're expecting, but nevertheless I have some comments.
Whilst it's nice that Go handles multi-threading at its core and makes it a first-class programming paradigm, forcing the developer to use it is no good in my opinion. Many developers would be really surprised what can be achieved by a single-threaded application when you just know the right tools and patterns. Multi-threading comes with its own costs, its own complexity and increases the possibilities for errors by magnitudes. And all this just for finding most of your threads hanging in
poll(2) all of the time.
Yes, it's true, garbage collection can protect you from all kinds of memory errors. But it comes at its costs and the developer should have the right to chose. Additionally, many of the bugs handled by garbage collection can be found by static analysis and/or
valgrind without decreasing runtime performance of your shipped binaries. Yes, most programmers don't use these tools, so maybe convincing them would be more appropriate then passing of that debt to the end user.
I don't get Caleb's point here. Yes, these file descriptors are non-blocking. Still they can be
poll(2)ed and if you need blocking behaviour you can still use
Converting all signals into run-time panics indeed sounds useful since proper signal handling is a permanent source of errors and the cost seems appropriate. Still, this should be optional (maybe opt-out).
This seems like solving a non-problem to me.
As I said in the beginning, all of the above, all the arguments used and disproved, all this heavily depends on who you are, on the domain you're in and on your goals. I don't want to say that my points are right, I just want to point out that these are valid points, too. Today, talking about software bloat puts you into a niche where it's hard to get out yourself. You're not one of the cool kids. And that's not justified: Not all computers are your big iMac, in fact IoT devices and edge computing are in the upswing. Software bloat is real for embedded development.
The one takeaway of this article is: Whenever you're introducing complexity, step back and take a moment to think about if it's necessary, think about your users and think about side-effects. Often software bloat is just passing the bill for ease of development to the end user and that's not fair. Having your user to buy new hardware just so you can use that fancy new language feature is mean and destructive. As a developer you're a service provider, software development is no end in itself.