Graham Trott

Posted on May 6, 2019

Language and complexity

#language #software #javascript #complexity

This piece was prompted by several recent questions posted on Quora, asking about the future of JavaScript.

I should start with a disclaimer; I am not a linguist, though I have considerable expertise in my own language (naturally), varying levels of proficiency in three other European languages and a smattering of two more. I'm interested in the relationship between language and complexity in the computer world, compared to that in the world we all inhabit. So please bear with me while I approach the subject in a roundabout manner.

The word 'language' is interesting. In Italian it's lingua and in French, langue. Both of these are closely related to the word for "tongue", and in fact we sometimes use that in English too, for example as in "mother tongue".

But in the English-speaking world we never refer to a computer language as a tongue. The reason is simple; you can't speak most computer languages without sounding like a gibbering lunatic or an unfunny version of Victor Borge.

I recently finished reading a detective novel, written in English and set in Jakarta. It was peppered with Indonesian words and phrases, plus several Dutch ones (Indonesia having once been a Dutch colony). Although the meanings of these words were rarely explained, it was usually possible to hazard a guess at their meanings from the contexts in which they were set.

The point is, at no time did I regard the novel as complicated. Words themselves do not imbue a written piece with complexity. We either understand them or we don't but the overall impression we get isn't one of complexity, no matter how many new words are introduced. Hard to read, maybe; irritating, sometimes, especially when strange names are thrown in just for effect. The English language itself has more words than pretty well any other, but it's much less complex than German or Italian. Complexity is something different; it's related to structure. A complex novel has multiple inter-twined storylines, a huge number of principal characters or regular flashbacks. These are structural features.

Which brings me to JavaScript; a computer language originally designed to provide website builders with the means to add interactivity to their pages. The initial feature set was tantalizingly inadequate so early users were quick to push it to its limits and wanted more than it could easily deliver.

The response to this came from two directions. The core feature set of JavaScript is pretty complete but it operates at quite a low level, so it's able to act as a platform for other software to deliver enhanced features. One direction is outward, in the form of frameworks which provide constructs that are not present in the core language; the other upward, in the form of higher-level languages that replace core JavaScript.

Direction: Outward

Frameworks bring structure to a language that was originally rather deficient in that respect. However, they also add complexity. Nothing is actually replaced, just added to, and the additions come with complex rules about how they are used. The additions and the rules vary widely from one framework to another and intuition can do little to help with the learning process. This is rather unfortunate since fashions change rapidly and without warning. A developer can spend the best part of a year learning one framework, only to find the only available jobs ask for a completely different one, requiring another massive investment in learning.

As time goes by, newer versions of JavaScript itself include many of the features that made a framework necessary in the first place, to the point that it has been suggested (e.g. here) that many of them are redundant. The response of framework builders is to add more dependent features, all requiring an understanding of the complex requirements for their effective use and all diverging wildly from each other.

This trend can be assumed to continue for the time being, but there has to be an upper limit on how much complexity is tolerable because it increasingly limits the number of skilled engineers able to deal with it. This impacts more on maintenance than development, making it hard to maintain products when the tools that were used to build them were overly complex and, worse still, have become obsolete.

Direction: Upward

The second direction is upward rather than outward. Instead of surrounding JavaScript with extra functionality, these additions provide more expressive substitutes for core features.

Some may find this surprising, but the first example of this approach came before any of the major frameworks and is still very popular today. JQuery provides a kind of "shorthand" for many of the features required of any coding system that is specifically designed for browsers. It's a half-way step to a full-blown programming language in its own right.

JQuery language features are quite intuitive once the basic principles have been learned. They match well to a non-technical user's view of the browser and the Document Object Model (DOM) so they're quite easy to learn. Having said that, they still leave the rest of the JavaScript syntax fully exposed, which is why I called it a half-way step in the previous paragraph.

Higher-level scripting

Half-way to what, though? Here we move from the present to the future, where all bets are off. Any confident prediction made today will most likely be overwhelmed by some left-field development nobody foresaw. As someone joked, "Prediction is difficult - especially the future".

Although JavaScript is the only language directly understood by browsers, the use of other languages is not ruled out. Various transpilers exist that will take code written in Python, for example, and convert it into JavaScript. The approach is not without its disadvantages, requiring transpilation to be done before the resulting code can be used, and debugging in the browser can be problematical because the code you see bears little resemblance to what you wrote. But it's likely these and other problems can be overcome, so it's a viable way to go.

It has to be said that although Python is often regarded as a higher-level language than JavaScript, neither of them get near to plain old English. Both are unapologetically computer languages, for programmers. This isn't universally the case, though. A good example of a much higher-level approach is AppleScript, which itself descended from HyperTalk, from the early days of the Macintosh computer.

AppleScript and similar languages are very English-like in appearance, resembling somewhat terse instructions for cookery recipes, navigation or step-by-step car maintenance. (They don't attempt to handle truly natural language; that's the job of an AI system and way outside the scope of this article.) Their big advantage is they can be understood not only by programmers but by ordinary people, most importantly the owners of websites whose requirements are being captured and implemented in code. As Linus Torvalds observed, "Given enough eyeballs, all bugs are shallow". As users of SQL will attest, a language that both domain experts and programmers can read brings many benefits.

English-like scripts do not look like conventional program code. One significant difference is there's less attention to structure and more to intent. In other words, scripts tend to read rather like the user stories from which they were written. When programming with React or Angular you need to build the structure first. The intent is still there but is often hard to find. With high-level scripts, implementation starts with a very broad overview and gradually fills in the detail.

Frequently the programmer encounters the need for a section of code that is clumsy or inefficient to implement in script. This is usually a signal that some new syntax is needed, so a well-designed scripting language has the ability to accept plug-ins that seamlessly extend the language. This process is very much the way human languages work, creating new shortcuts to describe any complex entities that can be encapsulated in a succinct manner. The human brain welcomes these additions, not regarding them as added complexity but as simplification; new information that can be slotted in with that already existing, often replacing far more clumsy descriptions. Without the word "laser" it would be harder to have a discussion about how a CD player works, for example.

Self-compilation

Whether the source language is to be Python or something looking like AppleScript, it still has to be compiled, or at least interpreted. The latter is very inefficient so I won't consider it further, but the opportunities for compilation are steadily growing. I mentioned transpilers earlier, but as computer hardware gets ever more powerful and software techniques more advanced, self-compilation starts to become possible.

JavaScript is incredibly powerful and allows quite inefficient code to run at an acceptable speed. It's possible to write a compiler in JavaScript that can process 10 or more lines of input script per millisecond, even on a smartphone and particularly if the output format is not actually JavaScript but some intermediate form that can be handled efficiently by a runtime engine also written in JavaScript. It's usually possible to schedule much of the compilation to be done while the page loads images, so the effect on load time is insignificant.

Load on demand

Many large web pages are complex and essentially monolithic structures, the bulk of which is included in the initial page load. With the growing popularity of the Web App format, where everything happens in a single page, this approach can too easily result in a long load time that degrades SEO performance. Not everything is needed right at the start, so systems should be able to load what they want when they want it. Having an on-board compiler able to read in scripts on demand is one simple way of achieving this aim. Although you can load JavaScript modules on the fly you need to deal with browser caching and security (CORS) issues, whereas high-level scripts are just text; they can be compiled in the browser itself. The only JavaScript code needed up front is the compiler and the runtime engine; a current example of these, able to do the bulk of what most websites need, weighs in at under 200kb. For even more performance enhancement you can precompile scripts and load the precompiled modules, avoiding the need for a compiler to be included when the page runs.

In this scenario, scripts are independent code modules that work with other scripts by passing messages back and forth, so there's no need to understand the whole structure in order to code for it. This reduction of complexity is a key benefit of a distributed code approach.

My conclusion is that frameworks increase complexity but higher-level languages decrease it or at least hold it at a manageable level. For the time being the former are where all the action is, but they are steadily outgrowing our ability to keep up. It's time for alternatives to be created; preferably ones that increase the accessibility of coding, not preserve it as the domain of a shrinking pool of highly specialized professionals.

Title photo by Mark Rasmuson on Unsplash

Top comments (2)

Anwar • May 10 '19

Always a great time reading your article. This goes without saying I obviously agree on your point. I guess this is the reason why we see more and more Framework fatigue. Complexity is forcing us to sometimes think in complicated manner to solve a problem that were clearly specified.

I am witnessing a behavior in my company I did not paid attention before reading your article : executive with a very light knowledge on programming will better understand SQL than PHP. For example, when I try to explain how my program work, which most of the time depends on a concept we modelized in a table, if I start from the PHP code I have more chance to loose the attention of my collègues. But when I show them pure SQL, they quickly interacts with me and I notice a better attention rate. I guess since SQL is one of the language that come the closest to our natural language, everything seems clearer.

Great article, very important to light some thoughts, I think overall the programming ecosystem needs a shift in the paradigm, because tomorrow with more and more automatisation we might not be able to efficiently maintain such complex code bases.

Graham Trott • May 11 '19

Yes, it seems intuitively true that the closer we get to English the better we are able to communicate with each other. Computer languages like C, Java and Python (with or without frameworks) are great for expressing highly algorithmic structures but they are too complex to permit anyone but a programmer to fully understand what is going on, and that's where mistakes get made. Users of SQL understand well the benefits of being to able not just to talk but also to show, as you describe.

My belief is we should pay more attention to 'containerization', where functionality is kept in components with well designed interfaces, and with a top-level controller that operates in something as close as possible to English.

My favorite analogy is of a roomful of people, all experts in different subjects. If I want to know something about waterproof fabrics I ask the materials expert; if it's a financial question I ask an accountant and if I have a sore toe I consult a chiropodist. Everything happens as 'messages' sent from one 'component' to another. Too many systems are built as if all the experts had their brains physically wired together, so when one moves the whole lot have to move. Sometimes the strain is too much, connections break and the system fails.