Update:
Be sure to read the improved 2nd draft of this article instead. The following content is the original article, only saved for posterity, and to respect the original URL. This article garnered some fruitful discussion on Hacker News, and some comments here on dev.to you can read at the end.
First, a few overarching guiding principles:
-
All programming languages are, in actuality, built to overcome human limitations. Otherwise, one might as well be typing 0's and 1's, or a lower-level language like Assembly.
- Software architecture in general (and frameworks in specific) is a way to organize the mind of the developer(s), categorising the conceptual world into what's closely or merely remotely related (giving rise to principles like 'cohesion over coupling' etc.). (This might explain OOP's popularity.) The machine would be perfectly content with executing even spaghetti code. Inspired by Martin Fowler.
- A programming language has certain affordances, allowing you to talk specifically about/with some concepts (typically the first-class citizens of the language), and avoid having to talk about other things (e.g. memory management, language runtime concerns). This does not only apply to DSL's.
- "each programming language has a tendency to create a certain mind set in its programmers. ... you tend to have a mental model of how to do things based on that language. ... Such a mind set may make it difficult to conceive of solutions outside of the model defined by the language." - Dennis J. Frailey
Only by accounting for human limitations (like cognitive capacity, and familiarity), could one derive a specification for the ideal programming language.
-
A bug is an error in thinking. Either by the developer, or the language-designer for not sufficiently accounting for human psychology (Sapir-Whorf: the language you write/speak determine what you can/do think).
- To reduce bugs, a language should ensure simple, safe, and scalable ways of thinking. For instance:
- Type systems are a way to use the compiler to help us verify our beliefs about our own code: they help us think consistently.
- Closures, enables the programmer to specify and share behaviors that are already half-way thought through and defined.
- Transducers, allows the programmer to define and compose behaviors without having to think explicitly about whatβs behaving.
- Currying, allows the programmer to take one grand behavior and break it down into smaller behaviors that can be reused independently, or used in sequence.
- Composition allows the programmer to think & build piece by piece, instead of all at once, and without the context influencing too much.
- "Open for extension, closed for modification": A programmer can recognize something useful, and add more pieces to it, without having to change the original thing (e.g. extension methods in C#), and without tying those new pieces too closely with the original thing (e.g. subclassing), thereby limiting their reuse.
- To reduce bugs, a language should ensure simple, safe, and scalable ways of thinking. For instance:
A language determines WHAT you can & have to think about, but also HOW you have to think about it.
"Things that are different should look different". Inspired by Lary Wall on Perl's postmodernism and my own frustrations with modern component frameworks like React, and my impression that Lisp/Clojure is hard to learn precisely because it has so little syntax: when everything looks the same it is hard to tell things apart. Although it is prize-worthy to stay very frugal with syntax, since more syntax necessitates more learning/documentation (knowledge debt, info overload), more avenues for confusion (the best code is no code), and more complications (language intricacies can lead to software intricacies, which can lead to bugs). My philosophy leads more towards Golang (less features, readability is realiability, simplicity scales best) and Python (explicit over implicit, one way over multiple ways), than Ruby (provide sharp knives) and Perl (postmodern plurality, coolness/easiness is justification enough in itself). Even though I come from Ruby and love it, and also cannot help admiring Lisp for its elegance and crucial evolvability.
Purpose: What should this dream language of mine primarily be for?
- Webapp / app + systems development. In Rich Hickey's words: Information-driven situated programs. Ideally, also open to extension into more areas of programming.
- Scripting and prototyping, but also scalable to production use (app/webapp)
- Systems development (compiled)
So, from various sources of inspiration, and these principles in mind, here is the list of features that my dream programming language would have.
Features:
Features in bold are considered most important.
-
Readability and reasonability as top priority. Reduce dev mind cycles > reduce CPU cycles. Human-oriented and DX-oriented. Willing to sacrifice some performance, but not much, and not to overly gain comparability with natural language (counter-inspired by SQL; inspired by Cypher). Willing to sacrifice immediate power in the language itself, esp. if that can be achieved through abstracted-away libraries.
- Should always be able to be read top-to-bottom, left-to-right. No
<expression> if <condititonal>
like in Ruby. - Readability should not imply a one to one match with natural language (counter-inspired by SQL), since natural language is inconsistent, duplicitous, ambivalent and multi-faceted. Consistency is key to a programming language. But it should borrow some similarities from natural language (like its popular Subject-Verb-Object structure) to make adoption easier (more at-hand/intuitive).
- Encapsulation. Everything should be able to be encapsulated (all code, whether on back-end and front-end), since encapsulation affords reasonability (and testability), by limiting places bugs can hide. Counter-inspired by Rails views (sharing a global scope) and instance variables. Inspired by testability of React components.
- No need to keep things in human memory for more than about 20 lines of code at a time. If extending that, then there should be an conceptual model which can be carried forward incrementally in the programmer's mind (implying state and mutation at the conceptual level, but not necessarily implemented as such).
- No need to manipulate data structures in the human mind. Programmer should always be able to see the data structure he/she is working on, at any given time, in the code. Inspired by Bret Victor, and Smalltalk. So the language should make that easy for tooling to support. But without being a whole isolated universe in its own right like a VM. Counter-inspired by Smalltalk. Some have described this as REPL-driven-development, or interactive-programming. Inspired by Clojure. But without having to leave your IDE to code in a console/prompt/terminal. Inspired by QuokkaJS.
- Params: Function parameters must be named, but no need to repeat yourself, if the argument is named the same as the parameter (i.e. keyword arguments can be omitted). Inspired by JS object params, and Ruby. Counter-inspired by the Mysterious Tuple Problem in Lisp. If currying, then input params should be explicit at every step (for clarity and refactorability). Counter-inspired by Point-free style in FP (since "explicit is better than implicit", inspired by Python).
- No Place-oriented programming (PLOP), iow. avoid order-dependence at almost any cost, since it isn't adaptable/scalable. This goes for parameter lists to functions etc. Don't want to have to use a '' placeholder for places where there _could be a parameter, just because you didn't supply one. Consequence: use primitive data structures like dictionaries over records/fields. Inspired by Clojure.
- No
unless
or other counter-intuitive-prone operators. Counter-inspired by Ruby. - No abstract mathematical jargon. Counter-inspired by Haskell. Should be accessible for as wide a community as possible, with as little foreknowledge as possible. Inspired by Quorum.
- Do not presume contextual knowledge. In UX this is known as "No modes!". Code should be able to be read A to B without having been educated/preloaded with any foreknowledge (like 'in this context, you have these things implicitly available'). Counter-inspired by class inheritance and Ruby magic, and JavaScript's runtime bound
this
keyword and associated scoping problems. Turns out too much dynamism (runtime contextualisation) can be harmful. - Should facilitate and nudge programming in the language towards code with low Cognitive Complexity score.
- Dynamic Verboseness: Should be able to show more/less of syntax terms in the code. Beginners will want more explicity, as code can be more self-explaining. Whereas professionals will want to write less and have more implicit/hidden from sight, so they can focus on their the problem domain without clutter from the language. See: content-addressable code.
- Should always be able to be read top-to-bottom, left-to-right. No
Not indentation based (counter-inspired by Python), since it is brittle. But also not require semicolons. Inspired by Ruby, and semi-colon-free JS.
-
Fast feedback to the programmer is second top priority. Inspired by TypeScript hints, QuokkaJS (!), Webpack Hot Reload, and Expo Live Reload.
- REPL / interactive shell. Can be done even if compiled, by having an interpreter on top of a VM to the compiler.
-
Refactorability / change-ability.
- Similar-looking and non-interacting code-lines should be able to change place without breaking anything. Counter-inspired by not able to add comma to last line in JSON, not being able to reorder/extract from comma-separated multi-line variable declarations in JS, and also counter-inspired by the contextualised expression terminators in Erlang.
- Backward-compatible and forward-compatible. Should be able to not worry about (or make poor future tradeoffs due to) backwards-compatibility. (Counter-inspired by ECMAScript.) To make the language optimally and freelly evolvable, and worriless to upgrade. Backwards-compatibility and Forwards-compatibility: Code in a one language version should be transformable (in a legible way) to another version (both ways; backwards and forwards). Solutions could be to either: have simple CLI tools to automatically refactor old code to new language versions, to always stay optimally adaptable, without having breaking changes. Maybe using some form of built-in self-to-self transpilation. Will likely need to be able to treat code-as-data. Might need compile-time macros. Or a solution could be to: with every breaking language revision, include an incremental language adapter, which would allow upgrading whilst ensuring backwards compatibility. Could be solved with Mechanical Source Transformation, enabled by gofmt, so developers can use gofix to automatically rewrite programs that use old APIs to use newer ones. Which is crucial in managing breaking changes. A breaking (aka. widely deviating) change, should in effect, not actually break anything (that current languages and systems do, is considered a "pretty costly" design flaw).
- A language for library authors. Inspired by the success of C++. The language should be able to evolve by community convention, not by centralised specification: the language itself should be extensible with libraries (would probably need to have compile-time macros).
- Content-addressable code: names of functions are simply a uniquely identifiable hash of their contents. The name (and the type) is only materialized in a single place, and stored alongside the AST in the codebase. Avoids renaming leading to breaking third-parties, and avoids defensively supporting and deprecating several versions of functions. Avoids codebase-wide text-manipulation, eliminates builds and dependency conflicts, robustly supports dynamic code deployment. Code would also need to be stored immutably and append-only for this to work. All inspired by Unison.
- Convention over configuration (CoC). The language should be "open to extension" by any community, without permission. So that it can evolve and converge to a consensus, based on real-world experience and feedback. Not being completely and statically "configured" up front, which would entail predicting future uses, and consequences of those uses. (Thus, the feature-list you are now reading would thus just be a first draft, of course.) This is mirrored in the important talk Growing a Language, by Guy Steele, and the point on crucial evolvability. NB: "if you apply [CoC] dogmatically you end up with an awful lot of convention that you have to keep in your head. It's always a question of balance; Hard Coding vs. Configuring vs. Convention, and it's not easy to hit the optimum (which depends on the circumstances)." as Peer Reynders reminded us.
Modularity. Module system which is sensible. Inspired by the NodeJS controversy. Code-splittable and tree-shakeable. Inspired by Rollup.
-
Quick to get started and produce something. Inspired by JS. Counter-inspired by JS tooling.
- Not too unfamiliar (to a large group of programmers, and to what they teach in universities). "Familiarity and a smooth upgrade path is a really big deal." source
Sensible, friendly, and directly helpful error messages. Inspired by Elm.
-
Struct-Oriented Functional Programming (SOFP). Mimics the style of object-orientation, but is simply structs and functions under-the-hood. (Also: functional programming patterns over procedural code.) Because it is human to see the world in terms of verbs and objects. Focusing to heavily on only one of the paradigms (OOP or FP) will either lead to anti-patterns (God classes/objects, Factory objects, and Singletons, in OOP), or program structures far removed from the business domain model which also has linguistically unintuitive syntax (since only 12% of natural languages start with the verb, either Verb-Subject-Object or Verb-Object-Subject, ref), as in pure FP. So Subject-Verb-Object should be preferred. (88% of natural languages start with something concrete, the Subject/Object, which I think is a reason for OOP's success; it is more intuitive for beginners, which is vitally important for onboarding & growth.) "Objects and methods" could be merely syntax sugar for structs and functions (see: interchangeability of method-style and procedure-call-style, or the pipe first operator in ReScript, which also illustrates emulating object-oriented programming), if one leaves out troublesome inheritance (which might be good, since composition > inheritance). Inspired by Golang.
- Functional programming patterns like
.map
,.filter
, over procedural code likefor-loops
etc., since the latter would encourage mutating state, and we want immutability. - Tree-shakeable code (for client-server webapps). So it should need a source code dependency between the calling code and the called function. Which makes the language more FP than OOP, according to one definition of FP vs. OOP. In general, shifting concerns from runtime to compile-time is considered a good thing, as it makes the language more predictable, optimizable, and affords helpful coding tools. Having consequences of code changes appear at runtime is a bad thing (see: The Fragile Base Class problem of OOP)
- Referentially transparent expressions. Which means variables cannot be reassigned, so a name will always refer to the same value (see principle: "Things that are different should look different"). Inspired by Haskell. This feature should also lead to easy automatic parallelization and memoization.
- Formally verifiable / provable. Nice-to-have, not must-have.
- Functional programming patterns like
Parallelization made natural. Aided by pure functions. The language should make it easy/natural for programmers to use parallelism (nudges), through language constructs (like executing several sequential lines simultaneously), to avoid overly sequential thinking, which leads to suboptimal performance due to the fact that programmers/humans think sequentially. Inspired by Verilog's fork/join construct. But as opposed to the fork/join example, the language should enforce a deterministic order, guaranteed implicitly by the lines sequential top-down order of appearance in the code (a novel idea, which would need to be experimented with thoroughly..). Alternatively, take inspiration from Golang's elimination of the sync/async distinction and allow programming everything in a sequential manner, but do parallelism under the hood. The sync/async barrier elimination, however, doesn't necessarily nudge programmers towards using parallelization (spinning off new threads) within the context of a program (thread). That style might conflict, or it might be synergistic with the goal of nudging programmers towards making more use of parallelization. Ideally, the language runtime should be able to use parallelization to handle multiple independent processes (like client/server requests; goroutines for concurrency), but also automatically distribute a single program across multiple CPU cores (without special directives, like thread/go) when those cores are idle. To do that, the language should nudge towards natural use of multi-threading instead of single-threading. But not at the expense of readability/reasonability, which is the top priority. The programmer should be concerned with, and simply describe independent sets of causal/logical connections, and the language runtime should automatically take care of as much parallelization as possible/needed.
-
Compiled, but also interpreted and/or incrementally compiled (for dev mode). Inspired by C++ and JS.
- Interpreted / incrementally compiled: So developer can write quick scripts and get fast feedback. Sacrifices runtime speed for compile-speed. Except it also needs quick startup/load time.
- Compiled: For production. Sacrifices compile-speed for runtime speed. Compiles to a binary. Inspired by Deno.
- Small core language: Compiled down to a small instructions set, which can be used/targeted as a starting point to generate code for other programming languages (i.e. generate JS).
- Portability: Be able to target and run on multiple computer architectures.
Mutable API, but immutable under-the-hood. Immutable/persistent data structures (like Lean-HAMT) and structural sharing, to allow incremental update, while also avoiding duplication of data. Inspired by Clojure. Alternatively: In-place mutation, where data structures only become immutable when they're shared (presumes keeping track of borrowing / reference counting). Inspired by Rust. The immutability will also facilitate concurrency and avoid race-conditions. As a bonus you could get time-series and thus time-travel for data. The desirability of a mutable API (mutating objects instead of always having to pass in functions) is inspired by the JS libraries Immer and Valtio.
-
Very constrained. Since discipline doesn't scale. Inspired by Golang. Should assume the developer is an unexperienced, lazy, (immediately) forgetful, and habitual creature. As long as software development is done by mere humans. This assumption sets the bar (the worst case), and is a good principle for DX, as well as UX. The constrained nature of the language should allow for quick learning and proficiency. Complexity should lie in the system and domain, not the language. When the language restricts what can be done, it's easier to understand what was done (a smaller space of possibilities reduces ambiguity and increases predictability, which gives speed for everyone, at a small initial learning cost). No alias names in the language, except for in documentation. Inspired by Python (explicit over implicit, one way over multiple ways). Counter-inspired by Perl (postmodern plurality) and aliasing in the Ramda library. The language should favor one consistent vocabulary, since it increases predictability and reduces variability. Names should not mimic any other language per se, but attempt to cater to complete beginners, because notation has a large impact on novices, a principle inspired by Quorum.
- "
<insert your favorite programming paradigm here>
works extremely well if used correctly." as Willy Schott said. The ideal programming language should both work extremely well even when used incorrectly (which all powerful tools will be), but first and foremost be extremely hard to use incorrectly. - Not overly terse. Counter-inspired by C. Maybe give compiler warnings if the programmer writes names with less than about 4 characters. Reading >>> writing, since time spent reading is well over 10x time spent writing (inspired by Robert C. Martin), and writing can be alleviated with auto-complete, text macro expansions, and snippets, in the IDE.
- No runtime reflection. Counter-inspired by meta-programming and runtime type inspection in Ruby.
- Not overly verbose. Counter-inspired by XML and Java. Maybe compiler warnings if the programmer writes names with more than about 20 characters.
- The Rule of Least Power (by WC3), suggest a language should be the least powerful language still suited for its purpose. To minimise its complexity and surface-area. For better reuse, but more importantly: to make programs, data, and (I will include) data flows, easier to analyse and predict. Inspired by FSM & XState. It needs, however, to be just powerful enough to be generally useful (and not limited to a DSL). Possibly Turing-complete. Given these considerations, a Lisp-style language comes to mind. But there's reasons Lisp never became hugely popular. My guess: readability. So while it could be a Lisp-language (or compile to one), it should read better than one.
- It should be small, but extensible by using simple primitives. Preferably, the language should be self-hosting, but if not, then probably built using OCaml, Rust, Racket or maybe Haskell (LLVM has bindings to these languages). Should do more with less. Inspired by Lisp. Since predictability is good for humans reading, and for machines interpreting, and if it's predictable to machines, humans also benefit. Important: "As one adds features to a language, it ramps up the complexity of the interpreter. The complexity of an analyzer rises in tandem." - Matt Might, on static analysis
- Code-Formatter, like gofmt, inspired by Golang. A tool to auto-format code into a standard. Since standardisation creates readability and faster onboarding of new developers. It also enables mechanical source transformation, which is crucial for language evolvability.
- "
-
Containability and explicitness. Inspired by pure functions. Perhaps the language should even restrict a function's scope only to what's sent in through its parameters. So no one can reference hidden inputs (i.e. side-causes). Thus enforcing more predictable functions, where it is always apparent where it is used: what the function takes in and what it returns. So to achieve partial application of functions (i.e. useful closures), without addressing the outer scope implicitly, could be to supply constants from the outer scope as default/preset/front-loaded parameters. Since "explicit is better than implicit" (inspired by Python's principles). That way, they would be declared in the function signature, so you don't have to dive into the function to discover/investigate them. With the added benefit that the function could be customized by the caller through overriding the defaults.
- Memoization automatically. Aided by pure functions. The programmer shouldn't have to think about memoization when programming, but should be able to tune the degree of memoization (since it is a space/time tradeoff) through general configuration, for advanced cases not optimal from the default. Run time optimisations such as these are not critical features, but certainly nice to have, and should be considered in the language design.
Pattern-matching. Inspired by Elixir. The expression-oriented nature of the language should make this natural, without extra/fancy syntax.
Not file boundary dependent. Can be split into files, but execution shouldn't be dependent on file boundaries. So the programmer is free to keep code tightly together. Inspired by SolidJS.
-
No magic / hidden control. Control-flow should be easy to trace, because it makes it easy to understand and debug. Less magic. Counter-inspired by Ruby on Rails. Inspired by Elixir Phoenix routing / endpoint plugs. Testing isolated parts is made possible by explicitness. Explicit is better than implicit. Inspired by Python's principle. Explicitness makes testing isolated parts of the system possible. So Explicit > Implicit, almost always. (Although implicitness is preferred when one may intuitively and robustly determine the convention through the context. E.g. Needing
self.
references to access class variables inside the class methods would just be noise. This is counter-inspired by Python, and inspired by Ruby. However, usingself
andthis
are considered an anti-pattern in general.)- Make Inversion of Control (IoC) hard/impossible (?). Should ideally always return control to the programmer, not take it away. To enable the programmer to always follow the control-flow by simply reading and following references. Thus,
yield
should also be avoided (counter-inspired by Ruby). Problem: Could make domain code dependent on integrations which goes against the dependency inversion rule. So other patterns, like containing integration coupling in an intermediary abstraction ('port/adapter' function or library, or 'channels'), would need to be developed. - Libraries over frameworks, as a strongly recommended community convention (because frameworks cannot be prevented by a language, afaik). Frameworks utilise inversion of control. That creates Stack Traces which are really hard to debug, because they reference the framework and not your own code, esp. problematic with concurrency. And when yielding control to various (micro-) frameworks, compatibility becomes a specific issue. The programmer shouldn't ever have to ask: "Is this library/framework compatible with this other one?". Counter-inspired by JS ("JS Fatigue"). Or have to ask "Where is the execution path of this program?". Counter-inspired by the magic of Ruby on Rails. When the control is always returned to the programmer (no IoC), he/she may likely mix and match more as pleased, without up-front worrying about compatibility (leading to analysis paralysis).
- Meta-programming: No first-class macros (runtime), since it is a too powerful footgun. But should have compile-time macros. Inspired by Clojure. So that the language can be extended by the community, and so that legacy code could be updated to latest language version by processing the code with macros to transform the syntax.
- Expressions over statements. The calling code should always get something back (Is. 55:11). Because the returned object can be further chained. Inspired by Clojure and Haskell. Counter-inspired by JavaScript. Statements suck, as even Brendan Eich, the inventor of JS, admits. A goal should be to eliminate the subjective/anthropocentric bias that afflicts programming (especially the Imperative kind), because: It is not you, the programmer, which is, or should be, calling code, but code should be calling code (and not terminating in the void, like as if it's you the programmer who is acting on the machine).
- Make Inversion of Control (IoC) hard/impossible (?). Should ideally always return control to the programmer, not take it away. To enable the programmer to always follow the control-flow by simply reading and following references. Thus,
-
Abstractions which are powerful, made from simple primitives. Maybe homoiconicity... since it would make writing the compiler easier, and making the language more readily available to evolve in the community on its own (permissionless). Inspired by Lisp and Clojure's Rich Hickey.
- But this would allow meta-programming, and the associated complexity..?
- The language should maybe also not be so powerful that programs become entirely composed by very high-level domain-specific abstractions, since it encourages esotericity and sociolects, but most importantly: code indirection when reading/browsing. Coding should not feel like designing an AST, so should try to encourage keeping the code flattened (by piping perhaps?) and as down-to-earth as possible. Could maybe be alleviated by an IDE plugin which would allow temporary automatic code inlining (editable previews).
-
Reversible debugging / time-travel debugging (TTD). βReverse debugging is the ability of a debugger to stop after a failure in a program has been observed and go back into the history of the execution to uncover the reason for the failure.οΏ½? Jakob Engblom. Inspired by Elm. Re: Accounting for human limitations and affording the most natural way of thinking: "The problem you are trying to fix is at the end of a trail of breadcrumbs in the programβs execution history. You know the endpoint but you need to find where the beginning is, so working backwards is the logical approach." source. Should at least have this. Could be enabled by, but not necessarily need:
- Reversible / invertible control flow: "A reversible programming language produces code that can be stopped at any point, reversed to any point and executed again. Every state change can be undone." source. Maybe. Might not be feasible, or desirable, when it comes down to it. Might be aided by immutability, and persistent data structures (if they are extended with history-traversal / operation logging features, in addition to structural sharing).
-
Transpiler, configurable, so it could translate between all language dialects and variations. So that the language could evolve in multiple directions, and consolidate later, without harm.
- Homoiconicity could perhaps enable this.
-
Async: blocking/sync interface, but non-blocking I/O. Inspired by Golang, and to lesser extent JS / Node.js too. Should not have to litter code with async/await repeatedly (see: what color is your function? and the problem with function annotations, and async everything). NB: But hiding the async nature with synchronous seeming abstractions could create a dangerous model-code gap with a potential impedance-mismatch and cause for design errors and bugs (inspired by Simon Brown). So the language should make some abstractions around async simple (like channels and goroutines in Golang). But also inspired by declarative and easily statically analysable async contexts, made with JSX, like Suspense (async if-statement), in React and SolidJS.
- Ease of reasonability is first priority, and I believe it is best afforded by simple and clear abstractions (without model/code impedance mismatch, as made important by failures of ORM's and distributed contexts). The choice of sync interface here as opposed to async, is similar to how the wish for lazy evaluation by default was discarded for eager evaluation by default. One argument by Ryan Dahl of Node.js is that sync by default with explicit async (he mentiones goroutines in Go) is a nicer programming model than async everything (like in Node). Because it's easier to think through what you're doing than jumping into other function calls like in Node.js. Reasonability is a top priority.
- Rich Hickey also has some good arguments against async by default (when implemented with callbacks), namely that it:
- fragments your logic (spread out into handlers), instead of keeping it together. Programmer has to deal with multiple contexts at once (complicated), instead of one overarching context (simple).
- callback handlers perform some action once in the future, but the state they are operating on may have mutated in the meanwhile. So it may give a false confidence in being able to get back to the state as it were when the callback was made. Want to avoid the dreaded Shared Mutable State. May be solved with only allowing immutable constructs.
- On the other hand, having sync by default, and async through Channels:
- gives the control back immediately (in line with functional composition) instead of functions that effectively evoke side-effects on the real world on the other end (as callback handlers do). In line with our principle: Always give control back to the programmer.
- channels are generalized pieces of code that can handle many connections (pub/sub).
- channels afford safe concurrency (thread handling), whilst with callback handlers (unless used in an event-loop system such as JS) the programmer has to ensure safe concurrency (which we don't want).
- channels afford choice on when to handle an event, whereas with a callback it gets called whenever it gets called (event-loop). Channels work in line with our principle: Always give control back to the programmer.
- All of the above have implications for reasonability. Needs to be investigated further... Golang's way of handling async seems to be the current gold standard, touted by many bright people, since "Golang has eliminated the distinction between synchronous and asynchronous code" (by letting the programmer code everything in a sync fashion, but doing async I/O under the hood). Golang's principle of "Don't communicate by sharing memory; share memory by communicating." avoids the dreaded Shared Mutable State and affords itself better to ensure simple, safe, and scalable modes of thinking (our core principle): It's hard to think of something, if it has changed the next time you think about it (thus: immutability). Or if thinking about it changes it (manifesting in code the cognitive equivalent of Heisenbug's): Programmers need to be able to reason about a program's state without simultaneously modifying that state (inspired by CQRS).
-
Concurrency. For Multi-Core and Distributed. Probably a CSP model, or a similar or novel model, due to easier to debug concurrency. Inspired by Go.
- Async: Concurrency should integrate well with the async feature of the language. Default should be to easily ship tasks off to be completed elsewhere (other thread/process/worker/server). Inspired by Golang and JS.
- Probably not implemented as an Actor Model. Since sending events may be harder to reason about than stricter promise-based operations (using callbacks under-the-hood). Counter-inspired by StimulusJS. Inspired by ReactJS.
-
Scalable: From single core to multiple core CPUs, and from one to a distributed set of machines. Inspired by the purpose of the Actor Model, from Erlang/Elixir. But rather implemented with "Machines" which is a novel concept that combines a Mailbox with a stateless function (executed asynchronously), as the universal primitive of concurrent operation. That way, instead of functions calling functions directly, which strongly couples them, which is bad, they call each other by sending Messages (containing the parameters) to the other function's Mailbox. We call such functions "Machines". Each of them are in fact a mini-computer, or a computer-within-the-computer, if you will. Such Machines should be able to be moved to distributed systems without rewriting the code (inspired by Actor Model systems).
- Ideally, for performance, when code is compiled to be run on a single machine, the compiler should be able to be optimise away the Mailboxes, so that Machines can be turned into (simpler and faster) synchronously executed functions.
- The language should facilitate and nudge developers's towards creating "Functional Core, Imperative Shell" architectures (inspired by Bernhardt at 31:56 in his Boundaries talk), to preserve the purity of functions as far as possible, while also containing side-effects:
- By semantic rules: A function should either return a value, or don't return anything (i.e. be simply a void procedure). And a procedure can never be placed within a function.
- Alternatively: use an IO action of an IO type (inaccurately named "IO Monad" at 30:44 in the Boundaries talk), transparently (without actually having to deal with the concept of a Monad). Where you effectively construct a sequence of I/O operations to be executed later. Inspired by Haskell. Something like this is needed because the Mailbox is stateful (it is constructive/destructive, like a queue), and I/O messaging would be a side-effect. The Machine/Mailbox is inspired by the Actor Model from Erlang.
- Alternative to use of Monads and Immutability: Use Uniqueness Type, which allows mutability and pass-by-value while also preserving referential transparency (since side-effects are ok in a pure language as long as variables are never used more than once). Inspired by Clean and Idris. Possibly use Simplified Uniqueness Typing, inspired by Morrow.
Reactive. Inspired by Functional Reactive Programming, and Elm, and The Reactive Manifesto. Though the latter is geared at distributed systems, it could also be a model for local computation (rf. Actor model, and Akka). The programming language should make default and implicit the features of reactivity and streaming, as opposed to preloading and batch processing. (Reactive Streaming Data: Asynchronous non-blocking stream processing with backpressure.)
No single-threaded event loop that can block the main thread. Counter-inspired by JS.
Transducers, under-the-hood, to compose and collate/reduce transformation functions (chains of map, filter etc. turn into a single function, visualised here). Chaining function calls should use language-supported transducers implicitly. Language should not require special
compose
syntax.Eager evaluation, generally. Since it is more straightforward to reason about in most cases, simpler to analyze/monitor, and spreads memory consumption out more in time, than lazy evaluation which would pile up work and in worst case could overflow memory at an unexpected time (in any case, the programmer shouldn't have to worry about evaluation strategies, including space usage performance and evaluation stack usage). But should use the more efficient lazy evaluation when currying functions or chaining methods, unless intermediate error-handling or similar requires value realization (and even here, transducers could potentially alleviate unnecessary value realization). Inspired by Lazy.js. But this is an optimisation that could wait. Concurrent operations across threads/processes should never be lazy. Counter-inspired by Haskell.
-
Gradually typed, as types can be boilerplate and create noise in the code (counter-inspired by TypeScript, and inspired by Elm). Most types should be inferred (inspired by Haskell and TypeScript).
- No runtime type errors. Inspired by Elm (and Haskell). See 'Error Handling & Nullability'.
- Types should be associative/commutative/composable/symmetric, inspired by Dotty/Scala3, and the 'Maybe Not' talk by Rich Hickey.
- Types should be enforced statically at program exit boundaries (so external libraries or outgoing I/O are ensured existing typings).
- Strongly typed (checked at compile time), not weakly typed, since implicit type coercion (at runtime) can be unpredictable. Inspired by TypeScript. Counter-inspired by JavaScript.
- Structurally typed (inspired by TypeScript, OCaml), instead of nominally typed (counter-inspired by Java and Haskell).
- No generics, but static analysis of functions based on structural typing on the partial structure used (anywhere) inside it (disregarding dynamic/conditional use). If passing in dynamic values to a function, type would need to be statically declared where it is passed in, so static analysis at compile time can determine if those types match the partial structural types used within.
- Type inference, fully decidable. Inspired by OCaml. For increased readability and convenience (though not essential, cf. popularity of Rust). To not have to declare types everywhere. But local type inference for the body of functions are what's most important.
Composable. Favour composition over inheritance. Inspired by Robert C. Martin, Martin Fowler, and JSX in React. Composability entails it should be easy to write code that is declarative, isolated and order-independent. See "strongly typed".
-
Memory safe, ergonomic, and fast. should be safely and implicitly handled by the language, without a runtime GC.
- No Garbage-Collector (GC), but also no garbage. Deterministic Object lifetimes, and Ownership tracking (affinity type system). Inspired by Rust and Carp.
- Memory-safe. Maybe a Borrow Checker and Reference Counting. Inspired by Rust. But ideally, Ownership and Borrowing should be implicit by the programming language, so the programmer wouldn't have to think about low-level concerns such as memory management. To avoid conceptual overhead of manual memory management (as with explicit borrowing semantics), the language should perhaps use or take inspiration from Koka's Perceus Optimized Reference Counting. Koka apparently allows even more precise reference counting (see sect: 2.2) than Rust.
Secure from the start. Secure runtime. Inspired by Deno. Safety has to be a built-in design-goal from the start, it cannot be added on later. As evidenced by the justification of existence of Deno (Node was unsafe), and Rust (C++ was unsafe). Also, see: memory safe.
-
Few core primitives, and based on very few fundamental concepts to learn. Inspired by Lisp. No limiting distinctions like only half-way interchangeable expressions vs. statements in JS.
- But avoid mini-languages/DSL's. Dialects/sociolects hinder generalised understanding and learnability (adds knowledge debt). Counter-inspired by Lisp and Ruby. Even though it might be true that βDomain-Specific Languages are the ultimate abstractions.οΏ½? as Paul Hudak put it in 1998, some cross-domain terms are usually helpful for onboarding programmers, since they afford familiar knobs on which to hang the other unfamiliar code. Even if you don't understand the domain (or its plethora of abstractions), you would at least understand something. From where you could build your further understanding.
Ergonomic to type. Prefer text over special characters like curly brackets (they are hard to tell apart from parentheses in JS). No littering of parentheses. Inspired by Ruby. Counter-inspired by JavaScript, Lisp, and JSON.
No super-powerful tools which may hurt you or others in the long run. Counter-inspired by meta-programming in Ruby.
Crash-safe. Can crash at any time and resume computation at exact same spot when restarted. Inspired by Erlang.
Piping, or some form of it. But always top-to-bottom or left-to-right. Inspired by Bash, and functional programming with pipes (Elixir, BuckleScript, and ts-belt). Data-first instead of data-last.
-
No Exceptions. Inspired by Go. But Recoverable and Unrecoverable errors. Inspired by Rust. (Definitely no checked exceptions, as it breaks encapsulation by imposing behavior on the callee. Counter-inspired by Java).
- Result data type, for error handling and validation. Inspired by Result from Rust and F#, and Either from Haskell and Elm, although it should not be called Either, as Either is a confusing misnomer.
-
Error handling & Nullability: No explicit
null
ornil
value. Meaning no Null Errors (typically occurring far removed from their point of inception). Inspired by Elm and Rust. But without having to explicitly declareMaybe
orOption
types (inspired by Hickey's Maybe Not). Instead, automatically but statically infer and create/augment a function's return type to a "nullable reference type" indicated by a?
after the typename, whenever there is an unhandled condition that could result in a null value. Or automatically create aNullObject
(see: NullObject pattern) of the function's declared return type (which with type inference can avoid some timid coding patterns, like always checking for null, counter-inspired by Golang). Maybe even better, let every type declare and handle their own empty state. If all types are defined in terms of Monoids, then null can be replaced by the identity value (of each Monoid), so that combinations within that type never fail, and never alter the result. Resulting in no more timid coding patterns like null checks. Furthermore, the return type from functions using I/O (likeIOMonad
in Haskell), should always be augmented/inferred from static analysis.- Variant Types for error-handling using return values (like
Result<Type, Error>
, inspired by Rust), instead of special syntax. Counter-inspired by Golang. - So that you have less avenues to explore when debugging and fewer branches to check when programming, so you can write Confident Code focused on the happy-path.
- No possibility of failing silently during runtime (due to syntax errors). Counter-inspired by JS.
- Variant Types for error-handling using return values (like
Compilation should be able to target some popular language & ecosystem, like transpile to JavaScript or compile to WASM, or potentially even the JVM, to get cross-platform interoperability. But not any target for any cost, if it would put unwieldy constraints on the language design. WASM seems like the best candidate.
Small standard library. To have some common ground of consolidation, and to provide the basic and most common utils. So usage will be fairly standard, and coming into a new codebase not feel too foreign.
Single package directory: Some sort of singular reference to a library package information service. So the community can organise around one common point, instead of scattering. Inspired by NPM. But doesn't necessarily need to be centralised package download/storage, the storage/download could be decentralised. But would need to be safe. Cert signing?
Runtime environment: Be able to run on some existing popular cross-platform runtime (like WASM or the JVM?). Inspired by Clojure. And/Or have a very minimal programming language runtime (without a GC). Inspired by Rust. But the runtime should in any case handle the scheduling of goroutines, inspired by Go.
Ecosystem: Interoperable with one or more existing programming language ecosystems. To import or reuse libraries. Without too much ceremony. So the ecosystem doesn't have to start from scratch.
Be general purpose enough to at least write scripts and CLIs, but also web servers/clients.
Editor integration: Should afford simple integration into editors/IDEs like VS Code. Syntax highlighting, a language server (for autocomplete, error-checking (diagnostics), jump-to-definition etc.), via the Language Server Protocol (LSP).
Well-documented. Documentation on language syntax should be accessible from the editor/IDE, vha. the LSP.
Well-tested.
All the while, the language should avoid the fate of the Vasa.π Which means a feature creep resistant core language. (I am aware the irony of this feature list, but read on...) Which should be designed and decided upon as early as possible (when the degrees of freedom in the design space is as wide as possible), with a holistic view. Boring > clever. Designed to reach a 80% sweet spot of most important features, foregoing the most exotic and esoteric features, and foregoing the ability to solve edge-cases (such should be relegated to interoperability with other more specialized programming languages). Since 80% of the work and complexity would come from the last 20% (The Pareto Principle).
One or more of these requirements might be conflicting / mutually exclusive. Maybe. But maybe not?
One can always dream.
This is a list of my preferences. Some would probably be quite controversial. Like my dislike for certain features, which a lot of other people like (e.g. meta-programming). I might just not be familiar enough with them to have developed an appreciation for them.
I will try to keep this list updated if and when I change my mind on any point, which I am open to doing. I have already changed my mind from negative to positive on pattern-matching.
What features (or lacking features) would your dream programming language have?
Top comments (20)
As the saying goes, you achieve perfection not when you have nothing more to add, but when there is nothing left to remove.
I would start with something mature and comfortable, like C# or Js, and start removing things that I think are bad. Async/await , void returns, null values, for/while loops, etc. Kill as much as possible. Then try coding. See what doesn't work, then fix it.
The only reason we don't do that within any existing PL is backward compatibility, but you won't have that problem. Instead you get tons of existing code that you can refactor to test your design. Don't reinvent the wheel.
"Inside every big ugly language there is a small beautiful language trying to come out." -- sinelaw
Most valuable comment. True indeed, thanks for the reminder.
ReScript is the language I've found that seems to most closely fulfill the set of most important familiar features. It has:
(F# with Fable is another equivalent alternative, if you're into .NET)
What about a soundly typed language that transpiles to Clojure and ClojureScript (which again transpiles to JS)?
It could be based on Subject-Verb-Object syntax, and Data First Functional Programming, with sweet-expressions, written in OCaml, to get sound type inference.
That could get pretty close to the dream...!
Typed Clojure is not it... youtu.be/a0gT0syAXsY?t=94
The syntax is a non-starter. It also has only local type inference. The problem might be Clojure's ad-hoc polymorphism complicates things quite a bit... I'd rather have parametric polymorphism ala. OCaml and ReScript.
One syntactical feature I dreamed about in future languages is blanks or placeholder parameters for functions.
If I have a function which takes multiple parameters for example a functin which divide a number x by y and add z to it, the current function structure will look like this
function util(x,y,z) // divides x by y and adds z
I envision something like the following
function divide( x )by( y )andadd( z ) { ... }
function zipfile( filepath )andEncrypt( withKey )andUploadToAWS( creds ) { ... }
Function names will be self-descriptive.
Self-descriptive function names is a very good idea and practise! Chaining atomic functions is another good idea.
In that regard, you might find the concept of piping in functional languages interesting.
Check out the ts-belt library for Typescript:
mobily.github.io/ts-belt/docs/gett...
Or the TC39 proposal for addition of the pipe operator to JavaScript:
github.com/tc39/proposal-pipeline-...
Chaining is a very powerful idea, I have used some excellent frameworks in Java.
What I am proposing is change the style of passing parameters only, instead of parameters at the end of function name, put gaps/blanks in the middle of functionname.
I look at the list and cant help but wonder, have you looked at Crystal ?
Many of the bullets are defined by the author's dislike for specific aspects of Ruby. With this context, Crystal seems like a poor suggestion.
I love Ruby though, it's one of the most beautiful languages I know. :-) Elixir and Crystal have inherited some of its beauty in terms of syntax.
I think a language could be even more beautiful, though. By using FP and composition in a readable way. Railway-oriented programming in F# is particularly beautiful. A Clojure/Lisp like language, with some beautification fix to the s-expression syntax, might be the best way to start. One interesting avenue there is sweet-expressions by Dr. David A. Wheeler.
Yes, I have looked at it briefly.
Briefly I can say that I agree with you, if programming languages are just tools to convert ideas to machine executable bytes as efficient and flexible as possible, then they need features like those, and some of them are already exist but not all in one.
But unfortunetally world is not perfect, and we (engineers) are needed to solve problems with limited resources and with not so efficient tools.
Finally, since we are the last coders of the time, enjoying coding with our favorite language until rise of the machines seems to be the best option.
"Not overly terse" "No overly verbose"
There is a lot of good thinking here. Deep thinking by domain experts can inspire great progress, so thank you for sharing.
However, I believe your focus on the length of names belies an inadequate effort on the matter of naming. I know your proposals for modulating name length (compiler griping about either too short or too long) is unworkable; the ridicule it would produce alone dooms this concept, never mind the properties of the many real languages people use.
Since we're dreaming here I'll share with you my unconstrained dream of how to deal with names. An overview of my dream naming paradigm, followed by brief elaborations of each point:
The best names are no names. After that you have concise and universally recognized symbols. After utilizing the previous affordances you use a small number of short names with limited scope. Finally, after having discovered something that is actually worthy of a lengthy name (not subject to elision through expressions, not accommodated by a concise symbol and in need of more than a small number of characters due to scope) you invent a longer, meaningful name.
No names: expressions deliver this. It should rarely be necessary to create names to represent the value of intermediate computation. I imagine, in my dream language, an environment where expressions fold/unfold (similar to code folding in editors) allowing the programmer to dig into complex expressions when necessary and ignore detail when not. With such an affordance one would simply forgo some large number of names.
Universal symbols: We see from the mathematics used in (non-software) engineering that well known symbols provide great value through a large variety of concise symbols. Why should we need type/read "struct" or "function" or "return" or many other ubiquitous things in our dream language? Can we not have concise symbols? Shouldn't DSLs represented in our dream language be able to introduce domain specific symbols? Do not concern yourself with the limits of keyboards or character sets; a dream language should consider voice input, gestures, non-Unicode representation and other advanced mechanisms.
Short (C like) names with limited scope: Within a limited scope a small number (up to about a half dozen) of concise names are fine. Lengthy names can make expressions harder to consume, violating the reading >>> writing view. For centuries mathematicians have utilized concision to convey fabulously complex expressions with great success. Programming should have this affordance as well; short names are not bad when confined to a limited context and long names do automatically equate with improving comprehension.
Having dealt with the vast bulk of names through the above mechanisms we are left with those things for which we actually need to invent bespoke names; they can't be obviated through expressions, don't rise to the level of a concise abstract symbol and are not confined to a limited context. Only now do you write prose.
Changing topics:
R. Martins view of the value of reading vs writing (the 10x time spent assertion) is insufficient to guide language design. The behavior of real programmers belies the high value of writing without regard to future comprehension. So while it true that we spend more time later reading what has been written, it does not automatically follow that the value of this later work is greater than the former, even if it is 10x more time. A language that values reading to the detriment of writing may suffer by creating excessive burden on the writer. COBOL may be an example of such. Java as well.
It looks a lot like Nim.
I did look briefly at Nim and while it looks like a nice language in its own right, it is quite different from the language that I envision. I dream of a more pure functional language (e.g. uses
map
instead offor
-loops etc.), for starters.The problem with functional language constructs like
map
is that they are less intuitive/familiar for beginners. I have found it easier to teach loops thanmap/filter/reduce
. A solution could be that the language suggest this to the developer and offer automatic refactoring of the for-loop, something like: "It might better to use themap
function instead -> refactor".yeah,
map
has a problematic name for beginners. It should have been called something likeapply
. Because you "apply" an operation to some thing. "Mapping" a set of things to another set of things (by way of a function) is too much derived from abstract mathematics to be beginner friendly (as is much of FP lingo, sadly).I mostly agree. And I have a series of ideas for how to actually implement it all, which I'm very slowly doing: dawn-lang.org