loading...

How Lua Banished the Semicolons

17cupsofcoffee profile image Joe Clay Updated on ・4 min read

My current pet project outside of work is developing a little programming language called Ein. I decided fairly early on in development that I didn't want Ein to have semicolons, so I've spent a fair chunk of the past week investigating how other languages make this work.

Lua's solution to this problem is (in my opinion) fairly nifty, so I thought I'd write about it on the off-chance that someone else will find it as interesting as I do 😄

The Problem

First, some background - why is getting rid of semicolons tricky? Can't we just remove them from our language's grammar and be done with it?

The answer to this question can be summed up in one snippet of pseudo-code:

let x = 1 // Does the statement end here?
- 1       // Or does it end here?

How does our language's parser decide whether this should be let x = 1; -1; or let x = 1 - 1;? In the parser's eyes, they're both perfectly valid statements!

The (Potential) Solutions

There's several ways that languages try to get around this problem. Some make the whitespace in their language significant, like Python. Others, like Go, helpfully insert the semicolons for you behind the scenes based on a set of rules.

Personally though, I'm not a fan of those solutions. Making whitespace have meaning rubs me the wrong way for reasons I don't quite understand, and automatic semicolon insertion feels like placing too much trust in the compiler to 'guess' where I meant for the statements to end.

Surely there's a way we can make things explicit without peppering our code with extra syntax?

How Lua Does It

Turns out we can! Lua's syntax is unambigous, even if you leave out all the semicolons and nice formatting, and the main way it achieves this is by dropping a feature a lot of us take for granted - expressions-as-statements.

In most languages, it's perfectly valid to use an expression (a piece of code that can be evaluated to get a value, like adding two numbers together) in the same place that you would use a statement (a piece of code run for its side effects, like a variable declaration).

Lua takes a much more hardline stance on this - programs are a list of statements, some statements may contain expressions (like the condition of an if), but expressions are not allowed to be used as statements.

Let's go back to our original example, translated to Lua:

local x = 1 -- Does the statement end here?
- 1         -- Or does it end here?

In Lua, a variable declaration is a statement, but -1 is an expression - therefore, the only valid way of interpreting this code is local x = 1 - 1!

What's The Catch?

Ah yes, there's always a catch, and in this case it's a fairly obvious one: what if I want to run an expression for its side effects?

For example, a lot of the time you'll want to use the return value of a function, but sometimes you'll just want to run it. Lua caters for this scenario by making an exception to the rule, allowing function calls to be used both in statement and expression position.

This is one of the only places that Lua bends the rules, however. In some languages, you can use the short circuting behavior of the logical AND/OR operators as short and sweet control flow statements:

// JavaScript
isActive && doSomething();

The equivalent in Lua isn't valid unless you assign the result to a temporary variable:

local _ = isActive and doSomething() -- _ has no special meaning - just a common
                                     -- Lua naming convention for throwing
                                     -- away variables!

That said, once I started thinking about it, I realized I don't write code like that all too often! I've gone through my phase of writing ternary soup, and I think I tend to prefer using more explicit/blocky syntax these days - it tends to convey my intent better. I'm starting to wonder if dropping expressions-as-statements might not be too bad a price to pay for having a completely unambiguous, semicolon-less grammar!

Conclusion

Thank you for reading! I hope I didn't bore you to death rambling on about semicolons for $x words! ❤️

If you're interested in this kind of thing, I'd recommend taking a look at the full grammar for Lua from its reference manual - it's really short and really clean, and there's a lot of really interesting design decisions in there which I'm interested in digging deeper into.

Now, my questions to you, dear reader:

  • Do any other popular-ish languauges disallow expressions-as-statements?
  • How do you feel about this solution compared to the others I mentioned? Do you think the trade-offs are worth it?

Posted on by:

17cupsofcoffee profile

Joe Clay

@17cupsofcoffee

java by day, rust by night • they/them

Discussion

markdown guide
 

Now, I rather like semicolons. I think they provide the punctuation needed. But then, I put in full stops (or periods, for our colonial friends), when I'm writing an SMS, let alone in text messages and so on, so maybe I'm the weird one.

 

I'm definitely not totally opposed to them - my favourite language by far is Rust, which does have semicolons!

I think for me it comes down to how they mesh with the overall feel of the language - languages like Rust have a lot going on syntax-wise, so the explicit syntax feels right, whereas Lua is quite minimalistic and 'light', for want of a better term.

I'd also say to a degree it just depends what mood I'm in, honestly :p

 

You missed a full stop at the end, there. twitches

I'm not trolling you, I swear! :D

(EDIT: Dev.to's formatting might be trolling me, though...)

 

F# does not have semicolons. I think it's different than the aforementioned categories, so you might want to check it out.

I like Lua. It's very tiny (about 100 KB footprint for the engine), and easy on the eyes. And is capable of doing large projects, like desktop applications.

For JavaScript, I like Standard as a formatter. Omits the superfluous semicolons.

I also like Python. No semicolons. (But it does have semantically meaningful indentation, which is my only twitch to the language.)

Line oriented programming languages, like Fortran or Basic, don't have semicolons. But they are line oriented. Which means you'd probably need some sort of "line continuation" marker.

I program in C++, which is a semicolon language. I also like D, which is another semicolon language.

 

I love F#, wish I got chance to use it more often :) Definitely my favorite functional programming language. Looking very briefly at the language spec, it seems like they do some sort of filtering on the the token stream to determine whether an operator is prefix or not? I was under the impression they just used the whitespace to figure that out, so that's super interesting - will have to investigate further :)

Yeah, the fact that Lua works well in a lot of different scenarios is one of the main things I like about it!

I think a good formatter is a must if you're going to use automatic semicolon insertion. This is probably why it's worked out quite well for Go - they bundle one with the language.

Python is a really cool language, and I love their attitude towards language design (Zen of Python, etc). That said, shakes fist in general direction of significant whitespace

D is a language I've heard a lot about, but never tried - need to check that out at some point!

 

There is also the solution of just adding an additional syntax to allow expressions as statments, a sort of a lexical cast.

You could just define your language as (concrete syntax):

<exp>  := <exp> + <exp> | - <exp> | ...
<stmt> := <id> := <exp> | $ <exp>

or something along these lines.

 

Something along those lines is what I'm considering implementing in Ein, if the lack of expression statements causes issues :) Seems a lot nicer than having to create a temporary variable!

 

Personally i'm a beginner in programming, App developer for Android to be exact.
So in that case i coded using Java and C#, since i'm new to programming i'd rather have the semi colons so i'm not confused to the core wondering "what is what?",
i'd love to learn Lua though i find it very interesting! :D

 

There's definitely a lot of upsides to having stuff be explicit :)

Even as someone who doesn't write it that often, Lua is a super interesting language! Does a lot of things differently to other languages I've used.

 

JavaScript does not require semi colons either;

 
 

Well I explicitly disagree with Google.

I explicitly disagree with JavaScript. Either you have semicolons or you don't.

 

I've seen this rise in popularity lately, and frankly I just find it harder to read.

There are many statements which often span multiple lines (looking at you promises...) and it's just more clear to me when there is a dedicated stop to the line. Might be because I come from a C background, but I always make linters require semi-colons.

 
 

interesting article.
another point of view on the 'statements vs expressions' issue is to get rid of all statements altogether and have everything be an expression.
this is the stance most often taken by functional languages and the one I prefer.

 

Overall, this is my preferred approach too! And it can work in imperative languages as well - Rust is my favourite example of this.

That said, I think it works a bit better in strongly typed languages than in dynamically typed languages - in the former case it's a lot easier for the compiler/interpreter to safeguard you from accidentally returning something when you didn't mean to.

 

I myself have just never quite understood not wanting semicolons. To me I'm always like "it's one extra key, and it's even apart of the home keys when typing. So easy to type." Plus it provides an even easier time for your brain to parse in my mind as you have a very good idea how the parser will read what you just wrote.

But I get it depends on the language. I've just never understood not wanting to use them. Even in languages where they are optional, I tend to use them. Of course this very well might just be because I was trained and groomed on the typical languages like C++, Java, and C# so it's just locked into my brain now.

 

Swift doesn't require semicolons either but you can add them if you want to put several statements on the same line.

I like Swift, it's a nice language.

 

Swift seems really cool! I'd like to check it out at some point, but they don't distribute Windows binaries, and I'm too lazy to compile from source... Eventually, though :p

Christopher Durham left a really interesting comment on my previous thread, explaining how Swift handles expressions-as-statements without creating syntax ambiguities. If I'm reading it right, they add a little bit of whitespace significance around operators to make it unambiguous whether they're infix, prefix or suffix. Another really clever solution to this problem :)