edA‑qa mort‑ora‑y

Posted on Mar 28, 2018

Help design a language: What about tuples without commas?

#language #coding #compiler #discuss

I see that you like to give feedback about language design. How about contributing to the design a new and evolving language: Leaf.

Even if not interested in an actual new language, feel free to participate.

Let's start off with a simple syntax decision. Should you be allowed to write tuples without commas, much like I allow statements without semicolons:

A tuple with commas:

var items = [
    one,
    two,
    three,
]

A tuple without commas:

var items = [
    one
    two
    three
]

The second one relies on line-endings to split values. If you need to do an inline tuple on one-line the comma form would always be available [one, two, three].

Treat these as "arrays without commas" if you wish to ignore that "tuple" part of the question.

Ask, Comment, Criticise

I'm going to leave these questions a bit open, without too much background or motivation. It's helpful to get a kind of instinctive response. I obviously have opinions, but am quite open to changing my language.

Please ask questions, about language design, syntax, compilers, or whatever you'd like. Sometimes there are details that help you decide, sometimes you may just be curious. For example, it's reasonable to ask "Are there any technical parsing problems that arise from removing these comma's?"

Top comments (25)

Jason C. McDonald • Mar 28 '18 • Edited

Definitely with commas. Imagine if I had this...

tuple = [1, 2, 3, 4, 5, 6, 7, 8, 9]

...and something changed to where I felt each element needed to be on a separate line. I'd have to actually go back and REMOVE all my commas, which feels like unnecessary work.

(and so on).

Or what about a mix? Do I need commas at the END of the line, which I'd feel compelled to add out of habit with all other languages?

tuple = [1, 2, 3
         4, 5, 6
         7, 8, 9]

Now, if the commas were merely optional in this case, I might see that as being workable. From there, we just get into Perl (TMTOWTDI) v. Python (TOOWTDI) philosophy.

Joe Clay • Mar 28 '18

Your latter example (mandatory between items on the same line but optional at the end of a line) is pretty much exactly how CoffeeScript objects and arrays work, so there's definitely precedent for that!

Jason C. McDonald • Mar 28 '18 • Edited

Then again, CoffeeScript and its relatives aren't exactly widely considered a pinnacle of good language design. Useful, but not exactly elegant.

Joe Clay • Mar 28 '18

Yeah, it's precedent, but not necessarily good precedent :p

edA‑qa mort‑ora‑y • Mar 28 '18

The commas would be optional on line end. All of the syntaxes you provided here would be acceptable.

Though I'd argue that mixing within a single tuple is perhaps not good practice -- I wouldn't go out of my way to forbid it though.

Jonathan Apodaca • Mar 28 '18

I have been thinking about implementing this in my language as well, but I remain undecided as I already have implemented significant newlines if the next token is an operator on the "line-continuation" whitelist. I really like tuples without commas, but I fear this may muddy my syntax further (in my case). So I guess the answer is it depends on the current syntax!

edA‑qa mort‑ora‑y • Mar 28 '18

I chose to limit line continuation to only paired syntax, (...), {...}, and [...]. I was worried that operators continuing lines can lead to ambiguity and also make it a bit harder to read.

In Leaf a new-line is always significant in this regards, It's never ignored and always has some meaning.

Jonathan Apodaca • Mar 28 '18

That is an interesting take on a very polarizing issue. I quite like it. So you are saying that I cannot split an expression across lines, but if I want to, I can surround it in parenthesis? Cool.

I chose to have "context-aware" newlines, so this works:

some
  .very(...)
  .long(...)
  .method(...)
  .chain(...)

The following are not "whitelisted" infix operators: (, [. So the following works as expected:

var x = y
(z)
// is equivalent to:
var x = y;
(z);

edA‑qa mort‑ora‑y • Mar 28 '18

The splitting across lines works with a common syntax style, which will basically become the defacto approach:

some.very(
  args
).long(
  args
).method(
  args
).chain(
  args
)

I'm tempted to make . a continuation operator as well though, but it will have to occur on the previous line:

some.
  very(...).
  long(...).
  method(...).
  chain(...).

The core idea is that continuation to the next line is decided solely on the current line, it never looks to the next line. This avoids a lot of ambiguity and avoids doing lookahead in the parser.

Jonathan Apodaca • Mar 28 '18

That is how Golang approaches it, and it seems to work well for them. In that respect, as long as you have a good, standardized code formatter (leafmt?) then I think that solution will work quite well.

I am a compiler noob, and this is what gets me: TBH, the concept of writing a compiler is not too bad, but what quickly spirals downhill are (a) too many choices to make, and (b) so many things to implement. It's hard to stay motivated, but rewarding to see things begin to come together and work.

edA‑qa mort‑ora‑y • Mar 28 '18

I this case I don't need a formatter, since if the continuation format is wrong it won't compile. :)

That in mind, I've kept the syntax simple to parse to make it easy to write formatters and high-level tools. The language can be parsed and manipulated without really knowing the semantics of it.

It takes a long time to create a language that does something useful. I'm finally at the point where I can do stuff. I've decided to use SDL and start doing graphics, to keep it interesting. Simple games in particular for now.

Jonathan Apodaca • Mar 28 '18

You still may benefit from a formatter. Imagine the case where the user types:

some .
  very( anArg).
  longMethodChain  ()

It would still be nice to normalize it and remove the unnecessary whitespace (assuming that the above is not a lexer error):

some.
  very(anArg).
  longMethodChain()

That's awesome that it's in a workable state! I look forward to following your development!

Joe Clay • Mar 28 '18

Happy to see my post sparked some more discussion :)

I'm not particularly fussy about having commas or not, as long as it's not too ambiguous where the expression ends (which, going by your other comments, doesn't seem to be much of a problem in Leaf).

If you do have commas, though, allowing a trailing comma after the last item is a must for me. Otherwise, you end up having to change the last line whenever you want to add more items, which makes the diff messier than it needs to be. It's a small thing, but it bugs me when languages don't let you do it!

edA‑qa mort‑ora‑y • Mar 28 '18

Definitely, a trailing comma is allowed. Probably the first thing the parser did! :)

var items = [
  a,
  b,
  c,
]

It's much diff friendlier, keeping those git PRs from touching lines they don't actually modify.

Nested Software • Mar 28 '18 • Edited

Will this approach work okay with cases where the items are themselves multi-line expressions? That's the first question/concern that comes to mind.

This brings to mind the ; in JavaScript, where it can now be omitted at the end of a line. I haven't actually tried (besides for loops) but I assume you'd still need it if you want to have multiple statements on a single line. I don't know if it is equivalent though.

edA‑qa mort‑ora‑y • Mar 28 '18

Expressions in Leaf don't extend beyond a line unless they belong to an open pair, such as (), {}, or [].

For example, this is invalid:

var a = 5 +
   7

Single item value-lists can be used to extend beyond a line:

var a = (5 +
   7)

This will work inside tuples as well, in cases where you need extra formatting:

var items = [
   one
   (some_condition ?
       first_value |
       second_value
   )
   third
]

I'll be encouraging a style that discourages such inline complexity. Given that types are inferred anyway, it's usually not much of a problem to do this instead:

var two = (some_condition ?
   first_value |
   second_value
)

var items = [
   one
   two
   third
]

I'm not a big fan of cramming too much into a single expression as it makes the code harder to follow.

There'll potentially also be a let syntax:

let two = (some_condition ?
   first_value |
   second_value
)

var items = [
   one
   two
   third
]

I'm thinking of making this a kind of lazy-evaluation that doesn't actually introduce a real variable.

Bach Le • Mar 28 '18

How does it fit with the rest of the language? Do function calls need comma or is it like ML: f x y or io: f(x y)?

An array where each element may be some other type of expression like function call or map may look weird if they require comma.

edA‑qa mort‑ora‑y • Mar 28 '18

There are value-lists and tuples, which are treated very similarly:

value_list = (a, b, c)
tuple = [a, b, c]

Typically you see value-lists in function calls:

pine(a,b,c)

If tuples go comma-less, then this will apply to value-lists as well. But I'm only suggesting this for multi-line lists now, not inline.

pine(
   a
   b
   c
)

That would be equivalent to pine(a,b,c)

There is a possibility to support comma-less inline notation, such as [a b c] or pine( a b c ). I'm reluctant to do this now as it may artificially restrict the syntax too early in the design, and it creates a lot of ambiguities which may not be worth it.

Andy Zhao (he/him) • Mar 28 '18 • Edited

You probably know this already, but your question reminds me of how Ruby has two options for building certain arrays:

# the "conventional" way of creating an array of strings
words = ["apple", "banana", "Leaf"]
#=> ["apple", "banana", "Leaf"]

# another way of creating an array of strings
words = %w(apple banana Leaf)
#=> ["apple", "banana", "Leaf"]

I guess it's Perl inspired^src, but one limitation is that you can't create an array of numbers by the percent literal notation. Maybe that's a solution that you'd be interested in. Personally, I like having both options, and also think the "conventional" way should have commas.

edA‑qa mort‑ora‑y • Mar 28 '18

That second one looks like a word splitting regex, or is it really a specialized syntax to create an array of strings?

Kasey Speakman • Mar 28 '18 • Edited

I like the way F# supports both \n and ; as separators for lists and arrays. (I do wish it were comma instead of semi-colon.)

Here's a list example. oneOf is given a list with 3 items. stopWith is given a list with 2 items.

let apiRoutes =
    oneOf [
        GET >=> path "/" >=> handler root
        POST >=> path "/upload" >=> requireAuth >=> handlerAsync upload
        stopWith [
            statusCode 404
            contentText "¯\\_(ツ)_/¯"
        ]
    ]

But I typically only use semi-colons to separate inline items. Here is an array example. Notice the last item can have an optional separator on the end:

let storyPoints = [| 1; 2; 3; 5; 8; 13; 21; |]

Since either works as a separator, you can also use both.

let board =
    [ O ; X ; O
      X ; X ; O
      X ; O ; X ]

You can also turn it into the Elm-like syntax if you really want.

let elements =
    [ i [ class "warning icon" ] []
    ; div [] [ text "asdf" ]
    ; div [] [ text "fdsa" ]
    ]

I'm not a fan of this because cutting/pasting the first line gets weird (e.g. moving it to the bottom). Git can also get confused when you swap the first line for another and merge with someone else's changes.

Note: F# actually requires commas for tuples. Probably because tuples are defined with parens, which would be hard to disambiguate from an expression with newlines.

edA‑qa mort‑ora‑y • Mar 28 '18

That git requirement/problem is in my mind. I'd like to ensure that the source files are source control friendly.

I never thought about using ; to separate lists, though there's no real reason why I would distinguish between ; and ,. Just a classic syntax I guess, though semi-colons seem "heavier" than commas.

Kasey Speakman • Mar 28 '18 • Edited

In the rare occasion that I have inline list elements, I instinctively use commas then have to go back and replace with semi-colons. The compiler thinks I have one list item that is a tuple, since commas indicate tuples in F# (even without parens).

I do like the range of choices that are given by using either a separator char or newline. However in practice, I nearly always use the first style (newlines only). It is the most copy-paste and git friendly, and secondarily I like the visual flow.

edA‑qa mort‑ora‑y • Mar 28 '18

I think the inline versus multiline choice depends on what type of code you are writing. Anything that involves a logical list of items I prefer newlines.

var items = [
  item_one,
  item_two(args),
  1 + 2,
]

But, sometimes I have little tuples, like points, that would be burdensome, and unclear to do multiline:

var a : point = [1,2]
var b = pt_a + [3,4]

Ben Halpern • Mar 28 '18

My vote is for commas for the same reason as @codemouse92 . I'll also add that I love these threads. Definitely want to encourage you and @17cupsofcoffee and anyone else who is language designing to tap the community. I already feel more engaged with Leaf.

View full discussion (25 comments)

DEV Community

Help design a language: What about tuples without commas?

Ask, Comment, Criticise

Top comments (25)

Read next

A beginner's guide to the Qwen-7b-Chat model by Niron1 on Replicate

Introduction to Using Python in DevOps for Beginners

What's up with Telegram messenger: dozen errors detected

ReactJS Best Practices: Writing Clean and Maintainable Code