re: Language Features: Best and Worst VIEW POST


Imagine, for instance, that instead of calculating x = 1/9 and truncating the result at some point to store it in the variable x, you instead keep a record of the formula which was used to construct x. If you then declare a variable y = x * 3, you could either store the formula in memory as y = (1/9) * 3 or, recognize that the user entered 3 and 9 as integers, and simplify the formulaic representation of y internally to 1/3.

I think Wolfram Alpha does something similar to this, and iirc most Computer Algebra Systems do this as well, or at least they achieve the same goal, possibly with a different implementation.

Also are you going to be using flex and bison for the language? Or is it going to be a PEG parser?


Also are you going to be using flex and bison for the language? Or is it going to be a PEG parser?

I don't know what any of this means. (Sorry, I'm new to language design.) Can you elaborate?


Alright, as far as I know, there are two general types of parsers, top down, and bottom up. Bottom up is more performant iirc, but is harder to get used to, while top down is less performant (O(n3) if you know what I mean, and iirc), but I hear that they're much simpler to work with.

Ohh before I elaborate further, I'll need to know what sorts of systems and tools you're used to working with. What language, OS, compiler?

In grad school I programmed in C/C++ for 5 years, now I'm mostly Java and R, but I'm teaching myself Haskell, as well. I work on Windows, Ubuntu, and macOS. If you're asking which compiler I use to compile my own code, just gcc or whatever's available. I haven't yet looked into anything for my language project.

I found this article while researching this and it mentions yacc, which I have heard of. I wasn't aware that bison is its successor. I don't know how yacc works, though, I just recognise the name.

Before you move forward with your project I strongly suggest you study Julia (as a replacement for R) and Elixir (as a replacement for Java). They're more modern languages so they've worked through a lot of issues in their recent, early pre 1.0 phases and have thought, hard and heatedly, about a lot of things you're looking at. Both are functional, and both are dynamic (I think strict adherence to staticity is a bit of a religion: Julia kills the idea that static = performance, because of its amazing compiler strategy, and Elixir + dialyzer kills the idea that static = correctness - I basically have no runtime errors with static typechecking engine, and the rock solid dynamic stability of OTP makes it way worth it - I'm a a relatively inexperienced dev, they just promoted me from junior directly to senior =) with a lot of freedom and in elixir I wrote a testing harness in three days for a senior coworker's go program (6 months in the making) that triggered both linux and go program panics/errors, while my harness (living on the same node and dispatching and monitoring tens of thousands of concurrent processes) was fine.

Moreover, both PLs focus on developer productivity and joy. I don't know if that's what you're optimizing for, but I suggest learning from some of the best!

Thanks, Isaac! I'll have to explore a few more languages before I make a decision one way or the other. I'll definitely check out Julia and Elixir.


A PEG is a parsing expression grammar. It is a way of specifying all valid strings in a language. It looks kinda like this

Statement <- FunctionCall
FunctionCall <- Identifier ( Arguments )

These would be converted into a Finite State Automata that can parse a string in LL(1) or O(n). It's basically a program that takes one character after another and changes state depending on the character. Each state can accept different characters leading to different states. This way, a state can encompass multiple non-terminals.

Let's say you define a language as having members "hero" and "hello".

MyLang <- hello | hero

The way to parse a given string without backtracking or lookahead is to construct this finite state machine. Opening state accepts an h, leading to state 1. State 1 accepts e, leading to state 2. State 2 accepts l, leading to state 3, or r leading to state 4. State 3 accepts l, leading to state 5, which accepts o, leading to a state which it is valid to end on. State 4 also accepts an o and is valid to end on. In this case you could reuse state 4 for 5 if desired.

Bison and Yaac are tools that allow you to write a grammar and it will spit out a finite state machine like I described.

code of conduct - report abuse