DEV Community

Cover image for Language Features: Best and Worst
Andrew (he/him)
Andrew (he/him)

Posted on

Language Features: Best and Worst

I'm interested in building my own programming language and I want to know: what are your most-loved and most-hated features of any programming languages?

Here are a few of mine:


When I create my own language (soon, hopefully), I would love for it to emulate R's paradigm where scalars are just length-1 vectors:

> x <- 3
> length(x)
[1] 1
> x <- c(1,2,3)
> length(x)
[1] 3
Enter fullscreen mode Exit fullscreen mode

...this means (as you can see above) that you can use methods like length() on scalars, which are actually just length-1 vectors. I would like to extend this to matrices of any dimensionality and length, so that every bit of data is actually an N-dimensional matrix. This would unify data processing across data of any dimensionality. (Though performance would take a hit, of course.)


I also love Haskell's Integer type, which is an easy-to-use, infinite-precision integer:

ghci> let factorial n = product [1..n]

ghci> factorial 100
93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
Enter fullscreen mode Exit fullscreen mode

Java has BigInteger and BigDecimal which are arbitrary-precision integers and floating point numbers, respectively. Since a user never enters an infinite-precision floating point number, it should be possible to keep track of the numbers entered and used, and only round / truncate the result when the user prints or exports the data to a file. You could also keep track of significant digits and use that as the default precision when printing.

Imagine, for instance, that instead of calculating x = 1/9 and truncating the result at some point to store it in the variable x, you instead keep a record of the formula which was used to construct x. If you then declare a variable y = x * 3, you could either store the formula in memory as y = (1/9) * 3 or, recognize that the user entered 3 and 9 as integers, and simplify the formulaic representation of y internally to 1/3.

(The way I see it, if this were implemented in a programming language, it would mean that there's no such thing as a "variable", really. Every variable would instead be a small function, which is called and evaluated every time it is used.)

Forgoing that simplification, you could have y refer to x whenever it's calculated and implement a spreadsheet-like functionality, where updating one variable can have a ripple effect to other variables. When print-ing the variable, you could display the calculated value, but when inspect-ing it, you could display the formula used to calculate it. Or something.


Finally, one language feature which I would never wish to emulate is Java's number hierarchy.

In Java, there are primitives like int, float, boolean, which are not part of Java's class hierarchy. These are meant to emulate C's basic data types and can be used for fast calculations. They are some of the only types not descended from Java's overarching Object class. As Java does not support operator overloading, the arithmetical operations +, -, *, /, and so on are only defined for primitives (and + is overloaded internally for Strings). So if you want to do any math, you need one of these primitive types... got it?

Well, Java also has wrapper classes for each of the primitive types: Integer, Float, Boolean, and so on. (Note also that int is wrapped by Integer and not Int. Why? I don't know. If you do, please let me know in the comments.) These wrapper classes allow you to perform proper OOP with numbers. Java "boxes" and "unboxes" the number types automatically so you don't have to convert Integer to int manually to do arithmetic.

It gets worse! The number class hierarchy in Java is flat, meaning that all of the numeric classes (Byte, Integer, Double, Short, Float, Long) descend directly from Number, and Number implements no methods to do anything other than convert a given Number to a particular primitive type. This means that if you want to, for instance, do something as simple as define a method which finds the maximum of two Numbers, you need to first convert each number to a Double and then use Double.max() for comparison. You need to convert to Double so you don't lose precision converting to a "smaller" type (assuming you're not also accepting BigIntegers or BigDecimals, which makes this even more complicated). Number (for the love of god) doesn't even implement Java's Comparable interface, which means you can't even compare a Float to a Double without Java unboxing them to primitives, implicitly casting the float to a double and then performing the comparison.

I wouldn't wish Java's Number hierarchy on my worst enemy.

Latest comments (66)

Collapse
 
rdil profile image
Reece Dunham

Lambdas are the best

Collapse
 
theredspy15 profile image
Hunter Drum

No semicolons.... That is all

A built in GUI library like JavaFX would be awesome too! And maybe Go's error handling style. As well as Python's strict policy on clean code

Collapse
 
antonrich profile image
Anton

Pattern matching is a must.
Haskell like syntax (small but expressive).

Collapse
 
madhadron profile image
Fred Ross

R conflating length 1 vectors and scalars is something to avoid. MATLAB does the same thing, and it was a bad idea there, too. Perl had the same error where it autoflattened an array of arrays unless you carefully inserted reference marks. So much of programming is getting data into the appropriate structure, and anything that gets in the way is a problem.

Boxing and unboxing is much more complicated than you would first think because of arrays. Say I have a class A with a subclass B. I can put an instance of B in an array of A. But when you say double[], you probably want a contiguous hunk of memory directly storing values. C++ fully exposes this semantic difference. Julia began its type system by insisting on unboxed arrays of doubles. Java compromised between making numeric computing not ridiculously inefficient and not complicating its semantics enormously.

Your gripe about Number not having any methods is spot on, and is why Stepanov invented the math that led to the Standard Template Library in C++ and part of why Common Lisp went with multiple dispatch in CLOS. This is a ubiquitous problem with single dispatch object systems.

Collapse
 
awwsmm profile image
Andrew (he/him)

How does it work in MATLAB and Perl? In R, they're pretty upfront about the fact that there's really no such thing as a scalar, but you can emulate it with a length-1 vector.

I'll definitely have to think about how I want data laid out for matrices and things. Lots of things to consider when you want to balance performance and syntax, etc.

What are your opinions on single vs. multiple dispatch mechanisms?

Collapse
 
madhadron profile image
Fred Ross

MATLAB does the same thing as R (or really, historically, R does the same thing as MATLAB): scalars are length 1 vectors. Perl 5 doesn't do that, but if you write @(1, 2, @(3, 4)), which in most languages would give you a list of length three containing two scalars and another list, you instead get a list of length four.

These choices make certain programming tasks more straightforward at the expense of all other programming tasks. Which is fine if you know the language is meant for just those tasks.

For single vs multiple dispatch, multiple dispatch is the clear winner in all respects except familiarity is multiple dispatch.

For balancing syntax, I remember the BitC folks saying that their best decision was using S-expressions for syntax until the language semantics stabilized, because they ended up changing deep things that would have been a real pain if they had a more structured syntax. Then they built a syntax besides S-expressions after the semantics stabilized.

Collapse
 
ih8celery profile image
Adam Marshall

Both Perl 6 and Common Lisp have a concept of Rational Number which exactly preserved the ratio of two integers. They also boast accurate floating point math. Perl 6 is relatively new -- and not much like Perl 5 if you have heard bad things about the latter.

Collapse
 
jessekphillips profile image
Jesse Phillips

Well much of what is in D. Most it would be nice to have different defaults.

  • static typing
  • templates
  • compile time evaluation
  • c calling conversion support

On the other hand there hand there is Lua. It embeds nicely into D and there are things I don't like.

  • 1 based indexing
  • blocks using words over brackets
  • lack of ranges, iteration isn't very nice.
Collapse
 
awwsmm profile image
Andrew (he/him)

Ranges and slices are two things I definitely want to implement from the get-go.

1-based indexing isn't even on the table!

Collapse
 
larryfoobar profile image
Larry Foobar • Edited

1) A native not operator that can be prepended to any bool expression.

if not contains() { ... }

Simple ! is so unmarkable and not noticable when reading. But using false == ...() is ugly

2) defer like in golang. And option to defer a loop, not only a function. So action to be executed right after break (not just return)

3) a crazy idea but I'm always thinking about having a break-from-if operator. When you have compex if-logic, this option allows to make it easy, more flattened and therefore more readable

Collapse
 
awwsmm profile image
Andrew (he/him)

Literally today I tried to break from an if in Java and got a compiler error. It would be a super useful addition.

Collapse
 
isaacrstor profile image
Isaac Yonemoto

For your first suggestion, I would seriously give this talk a watch, and learn from the very well-fought struggles of another PL:

youtube.com/watch?v=C2RO34b_oPM

I think julia's typesystem is quite fascinating, because it's used in a totally different way from pretty much every other PL.

Elixir is a language that really optimizes for programmer joy, one of the best PL features is the pipe operator.

I can do this:

result = value
|> IO.inspect(label: "value")
|> function_1
|> IO.inspect(label: "result 1")
|> function_2(with_param)
|> IO.inspect(label: "result 2")
|> function_3
|> IO.inspect(label: "result")

instead of:

value
println("value: $value")
r1 = function_1(value)
println("result 1: $r1")
r2 = function_2(r1, with_param)
println("result 2: $r2")
result = function_3(r2)

Indespensable for easy-to-read code and println debugging.

Collapse
 
awwsmm profile image
Andrew (he/him)

That is pretty neat. Thanks for the suggestions! I'll check out that video ASAP.

Collapse
 
validark profile image
Validark

A difference between method syntax and calling a member function. Enums that are not numbers or strings (unless you say so), like enum class in C++. If I were making a language I would consider the idea of minimizing keywords by using built-in function calls instead. Just a thought.

Collapse
 
awwsmm profile image
Andrew (he/him)

Can you give an example of what you don't want to see vs. what you want to see? I'm not sure I understand.

Collapse
 
validark profile image
Validark

In JavaScript, a function call implicitly passes this depending on the call site. In Lua there is a difference between a method call and a function being called from a namespace. This is accomplished through syntactic sugar.

In Lua:

Object:Move(1) is equivalent to
Object.Move(Object, 1)

This function can be declared like so:

function Object:Move(Amount)

The syntactic sugar puts a self as the first parameter. You could do this yourself as well:

function Object.Move(self, Amount) (you could name self whatever you want this way)

That means method calls can be localized and called later if necessary because self is actually a parameter.
local f = Object.Move
f(Object). You can't do this in JS, but it isn't just that. There isn't a need to implicitly pass this into namespaces.

The other solution could be to wrap localized functions with an implicit this, but that would make each instance of a class have a unique version of each method.

I would recommend under the hood using a data oriented approach. Instances of classes wouldn't hold all their data next to each other, an instance of a class would just be an integer which is the index at which its data can be retrieved from a bunch of arrays that each hold a single member of each instance. This is smarter design because it is much more frequent that at one time you might iterate through a particular property of all instances and with the way CPU caching works you can load up the right array and use just that data, without fetching unnecessary data you aren't using for comparison. Ex:

Names = ["First", "Second"]
IDs = [1, 2]

new Class() // reference to index 0 in each of these arrays

new Class() // reference to index 1 in each of these arrays

(Obviously this example is a simplification)

Thread Thread
 
awwsmm profile image
Andrew (he/him)

Thanks for all the advice! All of these are good points which I'll definitely have to consider when designing my language.

Collapse
 
cameronmartin profile image
Cameron Martin • Edited

One thing that's pretty similar to your "storing the formula used to calculate the number and then calculating the precision on-demand" idea that you have is exact real arithmetic. Several implementations exist for Haskell. One of the downsides of this approach, besides performance, is that equality is undecidable. The best you can do is determine that two numbers are within a certain distance of eachother.

Collapse
 
awwsmm profile image
Andrew (he/him)

Thanks for the link!

That's true, but it's also true with floating-point numbers in any programming language. Doing something like setting the default precision to the number of significant digits would eliminate this problem, I would think?

If you set x = 3.0 * 0.20 (= 0.60 @ 2 sig digits) and y = 0.599 * 1.0 (= 0.60 @ 2 sig digits) then y and x are equivalent when only significant figures are considered. Doing something like y - x would yield 0.599 * 1.0 - 3.0 * 0.20 = -0.001, which, to 2 significant figures, is zero. That's equality.

What do you think?

Collapse
 
cad97 profile image
Christopher Durham

I think the main thing that people look for in a modern "alternative" language is convenience and clarity.

Some specific possibilities:

  • Pick a "host" language and offer first-class interop. Immediate library ecosystem! (If non-idiomatic.)
  • Along the same lines, first-class project build and dependency management. Either integrate with an existing tool or make it as or more convenient than your favorite.
  • REPL. Almost a must-have for quickly understanding a new tool today.
  • LSP host and Jupyter kernel. Between the two you can support every development environment and awesome tooling in O(1) effort.

And the fun part, some anti-features:

  • Syntax overload. You'll pick up a language faster if you don't have to relearn everything.
  • Syntax uncanny valley. If it's too similar to a more popular language, people will just use that instead (or accidentally try to use it instead of yours).
  • Near-miss paradigms. Similar to your Java number hiearchy, adopting a paradigm almost everywhere but having some concessions in some corner cases just makes everything feel rougher.

And one more thing: I think there's the most available space around asynchronous-by-default. Play with an async runtime once you're up and running. There's potential there I haven't seen anyone fully hit.

Collapse
 
awwsmm profile image
Andrew (he/him)

REPL is a good shout. That will definitely go hand-in-hand with developing the language at the beginning.

Collapse
 
gavinfernandes2012 profile image
Gavin Fernandes • Edited

Imagine, for instance, that instead of calculating x = 1/9 and truncating the result at some point to store it in the variable x, you instead keep a record of the formula which was used to construct x. If you then declare a variable y = x * 3, you could either store the formula in memory as y = (1/9) * 3 or, recognize that the user entered 3 and 9 as integers, and simplify the formulaic representation of y internally to 1/3.

I think Wolfram Alpha does something similar to this, and iirc most Computer Algebra Systems do this as well, or at least they achieve the same goal, possibly with a different implementation.

Also are you going to be using flex and bison for the language? Or is it going to be a PEG parser?

Collapse
 
awwsmm profile image
Andrew (he/him)

Also are you going to be using flex and bison for the language? Or is it going to be a PEG parser?

I don't know what any of this means. (Sorry, I'm new to language design.) Can you elaborate?

Collapse
 
validark profile image
Validark

A PEG is a parsing expression grammar. It is a way of specifying all valid strings in a language. It looks kinda like this

Statement <- FunctionCall
FunctionCall <- Identifier ( Arguments )
Enter fullscreen mode Exit fullscreen mode

These would be converted into a Finite State Automata that can parse a string in LL(1) or O(n). It's basically a program that takes one character after another and changes state depending on the character. Each state can accept different characters leading to different states. This way, a state can encompass multiple non-terminals.

Let's say you define a language as having members "hero" and "hello".

MyLang <- hello | hero
Enter fullscreen mode Exit fullscreen mode

The way to parse a given string without backtracking or lookahead is to construct this finite state machine. Opening state accepts an h, leading to state 1. State 1 accepts e, leading to state 2. State 2 accepts l, leading to state 3, or r leading to state 4. State 3 accepts l, leading to state 5, which accepts o, leading to a state which it is valid to end on. State 4 also accepts an o and is valid to end on. In this case you could reuse state 4 for 5 if desired.

Thread Thread
 
validark profile image
Validark

Bison and Yaac are tools that allow you to write a grammar and it will spit out a finite state machine like I described.

Collapse
 
gavinfernandes2012 profile image
Gavin Fernandes

Alright, as far as I know, there are two general types of parsers, top down, and bottom up. Bottom up is more performant iirc, but is harder to get used to, while top down is less performant (O(n3) if you know what I mean, and iirc), but I hear that they're much simpler to work with.

Ohh before I elaborate further, I'll need to know what sorts of systems and tools you're used to working with. What language, OS, compiler?

Thread Thread
 
awwsmm profile image
Andrew (he/him)

In grad school I programmed in C/C++ for 5 years, now I'm mostly Java and R, but I'm teaching myself Haskell, as well. I work on Windows, Ubuntu, and macOS. If you're asking which compiler I use to compile my own code, just gcc or whatever's available. I haven't yet looked into anything for my language project.

I found this article while researching this and it mentions yacc, which I have heard of. I wasn't aware that bison is its successor. I don't know how yacc works, though, I just recognise the name.

Thread Thread
 
isaacrstor profile image
Isaac Yonemoto

Before you move forward with your project I strongly suggest you study Julia (as a replacement for R) and Elixir (as a replacement for Java). They're more modern languages so they've worked through a lot of issues in their recent, early pre 1.0 phases and have thought, hard and heatedly, about a lot of things you're looking at. Both are functional, and both are dynamic (I think strict adherence to staticity is a bit of a religion: Julia kills the idea that static = performance, because of its amazing compiler strategy, and Elixir + dialyzer kills the idea that static = correctness - I basically have no runtime errors with static typechecking engine, and the rock solid dynamic stability of OTP makes it way worth it - I'm a a relatively inexperienced dev, they just promoted me from junior directly to senior =) with a lot of freedom and in elixir I wrote a testing harness in three days for a senior coworker's go program (6 months in the making) that triggered both linux and go program panics/errors, while my harness (living on the same node and dispatching and monitoring tens of thousands of concurrent processes) was fine.

Moreover, both PLs focus on developer productivity and joy. I don't know if that's what you're optimizing for, but I suggest learning from some of the best!

Thread Thread
 
awwsmm profile image
Andrew (he/him)

Thanks, Isaac! I'll have to explore a few more languages before I make a decision one way or the other. I'll definitely check out Julia and Elixir.

Collapse
 
_darrenburns profile image
Darren Burns • Edited

Features I like:

  • First-class support for package management. When I install a programming language and can't work out within the first hour how to install a 3rd party library I really lose interest. Being able to do something like pip install library_name and for it to just work™ is awesome.
  • Error message output for humans. Elm and Rust spring to mind.
  • Expression-based languages that let you do something like let x = if something { 1 } else { 2 }.
  • Pattern matching (Elixir, Rust, Scala, etc.).
  • Languages that explicitly avoid the concept of exceptions and try/catch. Managing a single means of passing values up to a caller is hard enough, so I like the idea of encoding errors in the return value. Rust does this by way of the Result type. In Elixir, you return a tuple which includes information on whether an error occurred.
  • match statements like in Rust and Scala that support pattern matching. Bonus points if the compiler enforces that matches are exhaustive.
  • Any form of null-safety (e.g. Option types, Elvis operator, etc.). I see a lot of Java code with nested if (variable != null) { ... } checks and find it really hard to read.
  • Solid abstractions around concurrency (Actor model, Goroutines etc, compiler-enforced safety guarantees like those in Rust, etc.)
  • Quality of life features around debugging (for example, in the latest Rust version, you can do dbg!(something) to print out an object and all of its data without having to implement a toString or similar)
  • Native async/await syntax
  • If the language is dynamically typed, some form of type annotation syntax to aid static analysis is really helpful. Python 3 has a typing module which you can use in conjunction with static analysis and I've found it to make code significantly more readable and correct.
  • The ability to compile to a native binary.

Features I dislike:

  • null, NullPointerException, etc.
  • Bloated standard libraries. With a first-class, well supported package management system, “official” libraries could be pulled in when they’re needed.
  • Inconsistent standard libraries (i.e. lack of convention).
  • Excessive verbosity. I’m personally not a huge fan of the do/end syntax in languages like Elixir and Ruby. This is super subjective though and with modern text editors/IDEs it just becomes an aesthetic thing.

I could probably go on all day, but those are the first things that spring to mind!

Collapse
 
saint4eva profile image
saint4eva

async/await e.g. C#

Collapse
 
idanarye profile image
Idan Arye

In Elixir, you return a tuple which includes information on whether an error occurred.

But... Elixir has exceptions... Maybe you are thinking of Go?

Collapse
 
_darrenburns profile image
Darren Burns

It has exceptions, but it's considered more idiomatic to return a tuple

Thread Thread
 
idanarye profile image
Idan Arye

OK, I see now. It's not Go's abomination like the word "tuple" implies, but more like the dynamically typed version of enum types.

At any rate, I find it a weird design choice to add exceptions and then encourage a different method of error handling. One of the main complaints about C++'s exception was that due to its legacy it had three types of error handling:

  1. Function returned value + errno.
  2. Setting a field on an object.
  3. Exceptions.

I guess Elixir wanted to go with pattern matching for error handling, but had to support exceptions for Erlang interop?

Thread Thread
 
isaacrstor profile image
Isaac Yonemoto • Edited

I guess Elixir wanted to go with pattern matching for error handling, but had to support exceptions for Erlang interop?

No. Elixir inherits from erlang's "let it fail" mentality. The PL itself supports supervision trees, restart semantics, etc. So in some cases, you want to just "stop what the thread is doing in its tracks and either throw it away or let the restart semantic kick in". In those cases, you raise an exception and monitor logs. The failure will be contained to the executing thread. You can then make a business decision as to whether or not you REALLY want to bother writing a handler for the error. Does it happen once in 10 thousand? 10 million? Once in a trillion? The scale and importance of the task will dictate whether or not you need to deal with it.

Other times when you might want to raise an exception 1) when you're scripting short tasks. Elixir lets you create "somewhat out of band tasks". Example: commands attached to your program that create/migrate/drop your databases.

In this case, most failures are 'total failures', and you don't care about the overall stability of the program, since it's "just an out-of-band task". So explicit full fledged error handling is more of a boilerplate burden.

2) when you're writing unit or integration tests. The test harness will catch errors anyways, so why bother with boilerplate. Use exceptions instead of error tuples.

Thread Thread
 
idanarye profile image
Idan Arye

Yes, exceptions makes sense. I agree with that. Using the returned value for error handling also makes sense - but only if you don't use exceptions. If the language supports exceptions, and not just aborts/panics - actual exceptions you are expected to catch because they indicate things that can reasonably happen and you need to handle - then you can't argue that using the returned value makes everything clear and deterministic and safe and surprises-free - because some function down the call-chain can potentially throw an exception.

So, if that function can potentially throw, you already need to code in exception-friendly way - specifically put all your cleanup code in RAII/finally/defer/whatever the language offers. And if you already have to do this for all cases - why not just use exceptions in all cases and not suffer from the confusion that is multiple error handling schemes?

Thread Thread
 
isaacrstor profile image
Isaac Yonemoto • Edited

You don't have to cleanup. That's the point. I can't explain it except to say, if you watch enough IT crowd, you might start to agree that sometimes it's okay to just "turn it off and back on again".

If your system is designed to tolerate thread aborts, it's really refreshing and liberating. Let's say I was making a life-critical application. In the event of a cosmic ray flipping a bit, I would much rather have a system that was architected where, say, of the less important subprocesses just panics and gets automatically restarted from a safe state, with the critical subprocesses still churning along, than a system that brings down everything because it expects to have everything exactly typechecked at runtime.

Thread Thread
 
idanarye profile image
Idan Arye

There is still cleanup going on. Something has to close the open file descriptors and network connections. You may not need to write the cleanup code yourself, as it happens behind the scenes, but as you have said - you need to design your code in a way that does not jam that automatic cleanup. For example, avoid a crashing subprocess from leaving a corrupted permanent state which the subprocess launched to replace it won't be able to handle.

One of the main arguments of the "exceptions are evil" movement is that having all these hidden control paths makes it hard to reason about the program's flow, especially cleanup code that needs to run in case of error. But... if you already need to design your program to account for the possibility of exceptions, you are losing the benefit of explicit flow control while paying the price of extra verbosity.

This convention in Elixir to prefer returning a tuple seems to me as more trendy than thoughtful...

Thread Thread
 
isaacrstor profile image
Isaac Yonemoto • Edited

You really don't have to worry about it. The VM takes care of it for you. Unlike go, there are process listeners that keep track of what's going on. File descriptors are owned by a process id, and if the id goes down it gets closed.

As a FP, most stuff us stateless and in order to use state you have to be very careful about it,so there usually isn't a whole lot of cleanup to do in general. As I said, I wrote some sloppy code in three days as a multinode networked testbench for an internal product and it was - it had to be - more stable than the code shipped by a senior dev (not in a BEAM language)

There is zero extra verbosity because you write zero lines of code to get these features.

As for the tuples, I wouldn't call it trendy since it's inherited from erlang, which has had it since the 80s.

I think you have been misinformed about elixir or erlang and suggest you give it a try before continuing to make assertions about it.

Thread Thread
 
idanarye profile image
Idan Arye

You really don't have to worry about it. The VM takes care of it for you. Unlike go, there are process listeners that keep track of what's going on. File descriptors are owned by a process id, and if the id goes down it gets closed.

Yup - higher level languages do that basic stuff for you. But it can't do all cleanup for you. For example, if a subprocess needs to write two files, and it crashed after writing the first file due to some exception, there will only be one file on the disk. You need to either account for the possibility there will only be one file (when you expected there to be zero or two) or do something to clean up that already-written file.

There is zero extra verbosity because you write zero lines of code to get these features.

I talked about verbosity in the no-exceptions style error handling, not the one in the exceptions style.

As for the tuples, I wouldn't call it trendy since it's inherited from erlang, which has had it since the 80s.

Erlang had tuples, but didn't use them for returned value based error handling. At least, not from what I could see with a quick google. Elixir does use them for error handling.

I think you have been misinformed about elixir or erlang and suggest you give it a try before continuing to make assertions about it.

My "assertions" about Elixir is that it uses both exceptions and pattern-matching-on-returned-values for error handling. Is this incorrect?

Thread Thread
 
isaacrstor profile image
Isaac Yonemoto • Edited

At least, not from what I could see with a quick google

ok tuples and error tuples are literally everywhere in erlang. The result type for gen_server start function, for example, is {ok, Pid} | ignore | {error, Error}.

Collapse
 
isaacrstor profile image
Isaac Yonemoto

I find the Erlang/Elixir treatment of null to be acceptable.

It (nil) is an atom (as are false, and true), definitely not conflatable with zero.

The only thing that is "dangerously" affected is "if", which fails on "false" and "nil" exclusively. Everywhere else you have to treat nil as its own entity.

Collapse
 
idanarye profile image
Idan Arye

Erlang and Elixir are dynamically typed languages. The million dollar mistake does not apply to dynamically typed languages. Guaranteeing that a variable cannot be null is not very helpful when you can't guarantee that variable's type.

Thread Thread
 
isaacrstor profile image
Isaac Yonemoto

You can definitely guarantee variable's types in Erlang and Elixir.

Thread Thread
 
idanarye profile image
Idan Arye

By doing explicit checks. How do these differ from null checks?

Collapse
 
awwsmm profile image
Andrew (he/him)

Maybe something similar to this is the best approach?

Continuing with my idea of making all data N-dimensional matrices, nil or null would just be an empty matrix. Then a statement like if ([]) wouldn't make any sense because an empty matrix shouldn't be truthy or falsy. It should throw a compiler error.

Collapse
 
610yesnolovely profile image
Harvey Thompson

Add to that:

  • Proper static type system with generics, abstract type members, variadic types, tuple types, function types, a sound and complete type system with set-like operations (and, or, not).
  • Flow typing: a variables deduced type is refined through flow control.
  • Garbage collection (of some kind)
  • I agree on no exceptions, use types.
  • Compiler-as-a-library
  • Macros and other meta-programming
  • Incremental compilation
  • Interactive prompt
  • JIT compilation, AOT compilation and scripting
  • Support Language Server Protocol (and check it works in vim, emacs, vscode)
  • Support Debugger Access Protocol (which requires a whole debugger, and debug symbol system)
  • Support memory, cpu, cache profiling tools out of the box (eg. valgrind et. al)
  • Support and/or built-in testing
  • Support and/or built-in quality metrics
  • Make it open and freely available

I've been designing and building languages (now full time) for many years. You'll find you can end up adding an infinite list of things.

My advice, try writing a very simple Lisp interpreter: it can be done in under a day. Then try adding a few things.

You might also want to check out LLVM, which has a tutorial Implementing A Language With LLVM

My other advice, as soon as possible "Eat Your Own Dogfood". The programming language, compiler services, and all it's libraries should be written itself. To do this write a bare minimum language compiler from your language to C, C++, Java or whatever (C++ did this initially with "cfront"). Then rewrite that simple pre-processor in your new language. Then add more features.

This is the best and most efficient way to validate your work - if you like using your own language more than some other, you are on possibly on the right track.

Collapse
 
awwsmm profile image
Andrew (he/him)

Thanks for all the advice. I'm going to have a huge list of things to research before I even think about starting this project. I'm sure I'll come back to your comment more than a few times.

Thread Thread
 
610yesnolovely profile image
Harvey Thompson

I forgot to say the most influential book for me are:

  • "Programming Languages: An Interpreter Based Approach" by Samuel N Kamin - though it says it's about Interpreters, it's really looking at how to implement language features for various languages. This makes you feel like you could do it yourself, because it explains each feature and gives example code. One of the first books I read on the subject.

  • "Types and Programming Languages" by Benjamin C Pierce. Totally opposite and quite heavy reading. Assumes you can do degree level set-theoretic logic - but this book basically tells you how to build a proper type system from a mathematical perspective. The concepts are key, so it's possible to read and gloss over the maths. Often academia has the future or bleeding edge hidden in research papers, so it's worth reading these also, even if the maths goes way over ones head.

As I mentioned, Lisp is a great place to start because it has a very simple lexical and syntactical grammar, and the semantics can be expressed in very minimally. I quick google search gave me: Lisp in Less Than 200 Lines Of Code

Don't research absolutely everything to begin with, it's too overwhelming a subject. The basics are covered in the "Dragon Book":

  • Lexical analysis (see flex, antlr, re2c)
  • Syntax/Parsing (bison, antlr)
  • Semantic analysis (you're on your own here, it's too language specific)
  • Type systems (if you're going static typing, which I would highly recommend)
  • Code Generation (see LLVM, it does everything you'd need)

Another thing I tend to do is read/use a lot of languages and steal... err... leverage... ideas. Most also have their compilers and libraries open sourced. Some interesting languages: Swift, C#, Kotlin, Rust, Julia, Scala, Clojure, Erlang.

Don't be daunted, the subject, like most, is quite deep and involved when you really look into it.

Thread Thread
 
awwsmm profile image
Andrew (he/him)

Wow! Thanks a lot, Harvey! This is a great list of books. I really appreciate it. I'll definitely have a look at Lisp.

Collapse
 
awwsmm profile image
Andrew (he/him)

Wow! Thanks, Darren. I think I agree with most of the points you raise here. Getting a package management system up and running is clearly a step toward making the language accessible to the general public. I have next to no idea what that would entail, but it's certainly something I'd have to think about as the language matures.

Collapse
 
pretzelhands profile image
pretzelhands

The property tags in Go:

type Invitation struct {
    InvitationToken string `json:"invitationToken" db:"invitation_token"`
}

The fact that I can have a variable called InvitationToken in Go, that is called invitationToken in JSON and invitation_token in my database is just absolutely amazing. It resolves one of my biggest pet peeves in programming, which is mixing naming cases.

Collapse
 
awwsmm profile image
Andrew (he/him)

This is basically just metadata that you can attach to a variable, right?

related: stackoverflow.com/questions/108587...

Collapse
 
pretzelhands profile image
pretzelhands

That's the gist of it yes. I think you could implement the same thing with tagged template literals in JS, but Go just offers it natively. 🤓

Thread Thread
 
vberlier profile image
Valentin Berlier

In js you would use decorators, which are native langage features:

class Invitation {
  @Serialize({ json: 'invitationToken', db: 'invitation_token' })
  token
}
Collapse
 
ben profile image
Ben Halpern

Ruby has a lot of expressive language features that seem like aliases for other things but are actually a bit different.

For example, and, or, and not exist which could be interesting alternatives to &&, ||, and ! which are still more common in the language, except they have subtly different behavior so you can't really interchange them per se.

Lots of little ways to cut yourself in that way.

Collapse
 
ben profile image
Ben Halpern

Oh, and that's also the best feature of the language. It's been crafted to be expressive and intuitive. Not settling for inelegant solutions. Tools like [].empty? and the many, many more are really nice to have.

Of course this all comes with performance and memory bloat concerns but it's still a great tool for many jobs.

Collapse
 
awwsmm profile image
Andrew (he/him)

One thing I would like to emphasize in my to-be-created language is that there should, generally speaking, only be one correct way to do something. I think it would make the syntax more uniform and make things less difficult to document, etc.*

So I suppose at some point I'll have to decide between and vs. &&, not vs. ! vs. ~ and so on. Leaning toward the English-like options.


*One interesting effect this would have is that there would only be one kind of loop. No for vs. while, just some kind of loop feature.

Thread Thread
 
ben profile image
Ben Halpern

Yeah, Ruby pretty much goes the other way completely with that. I try not to indulge it too much, but this mentality jives with my personality, probably helps create the right coder language fit.

Thread Thread
 
ben profile image
Ben Halpern