loading...

Creating a new programming language

mcsh profile image Sajjad Heydari Updated on ใƒป2 min read

Rumi Programming Lang (2 Part Series)

1) Creating a new programming language 2) An update on the Rumi Programming Language, and an overview of the tools used

In 2019, I've been working on a compiler in my spare time(I think I spent like 2 working weeks on it!) called rumi. It is a general-purpose language, with ideas that I got from Johnatan Blow's compiler videos. It's been a wonderful experience for me, and I occasionally streamed coding this compiler in my native language (Farsi).

Now the compiler has reached a state where I can actually write programs in it, but it is still way off from what I plan to do. Here is a program showcasing most of what we can do:

/*
Comments and /*nested*/ comments are a thing
*/

ptinf(T: string, c: ... any) -> int; // We just declare the signature, it is implemented in a c file

MyStruct: Struct{
  id: int;
  age: u8;
}

main := ()-> int{
  a : int;
  a =1;

  b: int = 2;

  c := a + b;

  printf("The value of c is %d\n", c); // 3

  arr : int[10];
  mys : MyStruct;
  mys.id = 2;
  arr[0] = mys.id;

  printf("The id is %d, and the 0th element is %d\n", mys.id, arr[0]); // 2, 2

  // We also have pointers:

  p : *int;
  p = &c;

  *p = 2;

  printf("The value of c is %d\n", c); // 2

  return 0;
}

As you can see, it is really basic at this stage, but I have plans to implement something that I would use day-to-day. The program uses an llvm-backend which allows me to link it together with C files, either calling C functions or provide functions for C (or any other language that can use C) to call.

Now, I'm looking for contributors or suggestions as things to include in this language. Here is the repo:

GitHub logo MCSH / rumi

The rumi compiler

Rumi

Everything you possess of skill, and wealth, and handicraft, wasn't it first merely a thought and a quest? - Rumi

Rumi is a WIP compiler.

The goal is to have a language that is low level, has functional properties, can be linked with c, doesn't make local functions a nightmare, has a compile time language that is the same as the runtime language, and focuses on making programming joyful.

The current version is written in itself, you create a base compiler with C++, flex and bison, then run self_compile.sh so that the compiler compiles itself. The previously semi-complete version could be found in the old branch.

And you can also comment your ideas on this post! If you guys are interested, I can even showcase the compiler structure in an English YouTube video (or a live video or a blog post) and even implement some new functionalities to it.

Let me know if you have any suggestions or questions!

Rumi Programming Lang (2 Part Series)

1) Creating a new programming language 2) An update on the Rumi Programming Language, and an overview of the tools used

Posted on by:

mcsh profile

Sajjad Heydari

@mcsh

I'm a graduate student in Computer Science in machine learning, and a proud geek!

Discussion

markdown guide
 

Congratulations, Sajjad. What I find most interesting is the curiosity-driven mindset that motivates someone to write a new language.

There are about 6500 human languages in the world, which is all the richer for them. And this ignores dialects or variants used by specific interest groups.

So we can well afford a few more computer languages. Writing one is a mind-broadening job with unpredictable spin-offs. To anyone wanting an interesting project, I'd say "write a new language" and then "use it for something" (the only true test of value). You may be surprised how much you learn, you'll have a lot of fun and you never know where it might lead. Just remember, many of today's best computer languages started that way.

 

Thank you for your kind words!

I want to write this language, and then write the compiler within itself. Which will probably take longer than I like to admit, but I'm up for the challenge!

 

It may depend on how close your language is to the one it's currently written in. Could be a lot of it is a relatively simple substitution exercise. Or not. I did this back in about 1985. I originally wrote my compiler (a variant of PL/M) in assembly-language then rewrote it in itself. High-level languages look nothing like assembler so the new version was structurally very different from the old. However, much of the time in developing the original had been spent designing the syntax and figuring out what it should do, so little of this effort was needed the second time around. I think the second iteration actually took a lot less time than the first.

If you're going for self-compilation you can limit the scope of the initial product to the minimum needed to compile itself. All other features can wait until you have self-compilation working, after which you're operating entirely in your own code. This should flush out errors very quickly and your debugging takes place in a simpler environment than if you wait for everything to be in place. Well, that was my experience, at least.

 

Here's an ammended snippet with tripple backtick and go syntax highlighting, I tried perl but it didn't look so good.

/*
Comments and /*nested*/ comments are a thing
*/

ptinf(T: string, c: ... any) -> int; // We just declare the signature, it is implemented in a c file

MyStruct: Struct{
  id: int;
  age: u8;
}

main := ()-> int{
  a: int;
  a = 1;

  b: int = 2;

  c := a + b;

  printf("The value of c is %d\n", c); // 3

  arr: int[10];
  mys: MyStruct;
  mys.id = 2;
  arr[0] = mys.id;

  printf("The id is %d, and the 0th element is %d\n", mys.id, arr[0]); // 2, 2

  // We also have pointers:

  p: *int;
  p = &c;

  *p = 2;

  printf("The value of c is %d\n", c); // 2

  return 0;
}
 

That's pretty cool!

As a feature suggestion (to be honest, I'm not likely to use the language, but I think what you're doing is still really cool), some of the most useful features that I've seen in a language nowadays are: Non-Nullable types, optional chaining, and nullish coalescing. I think that some form of these features would make any language leaps and bounds better than it would be otherwise.

 

Thanks for the suggestions! I already have plans for implementing these, do you have any syntax in mind?

 

Kotlin has nullable types, but their's are non-nullable by default. I personally am not a fan of the default, and I personally don't agree with their reasoning for it. So, I would suggest a "bang" in your type declaration for non-nullables (a: int! = 42). I love the question mark syntax usually used for optional chaining (blah?.blee?[42]?.blue?()). For nullish coalescing, I think that either the Elvis operator (?:) or a double question mark (??) are both intuitive and easy to use.

 

Hey, I notice the code snippet has inconsistent spacing between : and assignments and types, it's harder to follow the syntax. I think you are in need of a style guide at this point. On the bright side I can see the mistakes having never seen this language beforehand, that's a good sign. Congratulations, I am also on the same quest. How did you handle grammer?

 

The code is all on the github. The grammar is defined in bison syntax in parser.y! You can look at it there

 

Is there plans to compile to jvm byte code and integrate with a maven or gradle project like Kotlin or Scala.

 

Not at the moment, but it is possible to convert llvm's output to jvm bytecode.

 
 
 

Yawn. Yet another block-structured language, clearly modeled after C, C++, Java, et al. Not to take away from your obvious intellectual accomplishments, but each time I visit a link claiming a "new" language, it's more of the same.

Clojure was the last really innovative language I discovered. It's still the coolest new kid on the block.

David

 

This language looks very easy to learn and nice to read.

Congratulations!

I will try to keep an eye on it.

 

It's very cool to work on these kind of stuff, compiler, new programming language, superb.

 

Thank you! I really appreciate it.