DEV Community


Posted on • Updated on

Making a Language - Theory

1 - Don't Skip Theory

People usually skip the theoretical part because they think it's boring, but it's very important know the theory, otherwise you can't implement the practice, but I'll try to make this theory part as short as possible.

2 - How does a language work?

Basically, the language can be represented by:

input > language > output

Maybe that doesn't say much to you, but for example the representation of a hello world would be:

print("hello world") > python > hello world

I think now you can better understand what I meant, but the language is not so simple and that's what we'll see next

3 - Ways to execute the code

Code can be executed in two ways: interpretation and compilation

  • Interpretation: Analyze the source code and execute
  • Compilation: Analyzes source code and generates executable code in low level language

There are also 2 "variants" of the compilation:

bytecode: this is how python and java work, it is a compilation but different from the conventional one where it generates executable code by the operating system, here in bytecode the generated code is not executed by your operating system but by a "virtual machine" of the language , which makes it work on most systems avoiding bugs
Transpilation: it's the same process as compilation, but compile to a high-level language(since compilation generates code in a low-level language), for example, if you have a hello world in python and transpile JS.
Enter fullscreen mode Exit fullscreen mode

4 - Lexical, syntactic and semantic analysis

The compiler or interpreter goes through 3 important analyses:

Lexical: The function of the lexical scanner or scanner is to read the source code, character by character, separating them into tokens. It is also the responsibility of this phase to eliminate "decorative" elements from the program, such as white spaces, text formatting marks and comments.
Syntax: Syntax analysis is the process responsible for verifying whether the symbols contained in the source program form a valid program or not.
Semantics: The role of the semantic analyzer is to provide methods by which the structures constructed by the analyzer can be evaluated or executed. An example of the semantic analyzer's own task is to check the variable types ​​in expressions.
Enter fullscreen mode Exit fullscreen mode

The compiler also goes through other phases that are not always done by most languages, but I highlight:

Intermediate code generation: not yet the object code (assembly or C), but a code that is easy to manipulate by the object code generator
optimization: optimize the code
Object code generation: here the intermediate code finally becomes object code and the compiler of the language being used as object code is called
Enter fullscreen mode Exit fullscreen mode

Discussion (0)