DEV Community

Cover image for Lexical Analysis

Lexical Analysis

martincorona007 profile image Martin Corona Updated on ・2 min read

The first phase of a compiler design is called Lexical Analysis or scanner. The Lexical Analysis read the Stream of characters that make up the source of file and group the significative sequences, known as a lexeme.

                          <token class, attribute value>

For example
int doc=8;

Alt Text

Alt Text

In the next table will be the lexemes classes used in the development of the practice.
Alt Text
Alt Text

In the next Figure 1, it shows the first Automata called A1 which recognize identifiers:

Alt Text

The next Figure 2, it shows the Automata A2 which recognize Octal numbers, Hexadecimal number and Real number:

Alt Text

The next Figure 3, it shows the Automata A3 which recognize Delimitation characters, Arithmetic operators, Punctuation marks and Assignment Operator:

Alt Text

The next Figure 4, it shows the Automata A4 comments:

Alt Text

In this automata the number 10 represents New Line in ASCII code.


As a conclusion, it was difficult to understand and solve the EOF, Blacks and comments. Because when I was using fgetc() function whit a char variable, the char variable was changing it for another number and never reach to EOF. So that was freaking me out but finally I solve it. Sometimes when the lexical analysis read a black it was doing a loop infinite but it was solved. Also, I consider that this practice was a huge challenge, but I made it.

The code:


Editor guide