Make another lexer.js vs use the original one and update it
The issue is updating the lexer.js
. And the maintainer of the open source project allowed me to modify the code structure. However, it is really good idea to create a whole new brand of lexer.js
or just to remove the useless parts and replace them with new codes?
To decide it, let's take a look at the code.
What does the lexer.js really do in the program.
This program mainly have 4 parts.
tokens: A pair of keyword and value. This is the list of syntax of the program to be compiled. If the words read from a file match with one of the token, the program know that the word is meaningful syntax.
_lexer: basically convert the raw input into meaningful object which later can be transplied.
transplie : take the object from
lexer.js
. If all syntax are correct then make the codes to javascript codes.syntax checker : check if the object from
lexer.js
is all correct if not issue the syntax errors
inside the lexer.js
So, let's explain the lexer engine again step by step:
find all of the possible
token
(the words predefined in the program to be compiled later, eg) 'print(), var ...')transform the raw data into the array of the token (need to be sorted: first in first out)
compare the token the object which has
identifier
(the name of the token) andregex
(regular expression).If they are matched, it will create an object(
token
andidentifier
). This object will be compiled later in thetransplier.js
(final version of object)
step 1 to 2, I should not modify them because if they were changed, all the predefined token
will be useless (needs new token system).
Also, the lexer engine has to pass the object to the transplier.js
, so that the raw code can be successfuly compiled later.
So, I could only change the step 3.
The thing to be upgraded
So far, there is so much work to be upgraded like var
----> let
or const
. Removing global variable, reform the code structure...
In this week, I changed the structure of the code first so that the upgrade can be done well later.
I first removed all the global variable in the lexer,js
let colors = require("./colors");
let tokens = require("./tokens");
let regexTerms = [];
let rawRegex = "";
let mainRegex;
let resolverRegexes = [];
Object.keys(tokens.tokens).forEach((token) => { // Populate main regex
regexTerms.push(`${tokens.tokens[token].match}`);
});
Object.keys(tokens.tokens).forEach((token) => {
resolverRegexes.push({
"match": new RegExp(`(${tokens.tokens[token].match})`, "gi"),
"data": {
"identifier": token
}
});
});
And create a new .js
file called lexerToken.js
. This new file will generate the intermediate token
object which will help to create the final version of object (used in transpiler.js
)
This commit can be found in here
After the work, now lexerToken
only handle the token related job. And lexer
will only handle transform the raw data into the completed object with token object.
next step
The thing is, it took long time to compile big amount of codes. So the lexer engine needs to have smaller runtime.
The main reason for too much runtime may be the too many loop
. So , the next step is find the extra loop
and reduce it and change the algorithm.
Top comments (0)