In this post I'll implement a simple parser that counts the number of words and lines of the input.
first we need to define what we consider as a white space.
// it's written like that because
// it'll be passed to the `char` parser later.
const whitespace = ' \\r\\n\\t\\f\\v';
Then, define what is a word
a word is a sequence of non white space characters.
So, follows the definition of word
and space
parsers.
import { char, oneOrMore } from 'pari';
// ...
const wordChar = char(`[^${whitespace}]`);
const wsChar = char(`[${whitespace}]`);
const word = oneOrMore(wordChar);
const space = oneOrMore(wsChar);
So, how do we keep count? we need to define a parser State
.
import { State, ... } from 'pari';
class CounterState extends State {
#wordsCount = 0;
#linesCount = 0;
// State must have a `clone` method.
clone() {
const state = new CounterState(
this.input,
this.index,
this.status
);
state.#wordsCount = this.#wordsCount;
state.#linesCount = this.#linesCount;
return state;
}
get wordsCount {
return this.#wordsCount;
}
get linesCount {
return this.#linesCount;
}
withIncWords() {
this.#wordsCount += 1;
return this;
}
withIncLines() {
this.#linesCount += 1;
return this;
}
}
// ...
In, the space
parser we need to increase the count of lines by one if we encounter a line character and increase the count of words by one at the last space (the maybe multiple consecutive spaces).
// ...
const space = oneOrMore(wsChar.ok(state =>
state.charAt(state.index - 1) == '\n'
? state.withIncLines()
: state
)).ok(state => state.withIncWords());
In the word
parser we need to handle an edge case that is the end of input.
//...
const word = oneOrMore(wordChar.ok(state =>
state.charAt(state.index) == ''
? state.withIncWords().WithIncLines()
: state
));
Finally, we define out word counter parser and pass it a state with an input.
// ...
import { firstOf, ... } from 'pari';
const wc = oneOrMore(firstOf([word, space]));
const input = ...;
const result = wc.process(
new CounterState(input)
);
// print word and line counts
console.log(
result.wordsCount,
'words',
result.linesCount,
'lines'
);
Thank you for reading 😄, If you have any questions do not hesitate to leave a comment.
Top comments (0)