DEV Community

Vicente Maldonado
Vicente Maldonado

Posted on • Originally published at Medium on

Use JFlex to Count Words

In the previous story we got to meet JFlex, a tool for generating lexers in Java. The example lexer was contrived, banal and not all that useful so let’s show that JFlex can be put to good use with a (bit) more useful example: we’ll count words, lines and characters the user enters.

The first part of the JFlex file is the same as in the first example:

import java.io.\*;

%%

We just import all the java.io classes. To start with, we’ll need a way to store our word, line and char count:

%{

public int chars = 0;
public int words = 0;
public int lines = 0;

chars, words and lines will become public members of the generated class, accessible to the rest of the code. The main method is much the same as in the first example:

public static void main(String[] args) throws IOException
{
 InputStreamReader reader =
 new InputStreamReader(System.in);

Lexer lexer = new Lexer(reader);

lexer.yylex();

 System.out.format(
 "Chars: %d\nWords: %d\nLines: %d\n",
 lexer.chars, lexer.words, lexer.lines);
}

%}

There are two differences though:

  • There is no infinite loop — the lexer will read from System.in until we interrupt it ( Ctrl-d on Linux and Ctrl-Z on Windows I think). This allows the user to enter several lines of text in the terminal.
  • We don’t use the yylex() directly.

In the next part:

%class Lexer
%type Integer

%%

The generated Java class will be named Lexer and yylex() will return a Java Integer value. This is only because yylex() needs to return something and it returns an object of type Yylex by default — javac will complain that Yylex type doesn’t exist because it doesn’t (if you don’t create it yourself).

Finally, in the lexical rules part:

[a-zA-Z]+ { words++; chars += yytext().length(); }
\n { chars++; lines++; }
. { chars++; }

If you type a word, recognized by the [a-zA-Z]+ regex, the word count will be incremented and the char count be increase by the entered word length. If you press Enter, ie. \n , the char count will be incremented. And if you enter a random character like * or & the character count will be incremented.

This allows us to print out the final count of chars, words and lines (back in main):

System.out.format(
 "Chars: %d\nWords: %d\nLines: %d\n",
 lexer.chars, lexer.words, lexer.lines);

Here is the complete file:

import java.io.\*;

%%

%{

public int chars = 0;
public int words = 0;
public int lines = 0;

public static void main(String[] args) throws IOException
{
 InputStreamReader reader =
 new InputStreamReader(System.in);

Lexer lexer = new Lexer(reader);

lexer.yylex();

 System.out.format(
 "Chars: %d\nWords: %d\nLines: %d\n",
 lexer.chars, lexer.words, lexer.lines);
}

%}

%class Lexer
%type Integer

%%

[a-zA-Z]+ { words++; chars += yytext().length(); }
\n { chars++; lines++; }
. { chars++; }

As in the previous example you need to compile both the JFlex file and the generated Java file:

[johnny@test example1]$ jflex Lexer.flex
[johnny@test example1]$ javac Lexer.java
[johnny@test example1]$ java Lexer

Here is a simple demo terminal session:

The quick brown fox
jumps over the lazy dog.
Chars: 45
Words: 9
Lines: 2

You can download the full code from Github.

Discussion (1)

Collapse
sebastian151196 profile image
Sebastian

You know how to count the vowels of what is entered in the terminal? Thx