DEV Community

loading...

Use JFlex to Count Words

vicentemaldonado profile image Vicente Maldonado Originally published at Medium on ・3 min read

In the previous story we got to meet JFlex, a tool for generating lexers in Java. The example lexer was contrived, banal and not all that useful so let’s show that JFlex can be put to good use with a (bit) more useful example: we’ll count words, lines and characters the user enters.

The first part of the JFlex file is the same as in the first example:

import java.io.\*;

%%

We just import all the java.io classes. To start with, we’ll need a way to store our word, line and char count:

%{

public int chars = 0;
public int words = 0;
public int lines = 0;

chars, words and lines will become public members of the generated class, accessible to the rest of the code. The main method is much the same as in the first example:

public static void main(String[] args) throws IOException
{
 InputStreamReader reader =
 new InputStreamReader(System.in);

Lexer lexer = new Lexer(reader);

lexer.yylex();

 System.out.format(
 "Chars: %d\nWords: %d\nLines: %d\n",
 lexer.chars, lexer.words, lexer.lines);
}

%}

There are two differences though:

  • There is no infinite loop — the lexer will read from System.in until we interrupt it ( Ctrl-d on Linux and Ctrl-Z on Windows I think). This allows the user to enter several lines of text in the terminal.
  • We don’t use the yylex() directly.

In the next part:

%class Lexer
%type Integer

%%

The generated Java class will be named Lexer and yylex() will return a Java Integer value. This is only because yylex() needs to return something and it returns an object of type Yylex by default — javac will complain that Yylex type doesn’t exist because it doesn’t (if you don’t create it yourself).

Finally, in the lexical rules part:

[a-zA-Z]+ { words++; chars += yytext().length(); }
\n { chars++; lines++; }
. { chars++; }

If you type a word, recognized by the [a-zA-Z]+ regex, the word count will be incremented and the char count be increase by the entered word length. If you press Enter, ie. \n , the char count will be incremented. And if you enter a random character like * or & the character count will be incremented.

This allows us to print out the final count of chars, words and lines (back in main):

System.out.format(
 "Chars: %d\nWords: %d\nLines: %d\n",
 lexer.chars, lexer.words, lexer.lines);

Here is the complete file:

import java.io.\*;

%%

%{

public int chars = 0;
public int words = 0;
public int lines = 0;

public static void main(String[] args) throws IOException
{
 InputStreamReader reader =
 new InputStreamReader(System.in);

Lexer lexer = new Lexer(reader);

lexer.yylex();

 System.out.format(
 "Chars: %d\nWords: %d\nLines: %d\n",
 lexer.chars, lexer.words, lexer.lines);
}

%}

%class Lexer
%type Integer

%%

[a-zA-Z]+ { words++; chars += yytext().length(); }
\n { chars++; lines++; }
. { chars++; }

As in the previous example you need to compile both the JFlex file and the generated Java file:

[johnny@test example1]$ jflex Lexer.flex
[johnny@test example1]$ javac Lexer.java
[johnny@test example1]$ java Lexer

Here is a simple demo terminal session:

The quick brown fox
jumps over the lazy dog.
Chars: 45
Words: 9
Lines: 2

You can download the full code from Github.

Discussion

pic
Editor guide
Collapse
sebastian151196 profile image
Sebastian

You know how to count the vowels of what is entered in the terminal? Thx