DEV Community

Cover image for The extraordinary behavior of match()
Jalal 🚀
Jalal 🚀

Posted on

The extraordinary behavior of match()

If you had a very rough week as I had. Let's do some coding therapy. Something healing and productive at the same time.

There are many ways to describe my relationship with regex. Complicated at its best, confusing most of the time.

It's something I usually try to avoid, but eventually, you have to face it and
get over it. And yes, no matter how much I pretend to know it, inside, I know I don't.

That was the case when our friends from the lovely dev community pointed out to another solution can be used to count and get statistics from a string by using match instead of split.

String.prototype.match()

Match() itself is simple. It looks into a given string, returns an array of results, based on regex expression.

const regex = /cat/g;
"fat cat, fast cat".match(regex);

// (2) ["cat", "cat"]
Enter fullscreen mode Exit fullscreen mode

/cat/g, will look for c followed by a followed by t. Let's see the result for this one:

- "fat cat, fast cat".match(regex);
+ "category: fat cat, fast cat".match(regex);
Enter fullscreen mode Exit fullscreen mode
"category: fat cat, fast cat".match(/cat/g);

// (3) ["cat", "cat", "cat"];
Enter fullscreen mode Exit fullscreen mode

Unexpected? Maybe. But it's also clear, you got what you asked for. cat is in category. Do you need a different output? Use extra options.

Let's change the pattern, I need to match cat which starts with whitespace \s followed by character c followed by a followed by t, ends with space or comma or dot [\s|.|,]

const regex = /\s(cat)[\s|.|,]/g;
"category. fat cat, fast cat. category".match(regex);

// (2)[" cat,", " cat."];
Enter fullscreen mode Exit fullscreen mode

A better result indeed. At least category is not counted.

So, to continue what we've already started in the previous post, let's recap some shorthands we need to know before we start counting:

\w: matches alphanumeric characters with numbers [a-zA-Z0-9_]
+: matches preceding symbol

Which means \w+ matches the whole word.

"fat cat".match(/\w+/g);
// (2) ["fat", "cat"]
Enter fullscreen mode Exit fullscreen mode

\n: matches newline

"fat cat".match(/\n/g);
// null

"fat cat \n fast cat".match(/\n/g);
// (1) ["↵"]
Enter fullscreen mode Exit fullscreen mode

Since the initial result is zero, we have to add +1 to the result.

\s: matches a whitespace character including newline \n and tab \t

"fat cat, fast cat".match(/\s/g);
// (3) [" ", " ", " "]

"fat cat\n fast cat".match(/\s/g);
// (4) [" ", " ", "↵", " ", " "]
Enter fullscreen mode Exit fullscreen mode

Spaces = str.match(/\s/g) - str.match(/\n/g)

Building count()

const str = "Hello World\n How are you doing";

function count(str) {
  const lines = (str.match(/\n/g) || []).length;
  // (1) ["↵"]

  const spaces = (str.match(/\s/g) || []).length;
  // (6) [" ", "↵", " ", " ", " ", " "]
  // 6 - 1 = 5

  const words = str.match(/\w+/g) || [];
  // (6) ["Hello", "World", "How", "are", "you", "doing"]

  const total = str.length;
  // 30

  return {
    lines: lines + 1,
    spaces: spaces - lines,
    words,
    total,
  };
}
Enter fullscreen mode Exit fullscreen mode

Note: Using str.match(reg) || [] just in case match not found which returns null.

Here's a good resource for learning regex github/learn-regex. You can also practice regex live via regexr.


Please leave ⭐️ if you like it. Feedbacks more than welcome 👋👋👋

Top comments (0)