DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»

Cover image for Regular Expressions in JavaScript
Will Preble
Will Preble

Posted on

Regular Expressions in JavaScript

Instantiation

I find it strange that all of the JavaScript courses I’ve taken online and my current immersion program have skipped over regular expressions. I first encountered them in solution code for some beginner algorithm puzzles on Free Code Camp. Frankly, they looked like witchcraft. That’s because JavaScript borrows its regex syntax from Perl. Although the syntax can be strange at first, regular expressions are powerful tools for matching patterns in strings.

In JavaScript, Regular Expressions are built-in objects, e.g. Array, Date, Promise, and have their own prototypical methods. They can be instantiated with RegExp literal syntax or via a constructor.

const myFirstRegExp = /[abc]/g; // literal syntax
const mySecondRegExp = new RegExp(β€˜[abc]’, β€˜g’); // constructor syntax
myFirstRegExp === mySecondRegExp; // true

They can also be inserted as arguments into common string methods such as replace or match.

const sentence = 'I walked my dog to the store. My dog is happy.';
console.log(sentence.replace(/dog/g, 'hippo'));
// logs => I walked my hippo to the store. My hippo is happy.

Whether you’re a JavaScript beginner or an experienced developer who’s been putting it off, now is the time to learn regular expressions.

Tools for Writing Regexes

Before we dive into the details, I want to introduce a couple of tools that I use every single time I need to build a regular expression.

The first is RegExPal. RegExPal provides a fabulous GUI for visualizing regexes and matches in sample text. It is very easy to write a regular expression that matches too often with false positives or one that misses a specific edge case. It can also be difficult to determine the order of operations of complicated regular expressions since the syntax is very compressed and characters can serve different purposes in different contexts. When hovering your cursor over characters in the regular expression you write, RegExPal will tell you exactly how that character is behaving in your regular expression.

Alt Text

Next is this cheat sheet at Debuggex. Even as I become more familiar with writing regexes, I always have a cheat sheet close at hand. It’s much more efficient than browsing the wealth of material in the MDN docs. I like this cheat sheet a little more than the one that comes with RegExPal. Just make sure you have the correct language selected! You will notice that Perl Compatible Regular Expressions (PCREs) don’t change much between languages, although Python adds a few extra features over the JavaScript set.

If you find yourself looking for a deeper explanation, don’t forget about the JavaScipt MDN docs. This is my default reference for any native JavaScript features and regexes are no exception.

Flags

The most common way to instantiate a regular expression is via the literal syntax. The entire expression is contained between a pair of single forward slashes. Flags can be tacked on after the ending slash. The two flags to know are global, g, and case insensitive, i. With the global flag, your regex will match every result. Without it, it will only match the first result. Obviously, the case insensitive flag allows your regex to match both upper and lower case if this is desirable.

const sentence = 'My dog is named Dog. He is friends with lots of dogs.';

const dogRegex1 = /dog/; // matches the first dog 
const dogRegex2 = /dog/g; // matches the first dog and the dog inside of dogs
const dogRegex3 = /dog/gi; // matches all three instances of dog

Groups and Ranges

If you want to match one of a group of possible characters, wrap the group of characters in square brackets. Ranges can be established with a hyphen between characters. Ranges are intuitive for letters of the same case and digits, i.e. a-z, A-Z, or 0-9.

Say you want to find all the individual numbers in a document. Both of the following regualar expressions are equivalent.

const allDigits1 = /[0123456789]/g;
const allDigits2 = /[0-9]/g;

Character Groups

But there’s an even more efficient way with special characters that represent character groups! For digits, these are represented by lowercase d. Special characters must be escaped with a preceding backslash. The following regex is equivalent to the previous two.

const allDigits3 = /\d/g;

If you wanted to match every character that is not a digit, you could use the special character D.

const nonDigits = /\D/g;

This is a common form with character groups. The inverse is often the capital version of the special character. Check out this table in the Debuggex cheat sheet I referenced above.

character groups

Quantifiers

What if you wanted to match a string of 16 digits such as a credit card? Quantifies have you covered. The most useful form is {n}, where n is the number of characters to match. The quantifier always refers to the character preceding it. Add a comma to match n or more characters. Add a second integer to match a quantity range.

const sixteenDigits = /\d{16}/;
const atLeastThreeDigits = /\d{3,}/;
const fiveToTenDigits = /\d{5,10}/;

Assertions

Say you only wanted to match strings that begin or end with a number. You will need an assertion to accomplish this. The ^ character will insist that a string begin with the character following it. The $ character will insist that a string ends with the character preceding it.

const beginWithALetter = /^[a-zA-Z]/;
const endWithAWhiteSpace = /\s$/;

Remember that this is the entire string, not the beginnings of each word! Look into the special characters for word boundaries if you want all words that begin with the letter s, for example.

Putting it All Together

Let's write a simplified version of a regular expression to identify a MasterCard number given the following conditions:

  • Starts with either 51, 52, 53, 54, or 55
  • Must be 16 characters in length

We will use an assertion to ensure the first number is 5. We will use a character group with a range to ensure the second digit is 1, 2, 3, 4, or 5. Then we use a special character to specify digits followed by a quantifier for the remainder. And finally we will add an assertion to ensure that the string ends with the 14 digits. But don't take my word for it, be sure to test it out in RegExPal!

const regexMasterCard = /^5[1-5]\d{14}$/;

Conclusion

In this blog, I introduced some useful tools for building your own custom, regular expressions in JavaScript. I recommend using RegExPal and the Debuggex cheat sheet. I covered instantiation syntax, flags, groups, ranges, character classes, special characters, and assertions. This will allow you to match most patterns in any length string. Once you have these down, be sure to look into capturing groups!

Thanks for reading!

Top comments (0)

πŸŒ™ Dark Mode?!

Β 
Turn it on in Settings