loading...
Cover image for Regular Expressions(RegEx)in JavaScript

Regular Expressions(RegEx)in JavaScript

carter profile image JeffUbayi ・6 min read

Regular expressions are a way to describe patterns in a string data. They form a small language of its own, which is a part of many programming languages like Javascript, Perl, Python, Php, and Java.

Regex are written in a specific syntax and then usually applied on a larger string of text to see if the string meets the conditions defined in the regular expression. Regular expressions have the general syntax of a pattern and modifier, like so:

/patterns/modifiers

The pattern is the sequence of characters and the modifier is a single letter which changes the entire behavior of the regular expression.

Creating a Regular Expression
There are two ways to create a regular expression in Javascript. It can be either created with RegExp constructor, or by using forward slashes ( / ) to enclose the pattern.

Regular expression using constructor:

 let regex = new RegExp('abc');

Regular expression using literal:

let regex = /abc/;

No matter which method you choose, the result is going to be a regex object. Both regex objects will have same methods and properties attached to them.

Since forward slashes are used to enclose patterns in the above example, you have to escape the forward slash ( / ) with a backslash ( \ ) if you want to use it as a part of the regex.

RegEx Methods
We have two methods for testing regular expressions;

1 .test()
The method is used to test whether a match has been found or not.
It returns a boolean true or false statement

let regex = /hello/;
let text =' hello devs';
let result = regex.test(text);
console.log(result);
//returns true

2. exec()
This method returns an array containing all the matched groups.


let regex =/hello/;
let text = ' hello devs';
let results = regex.exec(text);
console.log(results);
// returns [ 'hello', index: 0, input: 'hello devs', groups: undefined ]

// 'hello' -> is the matched pattern.
// index: -> Is where the regular expression starts.
// input: -> Is the actual string passed.

Simple regex patterns
It is the most basic pattern, which simply matches the literal text with the test string.

let regex = /hello/;
console.log(regex.test('hello devs'));
// true

Special characters
Now, let’s tap into the full power of regular expressions when handling more complex cases.
There are special symbols and characters that you have to memorize in order to fully understand the regular expressions.

Flags
Regular expressions have five optional flags or modifiers.Lets work with the two important ones.
i: This makes the searching case-insensitive
g: This makes the searching global which prevents it from stopping after the first match.

let regexGlobal = /abc/g;
console.log(regexGlobal.test('abc abc'));
// it will match all the occurence of 'abc', so it won't return 
// after first match.
let regexInsensitive = /abc/i;
console.log(regexInsensitive.test('Abc'));
// returns true, because the case of string characters don't matter 
// in case-insensitive search.

Character groups:
Character set [xyz] — A character set is a way to match different characters in a single position, it matches any single character in the string from characters present inside the brackets.

let regex = /[bt]ear/;
console.log(regex.test('tear'));
// returns true
console.log(regex.test('bear'));
// return true
console.log(regex.test('fear'));
// return false

Note — All the special characters except for caret (^) (Which has entirely different meaning inside the character set) lose their special meaning inside the character set.

Ranges [a-z] — Suppose we want to match all of the letters of an alphabet in a single position, we could write all the letters inside the brackets, but there is an easier way and that is ranges

let regex = /[a-z]ear/;
console.log(regex.test('fear'));
// returns true
console.log(regex.test('tear'));
// returns true

Meta-characters — Meta-characters are characters with a special meaning. There are many meta character but I am going to cover the most important ones here.

\d — Match any digit character ( same as [0-9] ).
\w — Match any word character. A word character is any letter, digit, and underscore. (Same as [a-zA-Z0–9_] ) i.e alphanumeric character.
\s — Match a whitespace character (spaces, tabs etc).
\t — Match a tab character only.
\b — Find a match at beginning or ending of a word. Also known as word boundary.
. — (period) Matches any character except for newline.
\D — Match any non digit character (same as [^0–9]).
\W — Match any non word character (Same as [^a-zA-Z0–9_] ).
\S — Match a non whitespace character.

Quantifiers: — Quantifiers are symbols that have a special meaning in a regular expression.

+ — Matches the preceding expression 1 or more times.

let  regex = /\d+/;
console.log(regex.test('8'));
// true
console.log(regex.test('88899'));
// true
console.log(regex.test('8888845'));
// true
  • —Matches the preceding expression 0 or more times.
let  regex = /go*d/;
console.log(regex.test('gd'));
// true
console.log(regex.test('god'));
// true
console.log(regex.test('good'));
// true
console.log(regex.test('goood'));
// true

? — Matches the preceding expression 0 or 1 time, that is preceding pattern is optional.

let regex = /goo?d/;
console.log(regex.test('god'));
// true
console.log(regex.test('good'));
// true
console.log(regex.test('goood'));
// false

^ — Matches the beginning of the string, the regular expression that follows it should be at the start of the test string. i.e the caret (^) matches the start of string.

let regex = /^g/;
console.log(regex.test('good'));
// true
console.log(regex.test('bad'));
// false
console.log(regex.test('tag'));
//

$ — Matches the end of the string, that is the regular expression that precedes it should be at the end of the test string. The dollar ($) sign matches the end of the string.

let regex = /.com$/;
console.log(regex.test('test@testmail.com'));
// true
console.log(regex.test('test@testmail'));
// false

{N} — Matches exactly N occurrences of the preceding regular expression.

let regex = /go{2}d/;
console.log(regex.test('good'));
// true
console.log(regex.test('god'));
// false

{N,} — Matches at least N occurrences of the preceding regular expression.

let regex = /go{2,}d/;
console.log(regex.test('good'));
// true
console.log(regex.test('goood'));
// true
console.log(regex.test('gooood'));
// true

{N,M} — Matches at least N occurrences and at most M occurrences of the preceding regular expression (where M > N).

let regex = /go{1,2}d/;
console.log(regex.test('god'));
// true
console.log(regex.test('good'));
// true
console.log(regex.test('goood'));
// false

Alternation X|Y — Matches either X or Y. For example:


let regex = /(green|red) apple/;
console.log(regex.test('green apple'));
// true
console.log(regex.test('red apple'));
// true
console.log(regex.test('blue apple'));
// false

Note — If you want to use any special character as a part of the expression, say for example you want to match literal + or ., then you have to escape them with backslash ( \ ).For example:

let regex = /a+b/;  // This won't work
let regex = /a\+b/; // This will work
console.log(regex.test('a+b')); // true

Practicing Regex:
Let’s practice some of the concepts that we have learned above.

Match any 10 digit number :

let regex = /^\d{10}$/;
console.log(regex.test('9995484545'));
// true

Let’s break that down and see what’s going on up there.

  1. If we want to enforce that the match must span the whole string, we can add the quantifiers ^ and $.
  2. The caret ^ matches the start of the input string, whereas the dollar sign $ matches the end. So it would not match if string contain more than 10 digits.
  3. \d matches any digit character. {10} matches the previous expression, in this case \d exactly 10 times. So if the test string contains less than or more than 10 digits, the result will be false.

Match a date with following format DD-MM-YYYY or DD-MM-YY:

let regex = /^(\d{1,2}-){2}\d{2}(\d{2})?$/;
console.log(regex.test('01-01-1990'));
// true
console.log(regex.test('01-01-90'));
// true
console.log(regex.test('01-01-190'));
// false

Let’s break that down and see what’s going on up there.

  1. Again, we have wrapped the entire regular expression inside ^ and $, so that the match spans entire string. ( start of first subexpression.
  2. \d{1,2} matches at least 1 digit and at most 2 digits.
  3. - matches the literal hyphen character.
  4. ) end of first subexpression.
  5. {2} match the first subexpression exactly two times.
  6. \d{2} matches exactly two digits.
  7. (\d{2})? matches exactly two digits. But it’s optional, so either year contains 2 digits or 4 digits.

Conclusion
Regular expression can be fairly complex at times, but having a proper understanding of the above concepts will help you understand more complex regex patterns easily. You can learn more about regex here and practice it here.

Posted on by:

carter profile

JeffUbayi

@carter

Always learning, looking for new challenges. Passionate about finding solutions to real-world problems using software. Because Technology is awesome!

Discussion

markdown guide