Picture this, you are attempting to sign up for a website, you enter your email and password and you get the message "The username or password you entered is invalid. Please try again."
How exactly does the computer know the username or password you input is invalid? The answer is a super powerful tool called regular expression.
What is a Regular Expression?
A regular expression, or regex for short, is a pattern that is used to match all types of characters in a text. The beauty of a regular expression is that you can design it to search for anything you want.
How RegEx Patterns Work
RegEx patterns can be as simple as searching for character(s) in a text, with an exact full match. For example the pattern:
password
Would match with the literal text "password". This particular pattern is case sensitive so the text "Password" would not match since the p is capitalized.
Complex RegEx Patterns
Most websites require your password to meet these requirements:
- Must contain a lowercase letter
- Must contain a capital letter
- Must contain a digit
- Must contain a non-word character
- Must be at least 8 characters long
How would we go about making a pattern that searches for these requirements? The key lies in regex metacharacters. Metacharacters are pre-defined shorthands to match a type of character.
Meeting The Requirements
The metacharacter we can use to pass the first requirement is:
[a-z]
The brackets are used to tell the computer where the range starts and ends. a-z is used to search for lowercase letters in the a-z range.
[A-Z]
Similar to a-z, A-Z searches for any capitalized letters. Passing the second requirement.
\d
This metacharacter passes the third requirement, it tells the computer to search and match with any digit.
\W
This metacharacter can be used to pass the fourth argument because it matches any non-word character. Word characters include any letter capitalized or not, any digit, and an underscore.
{8,}
The curly brackets are special characters that tell the computer to match whatever came before it x amount of times. If a comma is included inside the brackets it changes to match whatever came before at least x amount of times. If a second number is put after the comma it will the preceded pattern a maximum of y times. If no number is put after the comma, the computer interprets it as an infinite amount.
Putting It All Together
With the help of some additional metacharacters:
^
Used to indicate the beginning of a text
$
Used to indicate the end of a text
()
Used to group expressions
(?=)
Used to lookahead in a text
.
Matches any character
*
Matches the previous token between 0 and unlimited times
We can now put our password regex pattern together. The final result will look like this.
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*\W).{8,}$
Conclusion
At this point you probably realized just how powerful regular expressions can be. This is just the tip of the iceberg. They can be used to validate input, match text, search and replace text, amongst other things.
If this article peaked your interest, I recommend you check out some of the links below.
To learn more about regular expressions check out these articles:
Regular Expressions
Python Regular Expressions-Google Education
To experiment writing your own regular expressions visit regex101
Top comments (0)