DEV Community

Himanshu Gupta
Himanshu Gupta

Posted on

what is regex expression

A Regular Expression (also known as "regex" or "regexp") is a sequence of characters that define a search pattern. This search pattern can be used to match (and sometimes replace) specific strings, or to check if a string contains the specified pattern.

In other words, regular expressions are a way to describe patterns in strings, and they are widely used in text processing, data validation, and a variety of programming languages (such as Python, Perl, and JavaScript) for pattern matching with strings.

Here are a few examples of regular expressions:

\d matches any digit (equivalent to [0-9])
\w matches any word character (equivalent to [a-zA-Z0-9_])
\s matches any whitespace character (space, tab, newline, etc.)
. matches any character except a newline
\b matches a word boundary (such as the boundary between a word and a space)
It is important to note that the meaning of the characters in a regular expression can depend on the context in which they are used, as well as the regex engine that is interpreting the expression.

Quantifiers: These characters allow you to specify how many times a character or group of characters should be matched. For example, a{3} matches the character "a" exactly three times, while \d{1,3} matches one to three digits.

Character classes: These allow you to match any one character from a set of characters. For example, [aeiou] matches any lowercase vowel, while [A-Z] matches any uppercase letter.

Alternation: This allows you to match one of several alternatives. For example, (dog|cat) matches either the string "dog" or the string "cat".

Grouping: This allows you to group characters and apply quantifiers, alternation, or other operators to the entire group. For example, (abc)+ matches one or more consecutive occurrences of the string "abc".

Anchors: These characters allow you to specify the position of the match relative to the beginning or end of the string. For example, ^ matches the beginning of the string, while $ matches the end of the string.

Example

To create a regular expression pattern in PHP, you can use the / (slash) symbol to delimit the pattern. Within the pattern, you can use a combination of characters, special characters (also known as "metacharacters"), and character classes to define the desired pattern.

Here's an example of creating a regular expression pattern to match dates in the format YYYY-MM-DD:

$pattern = "/^[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])$/";

The ^ character at the start of the pattern matches the start of the string.

Another Example
create regular expression personal email is not valid

$pattern = "/^[a-zA-Z0-9._%+-]+@(gmail|yahoo|hotmail)\.(com|co\.uk)$/";

Enter fullscreen mode Exit fullscreen mode

Explanation of the pattern:

The ^ character at the start of the pattern matches the start of the string.

The [a-zA-Z0-9.%+-]+ matches one or more characters, consisting of letters (a-z and A-Z), digits (0-9), period (.), underscore (), percent (%), plus (+), and dash (-) characters. This represents the username part of the email address.

The @ character matches the @ symbol in an email address.

The (gmail|yahoo|hotmail) matches either "gmail", "yahoo", or "hotmail". This represents the domain name part of the email address.

The . character matches the period (.) character in the domain name.

The (com|co.uk) matches either "com" or "co.uk". This represents the top-level domain part of the email address.

The $ character at the end of the pattern matches the end of the string.

So, the pattern will match an email address that has a username, followed by the "@" symbol, followed by a domain name of "gmail", "yahoo", or "hotmail", followed by a period, followed by a top-level domain of "com" or "co.uk". Any email address that does not match this pattern will be considered not valid.

The [0-9]{4} matches exactly 4 digits (representing the year).

The -(0[1-9]|1[0-2])- matches either "01" to "09" (0[1-9]) or "10" to "12" (1[0-2]), separated by a dash. This represents the month.

The (0[1-9]|[1-2][0-9]|3[0-1]) matches either "01" to "09" (0[1-9]) or "10" to "29" ([1-2][0-9]) or "30" to "31" (3[0-1]), representing the day.

The $ character at the end of the pattern matches the end of the string.

You can use this pattern with functions such as preg_match or preg_replace to perform operations on strings that match the pattern.

Top comments (0)