DEV Community

Genne23v
Genne23v

Posted on

Useful Regex For Web Developer

Regular expression is hard. But it's useful for web developers to limit the input from users. There are many common examples to validate user input but sometimes we need to understand it to add more logic to customize it. Let me introduce some regular expression that I often use for user input validation.

Limiting domain name input

For Starchart project, I needed to restrict user's subdomain name as following rules.

  1. Domain name pattern should be [name].[studentId].rootDomain.com
  2. Domain name can contain only alphanumerical characters, along with '-' and '_'
  3. Domain name should not start or end with -
  4. Domain name cannot contain multiple consecutive '-' or '_'
  5. studentId only contains alphanumerical characters

After writing all possible pass/fail test cases, I came up with below regex. Let me explain each block of expression for better understanding.

/^(?!.*[-_]{2,})(?!^[-])[a-z0-9_-]+[^-_]\.[a-z0-9]+$/
Enter fullscreen mode Exit fullscreen mode
  • ^ means the beginning of the string.
  • (?!.*[-_]{2,}) -> ?! means the pattern inside the parentheses must not match for the overall regex pattern to be considered a match. .* means zero or more characters and the meaning of [-_]{2,} is two or more than two consecutive - - or _. So multiple - or _ must not be in the string.
  • (?!^[-]) -> ?! is the negative lookahead assertion that is explained above. ^[-] means the string must not start with -.
  • [a-z0-9_-]+ -> One or more of lowercase letters, numbers, -, and _ allowed.
  • [^-_]\. -> - or _ cannot be placed before .
  • [a-z0-9]+$ -> One or more of lowercase letters and numbers are allowed until the end of string

It's still difficult, but it's easier when you break down the expression.

Password requirement regex

Here's other example that I use for password validation.

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
Enter fullscreen mode Exit fullscreen mode
  • ?= is positive lookahead assertion that checks if the pattern inside the parentheses matches. In this first block of regex, it checks whether it has at least one lowercase letter. Next block checks if it has at least one uppercase letter.
  • \d represents numerical characters which are same as [0-9].
  • Next block checks if it has at least one of following special characters @$!%*?&.
  • Last block validates the string consists of alphanumerical characters and special characters (i.e. @$!%*?&) and its length must be at least 8 characters long.

Email validation

Here's another one for email validation.

^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$
Enter fullscreen mode Exit fullscreen mode
  • [\w-\.] means it only allows one or more of word characters, -, and ..
  • ([\w-]+\.) checks whether the pattern consists of word characters and - only followed by one ..
  • [\w-]{2,4} is the pattern that has only word characters and - and its length is between 2 and 4.

Conclusion

Regular expression is hard even for very experienced developers. And you don't want to use the one for production that is not proven by thousands of users. I had an experience that I couldn't add my website address on one of the biggest Canadian companies's website because my website address is treated invalid. It was frustrating. So it's always a better idea to use solutions that already exists and have been proven by many users. You can use a library like validator although you still have to add your own logic to customize it. Also there are better libraries for specific types of validation such as IP address, zip code, etc. But it's good to keep trying to read regex and practice so that you can add your own logic based on existing one.

Top comments (0)