DEV Community

Cover image for Common Questions in Regular Expression
Fakorede Damilola
Fakorede Damilola

Posted on

Common Questions in Regular Expression

In my last article introduction to regular expression, I explained what regular expression is, some of the methods involved and so on. In this article I will be going through a few regular expression questions that should help you get comfortable with regex. Note that this questions might not be the interview questions you are expecting but I hope it gives you an edge when it comes to solving questions in regular expressions.
Like the famous saying in programming you can do one thing in a thousand different ways and I, for one will definitely use just one of this different ways. If you feel there is a better way I should have solved a particular problem, let me know in the comment section below.
Let's get down to business.

  1. Email Validation :
    Create a function that will test if a given input is an email or not. Note that the emails can come is different formats for example devto123@gmail.com, dev.to@example.com and so on. Using the gmail format, that is only letters, number, and periods are allowed. Return a boolean

    Solution
    From the question above, we are testing to see if an input is an email or not and it is pretty obvious that we will end up using regex, that is the test method.
    Basically we have to write a regex that matches different formats of email. When solving questions like this, it is better you start from ground zero. What do you know about the question / what were we told. Here is a few things we know about emails

    1. It should start with an alphanumeric character and casing does not matter.
    2. A dot can be used somewhere in the string but not necessarily. If it is then it must be immediately followed by one or more characters.
    3. There must be an @ after which a few other characters must follow.
    4. It must end with a .com or .co and so on.

    This might look like a really long process for just one question and it actually is. I won't recommend doing this in an examination or interview. But when you are at your comfort, this can really help you understand the question especially for beginners. Although you don't have to write it out like I did but then it won't hurt.
    So now that we know what the mail looks like, let's move on and see how this can help us.

    1. Emails must start with letters and casing doesn't matter. In regex, must start is the ^ and we can easily match alphanumeric characters with \w ===[a-zA-Z1-9_]. But there is a problem with this, it should not allow underscore. So we have to write it out, that is [a-zA-Z1-9]. In order to match more than one character, we use the +. All together /^[a-zA-Z1-9]+/.
    2. A period can be somewhere in the string but not necessarily. In regex, a period is a wild card. Making it a normal character, we need the backslash to escape it. To make it optionally we use ?. At this point, you have this /^[a-zA-Z1-9]+\.?/. If there is a period, it should be followed by one or more string, so basically repeating step one. /^[a-zA-Z1-9]+\.?[a-zA-Z1-9]+/.
    3. @ should follow. This is pretty straightforward /^[a-zA-Z1-9]+\.?[a-zA-Z1-9]+@/. After which a few letters should follow, that is /^[a-zA-Z1-9]+\.?[a-zA-Z1-9]+@[a-zA-Z1-9]{3,}/.
    4. Must end with .com, .ca and so on. /(\.\w{2,3})$/. The parenthesis is used to just group regex together. Summing it all up we have this.
    function validateEmail(str){
    let regex = /^[a-zA-Z1-9]+\.?[a-zA-Z1-9]+@[a-zA-Z1-9]{3,}(\.[a-zA-Z0-9]{2,3})$/;
    return regex.test(str)
    }
    

    I know this is pretty long, and I won't be doing this for other questions. I just wanted to show you a better way to approach questions especially algorithm. I hope this will help you out when solving other questions.

  2. Date validation :
    Create a function to test if a string is a valid date format. The format is DD-MM-YYYY or D-M-YY. Note that the separator can be :,_,- or /.

    Solution
    Like what we did above, splitting this question will make it easier.

    • DD / D : From our calendar, the days are always less than or equal to 31. We are basically matching 01-31 or 1-31. /0?[0-9]/ will be used to match numbers less than 10 while making the 0 optionally, /[12][0-9]/ will match from 10-29 (remember [12] is a character set and it means either 1 or 2) and /3[01]/ since we can't have more than 31 days. All together /(0?[0-9]|[12][0-9]|3[01])/. Remember that | stands for or
    • MM / M : 12 months in the calendar, matching 0-12 or 01-12. Basically since we cannot have more than 12 months, we can't match it all at once. So /0?[0-9]/ and /1[0-2]/. Altogether /(0?[0-9])|(1[0-2])/.
    • YY / YYYY : Since this has no specific number it is pretty straightforward. Just remember 4 or 2 digits. That is /[0-9]{2}|[0-9]{4}/
    • Separator : Piece of cake right /[:\/_-]/. All together we have this.
    function validateDate(str){
    let regex = /^(0?[0-9]|[12][0-9]|3[01])[:\/_-](0?[0-9])|(1[0-2])[:\/_-][0-9]{2}?[0-9]{2}$/
    return regex.test(str)
    }
    
  3. Vowel count:
    Return the number of vowels in this string

    Solution
    Try it your self‼️‼️‼️
    There is quite a number of ways you can do this, a for loop will work just fine but right now you have the power of regex so why not try that.
    The closest thing you can use to actually get quantity in regex is the match method(returns an array) and then you can easily call a .length on the array returned.

    function vowel(str){
    return str.match(/[aeiou]/ig).length
    }
    

    Don't forget the i and g flag. Piece of cake right.

  4. Palindrome :
    Create a function to test if a string is a palindrome. Note that special characters, spaces and so on should not be considered when testing the string for example, race_-+C ar and m-.um are both palindrome.

    Solution
    Before we move forward we need to understand what a palindrome is. A palindrome is basically a string that when reversed spells out the same thing. For example racecar. If there was a special character or space in the string above it might not be a palindrome, for example ra_-ce car != rac ec-_ar. That is why the question says all non alphanumeric character should be removed before testing.
    Even though the question says we should test if a string is a palindrome, it is pretty obvious you cannot use any method from regex. I mean what would you be matching or testing against. So that is not an option.
    The first thing we can do know is to remove all non alphanumeric character. Regex could come in pretty handy here with the replace method.

    let str="ra c e-_.c;+-a.?).;#r"
    str.replace(/[\W_]/g,"") //\W matches All non alphanumeric character expect the underscore, which we also need to match, then replace them.
    

    With this we should have the exact string we are suppose to test alone without the other characters.
    Since a palindrome is basically the reverse of the actual string, we can do just that. Convert the string to an array with the split method and call the reverse method on that array. Then simply join the array back with the join method and you have the reverse which you can easily test to see if they are the same.

    function palindrome(str){
    let string = str.replace(/[\W_]/g,"")
    let array = string.split("")
    let str2 = array.reverse()
    let string2 = str.join("")
    return string === string2 ? true :false
    }
    //shorter version
    function palindrome(str){
    return str.replace(/[\W_]/g,"")
    .split("")
    .reverse()
    .join("") ? true : false
    }
    
  5. Hexadecimal colors :
    Create a function to test if the given string is an hexadecimal color for example #333, #333333

    Solution
    So we are back to testing and at this point you should know that we will be using regex. Can you give it a try.
    Here is what we know about hexadecimal colors. It can be three(3) or six(6) characters and it must be between 0-9 or A-F that is, sixteen different characters.
    An hexadecimal must start with an # and can be followed by A-F or 0-9 three times so basically /^#([A-Fa-f0-9]){3}/. But it can also be six alphanumeric characters. That is /^#([A-Fa-f0-9]){6}/. Since it is three or six together we can do this

    function validateHexadecimal(str){
    let regex = /^#([A-Fa-f0-9]{3}|[A-Fa-f0-9]{6})$/
    return regex.test(str)
    }
    
  6. Spinal case :
    Create a function to convert a string to a spinal case. For example This Is A JavaScript_String = this-is-a-javascript-string, thisIsJavascript = this-is-javascript

    Solution
    Try it out first.
    This question is in a way tricky because the strings can come in very different formats. Basically the task is to add an hyphen between words. If this sentences are always separated with _ or - it will be pretty easy. But it can also be camelCase like the example above. In situations like this, you will have to split it at all occurrence of a capital letter and then add the hyphen. Note that the string returned should always be in lowercase.
    Now that we know what should and shouldn't be there, we can move forward.

    • The easiest thing to do is to remove all alphanumeric characters first. That is str.replace(/[\W_]/g,"").
    • Now that all the special characters are gone, we can easily split the words either with spaces or with capital letters. That is str.split(/(?=[A-Z])|\s/). So basically, when going through the string, it is either it looks ahead to see if the next letter is in uppercase or it checks if there is a space and splits at that point.
    • With the array that was returned from the split method, a map method can be called to convert all to lowercase and then join with an hypen. Summing it altogether we have this
    function spinalCase(str){
    str=str.replace(/\W_/g,"")
    return str.split( /(?=[A-Z])|\s/)
    .map(str=>str.toLowerCase())
    .join("-")
    }
    
  7. Check HTML :
    Create a function to test if the given string is an HTML or not. Examples includes ,<> .

    Solution
    An HTML string will always have an opening and closing parenthesis with zero or more string, but a backslash is optional

    function validateHTML(str){
    let regex = /<([A-Za-z]*) *\/?>/
    return regex.test(str)
    }
    

    Breaking this down, we are basically saying

    • It should start with <
    • Zero or more characters [A-Za-z]
    • Zero or more spaces " *"
    • An optional backslash and the final closing tag.
  8. Password validator :
    Create a function to check if a given password follows this given format. Above 8 characters, at least a lowercase character, an uppercase character, a digit and a special character.

    Solution
    A password validator can be tricky. But let's start from the easier part, which should be making sure the password is 8 characters or above.

    /[\w\W]{8,}/ //\w to match alphanumeric and underscore and \W to match special character 
    

    Now we need to make sure that at least one of each character actually appears in the password. We have to do this for each of the different characters but it is basically the same thing, so I will explain just one of them.
    Uppercase:
    To match for uppercase, we need to use a look ahead (?=...). A look ahead makes sure that a character is followed by another specific character and basically returns true or false based on that. That is /^(?=.*\[A-Z])/. It goes through the string and checks to see if zero or more characters (the period is a wild card used to match any character) are followed by an uppercase character. It uses asterisk because it is possible the uppercase is the first character.
    So we do this for all the other types of characters we want to make sure occurs at least once in the string.

    /^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*\W)/
    

    If one or more of this returns false, for example a digit cannot be found it results to false.
    But if one or more of this occur in the string, we can then go ahead to match for the quantity. That is, the number of characters in the string. putting it altogether

    function validatePassword(str){
    let regex = /^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*\W)[\w\W]{8,}$/
    return regex.test(str)
    }
    

So we have come to the end of this rather long article, I really hope you learnt something and you are more comfortable with regular expression now. With this, algorithms and regex
should to a level pose no threat to you again. Just follow the patterns we used to solve some of this questions and you will be fine. If you have any suggestions or questions let me know in the comment section.
If you enjoyed this, smash that like button and share it with your friends. You can also follow me on Twitter @fakoredeDami.
Thank you for reading.

Discussion (0)