DEV Community

Cover image for How to create a regex version of strip()
Praise Idowu
Praise Idowu

Posted on • Updated on

How to create a regex version of strip()

Project 1: Practice Projects

Welcome to the third project in this tutorial series. In this project, you will create a Regex version of strip(). As usual, you will have examples and exercises to work on. Once you are done, you can compare your solutions by clicking here.

Exercise

Project A: Strong Password Detection

This project is available in the project's README. Once you have completed it, return here to continue.

Welcome back; now it's time to tackle Project B.
In this project, you will create a Regex Version of strip().

Prerequisites

Before starting this project, it's essential to have a good understanding of how strip() works. Some developers use it frequently, while others seldom use it. Experiment with it to gain a clear idea of its functionality.

Write Code

Now that you understand it, let us move on to writing code.

Step 1: Create a Function
In this step, define a function called regex_strip with two parameters: input_string and chars_to_remove, which has a default value of None. This allows you to provide characters to remove if you wish, but if none are specified, it defaults to None.

Step 2: Check for chars_to_remove Being None
Here, you check if the chars_to_remove variable is None. If it is, you create a regex pattern that matches leading and trailing whitespace (similar to how strip() works by default). This step ensures that if no characters are specified for removal, the function behaves like the standard strip() method.

Step 3: Create a Pattern
You might notice that we create a pattern without using re.compile(). The reason for this is that since it is a one time use, the performance gain is negligible. To learn more, click here.

Step 4: Execute the Else Statement
In this step, it checks whether chars_to_remove is specified (not None). If it is, it executes the code inside the else block.

Step 5: Substitute the Pattern
Finally, you use re.sub() to substitute any matching patterns with an empty string, effectively removing them from the input_string.
Here's the complete code:

import re

def regex_strip(input_string, chars_to_remove=None):
    if chars_to_remove is None:
        # If chars_to_remove is not specified, remove leading and trailing whitespace
        pattern = r'^\s+|\s+$'

    else:
        # Escape characters that might be interpreted as regex metacharacters(.*?^$+)
        chars_to_remove = re.escape(chars_to_remove)
        # Create a pattern that matches characters specified in the second argument
        pattern = f'[{chars_to_remove}]+'  #['hello']+ 1 or more so it picks any matching pattern in the bracket

    return re.sub(pattern, '', input_string)

if __name__ == '__main__':
    input_string = input('Input a string: ')
    chars = input('Input a character to escape: ')

    # TEST
    # input_string = "ABCDEFHello, World!ABCDEF"
    # chars = "ABCDEF"

    # input_string = "   Hello, World!   "

    print(regex_strip(input_string, chars))
Enter fullscreen mode Exit fullscreen mode

The code provides you with a tool to strip specific characters from a given string using regular expressions. Feel free to experiment with different input strings and characters to remove to see how it works.

Exercise

Here are some additional exercises to further practice your regex skills:

  1. Parsing CSV Data: Write a regex pattern to extract data from a CSV (Comma-Separated Values) file. Test it on a sample CSV file with different data.
  2. Extracting Hashtags from Social Media Posts: Create a regex pattern to extract hashtags from social media posts, such as #regex or #PythonProgramming.
  3. Parsing Log Files: Develop a regex pattern to extract specific information from log files, like timestamps, error codes, and messages.
  4. Validating ISBNs (International Standard Book Numbers): Write a regex pattern to validate ISBNs with different formats (e.g., ISBN-10 or ISBN-13). These exercises will help you apply your regex skills to real-world scenarios and expand your knowledge.

Conclusion

In this project, you created a Regex version of strip(), building upon our previous skills. Keep practicing, experimenting, and applying regex patterns in various contexts to become more proficient.

If you have any questions, want to connect, or just fancy a chat, feel free to reach out to me on LinkedIn and Twitter.

Top comments (2)

Collapse
 
gfelisberto profile image
Gustavo Felisberto

Adding a cache of compiled regex would mean a good performance boost in the long run no?

Collapse
 
praise002 profile image
Praise Idowu

Fixed. I admit my mistake.
quora.com/What-are-the-performance....