DEV Community

Matt Ryan
Matt Ryan

Posted on

How to use Regex with Python

Regular expressions, commonly known as regex, are powerful tools for matching patterns in text data. Python is a popular language for data processing, and it includes a powerful module for working with regular expressions: the re module. In this article, we will cover the basics of using regex with Python and provide some examples to help you get started.

Importing the re Module

To use regex in Python, you first need to import the re module. This can be done with a simple import statement:

import re
Enter fullscreen mode Exit fullscreen mode

Once you have imported the re module, you can use its functions to match patterns in text.

The Basic Syntax of Regular Expressions

Regular expressions use a combination of characters and special symbols to create patterns that can match specific strings of text. Here are some of the most commonly used symbols:

  • . matches any single character
  • ^ matches the beginning of a string
  • $ matches the end of a string
  • * matches zero or more occurrences of the preceding character or group
  • + matches one or more occurrences of the preceding character or group
  • ? matches zero or one occurrence of the preceding character or group
  • {m} matches exactly m occurrences of the preceding character or group
  • {m,n} matches between m and n occurrences of the preceding character or group
  • [] matches any character within the brackets
  • () creates a group that can be referenced later
  • ** escapes special characters so they can be matched as literal characters

Using Regular Expressions in Python

Now that we have covered the basics of regular expression syntax, let's look at some examples of how to use them in Python.

Matching a Specific String

To match a specific string, you can use the re.search() function. This function takes two arguments: the pattern you want to match and the string you want to search in.

For example, to match the string "hello" in the text "hello world", you can use the following code:

import re

text = "hello world"
pattern = "hello"

result = re.search(pattern, text)

if result:
    print("Match found!")
else:
    print("Match not found.")
Enter fullscreen mode Exit fullscreen mode

The output of this code will be "Match found!", since the string "hello" is present in the text "hello world".

Matching a Range of Characters

To match a range of characters, you can use square brackets. For example, to match any lowercase letter, you can use the pattern [a-z].

import re

text = "The quick brown fox jumps over the lazy dog."
pattern = "[a-z]"

result = re.findall(pattern, text)

print(result)
Enter fullscreen mode Exit fullscreen mode

The output of this code will be a list of all the lowercase letters in the text.

Matching a pattern in a string

To match a group of characters, you can use parentheses. For example, to match any word that starts with "cat" or "dog", you can use the pattern (cat|dog)\w*.

import re

text = "The quick brown fox jumps over the lazy dog."

pattern = r"fox"

result = re.search(pattern, text)

print(result.group()) # Output: "fox"

Enter fullscreen mode Exit fullscreen mode

This code will search for the word "fox" in the string text using the regex pattern r"fox". It will return the first occurrence of the pattern it finds.

Extracting data from a string

In this example, we're using regex to extract an email address from a string:

import re

text = "My email is john.doe@example.com"

pattern = r"([\w\.-]+)@([\w\.-]+)"

result = re.search(pattern, text)

print(result.group()) # Output: "john.doe@example.com"
print(result.group(1)) # Output: "john.doe"
print(result.group(2)) # Output: "example.com"
Enter fullscreen mode Exit fullscreen mode

We're using the pattern r"([\w.-]+)@([\w.-]+)", which matches any string that looks like an email address. We're then using the group() method to extract the entire email address, as well as the username and domain separately.

Replacing text in a string

In this example, we're using regex to replace the word "brown" with "red" in a string.

import re

text = "The quick brown fox jumps over the lazy dog."

pattern = r"brown"

new_text = re.sub(pattern, "red", text)

print(new_text) # Output: "The quick red fox jumps over the lazy dog."
Enter fullscreen mode Exit fullscreen mode

We're using the sub() method, which takes the regex pattern, the replacement text, and the string we want to search and replace text in. The sub() method returns a new string with the replacements made.

Top comments (0)