DEV Community

Cover image for Find Users With Valid E-Mails | LeetCode | MSSQL
Retiago Drago
Retiago Drago

Posted on • Updated on

Find Users With Valid E-Mails | LeetCode | MSSQL

The Problem

Given a Users table containing user information, including potentially invalid emails, the task is to write an SQL query to find users who have valid emails.

The Users table is structured as follows:

Column Name Type
user_id int
name varchar
mail varchar

The user_id is the primary key for this table. A valid email has a prefix and a domain with specific constraints. The prefix may contain upper or lower case letters, digits, underscore '_', period '.', and dash '-', but it must start with a letter. The domain must be '@leetcode.com'.

The goal is to return the users with valid emails in any order.

Here's a sample input for illustration:

user_id name mail
1 Winston winston@leetcode.com
2 Jonathan jonathanisgreat
3 Annabelle bella-@leetcode.com
4 Sally sally.come@leetcode.com
5 Marwan quarz#2020@leetcode.com
6 David david69@gmail.com
7 Shapiro .shapo@leetcode.com

The expected output from the above Users table would be:

user_id name mail
1 Winston winston@leetcode.com
3 Annabelle bella-@leetcode.com
4 Sally sally.come@leetcode.com

The mail of user 2 does not have a domain, the mail of user 5 has the # sign which is not allowed, the mail of user 6 does not have the leetcode domain, and the mail of user 7 starts with a period.

The Solution

We'll explore six different SQL solutions, each with a unique approach to the problem. We'll be using the LIKE, PATINDEX, RIGHT, LEFT, and LEN functions, as well as subqueries and WHERE clauses to implement our solutions.

Source Code 1

The first solution selects all users whose emails start with a letter and end with '@leetcode.com', and do not contain any disallowed characters. It checks the email format in the WHERE clause.

SELECT *
FROM Users
WHERE mail LIKE '[A-Za-z]%@leetcode.com'
    AND PATINDEX('%[^A-Za-z0-9\-\_\.\-]%@leetcode.com', mail) = 0
Enter fullscreen mode Exit fullscreen mode

This query finishes in 1675ms, beating 50% of submissions.

s1

Source Code 2

The second solution uses a subquery to first filter emails that start with a letter and end with '@leetcode.com'. The main query then ensures that there are no disallowed characters.

SELECT *
FROM (
    SELECT *
    FROM Users
    WHERE mail LIKE '[A-Za-z]%@leetcode.com'
) [u]
WHERE PATINDEX('%[^A-Za-z0-9\-\_\.\-]%@leetcode.com', mail) = 0
Enter fullscreen mode Exit fullscreen mode

This query runs in 2006ms, beating 17.62% of submissions.

s2

Source Code 3

The third solution also filters users with emails ending with '@leetcode.com' and starting with a letter. Instead of PATINDEX, it uses LEFT and LEN to filter out disallowed characters.

SELECT *
FROM Users
WHERE RIGHT(mail, 13) = '@leetcode.com'
    AND mail LIKE '[a-zA-Z]%'
    AND LEFT(mail, LEN(mail)-13)  NOT LIKE '%[^0-9a-zA-Z\_\.\-]%'
Enter fullscreen mode Exit fullscreen mode

This query finishes in 2022ms, beating 16.66% of submissions.

s3

Source Code 4

The fourth solution is similar to the first, but it explicitly selects the columns to return.

SELECT
    user_id,
    name,
    mail
FROM Users
WHERE mail LIKE '[A-Za-z]%@leetcode.com'
    AND PATINDEX('%[^A-Za-z0-9\-\_\.\-]%@leetcode.com', mail) = 0
Enter fullscreen mode Exit fullscreen mode

This query finishes in 2212ms, beating 9.76% of submissions.

s4

Source Code 5

The fifth solution, similar to the second, uses a subquery to first filter the emails, then applies the PATINDEX in the main query. It explicitly selects the columns to return.

SELECT
    user_id,
    name,
    mail
FROM (
    SELECT
        user_id,
        name,
        mail
    FROM Users
    WHERE mail LIKE '[A-Za-z]%@leetcode.com'
) [u]
WHERE PATINDEX('%[^A-Za-z0-9\-\_\.\-]%@leetcode.com', mail) = 0
Enter fullscreen mode Exit fullscreen mode

This query finishes in 1524ms, beating 68.9% of submissions.

s5

Source Code 6

The final solution is similar to the third, but it also explicitly selects the columns to return.

SELECT
    user_id,
    name,
    mail
FROM Users
WHERE RIGHT(mail, 13) = '@leetcode.com'
    AND mail LIKE '[a-zA-Z]%'
    AND LEFT(mail, LEN(mail)-13)  NOT LIKE '%[^0-9a-zA-Z\_\.\-]%'
Enter fullscreen mode Exit fullscreen mode

This query finishes in 1851ms, beating 29.52% of submissions.

s6

Conclusion

While all these solutions accomplish the same goal, their performance varies. It's evident that the use of subqueries (Source Code 2 and 5) can affect the query's efficiency, but in some cases, the performance difference is marginal and may not be a significant concern, especially when working with smaller datasets.

In this particular LeetCode problem, the best performing solution overall was Source Code 5, followed by Source Code 1, Source Code 6, Source Code 3, Source Code 2, and Source Code 4. However, it's crucial to remember that the efficiency of SQL operations can significantly differ based on the database's specifics, including its size, structure, indexing, and the database management system itself.

You can find the original problem at LeetCode.

For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.

ranggakd - Link in Bio & Creator Tools | Beacons

@ranggakd | center details summary summary Oh hello there I m a an Programmer AI Tech Writer Data Practitioner Statistics Math Addict Open Source Contributor Quantum Computing Enthusiast details center.

favicon beacons.ai

Top comments (0)