## DEV Community

Retiago Drago

Posted on • Updated on

# Customer Who Visited but Did Not Make Any Transactions | LeetCode | MSSQL

## The Problem

The task at hand is to analyze two tables: `Visits` and `Transactions`.

`Visits` contains information about customers who visited a mall. The structure of the `Visits` table is as follows:

Column Name Type
visit_id int
customer_id int

`visit_id` is the primary key for this table.

`Transactions` contains information about transactions made during a particular visit. The structure of the `Transactions` table is:

Column Name Type
transaction_id int
visit_id int
amount int

`transaction_id` is the primary key for this table.

The objective is to write a SQL query to find the customer IDs of users who visited the mall without making any transactions, as well as the number of times they made such visits. The result should be returned in any order.

For instance, consider the following input:

Visits

visit_id customer_id
1 23
2 9
4 30
5 54
6 96
7 54
8 54

Transactions

transaction_id visit_id amount
2 5 310
3 5 300
9 5 200
12 1 910
13 2 970

The expected output is:

customer_id count_no_trans
54 2
30 1
96 1

The explanation for this output is as follows:

• Customer 23 visited the mall once and made one transaction during visit 1.
• Customer 9 visited the mall once and made one transaction during visit 2.
• Customer 30 visited the mall once but did not make any transactions.
• Customer 54 visited the mall three times. During two of these visits, they did not make any transactions, but during one visit, they made three transactions.
• Customer 96 visited the mall once but did not make any transactions.

Hence, customers 30 and 96 visited the mall once without making any transactions. Customer 54 visited twice without making any transactions.

## The Solution

We have six source codes to address this problem. All six solutions use different SQL join types and methods for filtering records. The solutions use SQL functions such as LEFT JOIN, RIGHT JOIN, DISTINCT, COUNT, GROUP BY, and NOT EXISTS in different combinations. The different strategies employed by these codes include joining tables and filtering the result to only include visits without transactions, using the NOT IN clause to exclude visits that have corresponding transactions, and using the NOT EXISTS clause to exclude visits that have corresponding transactions.

### Source Code 1

This code uses a LEFT JOIN to combine the `Visits` and `Transactions` tables and then filters the result to only include visits that don't have a corresponding transaction. The `COUNT` function is then used to find the number of such visits for each customer.

``````SELECT
v.customer_id,
COUNT(v.visit_id) [count_no_trans]
FROM Visits v LEFT JOIN Transactions t ON v.visit_id = t.visit_id
WHERE t.visit_id IS NULL
GROUP BY v.customer_id
``````

This code has a runtime of 2934ms, beating 13.82% of submissions.

### Source Code 2

This code is similar to Source Code 1, but it uses a RIGHT JOIN instead of a LEFT JOIN.

``````SELECT
v.customer_id,
COUNT(v.visit_id) [count_no_trans]
FROM Transactions t RIGHT JOIN Visits v ON v.visit_id = t.visit_id
WHERE t.visit_id IS NULL
GROUP BY v.customer_id
``````

This code has a runtime of 2931ms, also beating 13.82% of submissions.

### Source Code 3

Source Code 3 uses a subquery in the WHERE clause to filter out visits that have corresponding transactions. The `NOT IN` clause excludes visit IDs that are in the `Transactions` table.

``````SELECT
customer_id,
COUNT(visit_id) [count_no_trans]
FROM Visits
WHERE visit_id NOT IN (
SELECT DISTINCT visit_id
FROM Transactions
)
GROUP BY customer_id
``````

This code has a runtime of 2848ms, beating 16.79% of submissions.

### Source Code 4

This code is similar to Source Code 3, but it also uses the `DISTINCT` keyword in the main query, not just in the subquery. Additionally, it uses the `OVER` clause with the `PARTITION BY` clause to compute the count of visits for each customer.

``````SELECT DISTINCT
customer_id,
COUNT(visit_id) OVER(PARTITION BY customer_id) [count_no_trans]
FROM Visits
WHERE visit_id NOT IN (
SELECT DISTINCT visit_id
FROM Transactions
)
``````

This code has a runtime of 2444ms, beating 39.85% of submissions.

### Source Code 5

Source Code 5 uses the `NOT EXISTS` clause with a subquery to exclude visits that have corresponding transactions. It also uses the `OVER` clause with the `PARTITION BY` clause to compute the count of visits for each customer.

``````SELECT DISTINCT
customer_id,
COUNT(visit_id) OVER(PARTITION BY customer_id) [count_no_trans]
FROM Visits v
WHERE NOT EXISTS (
SELECT 1
FROM Transactions t
WHERE t.visit_id = v.visit_id
)
``````

This code has a runtime of 2464ms, beating 38.63% of submissions.

### Source Code 6

Source Code 6 is similar to Source Code 5, but it uses the `GROUP BY` clause instead of the `OVER` clause with the `PARTITION BY` clause to compute the count of visits for each customer.

``````SELECT
customer_id,
COUNT(visit_id) [count_no_trans]
FROM Visits v
WHERE NOT EXISTS (
SELECT 1
FROM Transactions t
WHERE t.visit_id = v.visit_id
)
GROUP BY customer_id
``````

This code has a runtime of 3669ms, beating 5.2% of submissions.

## Conclusion

These solutions demonstrate different ways to solve the problem using SQL. Some methods are more efficient than others in terms of runtime. Based on these examples, it appears that using `NOT EXISTS` with a subquery, and using the `OVER` clause with the `PARTITION BY` clause for counting, provides the most efficient solution. However, the efficiency of a solution can depend on the specifics of the database management system being used, as well as the data distribution. Hence, different solutions may be more efficient in different circumstances.

Ranking the solutions from best to worst in terms of LeetCode performance, we get:

1. Source Code 4
2. Source Code 5
3. Source Code 3
4. Source Code 1 and 2
5. Source Code 6

Keep in mind that performance on LeetCode might not always translate directly to performance in a real-world RDBMS. LeetCode measures how quickly a solution runs in its specific environment, which may differ from the environment of a production database. Factors like indexing, query optimization by the database engine, the hardware the database is running on, and more can all affect the real-world performance of a solution.

You can find the original problem at LeetCode.

For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.

## ranggakd - Link in Bio & Creator Tools | Beacons

@ranggakd | center details summary summary Oh hello there I m a an Programmer AI Tech Writer Data Practitioner Statistics Math Addict Open Source Contributor Quantum Computing Enthusiast details center.

beacons.ai