The Problem
Table: Users
Column Name | Type |
---|---|
user_id | int |
user_name | varchar |
user_id
is the primary key for this table.
Each row of this table contains the name and the id of a user.
Table: Register
Column Name | Type |
---|---|
contest_id | int |
user_id | int |
(contest_id, user_id) is the primary key for this table.
Each row of this table contains the id of a user and the contest they registered into.
Problem: Write an SQL query to find the percentage of the users registered in each contest, rounded to two decimals. Return the result table ordered by percentage in descending order. In case of a tie, order it by contest_id in ascending order.
Explanation
Input:
Users table:
user_id | user_name |
---|---|
6 | Alice |
2 | Bob |
7 | Alex |
Register table:
contest_id | user_id |
---|---|
215 | 6 |
209 | 2 |
208 | 2 |
210 | 6 |
208 | 6 |
209 | 7 |
209 | 6 |
215 | 7 |
208 | 7 |
210 | 2 |
207 | 2 |
210 | 7 |
Output:
contest_id | percentage |
---|---|
208 | 100.0 |
209 | 100.0 |
210 | 100.0 |
215 | 66.67 |
207 | 33.33 |
Explanation:
All the users registered in contests 208, 209, and 210. The percentage is 100% and we sort them in the answer table by contest_id in ascending order. Alice and Alex registered in contest 215 and the percentage is ((2/3) * 100) = 66.67%. Bob registered in contest 207 and the percentage is ((1/3) * 100) = 33.33%.
The Solution
We present two different SQL queries to solve the problem. Both utilize Common Table Expressions (CTE) and window functions but differ in how they handle the join operation.
Source Code 1
This code uses a cross product join, which is a combination of all rows from both tables (Users and Register). It calculates the total number of users and then determines the percentage for each contest.
WITH total_users AS (
SELECT COUNT(DISTINCT user_id) [total]
FROM Users
)
SELECT DISTINCT
contest_id,
ROUND(COUNT(user_id) OVER(PARTITION BY contest_id) * 100.0 / total, 2) [percentage]
FROM Register, total_users
ORDER BY
percentage DESC,
contest_id
The runtime of this code is 4932ms, beating 68.97% of submissions.
Source Code 2
This code uses a CROSS JOIN operation instead of a simple cross product, which performs better when the tables have a large number of rows. The rest of the code is the same as in Source Code 1.
WITH total_users AS (
SELECT COUNT(DISTINCT user_id) [total]
FROM Users
)
SELECT DISTINCT
contest_id,
ROUND(COUNT(user_id) OVER(PARTITION BY contest_id) * 100.0 / total, 2) [percentage]
FROM Register r CROSS JOIN total_users t
ORDER BY
percentage DESC,
contest_id
The runtime of this code is 5418ms, beating 28.49% of submissions.
Conclusion
These solutions illustrate two approaches to the problem: using a simple cross product join and a more performant CROSS JOIN. While the latter is generally better for larger data sets, in this case, the LeetCode benchmarks show that the simple cross product is faster.
Here are the solutions ranked from best to worst based on their LeetCode performance:
- Source Code 1
- Source Code 2
However, it's important to note that the performance on LeetCode might not always translate directly to performance in a real-world RDBMS. Factors such as database engine optimizations, hardware, and the distribution of data can affect performance in ways that may not be reflected in these rankings.
You can find the original problem at LeetCode.
For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.
Top comments (0)