We are given three tables:
Students table structure:
student_id is the primary key for this table. Each row contains the ID and the name of one student in the school.
Subjects table structure:
subject_name is the primary key for this table. Each row contains the name of one subject in the school.
Examinations table structure:
This table doesn't have a primary key and may contain duplicates. Each row indicates that a student with the
student_id attended the exam of
The task is to write an SQL query to find the number of times each student attended each exam. Return the result table ordered by
Alice attended the Math exam 3 times, the Physics exam 2 times, and the Programming exam 1 time. Bob attended the Math exam 1 time, the Programming exam 1 time, and didn't attend the Physics exam. Alex didn't attend any exams. John attended the Math exam 1 time, the Physics exam 1 time, and the Programming exam 1 time.
We'll discuss three SQL solutions with varying degrees of complexity. The primary differences among these solutions lie in the SQL features they use, such as conditional statements, CROSS JOIN, and RIGHT JOIN.
This query starts by joining the
Subjects tables on a constant condition (1=1), effectively generating all possible combinations of students and subjects. Next, a subquery is used to calculate the attendance count for each student and subject from the
Examinations table. This subquery result is then left joined with the combination of students and subjects on matching student_id and subject_name. A CASE statement is used to handle scenarios where no examination record exists for a student-subject pair, replacing NULL with 0.
SELECT s.student_id, s.student_name, u.subject_name, CASE WHEN e.counter IS NULL THEN 0 ELSE e.counter END [attended_exams] FROM Students s JOIN Subjects u ON 1=1 LEFT JOIN ( SELECT *, COUNT(student_id) [counter] FROM Examinations GROUP BY student_id, subject_name ) e ON e.student_id = s.student_id AND e.subject_name = u.subject_name ORDER BY s.student_id, u.subject_name
The runtime for this solution is 715ms, beating 53.70% of submissions on LeetCode.
This query also generates all possible combinations of students and subjects, but it uses the CROSS JOIN operation instead of joining on a constant condition. It then LEFT JOINs the
Examinations table on matching student_id and subject_name. The attendance count for each student-subject pair is calculated using the COUNT function within an IIF statement to handle nulls.
SELECT s.student_id, s.student_name, sb.subject_name, IIF(COUNT(e.student_id) IS NULL, 0, COUNT(e.student_id)) [attended_exams] FROM Students s CROSS JOIN Subjects sb LEFT JOIN Examinations e ON s.student_id = e.student_id AND sb.subject_name = e.subject_name GROUP BY s.student_id, s.student_name, sb.subject_name ORDER BY s.student_id, sb.subject_name
The runtime for this solution is 874ms, beating 24.33% of submissions on LeetCode.
In this solution, we first generate a subquery from
Examinations that calculates the attendance count for each student and subject. We also generate a subquery from the CROSS JOIN of
Subjects tables. Then, we RIGHT JOIN the two subqueries on matching student_id and subject_name. The ISNULL function is used to handle scenarios where no examination record exists for a student-subject pair, replacing NULL with 0.
SELECT s_u.student_id, s_u.student_name, s_u.subject_name, ISNULL(e.attended_exams, 0) [attended_exams] FROM ( SELECT student_id, subject_name, COUNT(*) [attended_exams] FROM Examinations GROUP BY student_id, subject_name ) e RIGHT JOIN (SELECT * FROM Students CROSS JOIN Subjects) s_u ON s_u.student_id = e.student_id AND s_u.subject_name = e.subject_name ORDER BY s_u.student_id, s_u.subject_name
The runtime for this solution is 732ms, beating 49.77% of submissions on LeetCode.
All three solutions solve the problem effectively but with different performance results. The solution using CROSS JOIN (Source Code 2) performs the slowest due to the overhead of generating all combinations of students and subjects. Both Source Code 1 and 3 are faster due to their more efficient handling of the attendance count calculations.
Based on LeetCode performance metrics, Source Code 1 is the fastest solution, followed by Source Code 3, and then Source Code 2. However, in real-world scenarios, the performance may vary based on the RDBMS in use and the specific database structure and data volume.
You can find the original problem at LeetCode.
For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.