Retiago Drago

Posted on

# Find Followers Count | LeetCode | MSSQL

## The Problem

Consider a table `Followers` with the following schema:

Column Name Type
user_id int
follower_id int

`(user_id, follower_id)` is the primary key for this table. This table contains the IDs of a user and a follower in a social media app where the follower follows the user. The task is to write an SQL query that will, for each user, return the number of followers. The result table should be ordered by `user_id` in ascending order.

## Explanation

Consider the following `Followers` table:

user_id follower_id
0 1
1 0
2 0
2 1

For this input, the expected output would be:

user_id followers_count
0 1
1 1
2 2

Here, each row represents a distinct `user_id`, and the count of followers for each user.

## The Solution

We will discuss two different approaches to solve this problem, highlight their strengths and weaknesses, and explain their underlying structures.

### Using GROUP BY

This approach directly counts the `follower_id` for each `user_id` using GROUP BY clause.

``````SELECT
user_id,
COUNT(follower_id) [followers_count]
FROM Followers
GROUP BY user_id
``````

This query is fairly straightforward and easy to understand. It works by grouping the data by `user_id`, and then it counts the followers within each group. The advantage of this method is its simplicity and directness. However, GROUP BY clause can be slow on large data sets. This query runtime is 994ms, beating 86.63% of other submissions on LeetCode.

### Using COUNT() with PARTITION BY

This approach counts `follower_id` for each `user_id` using window function.

``````SELECT DISTINCT
user_id,
COUNT(follower_id) OVER(PARTITION BY user_id) [followers_count]
FROM Followers
``````

This query, while similar to the first, leverages the power of window functions to achieve the same result. The PARTITION BY clause essentially creates a "window" of rows with the same `user_id`, and the COUNT function then operates on this window. This approach can sometimes offer performance improvements over the GROUP BY method, especially when dealing with larger data sets. However, in this case, the performance was slightly less, with a runtime of 1416ms, beating 24.50% of other submissions on LeetCode.

## Conclusion

Both methods solve the problem, but their performances can vary based on the size and distribution of the data set. While the first solution using GROUP BY clause performed better on LeetCode, in a real-world scenario with a larger data set and a well-tuned database, the second method using window function might perform better. This example serves to illustrate the importance of understanding different methods and when each can be advantageous. Performance in LeetCode and real-world applications may not align perfectly, but practicing with various techniques helps us better prepare for different scenarios.

You can find the original problem at LeetCode.

For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.