Retiago Drago

Posted on

# SQL Project Planning | HackerRank | MSSQL

## The Problem

You are given a table, `Projects`, containing three columns: `Task_ID`, `Start_Date`, and `End_Date`. It is guaranteed that the difference between the `End_Date` and the `Start_Date` is equal to 1 day for each row in the table.

If the `End_Date` of the tasks are consecutive, then they are part of the same project. Samantha is interested in finding the total number of different projects completed.

Write a query to output the start and end dates of projects listed by the number of days it took to complete the project in ascending order. If there is more than one project that have the same number of completion days, then order by the start date of the project.

## Explanation

Input:

1 2015-10-01 2015-10-02
2 2015-10-02 2015-10-03
3 2015-10-03 2015-10-04
4 2015-10-13 2015-10-14
5 2015-10-14 2015-10-15
6 2015-10-28 2015-10-29
7 2015-10-30 2015-10-31

Output:

Project_Start_Date Project_End_Date
2015-10-28 2015-10-29
2015-10-30 2015-10-31
2015-10-13 2015-10-15
2015-10-01 2015-10-04

## The Solution

Here, we'll discuss two different SQL approaches that make use of window functions and date arithmetic to solve this problem. Both approaches have different methods of grouping tasks into projects.

### Source Code 1

This approach groups tasks into projects based on their `Start_Date`. If a task's `Start_Date` is consecutive to another task's `End_Date`, they belong to the same project. The `ROW_NUMBER` window function generates a sequence of numbers, which is subtracted from the `Start_Date` to create a grouping identifier.

``````WITH CTE AS (
SELECT
Start_Date,
End_Date,
DATEADD(day, - ROW_NUMBER() OVER(ORDER BY Start_Date), Start_Date) [Grouping]
FROM Projects
)
SELECT
MIN(Start_Date) [Project_Start_Date],
MAX(End_Date) [Project_End_Date]
FROM CTE
GROUP BY Grouping
ORDER BY DATEDIFF(day, MIN(Start_Date), MAX(End_Date)), MIN(Start_Date)
``````

### Source Code 2

This approach, on the other hand, groups tasks into projects based on their `End_Date`. Similar to the first approach, `ROW_NUMBER` is used to generate a sequence of numbers, which is subtracted from the `End_Date` to create a grouping identifier.

``````WITH CTE AS (
SELECT
Start_Date,
End_Date,
DATEADD(day, - ROW_NUMBER() OVER(ORDER BY End_Date), End_Date) [Grouping]
FROM Projects
)
SELECT
MIN(Start_Date) [Project_Start_Date],
MAX(End_Date) [Project_End_Date]
FROM CTE
GROUP BY Grouping
ORDER BY DATEDIFF(day, MIN(Start_Date), MAX(End_Date)), MIN(Start_Date)
``````

## Conclusion

Both of these solutions offer an efficient way to tackle the problem using window functions and date arithmetic. They provide an excellent showcase of some of the more advanced features of SQL and how they can be used to solve complex problems.

While these solutions have been shown to be effective on HackerRank, it's important to keep in mind that performance can vary based on different factors like the specific RDBMS used, the database structure, and the size of the data.

You can find the original problem at HackerRank.

For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.

## ranggakd - Link in Bio & Creator Tools | Beacons

@ranggakd | center details summary summary Oh hello there I m a an Programmer AI Tech Writer Data Practitioner Statistics Math Addict Open Source Contributor Quantum Computing Enthusiast details center.

beacons.ai