Welcome to the SQL showdown series!
I'm committing to publishing a SQL challenge every day for 30 days. In each post I'll describe my solution to the last day's challenge. I'll follow that up with a full description of the challenge for the next day.
Write your own solution in the comments! Let's see who can come up with the most creative solutions.
I'll add connection details for a PostgreSQL database containing test data for the challenge. Solutions are by no means limited to PostgreSQL, but it's there if you want an easy way to test your query!
Today we're going to tackle dealing with hierarchical or graph-like data. Imagine something like the Oracle of Bacon that finds the most direct relationship from one actor to Kevin Bacon. In our case, we'll use a made-up corporate org chart to represent a hierarchy of relationships.
Here's the challenge:
Can you take a specific employee ID and return a path up the org chart to the CEO, listing all of the employees on that path?
The only table in this challenge is the employee table. These are the columns that make up the employee table:
Here's an example to give you a better idea of the output you're after:
In this case, we start with Rudy and proceed up the chain until we reach Tonie, who has no manager, and who we assume is the CEO.
I have a PostgreSQL database ready for you to play with.
The password is the same as the username! To query the employee table:
SELECT * FROM day5.employee;
This is the question we were trying to answer with yesterday's SQL challenge about video game sales:
Can you produce a report that displays one year per row, and the aggregated global sales by genre for that year as columns?
The task is to turn rows into columns, so that we end up with every genre as a column. This is kind of like an Excel pivot table.
Some databases, such as SQL Server, have a PIVOT function that does exactly this. PostgreSQL, however, does not have the PIVOT function. The tablefunc extension module provides a CROSSTAB function for accomplishing the same thing, but let's stick with a vanilla PostgreSQL installation.
Here's how I went about solving this challenge:
SELECT year AS "Year", SUM(CASE WHEN genre = 'Action' THEN global_sales END) AS "Action", SUM(CASE WHEN genre = 'Adventure' THEN global_sales END) AS "Adventure", SUM(CASE WHEN genre = 'Fighting' THEN global_sales END) AS "Fighting", SUM(CASE WHEN genre = 'Misc' THEN global_sales END) AS "Misc", SUM(CASE WHEN genre = 'Platform' THEN global_sales END) AS "Platform", SUM(CASE WHEN genre = 'Puzzle' THEN global_sales END) AS "Puzzle", SUM(CASE WHEN genre = 'Racing' THEN global_sales END) AS "Racing", SUM(CASE WHEN genre = 'Role-Playing' THEN global_sales END) AS "Role-Playing", SUM(CASE WHEN genre = 'Shooter' THEN global_sales END) AS "Shooter", SUM(CASE WHEN genre = 'Simulation' THEN global_sales END) AS "Simulation", SUM(CASE WHEN genre = 'Sports' THEN global_sales END) AS "Sports", SUM(CASE WHEN genre = 'Strategy' THEN global_sales END) AS "Strategy" FROM day4.videogame GROUP BY year ORDER BY year;
genre == "Action" ? global_sales : 0;
Usually you'll see CASE statements that have an ELSE clause, but here I'm relying on the fact that the default ELSE clause returns NULL, which when passed to the SUM function is equivalent to zero.
By grouping on year, each of the SUM functions is only adding up the sales for one genre over that year after the CASE statement filters out other genres.
This approach is obviously a little tedious, and only works when you know your desired columns ahead of time.
Have fun, and I can't wait to see what you come up with! I'll be back tomorrow with a solution to this problem and a new challenge.