yaswanthteja

Posted on Oct 12, 2022

8 Week SQL Challenge: Case Study #2 Pizza Runner

#beginners #mysql #database #sqlchallenges

Introduction

Danny was scrolling through his Instagram feed when something really caught his eye — “80s Retro Styling 🎸 and Pizza 🍕 Is The Future!”

Danny was sold on the idea, but he knew that pizza alone was not going to help him get seed funding to expand his new Pizza Empire — so he had one more genius idea to combine with it — he was going to Uberize it — and so Pizza Runner was launched!

Danny started by recruiting “runners” to deliver fresh pizza from Pizza Runner Headquarters (otherwise known as Danny’s house) and also maxed out his credit card to pay freelance developers to build a mobile app to accept orders from customers.

Table Relationship

customer_orders — Customers’ pizza orders with 1 row each for individual pizza with topping exclusions and extras, and order time.
runner_orders — Orders assigned to runners documenting the pickup time, distance and duration from Pizza Runner HQ to customer, and cancellation remark. runners — Runner IDs and registration date
pizza_names — Pizza IDs and name
pizza_recipes — Pizza IDs and topping names
pizza_toppings — Topping IDs and name

Case Study Questions

This case study has LOTS of questions — they are broken up by area of focus including:

A. Pizza Metrics
B. Runner and Customer Experience
C. Ingredient Optimisation
D. Pricing and Ratings
E. Bonus DML Challenges (DML = Data Manipulation Language)

Data Cleaning and Transformation

Before I start with the solutions, I investigate the data and found that there are some cleaning and transformation to do, specifically on the

null values and data types in the customer_orders table
null values and data types in the runner_orders table
Alter data type in pizza_names table

Firstly, to clean up exclusions and extras in the customer_orders — we create TEMP TABLE #customer_orders and use CASE WHEN.

SELECT order_id, customer_id, pizza_id, 
  CASE 
    WHEN exclusions IS null OR exclusions LIKE 'null' THEN ' '
    ELSE exclusions
    END AS exclusions,
  CASE 
    WHEN extras IS NULL or extras LIKE 'null' THEN ' '
    ELSE extras 
    END AS extras, 
  order_time
INTO #customer_orders -- create TEMP TABLE
FROM customer_orders;

Then, we clean the runner_orders table with CASE WHEN and TRIM and create TEMP TABLE #runner_orders.

In summary,

pickup_time — Remove nulls and replace with ‘ ‘
distance — Remove ‘km’ and nulls
duration — Remove ‘minutes’ and nulls
cancellation — Remove NULL and null and replace with ‘ ‘

SELECT order_id, runner_id,
  CASE 
    WHEN pickup_time LIKE 'null' THEN ' '
    ELSE pickup_time 
    END AS pickup_time,
  CASE 
    WHEN distance LIKE 'null' THEN ' '
    WHEN distance LIKE '%km' THEN TRIM('km' from distance) 
    ELSE distance END AS distance,
  CASE 
    WHEN duration LIKE 'null' THEN ' ' 
    WHEN duration LIKE '%mins' THEN TRIM('mins' from duration) 
    WHEN duration LIKE '%minute' THEN TRIM('minute' from duration)        
    WHEN duration LIKE '%minutes' THEN TRIM('minutes' from duration)       
    ELSE duration END AS duration,
  CASE 
    WHEN cancellation IS NULL or cancellation LIKE 'null' THEN ''
    ELSE cancellation END AS cancellation
INTO #runner_orders
FROM runner_orders;

Then, we alter the date according to its correct data type.

pickup_time to DATETIME type
distance to FLOAT type
duration to INT type

ALTER TABLE #runner_orders
ALTER COLUMN pickup_time DATETIME,
ALTER COLUMN distance FLOAT, 
ALTER COLUMN duration INT;

Now that the data has been cleaned and transformed, let’s move on solving the questions! 😉

Solution

A. Pizza Metrics

How many pizzas were ordered?

SELECT COUNT(*) AS pizza_order_count
FROM #customer_orders;

Answer:

Total pizzas ordered are 14.
How many unique customer orders were made?

SELECT COUNT(DISTINCT order_id) AS unique_order_count
FROM #customer_orders;

Answer:

There are 10 unique customer orders made.

How many successful orders were delivered by each runner?

SELECT runner_id, COUNT(order_id) AS successful_orders
FROM #runner_orders
WHERE distance != 0
GROUP BY runner_id;

Answer:

Runner 1 has 4 successful delivered orders.
Runner 2 has 3 successful delivered orders.
Runner 3 has 1 successful delivered order.

How many of each type of pizza was delivered?

SELECT p.pizza_name, COUNT(c.pizza_id) AS delivered_pizza_count
FROM #customer_orders AS c
JOIN #runner_orders AS r
 ON c.order_id = r.order_id
JOIN pizza_names AS p
 ON c.pizza_id = p.pizza_id
WHERE r.distance != 0
GROUP BY p.pizza_name;

Answer:

There are 9 delivered Meatlovers pizzas.
There are 3 delivered Vegetarian pizzas.

How many Vegetarian and Meatlovers were ordered by each customer?

SELECT c.customer_id, p.pizza_name, COUNT(p.pizza_name) AS order_count
FROM #customer_orders AS c
JOIN pizza_names AS p
 ON c.pizza_id= p.pizza_id
GROUP BY c.customer_id, p.pizza_name
ORDER BY c.customer_id;

Answer:

Customer 101 ordered 2 Meatlovers pizzas and 1 Vegetarian pizza.
Customer 102 ordered 2 Meatlovers pizzas and 2 Vegetarian pizzas.
Customer 103 ordered 3 Meatlovers pizzas and 1 Vegetarian pizza.
Customer 104 ordered 1 Meatlovers pizza.
Customer 105 ordered 1 Vegetarian pizza.

What was the maximum number of pizzas delivered in a single order?

WITH pizza_count_cte AS
(
 SELECT c.order_id, COUNT(c.pizza_id) AS pizza_per_order
 FROM #customer_orders AS c
 JOIN #runner_orders AS r
  ON c.order_id = r.order_id
 WHERE r.distance != 0
 GROUP BY c.order_id
)

SELECT MAX(pizza_per_order) AS pizza_count
FROM pizza_count_cte;

Answer:

Maximum number of pizza delivered in a single order is 3 pizzas.

For each customer, how many delivered pizzas had at least 1 change and how many had no changes?

SELECT c.customer_id,
 SUM(CASE 
  WHEN c.exclusions <> ' ' OR c.extras <> ' ' THEN 1
  ELSE 0
  END) AS at_least_1_change,
 SUM(CASE 
  WHEN c.exclusions = ' ' AND c.extras = ' ' THEN 1 
  ELSE 0
  END) AS no_change
FROM #customer_orders AS c
JOIN #runner_orders AS r
 ON c.order_id = r.order_id
WHERE r.distance != 0
GROUP BY c.customer_id
ORDER BY c.customer_id;

Answer:

Customer 101 and 102 likes his/her pizzas per the original recipe.
Customer 103, 104 and 105 have their own preference for pizza topping and requested at least 1 change (extra or exclusion topping) on their pizza.

How many pizzas were delivered that had both exclusions and extras?

SELECT  
 SUM(CASE
  WHEN exclusions IS NOT NULL AND extras IS NOT NULL THEN 1
  ELSE 0
  END) AS pizza_count_w_exclusions_extras
FROM #customer_orders AS c
JOIN #runner_orders AS r
 ON c.order_id = r.order_id
WHERE r.distance >= 1 
 AND exclusions <> ' ' 
 AND extras <> ' ';

Answer:

Only 1 pizza delivered that had both extra and exclusion topping. That’s one fussy customer!

What was the total volume of pizzas ordered for each hour of the day?

SELECT DATEPART(HOUR, [order_time]) AS hour_of_day, 
 COUNT(order_id) AS pizza_count
FROM #customer_orders
GROUP BY DATEPART(HOUR, [order_time]);

Answer:

Highest volume of pizza ordered is at 13 (1:00 pm), 18 (6:00 pm) and 21 (9:00 pm).
Lowest volume of pizza ordered is at 11 (11:00 am), 19 (7:00 pm) and 23 (11:00 pm).
What was the volume of orders for each day of the week?

SELECT FORMAT(DATEADD(DAY, 2, order_time),'dddd') AS day_of_week, 
-- add 2 to adjust 1st day of the week as Monday
 COUNT(order_id) AS total_pizzas_ordered
FROM #customer_orders
GROUP BY FORMAT(DATEADD(DAY, 2, order_time),'dddd');

Answer:

There are 5 pizzas ordered on Friday and Monday.
There are 3 pizzas ordered on Saturday.
There is 1 pizza ordered on Sunday.

B. Runner and Customer Experience

How many runners signed up for each 1 week period? (i.e. week starts 2021-01-01)

SELECT DATEPART(WEEK, registration_date) AS registration_week,
 COUNT(runner_id) AS runner_signup
FROM runners
GROUP BY DATEPART(WEEK, registration_date);

Answer:

On Week 1 of Jan 2021, 2 new runners signed up.
On Week 2 and 3 of Jan 2021, 1 new runner signed up.

What was the average time in minutes it took for each runner to arrive at the Pizza Runner HQ to pickup the order?

WITH time_taken_cte AS
(
 SELECT c.order_id, c.order_time, r.pickup_time, 
  DATEDIFF(MINUTE, c.order_time, r.pickup_time) AS pickup_minutes
 FROM #customer_orders AS c
 JOIN #runner_orders AS r
  ON c.order_id = r.order_id
 WHERE r.distance != 0
 GROUP BY c.order_id, c.order_time, r.pickup_time
)

SELECT AVG(pickup_minutes) AS avg_pickup_minutes
FROM time_taken_cte
WHERE pickup_minutes > 1;

Answer:

The average time taken in minutes by runners to arrive at Pizza Runner HQ to pick up the order is 15 minutes.

Is there any relationship between the number of pizzas and how long the order takes to prepare?

WITH prep_time_cte AS
(
 SELECT c.order_id, COUNT(c.order_id) AS pizza_order, 
  c.order_time, r.pickup_time, 
  DATEDIFF(MINUTE, c.order_time, r.pickup_time) AS prep_time_minutes
 FROM #customer_orders AS c
 JOIN #runner_orders AS r
  ON c.order_id = r.order_id
 WHERE r.distance != 0
 GROUP BY c.order_id, c.order_time, r.pickup_time
)

SELECT pizza_order, AVG(prep_time_minutes) AS avg_prep_time_minutes
FROM prep_time_cte
WHERE prep_time_minutes > 1
GROUP BY pizza_order;

Answer:

On average, a single pizza order takes 12 minutes to prepare.
An order with 3 pizzas takes 30 minutes at an average of 10 minutes per pizza.
It takes 16 minutes to prepare an order with 2 pizzas which is 8 minutes per pizza — making 2 pizzas in a single order the ultimate efficiency rate.

What was the average distance travelled for each customer?

SELECT c.customer_id, AVG(r.distance) AS avg_distance
FROM #customer_orders AS c
JOIN #runner_orders AS r
 ON c.order_id = r.order_id
WHERE r.duration != 0
GROUP BY c.customer_id;

Answer:

(Assuming that distance is calculated from Pizza Runner HQ to customer’s place)

Customer 104 stays the nearest to Pizza Runner HQ at average distance of 10km, whereas Customer 105 stays the furthest at 25km.

What was the difference between the longest and shortest delivery times for all orders?

Firstly, let’s see all the durations for the orders.

SELECT order_id, duration
FROM #runner_orders
WHERE duration not like ' ';

Then, we find the difference by deducting the shortest (MIN) from the longest (MAX) delivery times.

SELECT 
    MAX(duration::NUMERIC) - MIN(duration::NUMERIC) AS delivery_time_difference
FROM #runner_orders
WHERE duration not like '% %'

Answer:

The difference between longest (40 minutes) and shortest (10 minutes) delivery time for all orders is 30 minutes.

What was the average speed for each runner for each delivery and do you notice any trend for these values?

SELECT r.runner_id, c.customer_id, c.order_id, 
 COUNT(c.order_id) AS pizza_count, 
 r.distance, (r.duration / 60) AS duration_hr , 
 ROUND((r.distance/r.duration * 60), 2) AS avg_speed
FROM #runner_orders AS r
JOIN #customer_orders AS c
 ON r.order_id = c.order_id
WHERE distance != 0
GROUP BY r.runner_id, c.customer_id, c.order_id, r.distance, r.duration
ORDER BY c.order_id;

Answer:

(Average speed = Distance in km / Duration in hour)

Runner 1’s average speed runs from 37.5km/h to 60km/h.
Runner 2’s average speed runs from 35.1km/h to 93.6km/h. Danny should investigate Runner 2 as the average speed has a 300% fluctuation rate!
Runner 3’s average speed is 40km/h

What is the successful delivery percentage for each runner?

SELECT runner_id, 
 ROUND(100 * SUM
  (CASE WHEN distance = 0 THEN 0
  ELSE 1
  END) / COUNT(*), 0) AS success_perc
FROM #runner_orders
GROUP BY runner_id;

Answer:

Runner 1 has 100% successful delivery.
Runner 2 has 75% successful delivery.
Runner 3 has 50% successful delivery
(It’s not right to attribute successful delivery to runners as order cancellations are out of the runner’s control.)
I will continue with Part A, B and C soon!

C. Ingredient Optimisation
What are the standard ingredients for each pizza?
What was the most commonly added extra?
What was the most common exclusion?
Generate an order item for each record in the customers_orders table in the format of one of the following:
Meat Lovers
Meat Lovers - Exclude Beef
Meat Lovers - Extra Bacon
Meat Lovers - Exclude Cheese, Bacon - Extra Mushroom, Peppers

Generate an alphabetically ordered comma separated ingredient list for each pizza order from the customer_orders table and add a 2x in front of any relevant ingredients

For example: "Meat Lovers: 2xBacon, Beef, ... , Salami"

What is the total quantity of each ingredient used in all delivered pizzas sorted by most frequent first?

D. Pricing and Ratings

If a Meat Lovers pizza costs $12 and Vegetarian costs $10 and there were no charges for changes — how much money has Pizza Runner made so far if there are no delivery fees?
What if there was an additional $1 charge for any pizza extras?

Add cheese is $1 extra

The Pizza Runner team now wants to add an additional ratings system that allows customers to rate their runner, how would you design an additional table for this new dataset — generate a schema for this new table and insert your own data for ratings for each successful customer order between 1 to 5.
Using your newly generated table — can you join all of the information together to form a table which has the following information for successful deliveries?

customer_id
order_id
runner_id -rating -order_time
pickup_time
Time between order and pickup
Delivery duration
Average speed
Total number of pizzas

If a Meat Lovers pizza was $12 and Vegetarian $10 fixed prices with no cost for extras and each runner is paid $0.30 per kilometre travelled — how much money does Pizza Runner have left over after these deliveries?

E. Bonus Questions
If Danny wants to expand his range of pizzas — how would this impact the existing data design? Write an INSERT statement to demonstrate what would happen if a new Supreme pizza with all the toppings was added to the Pizza Runner menu?

Top comments (2)

Aaron Reese • Oct 13 '22

I've not read the whole article (too long!) But your first two code blocks could have been achieved using ISNULL(), REPLACE() and CAST() and avoided the CASE statements and ALTER column types. Cleaner code and less steps.

yaswanthteja • Oct 13 '22 • Edited

Hi Aaron thanks for your suggestion, i'm just started MySql .

DEV Community

8 Week SQL Challenge: Case Study #2 Pizza Runner

Introduction

Table Relationship

Case Study Questions

Data Cleaning and Transformation

Solution

A. Pizza Metrics

B. Runner and Customer Experience

C. Ingredient Optimisation

D. Pricing and Ratings

Top comments (2)

Read next

Elixir: Concurrency & Fault-Tolerance

🗂️ Monorepo vs. Polyrepo: Choosing the Right Strategy for Your Projects 🚀

How to Become a Solana Developer: A Comprehensive Guide

Axios Or Fetch in NextJs