Hey! I'm Katie, a product data scientist here at Hubble 👩🏼💻
Something I've noticed since joining Hubble a little less than two months ago is how much SQL I have stuck in my head. These are queries that I find myself reusing over and over again at every place I work and once I used them once, I couldn't forget them.
So, I thought it might be worth sharing these queries in case it's useful for others 🤗 And that's exactly what I'm going to do in this series of blog posts.
I'm going to go through some of the most common SQL queries that I whip up when digging into a product's performance. I'll also talk a bit about how you can use this data to help you make better product decisions.
So expect to get some insight into building:
- event funnels
- gapless date grids
- tables that let you do time-series analyses on your users
Once you've built these common queries, you can use Beacon to save them so all of your team can access them without ever having to write the SQL. You can also use Beacon to share the results from your shiny, new queries right in Slack! You can check out Beacon here.
Being a "data driven" company can feel like a pipe dream at times, but hopefully these blog posts can help your team make it a reality 💪.
One of the most useful bits of data I dig up when first launching a product comes from an event funnel.
An event funnel is a tool that lets you measure the completion rate of a series of steps (events). For example, you might create a funnel of the steps you need to go through to create an account, and then track the completion rate of each step.
With these funnels, you're able to see which steps are the biggest drop-off points in the user journey. This information is really useful, because:
- You can make data-informed decisions about what parts of the product you need to change to increase completion of an action (instead of relying only on your intuition).
- You're able to estimate the impact that your change made on the completion rate because you can compare the funnel before + after the change.
How we used event funnels to drive higher engagement with Beacon.
Once we launched it, we wanted to understand how users were engaging with the product. Specifically, we wanted to figure out why people weren't running a query with Beacon after installing it.
So, we have an action ("execute a query") that we're trying to optimize the completion rate for, and we know the steps leading up to that action. Enter...the event funnel.
I built an event funnel for users starting at "I installed the app" and going to "I executed a query". I used our awesome tool, Beacon, to run the query and share the results directly with the team in Slack:
This event funnel makes it really clear that the biggest drop off point is between steps 3 and 4: seeing the
connect modal (i.e. the screen giving instructions about how to connect to a database) and successfully getting a
So, we reached out to some of the users who got stuck at this step and brainstormed some improvements that we could make to help reduce the friction in getting a database connected. Basically, the event funnel showed us what part of the product we should be looking to change if we wanted to drive the highest increase in user engagement.
To be able to build a funnel, you need to generate a table with one row per step in your funnel, and then a count of the number of users who made it to that step.
Once you've figured out the steps that you want in your funnel and the order that you want them in, you can use this SQL template below to build your event funnel. Our team loves seeing these funnels now, it drives a lot of our work. We use Beacon to share the latest event funnel data in our team Slack every morning before stand up.
I tapped out at 3 events, but you can follow the pattern in this template and add more events as additional sub-queries (these are actually called "CTEs") to fulfil your funnel needs!
To run a query using this event funnel, you'll need to make sure you've got a few thing sorted:
user_idfield needs to be the identifier for your users (or the unit that you're counting in the funnel - e.g.
team_idif you want to know the rate of conversion for teams doing something)
eventfield should be the name of the step that you're referencing in each sub-query (CTE).
timestampfield should represent the time when the event occurred.
- You'll need to change the
FROM project_name.dataset_name.event_name_1to reference the table where your event data is stored for each given step.
I've commented out some code at the bottom which lets you specify a range of dates for the funnel (see:
IF YOU WANT TO SPECIFY A RANGE OF DATES, YOU CAN DO THAT HERE). You can add it in if you need it, but it's not required for the query to work! 🙂
WITH event_1 AS ( SELECT user_id, -- identifier of the thing you're counting in your funnel 'get_in_rocket' AS event, -- name of the step in your funnel MIN(timestamp) AS timestamp_first -- timestamp of when the event occurred FROM `project_name.dataset_name.event_name_1` -- table where your event data is stored for step 1 GROUP BY 1, 2 ), event_2 AS ( SELECT s2.user_id, 'push_button' AS event, MIN(timestamp) AS timestamp_first FROM `project_name.dataset_name.event_name_2` AS s2 INNER JOIN event_1 s1 ON s1.user_id = s2.user_id WHERE s2.timestamp >= s1.timestamp_first GROUP BY 1,2 ), event_3 AS ( SELECT s3.user_id, 'blastoff' AS event, MIN(timestamp) AS timestamp_first FROM `project_name.dataset_name.event_name_3` AS s3 INNER JOIN event_2 s2 ON s2.user_id = s3.user_id WHERE s3.timestamp >= s2.timestamp_first GROUP BY 1,2 ), all_events AS ( SELECT CONCAT('01: "', event, '"') as step_name, user_id, timestamp_first FROM event_1 UNION ALL SELECT CONCAT('02: "', event, '"') as step_name, user_id, timestamp_first FROM event_2 UNION ALL SELECT CONCAT('03: "', event, '"') as step_name, user_id, timestamp_first FROM event_3 ) SELECT step_name, COUNT(user_id) AS num_users_making_it_to_step, ROUND(COUNT(user_id) * 100 / (MAX(COUNT(user_id)) OVER(PARTITION BY NULL)), 2) AS percent_of_users_making_it_to_step FROM all_events LEFT JOIN event_1 USING(user_id) -- IF YOU WANT TO SPECIFY A RANGE OF DATES, YOU CAN DO THAT HERE -- This is the "From" date: -- WHERE event_1.timestamp_first >= '2020-12-01' -- This is the "To" date: -- AND event_1.timestamp_first <= '2020-12-03' GROUP BY 1 ORDER BY 1 ASC
Once you've installed Beacon into your Slack workspace and you've connected to your database, just type
/run-query, then your SQL query and hit enter.
Once your query is posted in Slack, you can hit
Save Query to make the SQL query for your event funnel available to everyone else in your team!
And that's it!