DEV Community

loading...
Cover image for SQL CROSS JOINs

SQL CROSS JOINs

sneeu profile image Johnß ・2 min read

I found a use for CROSS JOIN in SQL recently when generating content for a pivot table.

Given a fixtures table:

CREATE SEQUENCE IF NOT EXISTS fixtures_id_seq;

CREATE TABLE "public"."fixtures" (
  "id" int4 NOT NULL DEFAULT nextval('fixtures_id_seq'::regclass),
  "start_time" TIMESTAMPTZ,
  "sport" VARCHAR(40),
  PRIMARY KEY ("id")
);
Enter fullscreen mode Exit fullscreen mode

The SQL is something similar to the below:

WITH sports AS (
  SELECT
    DISTINCT sport
  FROM
    fixtures
  ORDER BY
    sport
),
days AS (
  SELECT
    day
  FROM
    generate_series(
      date_trunc('day', '2020-09-09'::date)::date,
      date_trunc('day', '2020-09-21'::date)::date,
      '1 day'::interval
    ) AS day
),
fixtures_per_day AS (
  SELECT
    date_trunc('day', start_time) AS day,
    sport,
    COUNT(*) AS c
  FROM
    fixtures
  GROUP BY
    date_trunc('day', start_time),
    sport
)
SELECT
  days.day,
  sports.sport,
  COALESCE(fixtures_per_day.c, 0) AS fixture_count
FROM
  days
  CROSS JOIN sports
  INNER JOIN fixtures_per_day ON (
    days.day = fixtures_per_day.day
    AND sports.sport = fixtures_per_day.sport
  )
Enter fullscreen mode Exit fullscreen mode

There are three CTEs (Common Table Expressions) which are like sub-queries but can be re-used. I’ve used them here to keep the query tidy.

The first “table”, sports just gathers us a list of each of the sports in a fixtures table. Next, days generates a sequence of dates from 9/Sep/2020 to 21/Sep/2020. Finally fixtures_per_day create a table with the count of events on any given day for any given sport.

Our query then CROSS JOINs the days and sports “tables” to create a cartesian product, a maths-y name for the result of the following Python code:

result = []

for day in days:
    for sport in sports:
        result.append((day, sport))
Enter fullscreen mode Exit fullscreen mode

Once we have the CROSS JOIN we can LEFT JOIN on the fixtures_per_day to get something which would make sense in a pivot table or chart without having to fill in blanks by hand. We use the COALESCE function to make sure that any NULL rows become 0.

The results for some test data look like the following:

day sport fixture_count
2020-09-12 00:00:00+01 EPL 3
2020-09-12 00:00:00+01 NFL 0
2020-09-12 00:00:00+01 SPL 6
2020-09-13 00:00:00+01 EPL 2
2020-09-13 00:00:00+01 NFL 3
2020-09-13 00:00:00+01 SPL 0
2020-09-14 00:00:00+01 EPL 2
2020-09-14 00:00:00+01 NFL 1
2020-09-14 00:00:00+01 SPL 0
2020-09-15 00:00:00+01 EPL 0
2020-09-15 00:00:00+01 NFL 0
2020-09-15 00:00:00+01 SPL 0

Discussion

pic
Editor guide