DEV Community

Franck Pachot for AWS Heroes

Posted on

PostgreSQL: An example of using WITH clauses to decompose the problem

Here is a simple example using WITH clauses (aka CTE - Common Table Expressions) to decompose a problem. A question from the YugabyteDB forum was about storing hotel room bookings in a table like:


create table Rooms (PK int primary key, name text, roomNo int);

create table BookedRooms (PK int primary key
, checkInDate timestamptz, checkOutDate timestamptz
, roomId int references Rooms(PK))
;

insert into Rooms values (2,'Deluxe',102),(3,'King',103);
insert into BookedRooms values (1,'2022-05-26T00:00:00Z','2022-05-29T00:00:00Z',2),(2,'2022-05-29T00:00:00Z','2022-05-30T00:00:00Z',3);
Enter fullscreen mode Exit fullscreen mode

Question

My suggestion was the following query:

with 
input_month as (
 select 4 as month --> the input months (can be a parameter in a prepared statement)
),
day_generator as (
select 
 date_trunc('year', now())      --> I guess you query for the current year
 + m.month * interval '1 month' --> first day of the month
 + n * interval '1 day'         --> adding 31 days to cover any month
 as day
 from input_month m , generate_series(0,31) n
),
days_of_month as (
select *
from input_month m, day_generator d
 where d.day <                  --> removes the days in next month
 date_trunc('year', now())  + (m.month + 1) * interval '1 month' 
),
result as (
select distinct d.day, r.roomNo 
from days_of_month d, BookedRooms b join Rooms r on (b.roomId=r.PK)
where not(d.day between b.checkInDate and b.checkOutDate) --> remove non available rooms
order by day, roomno
) 
select * from result

Enter fullscreen mode Exit fullscreen mode

I start by defining the input parameters as input_month. It can be passed as parameter, but I find it more readable to have it at only one place, in the start of the query, rather than referenceing $1in multiple places

Then I need to generate all days for a month and I do it in two steps. day_geneator generates the maximum (31 days a months) starting from the input_month. And days_of_month filters to get only the days of the mons (for months with less than 31 days).

Finally, with all that, the query is a cartesian join between the dates and the bookings, filtering on the date interval to show only those available. I've added a DISTINCT in case there are overlaps, but this should not happen and should be checked with constraints. This may be a topic for a next blog post.

I put the final query in result so that the main select is the simplest one, this makes it easy to debug by selecting each step.

I find many advantages in decomposing into CTE:

  • easier to read and comment each step
  • easier to process by the query planner (possibility to materialize or not each query block)
  • easier to test, just by changing the final select Do not fear large SQL queries as long as they are well designed. You can think of it like piping Linux commands one after the other to resolve a problem with small steps.

Latest comments (0)