Hello, coders! 💻
Today's quiz taken from a real world task is SQL related.
Database structure
There are two tables (very simplified for the sake of this quiz):
-
companies
with two columns:id
andname
-
users
with two columns:company_id
andis_pro
(boolean)
Task
write a query to find all companies that only have pro users (
is_pro=1
).
My take on this
There you have it! How would you have done it? 📝 Edit Click to see my solution using MySQL
SELECT
c.name as company
FROM
users u
JOIN companies c ON
c.id = u.company_id
GROUP BY
c.id
HAVING
SUM(IF(is_pro, 0, 1)) = 0;
IF(is_pro, 0, 1)
means that I give 0 points to users that are pros (counter-intuitive but bear with me!). If all users are pros then the sum of points should be 0, right? That's what HAVING IF(is_pro, 0, 1)) = 0
does.
As pointed out by @geraldew using SUM
to COUNT
is hacky and as such I'd better use COUNT
and CASE
:
SELECT
c.name as company,
COUNT(
CASE WHEN can_use_app = 1 THEN 'x'
END
) isPro,
COUNT(
CASE WHEN can_use_app = 0 OR can_use_app IS NULL THEN 'x'
END
) isNotPro
FROM
users u
JOIN companies c ON
c.id = u.company_id
GROUP BY
c.id
HAVING
isPro > 0 and isNotPro = 0;
Happy coding! ⌨️
Top comments (7)
I don't know which SQL dialects have such an IF function (none that I use do), so my question is: what happens when is_pro is null?
What dialect do you use?
IF
has been available in MySQL since at least v5.1(released in 2008!).I should have mentioned MySQL in my solution, I'll edit.
in SQL Server you have
IIF
for instance.If
is_pro
isnull
then it's not considered truthy soIF
will return1
, which is what we want.Official MySQL documentation on IF
Well my question comes more from the fact that there is no IF in the SQL standard, so any article not clearly saying it is about a particular dialect perhaps should not use it.
The standard has the CASE expression for that purpose (note: expression, not function).
But to answer your question, most of my work has been with Teradata, HiveQL, Impala and occasionally PostgreSQL and SQLite. In some of these, there is an "IF" construct but it is used outside of SQL statements to provide a kind of conditional execution that standard SQL itself lacks.
An obvious caveat is to say that all vendors vary the dialect and that's probably a good thing overall and has been part of the evolution of SQL.
The issue of what functions should do when encountering a Null is .. debatable - and really is a whole other topic. My personal preference is for Nulls to cascade except where a construct is explicit about handling them.
Similarly I don't ever write SQL code that counts by summing the number 1 - even though I see that done a lot. When I'm investigating problems, code that does this is an immediate red flag.
And having said all that, I probably should back that up with code. A first cut of how I'd tackle it (i.e. just air code as I'm now at home) would be:
Where for clarity I've shown the subcounts as columns that are then quoted by name in the HAVING clause. As each is only used once, they could just be expressed in the HAVING clause and thus not output as columns.
Also, partly to make it clear - both to myself and a later reader of the code - I've put in the parts that would help deal with Nulls. This particularly helps when other requirements might prompt a change from using an INNER JOIN to using a LEFT OUTER JOIN, in which case you really do get a Null for the is_pro column via non-matches of the outer join. All the more reason to always be explicit about Null handling with SQL.
p.s. FWIW I don't personally find my
THEN 'x'
to be elegant but any non-null value will do there. I don't use the number1
precisely to avoid confusions with places where people might use the SUM of1
approach.Thanks for your valuable input.
It's indeed more portable to use
COUNT
in conjunction withCASE
. I wrote that query as a one-off, for stats, after wondering how many companies only had pro users. It's not production code. That being said, I'll keepCOUNT
andCASE
in mind!In my use case,
NULL
should be treated the same as0
so I would write it as:Ah yes, because you said
is_pro
was Boolean I just used that. Which is ironic as many data engines and their SQL dialects don't have a Boolean type at all, including the ones I mostly use. The joys of a language standard that is largely observed in the breach.Yeah by Boolean I meant 0 or 1 (up to you to decide if
null
is allowed or not).Boolean type in MySQL is sugar syntax for
TINYINT(1)
.I’ll be more precise next time!