loading...
Cover image for How to add IF-ELSE logic to SQL queries

How to add IF-ELSE logic to SQL queries

helenanders26 profile image Helen Anderson Updated on ・3 min read

CASE is SQL’s answer to other programming languages IF-ELSE statement.

If you need to test a condition and return a value, CASE is the expression you need.


Introduction
Syntax
CASE with one condition
CASE with multiple conditions
Limitations and gotchas


Introduction

In this example, I’m going to load a subset of the top-selling Sega Genesis games dataset into SQL Server.


Alt Text


Syntax

Each ‘WHEN’ tests to see if a condition has been met. If it has, it returns the ‘RESULT’, if not it moves on to check the next condition.

If none of the specified conditions is met it returns what we specify after ‘ELSE’.

Every CASE expression needs to finish with ‘END’.

An optional extra is to add a column name to our result.


case
    when condition1 then result1
    when condition2 then result2
    when condition3 then result3
    else result end
as columnname;

This makes a lot more sense with examples, so let's get started with our Sega Genesis dataset.


CASE with one condition

In our first example, we are using CASE to test one condition. We want to create a new column that shows if a game is from the Sonic series or not.

When the game name contains the string ‘Sonic’ the expression returns ‘Y’, if not it returns ‘N’.

The last step is to create the column name ‘Sonic Series’


select 
  game,
  year,
  case 
    when game like 'sonic%' then 'Y'
    else 'N'
    end 
  as 'Sonic Series'
from 
  sega_sales
order by 
  'Sonic Series' desc

Alt Text


CASE with multiple conditions

In this example, we are using CASE to test multiple conditions. We want to create a new column that shows if Global sales are low, mid or high.

  • When the games Global sales are greater than or equal to 4.00 the expression returns ‘high’, if not it moves on to the next step.

  • If the games Global sales are less than or equal to 3.99 and greater than 2.00 the expression returns ‘mid’.

  • If the games Global sales do not meet either of these conditions the expression returns ‘low’.

The last step is to create the column name ‘Sales Range’


select
  game, 
  global, 
  case 
    when global >= 4.00 then 'high' 
    when global > 2.00 and global <= 3.99 then 'mid' 
    else 'low' 
    end
  as 'Sales Range' 
from
  sega_sales

Alt Text


Limitations and gotchas

The CASE Expression is powerful and makes evaluating conditions simple. However, there are some limitations to be aware of:

  • CASE expressions cannot check for NULLs.
  • The expression works sequentially in the order specified so when a condition is satisfied it will stop.

Having said that, the CASE expression is flexible and can be used in ORDER BY, views, aggregates with HAVING and to UPDATE data. If you need to perform IF-ELSE conditions in SQL, give it a try.


Read more


Posted on by:

helenanders26 profile

Helen Anderson

@helenanders26

Making applications go faster at Raygun, AWS Data Hero, and tag moderator on Dev.to.

Discussion

pic
Editor guide
 

COALESCE is handy too depending on the nature of your if-else and your data. Since it takes in multiple arguments and returns the first non-null value, it's good for "if this field is null then give me that other field". Good write up though, thanks!

 

Oh yeah!
Also
COALESCE(NULLIF(...

 

Thank you for shaing, this is interesting; however you can do this, but its a bad idea. This is very anti-design. If you need analytical continuance on union joins, use stored functions with a reducer algorithm. Using sql in this fashion is very slow and does not utilize indices.. However, if your in a jam and need to say inject a SQL program into a foreign DB without permissions,, ya i would use this.. I literary spent 6 months trying to rip this stuff out of a database and into java store functions.

 

Thanks for the tips.

I should have mentioned that I'm looking at this from a data analysts point of view so the use case is writing a query on the database directly to pull back a dataset.

 

Unfortunately is a power that analysts often abuse (cases with many condition in long queries), my suggestion is usually to try to put the case logic in a function. It can be maintained more easily, the query is shorter and less prone to errors

 

Gotcha. Have you ever tried using XQuery and eXist?eXist

exist-db.org/exist/apps/doc/xquery

its designed more for queries with logic.. BY integrating logic operands into SQL your kinda fighting against alot of the algorithms that perform optimizations. They have some tools which let you write advanced DA style queries and it spits out java JPA SQL magic that you just dumb into your functions module..

That db is very fast and can work on very large datasets.. Its commonly used in the ontological auto classification field..

great example you post up before..

Im interested in seeing some benchmarks on using cases directly in SQL. I don't think anyone has put effort into a JPerormace like website but for SQL.. You know that website where it tells you if your function is fast.

GL;HC; :)

You may find that this depends very much on the platform. I feel I can attest from experience that on Teradata the optimisation definitely favours the CASE expression as long as it doesn't get so overly complex as to blow the memory of the distribution element (Teradata has its own internal terminology which I won't quote).

In that vein, while I wouldn't defend it, I've certainly seen CASE expressions with a hundred WHEN clauses implementing the quirky logic combinations that come from bureaucratic classifications. There are times when any attempt to abstract that to a function or a table join would be even more unwieldy and introduce more overheads than the CASE expression.

great tools come with great responsibility. :) Thank you for that reply, useful.. :)

CASE statements will perform fine on most SQL RDBMS engines. It is much better to have these rules in a view or stored procedure than embedded in client code. Even better, create Rules tables that treat the metadata as data and use JOINs in place of CASE statements.

I have tried to avoid speaking in terms of absolutes, but here is one that comes close:
Almost never embed SQL queries that call dedicated back end database server objects in modules of an imperative language like JAVA.

Well, there's an irony there. If we set aside performance issues for a moment, the advantage of moving the conditional evaluation from a "hard coded" CASE expression to "rules tables" (I'd probably call that "control tables" personally) is that you can then conveniently change this by editing the data instead of recoding.

Except, often the editing of that data requires building more infrastructure and arranging access rights, whereas once the CASE expression is in a view or stored procedure, usually whoever had access to write the view or stored procedure can readily replace it - even (gasp) in production.

Personally I'm with you though, my ideal is to write/build analytical systems where the code (SQL) is comprehensive and settled, and all behaviour modification is done by editing control data. Alas I usually seem to be alone in that ambition.

 

Excellent post! I always follow your articles.
I believe that it's possible to use CASE expressions to check for NULL. The following statement should execute properly.

SELECT game, 
       CASE WHEN year IS NULL THEN '2020' 
            ELSE year 
       END AS year
  FROM sega_sales
 

Sorry to nitpick but you're leaving a gap between 3.99 and 4.00, which will thereby cascade into the ELSE clause and be miscategorised as 'low'.

The cascade of tests in a CASE express allow for each boundary to be given only once, which is the neatest way to avoid such a slip.

 SELECT
     game, 
     global, 
     CASE 
         WHEN global >= 4.00 THEN 'high' 
         WHEN global >= 2.00 THEN 'mid' 
         WHEN global >= 0.00 THEN 'low' 
         ELSE '!!!' 
         END AS 'Sales Range' 
 FROM
     sega_sales
 ;

And yes, I've also snuck in a common practice of mine, the ELSE that you don't really expect to ever occur. The idea is that when it does occur*, you will notice and you will then work out what you will need to do about presumptions being wrong. In this case it would pose the question: "are my ranges still meaningful if negative values show up". A similar argument could lead to adding an extra clause in front of the first:

         WHEN global >= 40.00 THEN '###' 

.. where I've arbitrarily applied a factor of 10 as "game changer" detector.

In a away, this is definitely me nit-picking some code, but, to be frank I also think this is a way of showing how useful CASE expressions are in regular data analysis work, and a good habit to ingrain early on.

Where * = Gerald's (half tongue in cheek) Law, that in data work, everything happens (sometime).

 

There is also the possibility to write cases without else, that will return null. Sometimes it can be useful, for example sqlfiddle.com/#!9/ae6d108/5