DEV Community

Cover image for Every SQL Join You’ll Ever Need
Katie
Katie

Posted on • Originally published at katiekodes.com on

Every SQL Join You’ll Ever Need

All right ... maybe not every join. But this cheat sheet helped me score 104% on my Database 101 midterm exam, so it must be worth something.

NOTE: There's a lot of markup I can't do on dev.to. So although I've let this post cross-post in its entirety, a lot of things I say in the text don't make sense anymore without my "pay attention to this!" background highlights, cross-links within the text, table of contents, etc.

I recommend reading it here, on my original blog, instead of on dev.to, and then coming back over to dev.to to discuss!


It took a bit of work to find my old school notes, because I hadn't even put "join" in the title. Instead, I called my notes "Set Operations In SQL." And that's really what this is all about, because it doesn't matter if you use SQL, Excel VLOOKUP, Python, or a pen and paper to compare one list of records to another. Combining one "set" of things with another, as a business problem, is math. Better yet, it's fun math, not hard math! It's Venn diagrams -- the kind of math that gets you doodling like you're in elementary school.

SQL code is useful to know when the two sets of records (each "record" representing some real-world entity, like a person, an invoice, or a country) that your business problem requires you to combine both exist within a single relational database management system (RDBMS), usually simply referred to as a "database." All such systems come with the ability to extract your data using the SQL programming language.

  • If you're a Python data analyst, be sure to learn some SQL so you can leverage its power against the databases where your data already resides.

  • If you're a Salesforce analyst or sysadmin, you've probably run into report and SOQL limitations. Perhaps you solved them by exporting two tables to Excel, then performing a VLOOKUP operation to connect them. To level up from there, you'll probably want to explore Python, not SQL -- unless your IT department is already backing up your Salesforce org's data to a traditional database. Then you'll want to know all about SQL, because it is going to save you a lot of time!

I do a lot of work in Oracle databases, so I find Oracle-flavored SQL the fastest to type and will use it as an example. You can run Oracle SQL code online if you sign up for a free account and try their Live SQL tool.

To translate this into another "flavor" of SQL, you may need to Google the keywords I've used, together with the name of your database. For example, when my full outer join command wouldn't work in a MySQL database, I Googled "'full outer join' mysql" and discovered that MySQL-flavored SQL doesn't include a full outer join command. Instead, you have to get clever combining smaller set operations (it's like carefully coloring in your Venn diagram one entire circle at a time, rather than running your crayon over the whole diagram at once).

Pro tip: Bookmark this article and treat it as a reference guide.

Whether you know a little SQL or a lot, use the Table of Contents at the top of this article to read one section at a time.

  • Though simple, this guide is long, so if you're learning, read about one "Join Type" per day.(Unless you have a long bus ride!)
  • If you know the ropes and just need to compare commands to each other, jump straight to the command that interests you.

Our Sample Data

Table "A" Contents (CONVENTION)

CONVENTION is a list of people who attended a convention and where they work.

(Yes, it's a very nosy event, asking its attendees for social security number.)

ssn name_lf ph em company
A1A1A1 Amjit, Anush 1111111 111@example.com Apple
B2B2B2 Borges, Benita 2222222 222@example.com Boiron
C3C3C3 Combs, Cathy 3333333 333@example.com CVS
D4D4D4 Daher, Darweesh 4444444 444@example.com Dell
E5E5E5 Ellis, Ezra 5555555 555@example.com EDF
F6F6F6 Fulvia, Frances 6666666 666@example.com Firestone

Table "B" Contents (RESTAURANT)

RESTAURANT is a list of regular customers at a restaurant, their favorite food, and their age.

(Yes, it's a very nosy restaurant, asking its regular customers for social security number.)

  • Note the "ccc," rather than "333," email address for Cathy Combs.
  • Note the typo in the social security number for Frances Fulvia.
social name_fl phone email fav_food age
C3C3C3 Cathy Combs 3333333 ccc@example.com Carrots 33
D4D4D4 Darweesh Daher 4444444 444@example.com Doritos 44
E5E5E5 Ezra Ellis 5555555 555@example.com Endives 55
FFF666 Frances Fulvia 6666666 666@example.com Fries 66
G7G7G7 Grace Gao 7777777 777@example.com Garlic 77
H8H8H8 Helen Hopper 8888888 888@example.com Hummus 88

Code To Create This Data ("DDL" & "DML")

Here's code to create CONVENTION and RESTAURANT tables in an Oracle database and populate them with our sample data, if you want to try it yourself.

drop table CONVENTION;
create table CONVENTION ( 
    ssn     varchar2(6) not null 
    , name_lf   varchar2(30) not null 
    , ph        varchar2(7) not null 
    , em        varchar(15) not null 
    , company   varchar(15) not null 
);
insert into CONVENTION values ('A1A1A1', 'Amjit, Anush', '1111111', '111@example.com', 'Apple');
insert into CONVENTION values ('B2B2B2', 'Borges, Benita', '2222222', '222@example.com', 'Boiron');
insert into CONVENTION values ('C3C3C3', 'Combs, Cathy', '3333333', '333@example.com', 'CVS');
insert into CONVENTION values ('D4D4D4', 'Daher, Darweesh', '4444444', '444@example.com', 'Dell');
insert into CONVENTION values ('E5E5E5', 'Ellis, Ezra', '5555555', '555@example.com', 'EDF');
insert into CONVENTION values ('F6F6F6', 'Fulvia, Frances', '6666666', '666@example.com', 'Firestone');
drop table RESTAURANT;
create table RESTAURANT ( 
    social      varchar2(6) not null 
    , name_fl   varchar2(30) not null 
    , phone     varchar2(7) not null 
    , email     varchar2(15) not null 
    , fav_food  varchar2(10) not null 
    , age       int not null 
);
insert into RESTAURANT values ('C3C3C3', 'Cathy Combs', '3333333', 'ccc@example.com', 'Carrots', 33);
insert into RESTAURANT values ('D4D4D4', 'Darweesh Daher', '4444444', '444@example.com', 'Doritos', 44);
insert into RESTAURANT values ('E5E5E5', 'Ezra Ellis', '5555555', '555@example.com', 'Endives', 55);
insert into RESTAURANT values ('FFF666', 'Frances Fulvia', '6666666', '666@example.com', 'Fries', 66);
insert into RESTAURANT values ('G7G7G7', 'Grace Gao', '7777777', '777@example.com', 'Garlic', 77);
insert into RESTAURANT values ('H8H8H8', 'Helen Hopper', '8888888', '888@example.com', 'Hummus', 88);
Enter fullscreen mode Exit fullscreen mode

Who overlaps and how

Let's say we're a large corporation. We're so large that we run the convention and we own the restaurant. We want to look for "overlaps" between our two customer lists an analyze them accordingly.

Actually, that's all the context I'll give about "why" I chose the queries and output I chose. They make no business sense at all. I chose them to demonstrate what the different styles of SQL query can and can't do!

But back to the business of combining our sets:

As always when comparing "people" records from two different data sets, decisions must be made about what "proves" two records represent the same "real-world person." Do you match on name? Email? Social security number (remember, they are re-used after people die)? Some combination thereof?

For our examples, we'll look at two different ways of combining our data, just to flex our code muscles.

Before reading the code, be sure to look at CONVENTION and RESTAURANT and take note that:

  1. When we join on A.ssn=B.social, Cathy-Ezra (3-5) match
    • Anush, Benita, and "Frances w/ ssn F6F6F6" remain unique to CONVENTION
    • "Frances w/ ssn FFF666," Grace, & Helen remain unique to RESTAURANT
  2. When we join on A.ph=B.phone and A.em=B.email, Darweesh-Frances (4-6) match
    • Anush, Benita, and "Cathy w/ em 333@..." remain unique to CONVENTION
    • "Cathy w/ email ccc@...," Grace, & Helen remain unique to RESTAURANT

You're fully debriefed! Let's take a look at our first of 5 "join types."


Join Type: "In A"

Venn diagram with the left circle shaded, including the overlap

For this join type, we'd expect 6 rows of output (Anush-Frances -- all people from CONVENTION).

To the extent that any of these people "also exist" in RESTAURANT (as defined by the "matching criteria" we choose), we'd expect to see details about them from RESTAURANT.

LEFT OUTER JOIN

SQL Query, left outer join example 1/4

First, we'll match on social security number, so Cathy has details from "table B" but Frances doesn't.

select A.name_lf, A.company, B.name_fl, B.fav_food
from CONVENTION A
left outer join RESTAURANT B on A.ssn = B.social;
Enter fullscreen mode Exit fullscreen mode

Query Output, left outer join example 1/4

name_lf company name_fl fav_food
Amjit, Anush Apple
Borges, Benita Boiron
Combs, Cathy CVS Cathy Combs Carrots
Daher, Darweesh Dell Darweesh Daher Doritos
Ellis, Ezra EDF Ezra Ellis Endives
Fulvia, Frances Firestone

SQL Query, left outer join example 2/4

Next, we'll match on phone and email, so Frances has details from "table B" but Cathy doesn't.)

select A.name_lf, A.company, B.name_fl, B.fav_food
from CONVENTION A
left outer join RESTAURANT B on A.ph = B.phone and A.em = B.email;
Enter fullscreen mode Exit fullscreen mode

Query Output, left outer join example 2/4

name_lf company name_fl fav_food
Amjit, Anush Apple
Borges, Benita Boiron
Combs, Cathy CVS
Daher, Darweesh Dell Darweesh Daher Doritos
Ellis, Ezra EDF Ezra Ellis Endives
Fulvia, Frances Firestone Frances Fulvia Fries

SQL Query, left outer join example 3/4

Next, we'll try (incorrectly) to suppress name_fl and fav_food details for anyone younger than 50.

select A.name_lf, A.company, B.name_fl, B.fav_food
from CONVENTION A
left outer join RESTAURANT B on A.ssn = B.social
where B.age >= 50;
Enter fullscreen mode Exit fullscreen mode

Query Output, left outer join example 3/4

name_lf company name_fl fav_food
Ellis, Ezra EDF Ezra Ellis Endives

Note

Even though "rows from B" were optional in our join, in adding a filter on B, we've effectively turned this "left join" into an "inner join" where our data must appear in both tables.

What we actually did was filter out the entire record of anyone who didn't have an age (because they aren't in table B) or was younger than 50.

This is a common mistake. We'll fix it in the next example.

In math, when you subtract numbers from each other, the order in which you subtract them matters.

Similarly, it's important to think about the order in which you "join" and "filter" data as you solve your business problems with SQL.

SQL Query, left outer join example 4/4

Now we'll actually suppress name_fl and fav_food details for anyone 50 or younger.

select A.name_lf, A.company, B.name_fl, B.fav_food
from CONVENTION A
left outer join (select * from RESTAURANT where age >= 50) B on A.ssn = B.social;
Enter fullscreen mode Exit fullscreen mode

Query Output, left outer join example 4/4

name_lf company name_fl fav_food
Amjit, Anush Apple
Borges, Benita Boiron
Combs, Cathy CVS
Daher, Darweesh Dell
Ellis, Ezra EDF Ezra Ellis Endives
Fulvia, Frances Firestone

Note

In our corrected query, we pre-filtered RESTAURANT by age before treating it as B, which fixed our problem from the previous example.

That was the first "join type." Let's take a look at the second.


Join Type: "In A But Not In B"

Venn diagram with only the portion of the left circle that does not overlap shaded

For this join type, we'd expect 3 rows of output (Anush+Benita+Cathy or Anush+Benita+Frances, depending on the "matching criteria" we choose).

Table B is, by definition, not part of the final output from your business question, so remember that all of the approaches below will only include "details" from table A.

NOT IN

SQL Query, not in example 1/3

not in is great when you're only matching on one column.

select A.name_lf, A.company
from CONVENTION A
where A.ssn not in ( select B.social from RESTAURANT B );
Enter fullscreen mode Exit fullscreen mode

Query Output, not in example 1/3

name_lf company
Amjit, Anush Apple
Borges, Benita Boiron
Fulvia, Frances Firestone

SQL Query, not in example 2/3

not in gets dicey when you need to match on multiple columns -- you have to concatenate multiple columns into a single column and responsibly handle blank/null values.

It is NOT PREFERRED for matching on multiple columns. Use NOT EXISTS instead.

This query is just for show!

select A.name_lf, A.company
from CONVENTION A
where nvl(A.ph,'')||'~'||nvl(A.em,'') not in ( select nvl(B.phone,'')||'~'||nvl(B.email,'') from RESTAURANT B );
Enter fullscreen mode Exit fullscreen mode

Query Output, not in example 2/3

name_lf company
Amjit, Anush Apple
Borges, Benita Boiron
Combs, Cathy CVS

SQL Query, not in example 3/3

Our third not in example isn't even a real join between tables: note that RESTAURANT isn't even in the query.

I just wanted to point out that this is typed the same way as the not in that you can use when adding a hand-typed value filter to a query.

select A.name_lf, A.company
from CONVENTION A
where A.ssn not in ( 'B2B2B2','XYZZY','E5E5E5' );
Enter fullscreen mode Exit fullscreen mode

Query Output, not in example 3/3

name_lf company
Amjit, Anush Apple
Combs, Cathy CVS
Daher, Darweesh Dell
Fulvia, Frances Firestone

<> ALL

This is exactly like not in, only you type <> all instead. Every example from the section on not in could have been written with <> all.

SQL Query

select A.name_lf, A.company
from CONVENTION A
where A.ssn <> all ( select B.social from RESTAURANT B );
Enter fullscreen mode Exit fullscreen mode

Query Output

name_lf company
Amjit, Anush Apple
Borges, Benita Boiron
Fulvia, Frances Firestone

MINUS

With minus, we can't output columns that aren't in our "match criteria."

That said, such limited output might suffice if our business problem is a simple "dummy check" list of "all values that appear only in A."

SQL Query, minus example 1/2

select A.ssn
from CONVENTION A
minus
select B.social
from RESTAURANT B;
Enter fullscreen mode Exit fullscreen mode

Query Output, minus example 1/2

ssn
A1A1A1
B2B2B2
F6F6F6

SQL Query, minus example 2/2

select A.ph, A.em
from CONVENTION A
minus
select B.phone, B.email
from RESTAURANT B;
Enter fullscreen mode Exit fullscreen mode

Query Output, minus example 2/2

ph em
1111111 111@example.com
2222222 111@example.com
3333333 111@example.com

NOT EXISTS

USE THIS if you need to do "in A, but not B" with multi-column "matching criteria" in a complicated query!

where not exists (...) is the most useful syntax for writing a complex query with "matching criteria" that involve a multi-column match.

This approach supports "further filters" on A -- just include them as additional and and or clauses beyond the where clause responsible for the "matching" operation.

If the rest of your filters have or in them, be sure to use parentheses to ensure that you don't accidentally make your where not exists (...) optional.

SQL Query

SELECT A.name_lf, A.company
  FROM CONVENTION A
WHERE NOT EXISTS (
  SELECT NULL
    FROM RESTAURANT B
  WHERE A.ph = B.phone and A.em = B.email
);
Enter fullscreen mode Exit fullscreen mode

Query Output

name_lf company
Amjit, Anush Apple
Borges, Benita Boiron
Combs, Cathy CVS

LEFT OUTER JOIN ... B.... IS NULL

I can't think of a single reason to use the "LEFT OUTER JOIN plus B.joinColumn IS NULL" syntax in practice. It's completely overwrought.

But ... it's conceptually related to a common approach to "In A Or B, But Not Both" so it's an interesting mental exercise to make sure you understand.

It's also a good "cheat" pattern to be familiar with when you run into a report-writing tool that doesn't have a NOT EXISTS syntax, like Cognos Report Studio or Python+Pandas.

With this approach, it is vital to use the IS NULL filter for all columns of B that were involved in the LEFT OUTER JOIN operation.

Otherwise, if your data set has a lot of NULL values in it, you could get unexpected results.

This approach supports "further filters" on A -- just include them as additional and and or clauses beyond the where clause responsible for the "matching" operation.

If the rest of your filters have or in them, be sure to use parentheses to ensure that you don't accidentally make your where ... is null filter(s) optional.

SQL Query

select A.name_lf, A.company
from CONVENTION A
left outer join RESTAURANT B on A.ph = B.phone and A.em = B.email
where B.phone is null
and B.email is null;
Enter fullscreen mode Exit fullscreen mode

Query Output

name_lf company
Amjit, Anush Apple
Borges, Benita Boiron
Combs, Cathy CVS

Congratulations getting through the second "join type." On to the third!


Join Type: "In A, B, Or Both"

Venn diagram with both circles and their overlap shaded

For this join type, we'd expect 9 rows of output (Anush-Helen, with Cathy or Frances doubled up depending on which "matching criteria" we choose).

UNION

SQL Query, union example 1/2

select A.ssn
from CONVENTION A
union
select B.social
from RESTAURANT B;
Enter fullscreen mode Exit fullscreen mode

Query Output, union example 1/2

ssn
A1A1A1
B2B2B2
C3C3C3
D4D4D4
E5E5E5
F6F6F6
FFF666
G7G7G7
H8H8H8
  • Do you see the doubled-up "Frances" data?

Note

As with minus, we can't output columns that weren't in our "match criteria."

A single column of output might suffice if our business problem is a simple "dummy check" list of "all possible values from either table."

SQL Query, union example 2/2

select A.ph, A.em
from CONVENTION A
union
select B.phone, B.email
from RESTAURANT B;
Enter fullscreen mode Exit fullscreen mode

Query Output, union example 2/2

  • Do you see the doubled-up "Cathy" data?

FULL OUTER JOIN

SQL Query, full outer join example 1/2

select A.name_lf, A.company, B.name_fl, B.fav_food
from CONVENTION A
full outer join RESTAURANT B on A.ssn = B.social;
Enter fullscreen mode Exit fullscreen mode

Query Output, full outer join example 1/2

name_lf company name_fl fav_food
Amjit, Anush Apple
Borges, Benita Boiron
Combs, Cathy CVS Cathy Combs Carrots
Daher, Darweesh Dell Darweesh Daher Doritos
Ellis, Ezra EDF Ezra Ellis Endives
Fulvia, Frances Firestone
Frances Fulvia Fries
Grace Gao Garlic
Helen Hopper Hummus
  • Do you see the doubled-up "Frances" data?

SQL Query, full outer join example 2/2

select A.name_lf, A.company, B.name_fl, B.fav_food
from CONVENTION A
full outer join RESTAURANT B on A.ph = B.phone and A.em = B.email;
Enter fullscreen mode Exit fullscreen mode

Query Output, full outer join example 2/2

name_lf company name_fl fav_food
Amjit, Anush Apple
Borges, Benita Boiron
Combs, Cathy CVS
Daher, Darweesh Dell Darweesh Daher Doritos
Ellis, Ezra EDF Ezra Ellis Endives
Fulvia, Frances Firestone Frances Fulvia Fries
Cathy Combs Carrots
Grace Gao Garlic
Helen Hopper Hummus
  • Do you see the doubled-up "Cathy" data?

MySQL fake FULL OUTER JOIN

This is the clever trickery I mentioned earlier to compensate for the fact that MySQL-flavored SQL doesn't have a full outer join command.

SQL Query

select A.name_lf, A.company, B.name_fl, B.fav_food
from CONVENTION A
left outer join RESTAURANT B on A.ssn = B.social
union
select A.name_lf, A.company, B.name_fl, B.fav_food
from CONVENTION A
right outer join RESTAURANT B on A.ssn = B.social;
Enter fullscreen mode Exit fullscreen mode

Query Output

name_lf company name_fl fav_food
Amjit, Anush Apple
Borges, Benita Boiron
Combs, Cathy CVS Cathy Combs Carrots
Daher, Darweesh Dell Darweesh Daher Doritos
Ellis, Ezra EDF Ezra Ellis Endives
Fulvia, Frances Firestone
Frances Fulvia Fries
Grace Gao Garlic
Helen Hopper Hummus
  • Do you see the doubled-up "Frances" data?

Over halfway there! Keep reading to learn about the fourth "join type."

This one has a lot of options, so it'll be a bit of a longer read.


Join Type: "In Both A And B"

Venn diagram with only the overlap of the two circles shaded

For this join type, we'd expect 3 rows of output (Cathy-Ezra or Darweesh-Frances, depending on the "matching criteria" we choose).

IN

SQL Query, in example 1/3

select A.name_lf, A.company
from CONVENTION A
where A.ssn in ( select B.social from RESTAURANT B );
Enter fullscreen mode Exit fullscreen mode

Query Output, in example 1/3

name_lf company
Combs, Cathy CVS
Daher, Darweesh Dell
Ellis, Ezra EDF

Note

  • in is great when:
    • you're only matching on one column, and
    • you don't need to see, or filter further on, any details from B

SQL Query, in example 2/3

select A.name_lf, A.company
from CONVENTION A
where nvl(A.ph,'')||'~'||nvl(A.em,'') in ( select nvl(B.phone,'')||'~'||nvl(B.email,'') from RESTAURANT B );
Enter fullscreen mode Exit fullscreen mode

Query Output, in example 2/3

name_lf company
Daher, Darweesh Dell
Ellis, Ezra EDF
Fulvia, Frances Firestone

Note

  • in gets dicey when you need to match on multiple columns -- you have to concatenate multiple columns into a single column and responsibly handle blank/null values. It is NOT PREFERRED for matching on multiple columns. Use exists instead. This is just for show!

SQL Query, in example 3/3

select A.name_lf, A.company
from CONVENTION A
where A.ssn in ( 'B2B2B2','XYZZY','E5E5E5' );
Enter fullscreen mode Exit fullscreen mode

Query Output, in example 3/3

name_lf company
Borges, Benita Boiron
Ellis, Ezra EDF

Note

  • This isn't actually a real join between tables: note that RESTAURANT isn't even in the query. I just wanted to point out that this is the same in that you can use when adding a manual value filter to a query.

= ANY

SQL Query

select A.name_lf, A.company
from CONVENTION A
where A.ssn = any ( select B.social from RESTAURANT B );
Enter fullscreen mode Exit fullscreen mode

Query Output

name_lf company
Combs, Cathy CVS
Daher, Darweesh Dell
Ellis, Ezra EDF

Note

This is exactly like in, only you type = any instead. All 3 examples from in apply.

INTERSECT

SQL Query, intersect example 1/2

select A.ssn
from CONVENTION A
intersect
select B.social
from RESTAURANT B;
Enter fullscreen mode Exit fullscreen mode

Query Output, intersect example 1/2

ssn
C3C3C3
D4D4D4
E5E5E5

Note

As with minus and union, we can't output columns that weren't in our "match criteria."

A single column of output might suffice if our business problem is a simple "dummy check" list of "all values that appear in both tables."

SQL Query, intersect example 2/2

select A.ph, A.em
from CONVENTION A
intersect
select B.phone, B.email
from RESTAURANT B;
Enter fullscreen mode Exit fullscreen mode

Query Output, intersect example 2/2

ph em
4444444 444@example.com
5555555 555@example.com
6666666 666@example.com

EXISTS

SQL Query

SELECT A.name_lf, A.company
  FROM CONVENTION A
WHERE EXISTS (
  SELECT NULL
    FROM RESTAURANT B
  WHERE A.ph = B.phone and A.em = B.email
);
Enter fullscreen mode Exit fullscreen mode

Query Output

name_lf company
Daher, Darweesh Dell
Ellis, Ezra EDF
Fulvia, Frances Firestone

Note

  • exists is better than in when your "matching criteria" involve multiple columns.
  • Like in, exists doesn't let you see, or further filter upon, details from B.

INNER JOIN

SQL Query

select A.name_lf, A.company, B.name_fl, B.fav_food
from CONVENTION A
inner join RESTAURANT B on A.ph = B.phone and A.em = B.email;
Enter fullscreen mode Exit fullscreen mode

Query Output

name_lf company name_fl fav_food
Daher, Darweesh Dell Darweesh Daher Doritos
Ellis, Ezra EDF Ezra Ellis Endives
Fulvia, Frances Firestone Frances Fulvia Fries

= {#equals}

The simple = approach to joining tables is extraordinarily widespread and often taught first in tutorials to joining tables with SQL.

It supports "further filters" -- just include them as additional and and or clauses beyond the where clause responsible for the "matching" operation.

If the rest of your filters have or in them, be sure to use parentheses to ensure that you don't accidentally make your "join filter(s)" optional.

Avoiding such concerns is one reason I slightly prefer inner join to the "=" approach.

I almost always use explicit inner join commands when I need to experiment to determine what query I want to write. It helps me avoid careless backspacing errors while I play.

SQL Query

select A.name_lf, A.company, B.name_fl, B.fav_food
from CONVENTION A, RESTAURANT B
where A.ph = B.phone
and A.em = B.email;
Enter fullscreen mode Exit fullscreen mode

Query Output

name_lf company name_fl fav_food
Daher, Darweesh Dell Darweesh Daher Doritos
Ellis, Ezra EDF Ezra Ellis Endives
Fulvia, Frances Firestone Frances Fulvia Fries

So close you can taste it. One more to go -- below is the fifth and final "join type."


Join Type: "In A Or B, But Not Both"

Venn diagram with the unique portions of the circles shaded, but not the overlap

For this join type, we'd expect 6 rows of output (Anush+Benita+Cathy1+Cathy2+Grace+Helen or Anush+Benita+Frances1+Frances2+Grace+Helen, depending on the "matching criteria" we choose).

There aren't any SQL commands dedicated to this join type, but take a close look at the Venn diagram representing it. Can you see the "bite out of the middle" of what's otherwise an "In A, B, Or Both" join? Your task is to write code that takes that "bite out of the middle."

(Side note: it is very difficult to make this Venn diagram as a sugar cookie that won't break when bringing treats for your Database 101 class's final exam. Making it in SQL is much easier!)

Choose your own adventure

"Cut out the middle" of your overlapping circles by designing a query according to this architecture:

  1. Build an "In A, B, Or Both" query using your favorite approach and think of it as your new A.
  2. Build an "In Both A And B" query using your favorite approach and think of it as your new B.
  3. Stitch together your new A and your new B using your favorite "In A But Not In B" approach.

FULL OUTER JOIN ... A.... IS NULL OR B.... IS NULL

With this approach, it is vital to use the IS NULL filter for all columns of A and for all columns of B that were involved in the FULL OUTER JOIN operation.

Otherwise, if your data set has a lot of NULL values in it, you could get unexpected results.

When you do this, "AND" together filters on columns from the same table, but "OR" together columns from different tables, as in the sample code below.

This approach supports "further filters" -- just include them as additional and and or clauses beyond the where clause(s) responsible for the "matching" operation.

Be sure to surround the entire ... or ... filters responsible that "take a bite out of the middle" in their own set of parentheses, to avoid unexpected behavior.

If the rest of your filters have or in them, use parentheses around them, too, to ensure that you don't accidentally make your "bite out of the middle" filter optional.

SQL Query

select A.name_lf, A.company, B.name_fl, B.fav_food
from CONVENTION A
full outer join RESTAURANT B on A.ph = B.phone and A.em = B.email
where 
    (
    A.ph is null
    and A.em is null
    )
or
    (
    B.phone is null
    and B.email is null
    );
Enter fullscreen mode Exit fullscreen mode

Query Output

name_lf company name_fl fav_food
Amjit, Anush Apple
Borges, Benita Boiron
Combs, Cathy CVS
Cathy Combs Carrots
Grace Gao Garlic
Helen Hopper Hummus

You made it! Can you feel how much smarter you are yet?


More Resources

If you liked this post, other links that may interest you are:

Top comments (2)

Collapse
 
helenanders26 profile image
Helen Anderson

Wow! Thank you for such an incredibly thorough post. This should be on everyone's reading list.

I'll be adding it as a link in my post for beginners as a logical next step to understanding how powerful joins can be.

Collapse
 
mbzmak profile image
MAK

Hi Katie, great article, thx for taking the time to share it.
Concerning the <> ALL syntax, in Oracle, it is possible to use it when matching multiple columns without having to concatenate them. It's done using sets :
... WHERE (COL_A1, COL_A2, COL_A3) <> ALL (SELECT COL_B1, COL_B2, COL_B3 FROM B)