My work includes a lot of database views because they work as a kind of public API in our case. Since I started talking about self-testing and utPLSQL, the MINUS-comparison has always had its place when assuring that a new, reconstructed view has the exact same content as the old one.
However, I did it wrong until recently when I discovered a major pitfall.
For I like to tell stories let’s start with a Star Wars scenario again: We have a table which contains all the Star Wars characters and another table which holds information in which of the “Episode”-movies they appeared.
We have the following sample data:
|1||Darth Vader||3, 4, 5, 6|
|2||Luke Skywalker||4, 5, 6, 7, 8|
Our current view is this:
create or replace view all_movie_characters as select sw_char.id, sw_char.name from star_wars_characters sw_char inner join appearance_in_episode ep on sw_char.id = ep.character_fk group by sw_char.id, sw_char.name;
However we just found a neat little function called
appears_in_movie(i_character_id) – it looks like we could make our view more readable by using it (don’t start an argument about performance, this is just a made-up example!):
create or replace view all_movie_character_2 as select sw_char.id, sw_char.name from star_wars_characters sw_char where appears_in_movie(sw_char.id) = 1;
Because we are very careful we didn’t just replace our initial view but gave it a new (very creative) name so we can compare both views and make sure they have the same content:
select * from all_movie_characters minus select * from all_movie_character_2 union all select * from all_movie_character_2 minus select * from all_movie_characters;
0 Rows returned. Great! We can ship this neat little improvement to production with confidence!
Never trust yourself. Let’s look at the
create or replace function appears_in_movie( i_character_id integer ) return integer result_cache as l_count integer; begin select count(*) into l_count from appearance_in_episode where character_fk = i_character_id and episode_no between 4 and 6; if l_count > 0 then return 1; else return 2; end if; end;
Ooops. Seems like the person who created this function had a different context and did only care about the characters who appear in the original 3 episodes 4 to 6! (I want to claim hereby that I completely endorse the episodes 7+ so far)
Results of selecting the new view:
| 1 | Darth Vader |
| 2 | Luke Skywalker |
But what happened? Why didn’t we notice that Rey is completely missing in the new view?
The problem is that
UNION ALL and
MINUS have the same precedence, which means that they are applied in the order of occurence. Here’s what happened:
Darth Vader Luke Skywalker Rey MINUS Darth Vader Luke Skywalker ----------------- Rey UNION ALL Darth Vader Luke Skywalker ---------------- Rey Darth Vader Luke Skywalker MINUS Darth Vader Luke Skywalker Rey --------------- 0 Rows.
To avoid this, we need to wrap the two
MINUS comparisons in parenthesis:
( select * from all_movie_characters minus select * from all_movie_character_2 ) union all ( select * from all_movie_character_2 minus select * from all_movie_characters );
1 Row returned: 3 Rey
Yes, the version on github is a completely commented, runnable example, as the previous ones.
But I want to experiment a bit with making the examples on the blog more enjoyable and approachable by adding some story and only highlighting the crucial parts.