This article is relevant to MYSQL. Other flavours of SQL (eg SQL Server , PostgreSQL , SQLite) may be slightly different, SQL Server in particular has Indexed Views which can improve performance.
Recently I have had the opportunity to trial several third party BI tools which integrate with my database. I tend to err on the side of caution and do not want to expose my table structure to these applications.
Some of these BI tools also allow users to create their own "Queries" with the tables, which can be confusing as they wont know the schema.
The common solution across their documentation for both issues is to create Views to simplify tables, hide joins etc.
While this sounds great it can be an absolute disaster performance-wise for your database.
What is a View ?
First off what exactly is a Database View ?
It is the result set of a stored query in the database server that can then be queried like a table.
Consider this example query to return actors and films they are in ( using the Sakilla sample Database ).
select
first_name , last_name, title,description,release_year
from actor a
inner join film_actor fa on fa.actor_id = a.actor_id
inner join film f on f.film_id = fa.film_id;
We can create a View from this Query by adding Create View 'View_name' as
to the start of the query like the below.
CREATE VIEW `actor_films` AS
select
first_name , last_name, title,description,release_year
from actor a
inner join film_actor fa on fa.actor_id = a.actor_id
inner join film f on f.film_id = fa.film_id;
We can then query this like we would a table like
select * from actor_films ;
Seems pretty straight forward so far right ?
Types of Processing Algorithms
There is an additional optional clause for the Create View statement - the Algorithm. There are three possible values for this - Merge
, Temptable
and Undefined
. This clause defines how MySQL will process the view query, and this is where your performance can be hit as hard as if hit by the hammer of Thor.
Merge
Lets say you need to need to query our new view - how exactly will MySQL do this ? With the merge algorithm it will simply merge your query with the query that creates the view.
Example - we add a where condition to query our view:
select *
from actor_films
where release_year > 2006 ;
This will be combined with the View Query as:
CREATE VIEW `actor_films` AS
select
first_name , last_name, title,description,release_year
from actor a
inner join film_actor fa on fa.actor_id = a.actor_id
inner join film f on f.film_id = fa.film_id
where release_year > 2006
One thing to be away of is that you cannot use the merge
algorithm if you have any of the following in your View Query:
1.Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
2.DISTINCT
3.GROUP BY
4.HAVING
5.LIMIT
6.UNION or UNION ALL
7.Subquery in the select list
8.Assignment to user variables
Trying to use any of these will force MySQL to use the below Temptable
Algorithm.
So when using the Merge
algorithm, the performance of your View is only as good as the query that creates it.
Temptable
With the Temptable
option MySQL will run the View Query and store the results in a temporary table. Your query will then be run against this table.
This can be absolutely dire for performance if you don't have a where clause that significantly narrows down the result set in the View Query.
Consider an example view that contains Customer Orders and Details. Without a date clause limiting the results to eg the most recent 2years worth of orders , you will get all orders for all time. If you then try to add a few clauses when querying this view you will essentially be running the query twice - once where the temporary table is created and then again when your query is used against it.
This can be a disaster when using large datasets.
Undefined
This is the default option set when the Algorithm hasn't been set in the create statement. It allows MySQL to automatically select the best of the Merge
and Temptable
options with a preference to use the Merge
option.
What ? more dangers ?!
So what happens when users start to join various Views together into new Views ? If they use the Merge
algorythmn then not much, but if they use Temptable
then you will start to take a hit again with performance as your performance issue begins to cascade.
And it gets worse. and worse. and worse.
Le Solutions
- Plan the Views correctly and avoid any queries that will use the
Temptable
algorithm. - If you can, use aggregate functions when querying the View rather than in the Create View Statement.
- Add
Where
conditions that will reduce the View result set.
And if you're feeling generous you can buy me a coffee with the link below ( and yes its all for coffee, I drink a copius amount of it while writing ☕ )
Top comments (1)
VIEWs in general, I've personally not liked. What I ended up doing was building a SQL query generator that handled all of the JOIN logic automatically, but also dynamically, so it could be re-used for different parts of the application easily. This also allowed us to update the "application" and the "view" (handled application side) in a single Git commit, rather than having a potentially inconsistent database vs application access. Everything being maintained this way helped reduce errors, and as you've described, helped prevent a ton of performance headaches. Using this dynamic SQL query generator, it was pushed to around 30 table JOINs, and could still crank out results in single to double digit milliseconds. Each page on the web app was also able to have its own minor tweeks to this virtual "VIEW" without worrying about effecting any other page!
You can check out the PHP code here: github.com/darkain/pudl