Alejandra Quetzalli 🐾 for AWS

Posted on Mar 14, 2020 • Edited on Apr 23, 2020

Have you heard about Materialized Views in Amazon Redshift?

#aws #beginners #sql #tutorial

Materialized Views helps improve performance of analytical workloads such as dashboarding, queries from BI (Business Intelligence) tools, and ELT (Extract, Load, Transform) data processing. Materialized Views store the pre-computed results of queries and maintain them by incrementally processing latest changes from base tables. Future queries referencing these Materialized Views can then use the pre-computed results to run🏃🏻‍♀️much faster.

You can create Materialized Views based on one or more source tables by using filters, projections, inner joins, aggregations, grouping, functions, etc. (👉🏽Read more in the docs, here.)

What customer problem does Materialized Views solve?🧐

In a data warehouse (system used for reporting and data analysis) environment, applications often perform complex queries on large tables. A common example would be using a SELECT statement to perform multiple-table joins and aggregations (process where data is collected and presented in summarized format) on tables that contain billions of rows. Due to the complexity and large volume of data, processing these queries can be very time-consuming!

Enter Materialized Views in Amazon Redshift.🙌🏽

A Materialized View stores the result of the SELECT statement that defines the Materialized View. You can then issue a SELECT statement to query the Materialized View, in the same way that you query other tables or views in the database. When you query the Materialized View, you’re now querying that pre-computed result, that was based on an SQL query over one or more base tables. The difference is that now Amazon Redshift can process the query based on the pre-computed data stored in the Materialized View, without having to process the base tables at all!😅 This is a win🏆, because now query results are returned much faster compared to when retrieving the same data from the base tables.

How can I create and manage Materialized Views?👩🏻‍💻📊

First, let me point you to the docs that detail SQL commands used to create and manage Materialized Views...

CREATE MATERIALIZED VIEW
REFRESH MATERIALIZED VIEW
DROP MATERIALIZED VIEW

Second, let’s walk through a basic example on how to CREATE Materialized Views and REFRESH it after data ingestion. For the purpose of this blog post, I'm going to pretend you’ve already created and connected to your cluster. Let’s run some queries on the AWS Management Console with the Redshift query editor...

I CREATE a simple table with the following SQL command, making sure I hit the run button for each individual query. (To see my results afterwards, I select the table base_table.)

CREATE table base_table (a int, b int);

Now I want to INSERT more values into our base_table.

INSERT into base_table values (1,2);

Time⏳ to CREATE our first Materialized View!😅

CREATE MATERIALIZED VIEW mv_test as (select a from base_table);

Ok, let’s INSERT a few more values into our base_table...

INSERT into base_table values (2,3);

Let's SELECT all our content from base_table.

SELECT * FROM base_table;

Now let's SELECT all our content from mv_test. (This command will retrieve our pre-computed results, which at this point are stale. The base_table has changed, but the changes have not yet been reflected.)

SELECT * FROM mv_test;

Let’s get this up to date. (A Materialized View stores the query result. When you query the Materialized View, you’re querying that pre-computed result.)

refresh materialized view mv_test;

Now let’s SELECT all our content from mv_test again...

SELECT * FROM mv_test;

And now we see it’s up to date! And this is what we call incremental maintenance.💁🏻‍♀️

What if I want to test a more complex scenario?👩🏻‍🔬🔬🥼🧪

Want to replicate a more detailed scenario to better appreciate how this new feature comes in handy?

Check out this example from our documentation. It walks you through the process of creating tables, uploading sample data, and querying the database in your Amazon Redshift cluster.

And lastly...

This new feature is available at no additional cost, in all regions where Amazon Redshift is available.☁️

¡Gracias por tu tiempo!
~Alejandra💁🏻‍♀️ y Canela🐾

Top comments (1)

Ashish Kaushal • Jun 21 '20

Hi Alejandra,

It was nice article, very helpful for me.
I have further question, should i need to run refresh command again and again whenever record ingest in table or is this one time activity if not than how can I scheduled it.
refresh materialized view mv_test

Thanks In advance