Wendy Wong for AWS Community Builders

Posted on Nov 8, 2021 • Edited on Mar 7, 2022

Build a data warehouse quickly with Amazon Redshift - Part 2

#aws #database #cloud #tutorial

From the previous post there were Amazon Redshift use cases presented for business intelligence and machine learning.

Pre-requisite ✨

I assume you have already created your Amazon account, if not you may open an account by following the steps here.

Let's start with a quick warm up 🙆‍♀️🙆‍♂️

👇

Tutorial 1: Create an Amazon Redshift cluster using sample data

Objective: In this tutorial, we will create a Amazon Redshift cluster using sample data that has already been loaded into Amazon S3.

You may follow this tutorial by downloading the Amazon Redshift Getting Started Guide

Step 1: Sign back into the Management Console with your email address and password https://aws.amazon.com/console/

Step 2: In the Management Console, under the search bar type the word Amazon Redshift or you may already access it under AWS Services if you have visited this recently.

Step 3: On the Amazon Redshift dashboard click the orange button Create cluster

Step 4: Choose dashboard and click the orange button Create cluster

Step 5: In the Cluster configuration, provide a name under the Cluster identifier tab. A name must be in lower case letters and will accept a hyphen (-).

For this tutorial enter the name 'redshift-cluster-1' into the Cluster identifier tab.

Step 6: If your organization has never created an Amazon Redshift cluster before, you will be eligible to use the Free Trial which entitles you to 750 hours per month over a 2 month period.

If you are eligible for the Free Trial, select the radio dial for Free Trial.

Important note: after your free trial ends after 2 months, ensure that you delete your cluster to avoid incurring large costs at the end of the month.

Under this Free Trial, create a cluster with a node type dc2.large.
After the node type is selected, load the sample data stored in the Amazon S3 bucket.

In Sample data, choose Load sample data to load the sample dataset Tickit to the default database called 'dev' which has a public schema.

Step 7: In Database configuration section, enter the details below:

Admin user name: awsuser
Admin user password: create a password

Step 8: Click close and then click Create Cluster.

It will take a few minutes for the cluster to be created and you can check the status in the blue banner message.

After your cluster has successfully been created, the banner message will turn green and the created cluster will have a green tick with the wording 'available'

On the right, you may download the JDBC or ODBC drivers to connect to third party business intelligence tools such as Tableau.

Step 9: Check the properties of your cluster

You may resize the cluster.

Tutorial 2: Use the query editor to run example queries in SQL

Click on the button 'query data' after the cluster is created for Amazon Redshift.

Step 1: From the query editor, you may view the sample dataset table Tickit by choosing the cluster, default dev database and public schema.

Under the query editor, connect to the database and choose a cluster name under the tree-view panel. You will be prompted to enter the details for Database user name and password.

Enter the details you copied earlier for the database.

Step 2: Connect to the database from the query editor

Step 3: Expand the tables

Step 4: Run SQL queries

Try using the query editor to enter sample queries provided here

Additional Reading

Additional resources are located on page 31 to make you efficient in navigating around the data warehouse and page 27 includes how to delete resources when you no longer need to use it to clean up resources for unexpected costs 👀

Next Tutorial: Bringing in your own data to Amazon Redshift

Create an Amazon S3 bucket
Load your data into Amazon Redshift using the COPY command
Create a table
Use the query editor run SQL queries

Happy Learning!

DEV Community