From the previous post there were Amazon Redshift use cases presented for business intelligence and machine learning.
I assume you have already created your Amazon account, if not you may open an account by following the steps here.
Objective: In this tutorial, we will create a Amazon Redshift cluster using sample data that has already been loaded into Amazon S3.
You may follow this tutorial by downloading the Amazon Redshift Getting Started Guide
Step 1: Sign back into the Management Console with your email address and password https://aws.amazon.com/console/
Step 2: In the Management Console, under the search bar type the word Amazon Redshift or you may already access it under AWS Services if you have visited this recently.
Step 3: On the Amazon Redshift dashboard click the orange button Create cluster
Step 4: Choose dashboard and click the orange button Create cluster
Step 5: In the Cluster configuration, provide a name under the Cluster identifier tab. A name must be in lower case letters and will accept a hyphen (-).
For this tutorial enter the name 'redshift-cluster-1' into the Cluster identifier tab.
Step 6: If your organization has never created an Amazon Redshift cluster before, you will be eligible to use the Free Trial which entitles you to 750 hours per month over a 2 month period.
If you are eligible for the Free Trial, select the radio dial for Free Trial.
Important note: after your free trial ends after 2 months, ensure that you delete your cluster to avoid incurring large costs at the end of the month.
Under this Free Trial, create a cluster with a node type dc2.large.
After the node type is selected, load the sample data stored in the Amazon S3 bucket.
In Sample data, choose Load sample data to load the sample dataset Tickit to the default database called 'dev' which has a public schema.
Step 7: In Database configuration section, enter the details below:
- Admin user name: awsuser
- Admin user password: create a password
Step 8: Click close and then click Create Cluster.
It will take a few minutes for the cluster to be created and you can check the status in the blue banner message.
After your cluster has successfully been created, the banner message will turn green and the created cluster will have a green tick with the wording 'available'
On the right, you may download the JDBC or ODBC drivers to connect to third party business intelligence tools such as Tableau.
Step 9: Check the properties of your cluster
You may resize the cluster.
Click on the button 'query data' after the cluster is created for Amazon Redshift.
Step 1: From the query editor, you may view the sample dataset table Tickit by choosing the cluster, default dev database and public schema.
Under the query editor, connect to the database and choose a cluster name under the tree-view panel. You will be prompted to enter the details for Database user name and password.
Enter the details you copied earlier for the database.
Step 2: Connect to the database from the query editor
Step 3: Expand the tables
Step 4: Run SQL queries
Try using the query editor to enter sample queries provided here
Additional resources are located on page 31 to make you efficient in navigating around the data warehouse and page 27 includes how to delete resources when you no longer need to use it to clean up resources for unexpected costs 👀
- Create an Amazon S3 bucket
- Load your data into Amazon Redshift using the COPY command
- Create a table
- Use the query editor run SQL queries