DEV Community

Vivekanand Gujjeti
Vivekanand Gujjeti

Posted on

Querying Data in S3 with Amazon Athena

Architecture Diagram

Image description

Task Details

  1. Log into AWS Management Console.

    1. Setup workgroup
    2. Create a database in Glue.
    3. Query table in Athena

Task 1:-
i. Login to AWS Console

Task 2:-
i. Make sure you are in the N.Virginia Region.
ii. Navigate to menu in the top, then click on Athena.
iii. Click on the Get started button.

Image description
iv. In the menu bar, click on Workgroup:primary.

Image description

v. To create a workgroup, Click on the Create workgroup button.

Image description
vi. Provide details to create a workgroup:
- Workgroup Name: Enter WhizWorkgroup
- Description: Enter Workgroup for Athena lab
- Query result location: Select the S3 bucket, whose name is starting with whizlabs...

Image description
- Leave other settings as default.
- Click on the Create Workgroup button.

vii. Now it will list all the Workgroups

Image description
viii. Select WhizWorkgroup by checking on the option and click on the Switch workgroup button.

Image description
ix. Now it will show the opening page, click on Get started

Image description

x. Now you can verify your workspace.

Image description

Task 3: Create a database in Glue

 1. Make sure you are in the **<u>N.Virginia</u>** Region.
 2. Navigate to   menu at the top, then click on **<u>AWS Glue</u>** in 
    the   section.
 3. By default, you will be able to see the **<u>tables </u>**present. In the left sidebar, Under Data catalog, Click on **<u>Databases</u>**.
 4. Click on the **<u>ADD DATABASE</u>**
 5. In the pop-up menu, enter the database name as 
   **<u>whizgluedatabase </u>**and click on the **<u>Create </u>**button.
Enter fullscreen mode Exit fullscreen mode

Image description
6. The database is now created.

Image description
Task 4: Create a table in Glue

  1. In the left sidebar, Under Data catalog, Click on Tables.
  2. To create a table, click on the Add tables button and select Add table manually.

Image description

  1. In the Set up your table's properties section, do the following: • Enter the Table name as whiz-sample-table • select the Database, select whizgluedatabase • Click on the Next button.

Image description

  1. In the Add a data store section, do the following: • Select the type of source: S3 (default) • Data is located in: Specified path in my account (default) • Include path: Select the S3 bucket name starting with whizlabs... • Click on the Next button below to proceed further...

Image description

  1. In the Choose a data format section, do the following: • Select Classification as CSV • Choose delimiter as Comma: , • Click on the Next button.

Image description

  1. In the Define a schema section, we will add 2 columns. • Click on the ADD Column button. • Column name: Enter Expense_Type and Column Type: Select string • Click on the Add button below.

• Click on the ADD Column button again.
• Column name: Enter Expense_Category and Column Type: Select string
• Click on the Add button below.
• After adding both the columns, click on the Next button.

Image description

  1. Add partition indices: Leave everything as default and click on Next button.
  2. Review the configuration of the table and click on the Finish button.

The table is now created.

Image description

Task 5: Query table in Athena

  1. Make sure you are in the N.Virginia Region.
  2. Navigate to menu in the top, then click on Athena in the section.
  3. Click on the Get started button, if asked.
  4. In the left sidebar, under data source, Select the database as whizgluedatabase.
  5. Then you will see our table, whiz-sample-table.

  6. To preview the data of whiz-sample-table table, select the Preview table.

Image description

  1. Query editor will automatically generate the SQL statement for querying the first 10 columns.

Image description

  1. The result of the query is shown below.

Image description

  1. To get the results of all expenses types under expense_category of food, paste the following SQL statement into the query editor. • SELECT * FROM "whizgluedatabase"."whiz-sample-table" where expense_category = 'Food'; Note: To execute the queries through the keyboard directly use the shortcut Ctrl + Enter (For windows) or Tab + Enter (For Mac)
  2. You can play around with some of the queries like:
  3. Getting a total number of rows present by running the following SQL statement in the query editor. • SELECT count(*) FROM "whizgluedatabase"."whiz-sample-table";

Once you execute this command you will get the result.

Thank you.

Top comments (0)