Banjo Obayomi

Posted on Feb 22, 2023

How I used GPT-3 to Build 1,000 AWS Quiz Questions

#ai #chatgpt #aws

As demand for AWS-certified professionals continues to grow, having a solid understanding of the services and architecture is becoming increasingly essential. However, preparing for certification exams can be challenging, especially when trying to find relevant and engaging study materials.

In this blog post, I'll show you how I leveraged GPT-3 and the AWS Well-Architected Framework to create a significantly improved quiz experience from the one I made in 2020.

This time around, I was able to generate 1,000 scenario-based multiple choice questions, each accompanied by an explanation for the correct answer and additional resources for further study.

Compared to my previous quiz, which was based on a knowledge graph and offered limited question variety, this new approach allowed me to create sophisticated questions that test your understanding of real-world scenarios within each domain of the exam guide.

The release of ChatGPT showed that generative AI is ready, and my vision can finally be implemented.

To accomplish this, I'll walk you through the following

Data Collection
Creating Text Embeddings
Prompt Engineering
Question Generation
Creating the Quiz

You can try out the quiz here

And the GitHub repo with the code here

Data Collection

I started by using Selenium and BeautifulSoup to scrape the HTML page of the Well-Architected Framework, and stored all the text on each page into a CSV. I also extracted the title and URL for each page so they can be referenced later.

Here is some of the code used to gather data for each page.

def get_page_info(url):
    browser.get(url)
    html = browser.page_source
    # Have soup parse the website
    soup = BeautifulSoup(html, "html.parser")

    # Get title
    title = soup.find("title").string

    main_article = soup.find(id="main-col-body")  # main text of article
    # Get text sections
    text_sections = main_article.findAll("p")
    text_list = []

    for list_item in text_sections:
        text_list.append(list_item.text)

    # Get info in tables
    tables = main_article.findAll("table")

    for table in tables:
        # Add all ths and tds
        tds = table.findAll("td")
        ths = table.findAll("th")

        for th in ths:
            text_list.append(th.text)

        for td in tds:
            text_list.append(td.text)

    json_obj = {}
    json_obj["url"] = url
    json_obj["title"] = title
    json_obj["sections"] = text_list

    return json_obj

Data Cleaning

After the raw data is collected, I did some light data cleanup to normalize text and to split up any sections over 5,000 tokens as their model has an upper limit of tokens it can process. I also removed text that had less than 13 tokens, as there was bunch of text that just had service names such as "Amazon S3" and no context around what it does.

Creating Text Embeddings

Next, I created text embeddings for each of the pages using
OpenAI's embeddings API.

Text embeddings measure the relatedness of text strings. Embeddings are commonly used for:

Search (where results are ranked by relevance to a query string)
Clustering (where text strings are grouped by similarity)
Recommendations (where items with related text strings are recommended)
Anomaly detection (where outliers with little relatedness are identified)
Diversity measurement (where similarity distributions are analyzed)
Classification (where text strings are classified by their most similar label)

An embedding is a vector (list) of floating point numbers. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness.

This OpenAI Notebook provides a full end to end example of creating text embeddings

Here is an example of how the text looks like as an embedding, an list of 1536 numbers that represent the text.

With text embeddings we can now do a Search of all the text based on an input query. By asking Designing VPC architectures with security components we get a list of the documents that has text which is relevant to the query.

Prompt Engineering

Prompt engineering is about designing prompts that elicit the most relevant and desired response from a Large Language Model (LLM) such as GPT-3.

Crafting these prompts is an art, that many are still figuring out but a rule of thumb is the more detailed the prompt, the better the desired outcome.

I first started out with a simple prompt

Generate a multiple choice question that can appear on a AWS Certified Solutions Architect - Associate Exam

The responses varied in quality without any context, here's an example of one.

'Q1. Which of the following is a feature of Amazon EC2?
A. Automated scaling
B. Automated backups
C. Automated patching
D. Automated deployment

Answer: A. Automated scaling. Explanation: Amazon EC2 provides automated scaling, which allows users to scale their compute resources up or down based on their application needs.

Here is where prompt engineering can shine. I need to "overfit" my prompt so I can get the desired outcome of a question that is derived from the study material.

For this quiz, I want the questions to be about the specific testing areas for the test, and then inject the relevant content, a random scenario and have the data be returned in a JSON format so it can be used in different applications.

Question Generation

To construct my prompt, I made up around 20 different scenarios like A healthcare company wants to store and process sensitive patient information in the cloud. and then did a text embedding search on a topic that will show up on the exam such as Determining when to federate a directory service with IAM roles and then injected the text from the retrieved documents to make a robust prompt.

Generate a scenario-based multiple-choice question for the AWS Certified Solutions Architect Associate Exam using the provided scenario, context, and knowledge area. 

The response must be returned in the specified JSON format with nothing else. 

There should be one correct answer and 3 incorrect answers. The incorrect answers should be response options that a candidate with incomplete knowledge or skill might choose. 

Provide an explanation for the answer to each question as well. The question must be about a scenario, and not a simple definition question such as What type of storage is Amazon S3. 

The answers must also be action-oriented and not just the name of a service.

Scenario: A healthcare company wants to store and process sensitive patient information in the cloud."

Context:
Require identities to dynamically acquire temporary credentials. For workforce identities, use AWS IAM Identity Center (successor to AWS Single Sign-On)....

Knowledge Area: Determining when to federate a directory service with IAM roles

JSON Format: {"question": "","answer_choices": [{"answer": "","is_correct": "","explanation": ""},{"answer": "","is_correct": "","explanation": ""},{"answer": "","is_correct": "","explanation": ""},{"answer": "","is_correct": "","explanation": ""}]}

Using OpenAI's Completions API, with the prompt, I was able to use a Python script to iterate through all 189 domains on the exam guide with the example scenarios to generate 1,000 questions.

Here is an example question

Which of the following is the best practice for granting consumers access to a healthcare company's AWS resources, while storing and processing sensitive patient information in the cloud?

A. Federate with AWS IAM roles to access AWS accounts
B. Use Amazon Cognito identity pools and assign them a set of temporary, limited privilege credentials to access AWS resources
C. Grant consumers access to AWS resources using IAM users with long-term access keys
D. Grant consumers access to AWS resources using AWS Single Sign-On

And each question provides explanations, and link to the reference material.

Pitfalls

Not every question generated by GPT-3 was perfect. Sometimes the JSON did not have a question, or there were two answers or none at all despite the prompt saying "There should be one correct answer and 3 incorrect answers." This required me to manually review and correct the data. There could also be a chance that some of the questions may even be wrong, but by having the explanation paired with the documentation allows us a way to verify the information.

Creating the Quiz

I developed this quiz using my go-to tool for building web applications with Python, Streamlit. Streamlit allows builders to easily create interactive web apps that provide instant feedback on user responses. The quiz data is loaded from a JSON file and rendered using interactive widgets, making it simple for users to interact with the quiz. Additionally, I included links to documentation for each question, allowing developers to further explore the topics covered in the quiz.

Conclusion

In this blog post, I shared how I used GPT-3 and the AWS Well-Architected Framework to generate 1,000 scenario-based multiple choice questions for AWS certification exams.

I covered the entire process, from data collection to quiz creation, and provided insights into the challenges I faced along the way.

You can check out the quiz here, and use it as a study aid for your own AWS certification journey.

If you're interested in building your GPT-3 powered applications, I hope this post has provided you with some helpful tips and guidance.

DEV Community

How I used GPT-3 to Build 1,000 AWS Quiz Questions

Data Collection

Data Cleaning

Creating Text Embeddings

Prompt Engineering

Question Generation

Pitfalls

Creating the Quiz

Conclusion

Top comments (0)

Read next

Deus in Machina: Pinging Jesus in the Digital Confessional

What’s on the Horizon for AI in Healthcare?

Top Generative AI-Based Testing Tools in the Market

New AI Model Uses Document Screenshots to Revolutionize Search Across Text and Images