ChatGPT and other generative AI services have been around for less than a year and it's already taken the world over by storm. We see more and more use cases popping up every day on social media showing how these services solve a cool new problem. It's thrilling to see the surge of innovation pop up seemingly overnight.
A couple weeks ago I wrote an article about how ChatGPT changed the way I write software. The article discussed some new use cases available at our fingertips and shared a couple of pragmatic examples to get your gears turning.
Today we're going to take it a step further. We're going to discuss how you can write software to take advantage of services like ChatGPT. To do that, let's walk step-by-step through an example: an unanswered question chat bot.
What We're Building
Whenever you go into a busy livestream or popular chatroom, you see hundreds of messages flying across the screen every minute. It's impossible to carry on a conversation, let alone get an answer to a question. Moderators do a decent job wading through questions, but that's a manual task relying on their judgment and ability to parse through dozens of messages a second. Questions get missed and wrong answers are given... a lot.
I want to fix that with an AI-powered bot that answers these questions. We'll call it the "no question left behind" bot. Here are the requirements:
- Answer any questions that have been asked but received no answer
- Answer any questions that have been asked but received an incorrect answer
- Tag the person who asked the unanswered or incorrect question in the response
- Do not respond to questions that were answered correctly, either by a human or the bot
- Run automatically on a 1 minute interval
Before the introduction of generative AI, this project would have been next to impossible to build. But with proper prompt engineering, we can build it in just a couple of hours.
DISCLAIMER - I do not have the full source code available for this project. The purpose of this article is to show you how to structure prompts to generative AI services.
Setting Perspective
Have you ever told someone to "put their engineer hat" or their "product hat" on before asking them a question? This helps them get in the right frame of mind, or perspective, to answer your question.
When building a house, you'd get very different answers from the buyer, architect, and construction crew when you ask "describe the kitchen." To get the answer you're looking for you need to ask the right person.
It's very much the same way with ChatGPT. When prompting the service, you can optionally give it a system role which sets the perspective of the model and directly affects the type of answer you get back.
But before we go any further, let's take a step back and look at the inputs for a chatCompletion
using the OpenAI Node.js SDK.
const result = await openai.createChatCompletion({
model: 'gpt-4',
temperature: .7,
messages: messages
});
This is the call we use to communicate with OpenAI, specifically the gpt-4
model, which we are all familiar with at this point. The temperature
field indicates the "creativeness" of the answers the model provides. The lower the number the more creative/wild the answers are. This value ranges from 0 to 1, and I've found in general that a value of .7 will consistenly provide quality results.
The messages
field is where the power comes into play. You can pass entire conversations into this array, giving the model meaningful context (more on this later).
To set the perspective of our call, we need to pass in a message with the following properties:
{
"role": "system",
"content": "You are a master trivia player. You provide quick, to-the-point answers to any and all questions you see. You aren't afraid to correct others when they are wrong"
}
By indicating a role of system, we set the perspective of the AI model. For our "no questions left behind bot" we want a perspective of someone who is really good at trivia and providing succint answers. So we tell it exactly that.
Context Setting
Now that we've told the AI model how we want it to approach our prompt, we need to give it some data to process. We're building a chat bot, so let's feed it an array of json objects representing the chat history. Our raw data would look something like this:
[
{
"username": "allenheltondev",
"message": "Does anyone know what color you get when you mix purple and green?"
},
{
"username": "andmoredev",
"message": "definitely pink"
},
{
"username": "astuyve",
"message": "How many pounds are in a stone?"
},
{
"username": "allenheltondev",
"message": "Thanks. And how would you center a div in css?"
},
{
"username": "astuyve",
"message": "display: flex; align-items: center; justify-content: center;"
}
]
There's a lot going on in that conversation. We have a clearly incorrect answer in there, a question that wasn't answered at all, and a question answered correctly. Our chat bot needs to correctly answer the color question, give an answer to the weight question, and skip over the css one. But before we do that, we need to provide the data to ChatGPT. To do that, we must add another message to the message array.
const chatHistory = await getChatHistory(chatId);
const historyMessage = {
"role": "user",
"content": `Here is a chat history for the past minute. It's an array of json objects indicating the user that sent the message and what message they sent. ${JSON.stringify(chatHistory)}`
};
messages.push(historyMessage);
This will give ChatGPT the necessary information for it to do what we want. We still haven't asked it to do anything yet, but we've told it how we want it to approach the incoming prompt and have given it all the data it needs to do something meaningful.
Output Format
We're writing an app. Apps need consistent behavior. As I mentioned before, ChatGPT wanders on occasion and provides some interesting answers every now and then.
To get around this, we need to provide it with a schema to structure its output. We want a strongly defined schema that we can validate to guarantee the answers we are receiving have all the information we expect. So let's define our desired output as a json schema.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "MessageFormat",
"type": "array",
"items": {
"type": "object",
"properties": {
"username": {
"type": "string",
"description": "Username of the person sending a message"
},
"message": {
"type": "string",
"description": "Message sent to all users in the chat room"
}
},
"required": [ "username", "message" ]
}
}
We can provide this schema to ChatGPT as another message in our messages
array so it knows how to format the generated responose. Additionally, we can use the schema to validate the output afterward. This way we can guarantee downstream services are getting the data they expect.
const messageSchema = require('./messages.json');
// existing code
const outputMessage = {
"role": "user",
"content": `Here is a MessageFormat json schema. Please provide an answer in this format when asked to respond with "MessageFormat" schema. ${messageSchema}`
}
messages.push(outputMessage);
Asking the Question
Now that we've set the perspective, provided the contextual data, and given the desired output format, we now can ask our question. Believe it or not, this is the easy part! We just need to remember the other pieces of data we've provided in the messages
array so we can reference them appropriately. The question is built in the same format as our other messages, but this time we'll ask it to do what we want.
{
"role": "user",
"content": `Answer any unanswered or incorrect questions from that chat history. Be sure to tag the user who asked the question in your answers (use @{username} to tag the user). The answers you come up with should come from a bot with the username "NoQuestionsLeftBehindBot". Structure your answer in the MessageFormat schema.`
}
You see here that we reference the contextual data as "that chat history". The model knows from the entire array of provided data what the chat history is, so we can refer to it simply. Same case for the output format - we've already told it how to structure the response when told to use the MessageFormat schema.
Validating the Response
As I mentioned earlier, you should validate the format of the response to guarantee you received data in the correct shape. To do this in Node.js, we use ajv as our validator.
const Ajv = require('ajv');
const addFormats = require('ajv-formats');
const messageSchema = require('./messages.json');
const ajv = new Ajv();
addFormats(ajv);
// code to create message array
const result = await openai.createChatCompletion({
model: 'gpt-4',
temperature: .7,
messages: messages
});
try {
const answerArray = JSON.parse(result.data.choices[0].message.content);
const validate = ajv.compile(messageSchema);
const valid = validate(answerArray);
if (!valid) {
throw new Error ('Invalid data format');
}
return answerArray;
} catch(err) {
console.error(err);
throw new Error ('Invalid data format');
}
If we don't recieve data in the correct format, we can retry up to a set number of times before failing execution and notifying a human. But in the likely event that we do receive data in the format we expected, we can push it on to downstream services. In the case of our bot, that would be sending these messages over a WebSocket connection to push the answers straight to the browsers of the people connected to the chat room.
Holding Conversations
Everything we walked through was for a single prompt. The combination of the perspective, data, output format, and question are considered a prompt. But there are many use cases where you need to continue the conversation and build on the answers provided from ChatGPT.
When ChatGPT gives you an answer, it will return a message in the same format we've been building ourselves. The only difference is the role
property will be set to assistant to indicate the answer came from ChatGPT itself.
I built a conversational Lambda function in my serverless toolbox that will maintain the back and forth between you and ChatGPT. It records all messages recieved from ChatGPT and all the questions you've asked it in chronological order. Every time you call it, it adds to the existing conversation, allowing you to provide the full conversation to the model in a hands-off way. You can use the skills we just learned above in conjuction with this Lambda function to build powerful conversations with AI.
Summary
Building prompts is much more difficult than you'd expect. The easy-to-use user interface of OpenAI's website has given a false impression of how easy it is to communicate and get answers back from a generative AI service.
Prompting requires multiple pieces, checks and balances, and a little bit of know-how to get answers in the right perspective and shape you'd expect. Much like programming, we aren't going to be experts on day one. It takes practice and experimentation to know exactly how to get the type of data you want out of these services. For example, it took many iterations for me to reach a result from our chat bot above that results in the following output:
[
{
"username": "NoQuestionsLeftBehindBot",
"message": "@allenheltondev, you get dark gray when mixing purple and green."
},
{
"username": "NoQuestionsLeftBehindBot",
"message": "There are 14 pounds in a stone, @astuyve."
}
]
As with most SaaS offerings, ChatGPT and other generative AI services will continue to get better over time. More things will be handled for us, reasoning will become better, and it might even tell you how to prompt it better!
But one thing is for sure - AI is not going away. It's going to continue to grow in popularity and availability. So get in now while we're still pioneering. Jump in, build a couple apps, and be amazed at the new capabilities we've just recently unlocked.
Happy coding!
Top comments (2)
Great article on prompy engineering
Great write-up! Covers all the important parts of prompt engineering with the OpenAI API. I've been playing around with the API and found out that it can also understand jsdoc type definitions. It's a slightly more compact annotation than a JSON schema which can be handy if we're running out of tokens.
But now they've updated the API, the
gpt-3.5-turbo-0613
andgpt-4-0613
models can take in a functions param, where you can define the output schema. Andgpt-3.5-turbo-16k
has a much larger token limit. Very exciting stuff!