GPT Models Can Do More Than Just Talk: How to Make Them Brew You Coffee ☕

Intent Classification 🎯

Intent classification is the task of figuring out what the user wants to achieve or do from their natural language input. For example, if the user says "Can you make me a coffee?", the intent could be "Make a coffee". This way, we can map the user's input to a specific action or function that we want our GPT model to perform.

To do this, I used Ada Models, which I wrote about in a previous article. Ada Models are models that return the embedding of the text, which is a numerical representation of its meaning. The embedding captures the semantic and syntactic features of the text, such as the words, phrases, and context.

By using a simple formula, we can find the most relatable of the embedding among a set of predefined categories, which indicates the most likely intent. The formula is based on the Euclidean distance, which measures how close two vectors are in a multidimensional space. The smaller the distance, the more similar the vectors are.

For example, if we have three categories: "Make a coffee", "Make a tea", and "Make a sandwich", and we have an embedding for each category, we can compare the embedding of the user's input with each category embedding and find the one with the smallest distance. If the user says "Can you brew me a coffee?", the embedding of their input will be closer to the embedding of "Make a coffee" than to the other two categories, so we can infer that their intent is "Make a coffee".

A sample of how we can find the intent can be found by categorizing the intents into actions. So for example, if you have an action like Make a coffee, you can define it as a category with a user-provided prompt and a function to execute:

Make a coffee => (User Provided Prompt = "Can you make me a coffee?") => MakeACoffee()

However, to increase the accuracy and flexibility of the system, we don't just define one category for each action, but more different ways of expressing the same intent. For example:

Make me a coffee => (User Provided Prompt = "Can you make me a coffee?") => MakeACoffee()\
Bring me a coffee => (User Provided Prompt = "Can you bring me a coffee?") => MakeACoffee()\
Buy me a coffee => (User Provided Prompt = "Can you buy me a coffee?") => MakeACoffee()

As you can see, all these categories will trigger the same function: MakeACoffee().

Rule-Based System 📜

The rule-based system is like any other chatbot system, but ours will be very dynamic and powerful and use AI from start to end. The concept is to make a list of categories as keys and objects as values:

const intents = {
  categories: {
    "Make a coffee": MakeACoffee,
    "Make me a coffee": MakeACoffee,
    "Bring me a coffee": MakeACoffee,
    "Buy me a coffee": MakeACoffee,
  },
};

Each object has an action, which can be either an API call or something else, a URL to call if it's an API call and a response function to handle the result of the call. For example:

const MakeACoffee = {
 action: "LOCAL_ACTION",
 response : (response) => 'Your coffee is here!'
};

A function would make a fetch request and return the response. You can customize the response as much as you like.

Now whenever you ask for, Make me a coffee Your result would be your coffee is here.

But what if you have follow-up questions or options? How would you handle that?

For example, when you ask the system to makeACoffee, you might want to handle what kind of coffee use case. To do that, we will have a recursive approach.

Each object can also have an ask function to generate a follow-up question and a categories object to define the possible options for the user. For example:

const MakeACoffee = {
  action: "API_CALL",
  url: "http://localhost:3000/coffees",
  ask: [
    (coffees) => {
      return `What kind of coffee would you like ${coffees.map(
        (coffee) => coffee
      )}?`;
    },
  ],
  categories: () => ({
    Mocha: MochaCoffee,
    Karak: KarakCoffee,
    Espresso: EspressoCoffee,
  }),
};

Now with this approach, we will have a list of coffees in the response with a follow-up question and another object to handle the user's choice.

const MochaCoffee = {
  action: "LOCAL_ACTION",
  response: (response) => "Your mocha coffee is here!",
};

And that's it! You have successfully integrated your GPT model with your own data sources and actions. You can now enjoy your coffee while chatting with your AI friend. ☕

Behind the Scenes 🕵️‍♂️

Now let's look at the behind the scene working of this application. I showed you the configuration, but let's look at the internals.

The second part of this article is here : GPT Models Can Do More Than Just Talk: How to Make Them Brew You Coffee ☕

DEV Community

GPT Models Can Do More Than Just Talk: How to Make Them Brew You Coffee ☕

Intent Classification 🎯

Rule-Based System 📜

Behind the Scenes 🕵️‍♂️

Top comments (0)

Read next

Open Source Funding for Maintenance: Ensuring Sustainability

Open Source Project Investment: A Gateway to Collaborative Innovation

TS1231: An export assignment must be at the top level of a file or module declaration

TS1237: The return type of a parameter decorator function must be either 'void' or 'any'