Return JSON from OpenAI to build AI enhanced APIs

#openai #webdev #programming #javascript

When building APIs on top of OpenAI, you're usually getting plain text back.
This is fine when a human interacts with the API, because they can easily "parse" the text even though it's not structured.
But what about building APIs on top of OpenAI that should be consumed by other machines?
How can we build APIs and document them using OpenAPI while still using OpenAI to generate the response?

The Problem: How to return structured data (JSON) from OpenAI?

Let's say we want to build an API that returns the weather of a given country.
We don't want to manually write the integration code but rather use OpenAI to generate the response.
However, LLMs like GPT-3 are simply returning plain text, not structured data (JSON).
So how can we force OpenAI to return an answer that conforms to a JSON Schema so that we can expose it as an API,
documented using OpenAPI?

The Solution: With zod / JSON Schema and OpenAI Functions, you can return structured data (JSON) from OpenAI

OpenAI has a new feature called "Functions".
Functions are a way to define Operations that can be called from within an LLM.
Functions can be described using JSON Schema.

What happens though is that the LLM will not call the function directly, but rather generate inputs for the function and return them to you.
So you can create a prompt and add the available functions as context.
You then call the OpenAI API and the response "might" contain instructions to call a function with certain inputs.

This is a bit hard to understand, so let's look at an example.

const completions = await openai.createChatCompletion({
    model: this.model,
    messages: [
        {
            role: 'user',
            // this is the actual prompt from the user
            content: 'What is the weather like in Berlin?',
        },
        {
            role: 'agent',
            // this is the assumed response from the agent (LLM) in text form
            content: 'The weather in Berlin is sunny and 25 degrees',
        },
        {
            role: 'user',
            // this is our "injected" prompt to trigger the agent to "send" the result to the out function
            content: 'Set the result to the out function',
        }
    ],
    functions: [
        {
            name: 'out',
            // here we define our function to get the result from the agent in JSON form
            description: 'This is the function that returns the result of the agent',
            // we use zod and a zod-to-json-schema converter to define the JSON Schema very easily
            parameters: zodToJsonSchema(z.object({
                temperature: z.number(),
                city: z.string(),
                description: z.string(),
            })),
        },
    ],
});
const structuredResponse = JSON.parse(completions.data.choices[0].message!.function_call!.arguments!);
const validatedOut = this.outputZodSchema.parse(structuredResponse);
console.dir(validatedOut); // { temperature: 25, city: 'Berlin', description: 'sunny' }

So, what happens here?
We ask OpenAI to create a chat completion for a previously answered prompt.
We want the LLM to use a function to "send" the result to.
The parameters of the function are defined using JSON Schema (zod).

As a result of this prompt, we get a response from OpenAI that it wants to call the function with a JSON encoded string as input (completions.data.choices[0].message!.function_call!.arguments!).

This string can be parsed using JSON.parse and then validated using zod.
After that, we can be sure that the response is valid on following the schema we've defined.

What's left is that we put all of the pieces together,
add code generation on top of it and we have a fully automated way to build APIs on top of OpenAI.

Final solution to expose an AI-enhanced API via OpenAPI

The WunderGraph Agent SDK does all of this for you out of the box.
Define an Operation using TypeScript, add an agent to execute your prompt, and you're done.
The framework will infer the JSON Schema from the TypeScript types and generates the OpenAPI documentation for you.

// .wundergraph/operations/openai/GetWeatherByCountry.ts
export default createOperation.query({
    input: z.object({
        country: z.string(),
    }),
    description: 'This operation returns the weather of the capital of the given country',
    handler: async ({ input, openAI, log }) => {
        const agent = openAI.createAgent({
            // functions takes an array of functions that the agent can use
            // these are our existing WunderGraph Operations that we've previously defined
            // A WunderGraph Operation can interact with your APIs and databases
            // You can use GraphQL and TypeScript to define Operations
            // Typescript Operations (like this one right here) can host Agents
            // So you can also call other Agents from within an Agent
            functions: [{ name: 'CountryByCode' }, { name: 'weather/GetCityByName' }],
            // We want to get structured data (JSON) back from the Agent
            // so we define the output schema using zod again
            structuredOutputSchema: z.object({
                city: z.string(),
                country: z.string(),
                temperature: z.number(),
            }),
        });
        // Finally, we execute the agent with a prompt
        // The Agent will automatically fetch country data from the CountryByCode Operation
        // and the weather data from the weather/GetCityByName Operation
        // It will then generate a response using the schema we've defined
        return agent.execWithPrompt({
            prompt: `What's the weather like in the capital of ${parsed.country}?`,
        });
    },
});

We can now use this Operation using any OpenAPI client, like Postman, or even just curl.

curl --request 'http://localhost:9991/operations/openai/GetWeatherByCountry?country=Germany'

The response will be a JSON object that conforms to the schema we've defined.

{
  "city": "Berlin",
  "country": "Germany",
  "temperature": 25
}

You OpenAPI documentation will be generated in the following directory:

.wundergraph/generated/wundergraph.openapi.json

Learn more about the WunderGraph Agent SDK

If you want to learn more about the Agent SDK in general,
have a look at the announcement blog post here.

If you're looking for instructions on how to get started with the Agent SDK,
have a look at the documentation.

Conclusion

OpenAI is a powerful tool that can be used to build APIs on top of it.
With the new Functions feature, we can even return structured data (JSON) from OpenAI.
This allows us to build APIs on top of OpenAI that can be consumed by other machines.
We've also demonstrated how to use the WunderGraph Agent SDK to write up the agents and generate OpenAPI documentation automatically.

You can check out the source code on GitHub and leave a star if you like it.
Follow me on Twitter,
or join the discussion on our Discord server.