If you have struggled to get consistent output from the OpenAI API when dealing with structured data, this post is for you. As AI and machine learning continue to evolve, it can be both exciting and challenging to extract valuable insights from data using OpenAI's API. One of the biggest challenges developers face is ensuring consistency in the results obtained from the API. In this post, we will explore how to achieve consistent results by integrating OpenAI's API with structured output parsing. We will use a practical example that involves extracting meaningful words or groups of words along with their associated emojis.
Challenges of Direct Usage of OpenAI API
- Inconsistent Formatting: OpenAI's API generates responses based on the input prompt, which can lead to varied formatting in the output. For instance, if you ask for a summary of a journal entry without specifying a structure, one response might list insights in complete sentences. At the same time, another might return a bullet-point list. This inconsistency complicates the process of extracting specific pieces of information from the responses.
- Variable Detail Level: The level of detail in the responses can also vary significantly. For example, a prompt requesting an analysis of a journal entry's mood might return a simple "happy" in one instance and a more nuanced "generally content with moments of excitement" in another. Such variability makes it difficult to categorize and act upon the AI's insights systematically.
- Lack of Standardization: Without a predefined schema, parsing the AI's output to fit into a database or other structured data storage system becomes challenging. For instance, if you're collecting weather information, you'd want each data point to have a consistent format like {"temperature": 72°F, "humidity": 50%}. If you make API calls without structured prompts, the responses might not align with this desired format, making it challenging to work with the data. Even if you defined the structure in the prompt it may give inconsistent results.
The Structured Approach with Zod and Langchain
Introducing Zod and Lanchain into the workflow allows us to define explicit schemas that describe the exact structure of the data they're working with. For instance, if the desired output is a list of product objects, each containing a name and a price, a Zod schema can enforce this structure.
We will provide a complete coding example, covering the entire process, to make it more accessible for you to implement on your own.
- Define the Schema: We start by defining a Zod schema for our expected output, which includes an array of objects, each object representing a product with a name and price.
import { z } from 'zod';
import {
StructuredOutputParser,
OutputFixingParser,
} from 'langchain/output_parsers'
const moodSchema = {
mood: z
.string()
.describe('the mood of the person who wrote the journal entry.'),
subject: z.string().describe('the subject of the journal entry.')
}
By modifying the Zod form, you can ensure that it perfectly aligns with your database structure.
Next, We are going to use StructuredOutputParser to enforce and validate a specific data structure defined in moodSchema. This ensures that the API response sticks to the expected format, maintains data consistency, and enables easier data processing and analysis.
const parser = StructuredOutputParser.fromZodSchema(moodSchema)
- Crafting the Prompt: Next, we craft a prompt that guides Open AI to generate structured responses and it is crucial. By specifying the desired output format in the prompt, we can influence how OpenAI's model structures its response, making it easier to parse and use the data.
const getPrompt = async (content) => {
const format_instructions = parser.getFormatInstructions()
const prompt = new PromptTemplate({
template: 'Analyze the following journal entry. Follow the instructions and format your response to match the format instructions, no matter what! \n{format_instructions}\n{entry}',
inputVariables: ['entry'],
partialVariables: { format_instructions },
});
return await prompt.format({ entry: content });
};
This code creates a getPrompt function that generates a prompt for analyzing a journal entry. The "template" is a message structure, and "format instructions" are specific formatting rules provided by the parser. The function combines these elements with the input content to create a formatted prompt for analysis.
This approach ensures that the AI's response aligns closely with our predefined schema, significantly reducing the variability of the output format.
- Analyzing the Entry and Parsing the Response
Once we have crafted our prompt, we invoke the OpenAI model and parse its output. Here, the structured approach reveals its full value, as we can directly parse and validate the AI-generated response against our Zod schema.
const analyzeEntry = async (entry) => {
const input = await getPrompt(entry.content);
const model = new OpenAI({ temperature: 0, modelName: 'gpt-3.5-turbo' });
const output = await model.call(input);
try {
return parser.parse(output);
} catch (e) {
// If parsing fails, attempt to fix the output
const fixParser = OutputFixingParser.fromLLM(model, parser);
return await fixParser.parse(output);
}
};
This approach not only ensures that the output aligns with our expectations, but also simplifies the integration of AI-generated data into various applications, databases, and analytical processes.
The GitHub Gist contains the entire code: Code
Documentation:
Langchain Javascript
Lanchain Python
Thank you for taking the time to read till the end. I hope this post was helpful. If you want more content like this, you can follow me in dev. Happy Coding!
Top comments (0)