TL;DR
ChatGPT is trained until 2022.
But what if you want it to give you information specifically about your website? Most likely, it’s not possible, but not anymore!
OpenAI introduced their new feature - assistants.
You can now easily index your website and then ask ChatGPT questions about it. In this tutorial, we will build a system that indexes your website and lets you query it. We will:
- Scrape the documentation sitemap.
- Extract the information from all the pages on the website.
- Create a new assistant with the new information.
- Build a simple ChatGPT frontend interface and query the assistant.
Your background job platform 🔌
Trigger.dev is an open-source library that enables you to create and monitor long-running jobs for your app with NextJS, Remix, Astro, and so many more!
Please help us with a star 🥹.
It would help us to create more articles like this 💖
Star the Trigger.dev repository ⭐️
Let’s get started 🔥
Let’s set up a new NextJS project.
npx create-next-app@latest
💡 We use NextJS new app router. Please make sure you have a node version 18+ before installing the project
Let's create a new database to save the assistant and the scraped pages.
For our example, we will use Prisma with SQLite.
It is super easy to install, just run:
npm install prisma @prisma/client --save
And then add a schema and a database with
npx prisma init --datasource-provider sqlite
Go to prisma/schema.prisma
and replace it with the following schema:
// This is your Prisma schema file,
// learn more about it in the docs: https://pris.ly/d/prisma-schema
generator client {
provider = "prisma-client-js"
}
datasource db {
provider = "sqlite"
url = env("DATABASE_URL")
}
model Docs {
id Int @id @default(autoincrement())
content String
url String @unique
identifier String
@@index([identifier])
}
model Assistant {
id Int @id @default(autoincrement())
aId String
url String @unique
}
And then run
npx prisma db push
That will create a new SQLite database (local file) with two main tables: Docs
and Assistant
- The
Docs
contains all the scraped pages - The
Assistant
contains the URL of the docs and the internal ChatGPT assistant ID.
Let’s add our Prisma client.
Create a new folder called helper
and add a new file called prisma.ts
and the following code inside:
import {PrismaClient} from '@prisma/client';
export const prisma = new PrismaClient();
We can later use that prisma
variable to question our database.
Scrape & Index
Create a Trigger.dev account
Scraping and indexing the pages is a long-running task. We need to:
- Scrape the main website meta URL for the sitemap.
- Extract all the pages inside the sitemap.
- Go to each page and extract the content.
- Save everything to the ChatGPT assistant.
For that, let’s use Trigger.dev!
Sign up for a Trigger.dev account.
Once registered, create an organization and choose a project name for your job.
Select Next.js as your framework and follow the process for adding Trigger.dev to an existing Next.js project.
Otherwise, click Environments & API Keys
on the sidebar menu of your project dashboard.
Copy your DEV server API key and run the code snippet below to install Trigger.dev.
Follow the instructions carefully.
npx @trigger.dev/cli@latest init
Run the following code snippet in another terminal to establish a tunnel between Trigger.dev and your Next.js project.
npx @trigger.dev/cli@latest dev
Install ChatGPT (OpenAI)
We will use OpenAI assistant, so we must install it on our Project.
Create a new OpenAI account and generate an API Key.
Click View API key
from the dropdown to create an API Key.
Next, install the OpenAI package by running the code snippet below.
npm install @trigger.dev/openai
Add your OpenAI API key to the .env.local
file.
OPENAI_API_KEY=<your_api_key>
Create a new directory, helper
and add a new file, open.ai.tsx
with the following content:
import {OpenAI} from "@trigger.dev/openai";
export const openai = new OpenAI({
id: "openai",
apiKey: process.env.OPENAI_API_KEY!,
});
That’s our OpenAI client wrapped by Trigger.dev integration.
Building the background jobs
Let’s go ahead and create a new background job!
Go to jobs
and create a new file called process.documentation.ts
. Add the following code:
import { eventTrigger } from "@trigger.dev/sdk";
import { client } from "@openai-assistant/trigger";
import {object, string} from "zod";
import {JSDOM} from "jsdom";
import {openai} from "@openai-assistant/helper/open.ai";
client.defineJob({
// This is the unique identifier for your Job; it must be unique across all Jobs in your project.
id: "process-documentation",
name: "Process Documentation",
version: "0.0.1",
// This is triggered by an event using eventTrigger. You can also trigger Jobs with webhooks, on schedules, and more: https://trigger.dev/docs/documentation/concepts/triggers/introduction
trigger: eventTrigger({
name: "process.documentation.event",
schema: object({
url: string(),
})
}),
integrations: {
openai
},
run: async (payload, io, ctx) => {
}
});
We have defined a new job called process.documentation.event
, and we added a required parameter called URL - that’s our documentation URL to be sent later.
As you can see, the job is empty, so let’s add the first task to it.
We need to grab the website sitemap and return it.
Scraping the website will return an HTML that we need to parse.
To do it, let’s install JSDOM.
npm install jsdom --save
And import it at the top of our file:
import {JSDOM} from "jsdom";
Now, we can add our first task.
It’s important to wrap our code with runTask
, which lets Trigger.dev separate it from the other tasks. Trigger special architecture splits the tasks into different processes so Vercel serverless timeout does not affect them. Here is the code for the first task:
const getSiteMap = await io.runTask("grab-sitemap", async () => {
const data = await (await fetch(payload.url)).text();
const dom = new JSDOM(data);
const sitemap = dom.window.document.querySelector('[rel="sitemap"]')?.getAttribute('href');
return new URL(sitemap!, payload.url).toString();
});
- We grab the entire HTML from the URL with an HTTP request.
- We convert it into a JS object.
- We find the sitemap URL.
- We parse it and return it.
Going forward, we need to scrape the sitemap, extract all the URLs and return them.
Let’s install Lodash
- special functions for array structures.
npm install lodash @types/lodash --save
Here is the code of the task:
export const makeId = (length: number) => {
let text = '';
const possible = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
for (let i = 0; i < length; i += 1) {
text += possible.charAt(Math.floor(Math.random() * possible.length));
}
return text;
};
const {identifier, list} = await io.runTask("load-and-parse-sitemap", async () => {
const urls = /(http|ftp|https):\/\/([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,@?^=%&:\/~+#-]*[\w@?^=%&\/~+#-])/g;
const identifier = makeId(5);
const data = await (await fetch(getSiteMap)).text();
// @ts-ignore
return {identifier, list: chunk(([...new Set(data.match(urls))] as string[]).filter(f => f.includes(payload.url)).map(p => ({identifier, url: p})), 25)};
});
- We create a new function called makeId to generate a random identifier for all our pages.
- We create a new task and add a Regex to extract every possible URL
- We send an HTTP request to load the sitemap and extract all its URLs.
- We
chunk
the URL into arrays of 25 elements (if we have 100 elements, we will have four arrays of 25 elements)
Next, let’s create a new job to process each URL.
Here is the complete code:
function getElementsBetween(startElement: Element, endElement: Element) {
let currentElement = startElement;
const elements = [];
// Traverse the DOM until the endElement is reached
while (currentElement && currentElement !== endElement) {
currentElement = currentElement.nextElementSibling!;
// If there's no next sibling, go up a level and continue
if (!currentElement) {
// @ts-ignore
currentElement = startElement.parentNode!;
startElement = currentElement;
if (currentElement === endElement) break;
continue;
}
// Add the current element to the list
if (currentElement && currentElement !== endElement) {
elements.push(currentElement);
}
}
return elements;
}
const processContent = client.defineJob({
// This is the unique identifier for your Job; it must be unique across all Jobs in your project.
id: "process-content",
name: "Process Content",
version: "0.0.1",
// This is triggered by an event using eventTrigger. You can also trigger Jobs with webhooks, on schedules, and more: https://trigger.dev/docs/documentation/concepts/triggers/introduction
trigger: eventTrigger({
name: "process.content.event",
schema: object({
url: string(),
identifier: string(),
})
}),
run: async (payload, io, ctx) => {
return io.runTask('grab-content', async () => {
// We first grab a raw html of the content from the website
const data = await (await fetch(payload.url)).text();
// We load it with JSDOM so we can manipulate it
const dom = new JSDOM(data);
// We remove all the scripts and styles from the page
dom.window.document.querySelectorAll('script, style').forEach((el) => el.remove());
// We grab all the titles from the page
const content = Array.from(dom.window.document.querySelectorAll('h1, h2, h3, h4, h5, h6'));
// We grab the last element so we can get the content between the last element and the next element
const lastElement = content[content.length - 1]?.parentElement?.nextElementSibling!;
const elements = [];
// We loop through all the elements and grab the content between each title
for (let i = 0; i < content.length; i++) {
const element = content[i];
const nextElement = content?.[i + 1] || lastElement;
const elementsBetween = getElementsBetween(element, nextElement);
elements.push({
title: element.textContent, content: elementsBetween.map((el) => el.textContent).join('\n')
});
}
// We create a raw text format of all the content
const page = `
----------------------------------
url: ${payload.url}\n
${elements.map((el) => `${el.title}\n${el.content}`).join('\n')}
----------------------------------
`;
// We save it to our database
await prisma.docs.upsert({
where: {
url: payload.url
}, update: {
content: page, identifier: payload.identifier
}, create: {
url: payload.url, content: page, identifier: payload.identifier
}
});
});
},
});
- We grab the content from the URL (previously extracted from the sitemap)
- We parse it with
JSDOM
- We remove every possible
<script>
or<style>
that exists on the page. - We grab all the titles on the page (
h1
,h2
,h3
,h4
,h5
,h6
) - We iterate over the titles and take the content between them. We don’t want to take the entire page content because it might contain irrelevant content.
- We create our version of the raw text of the page and save it to our database.
Now, let’s run this task for every sitemap URL.
Trigger introduces something called batchInvokeAndWaitForCompletion
.
It allows us to send batches of 25 items to process, and it will simultaneously process all of them. Here are the next lines of codes:
let i = 0;
for (const item of list) {
await processContent.batchInvokeAndWaitForCompletion(
'process-list-' + i,
item.map(
payload => ({
payload,
}),
86_400),
);
i++;
}
We manually trigger the previously created job in a batch of 25.
Once that’s completed, let’s take all the content we have saved to our database and connect it:
const data = await io.runTask("get-extracted-data", async () => {
return (await prisma.docs.findMany({
where: {
identifier
},
select: {
content: true
}
})).map((d) => d.content).join('\n\n');
});
We use the identifier we have specified before.
Now, let’s create a new file in ChatGPT with the new data:
const file = await io.openai.files.createAndWaitForProcessing("upload-file", {
purpose: "assistants",
file: data
});
createAndWaitForProcessing
is a task created by Trigger.dev to upload files to the assistant. If you manually use openai
without the integration, you must stream the files.
Now let’s create or update our assistant:
const assistant = await io.openai.runTask("create-or-update-assistant", async (openai) => {
const currentAssistant = await prisma.assistant.findFirst({
where: {
url: payload.url
}
});
if (currentAssistant) {
return openai.beta.assistants.update(currentAssistant.aId, {
file_ids: [file.id]
});
}
return openai.beta.assistants.create({
name: identifier,
description: 'Documentation',
instructions: 'You are a documentation assistant, you have been loaded with documentation from ' + payload.url + ', return everything in an MD format.',
model: 'gpt-4-1106-preview',
tools: [{ type: "code_interpreter" }, {type: 'retrieval'}],
file_ids: [file.id],
});
});
- We first check if we have an assistant for that specific URL.
- If we have one, let’s update the assistant with the new file.
- If not, let’s create a new assistant.
- We pass the instruction of “you are a documentation assistant.”, it’s essential to notice that we want the final output to be in
MD
format so we can display it nicer later.
For the final piece of the Puzzle, let’s save the new assistant into our database.
Here is the code:
await io.runTask("save-assistant", async () => {
await prisma.assistant.upsert({
where: {
url: payload.url
},
update: {
aId: assistant.id,
},
create: {
aId: assistant.id,
url: payload.url,
}
});
});
If the URL already exists, we can try to update it with the new assistant ID.
Here is the full code of the page:
import { eventTrigger } from "@trigger.dev/sdk";
import { client } from "@openai-assistant/trigger";
import {object, string} from "zod";
import {JSDOM} from "jsdom";
import {chunk} from "lodash";
import {prisma} from "@openai-assistant/helper/prisma.client";
import {openai} from "@openai-assistant/helper/open.ai";
const makeId = (length: number) => {
let text = '';
const possible = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
for (let i = 0; i < length; i += 1) {
text += possible.charAt(Math.floor(Math.random() * possible.length));
}
return text;
};
client.defineJob({
// This is the unique identifier for your Job; it must be unique across all Jobs in your project.
id: "process-documentation",
name: "Process Documentation",
version: "0.0.1",
// This is triggered by an event using eventTrigger. You can also trigger Jobs with webhooks, on schedules, and more: https://trigger.dev/docs/documentation/concepts/triggers/introduction
trigger: eventTrigger({
name: "process.documentation.event",
schema: object({
url: string(),
})
}),
integrations: {
openai
},
run: async (payload, io, ctx) => {
// The first task to get the sitemap URL from the website
const getSiteMap = await io.runTask("grab-sitemap", async () => {
const data = await (await fetch(payload.url)).text();
const dom = new JSDOM(data);
const sitemap = dom.window.document.querySelector('[rel="sitemap"]')?.getAttribute('href');
return new URL(sitemap!, payload.url).toString();
});
// We parse the sitemap; instead of using some XML parser, we just use regex to get the URLs and we return it in chunks of 25
const {identifier, list} = await io.runTask("load-and-parse-sitemap", async () => {
const urls = /(http|ftp|https):\/\/([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,@?^=%&:\/~+#-]*[\w@?^=%&\/~+#-])/g;
const identifier = makeId(5);
const data = await (await fetch(getSiteMap)).text();
// @ts-ignore
return {identifier, list: chunk(([...new Set(data.match(urls))] as string[]).filter(f => f.includes(payload.url)).map(p => ({identifier, url: p})), 25)};
});
// We go into each page and grab the content; we do this in batches of 25 and save it to the DB
let i = 0;
for (const item of list) {
await processContent.batchInvokeAndWaitForCompletion(
'process-list-' + i,
item.map(
payload => ({
payload,
}),
86_400),
);
i++;
}
// We get the data that we saved in batches from the DB
const data = await io.runTask("get-extracted-data", async () => {
return (await prisma.docs.findMany({
where: {
identifier
},
select: {
content: true
}
})).map((d) => d.content).join('\n\n');
});
// We upload the data to OpenAI with all the content
const file = await io.openai.files.createAndWaitForProcessing("upload-file", {
purpose: "assistants",
file: data
});
// We create a new assistant or update the old one with the new file
const assistant = await io.openai.runTask("create-or-update-assistant", async (openai) => {
const currentAssistant = await prisma.assistant.findFirst({
where: {
url: payload.url
}
});
if (currentAssistant) {
return openai.beta.assistants.update(currentAssistant.aId, {
file_ids: [file.id]
});
}
return openai.beta.assistants.create({
name: identifier,
description: 'Documentation',
instructions: 'You are a documentation assistant, you have been loaded with documentation from ' + payload.url + ', return everything in an MD format.',
model: 'gpt-4-1106-preview',
tools: [{ type: "code_interpreter" }, {type: 'retrieval'}],
file_ids: [file.id],
});
});
// We update our internal database with the assistant
await io.runTask("save-assistant", async () => {
await prisma.assistant.upsert({
where: {
url: payload.url
},
update: {
aId: assistant.id,
},
create: {
aId: assistant.id,
url: payload.url,
}
});
});
},
});
export function getElementsBetween(startElement: Element, endElement: Element) {
let currentElement = startElement;
const elements = [];
// Traverse the DOM until the endElement is reached
while (currentElement && currentElement !== endElement) {
currentElement = currentElement.nextElementSibling!;
// If there's no next sibling, go up a level and continue
if (!currentElement) {
// @ts-ignore
currentElement = startElement.parentNode!;
startElement = currentElement;
if (currentElement === endElement) break;
continue;
}
// Add the current element to the list
if (currentElement && currentElement !== endElement) {
elements.push(currentElement);
}
}
return elements;
}
// This job will grab the content from the website
const processContent = client.defineJob({
// This is the unique identifier for your Job; it must be unique across all Jobs in your project.
id: "process-content",
name: "Process Content",
version: "0.0.1",
// This is triggered by an event using eventTrigger. You can also trigger Jobs with webhooks, on schedules, and more: https://trigger.dev/docs/documentation/concepts/triggers/introduction
trigger: eventTrigger({
name: "process.content.event",
schema: object({
url: string(),
identifier: string(),
})
}),
run: async (payload, io, ctx) => {
return io.runTask('grab-content', async () => {
try {
// We first grab a raw HTML of the content from the website
const data = await (await fetch(payload.url)).text();
// We load it with JSDOM so we can manipulate it
const dom = new JSDOM(data);
// We remove all the scripts and styles from the page
dom.window.document.querySelectorAll('script, style').forEach((el) => el.remove());
// We grab all the titles from the page
const content = Array.from(dom.window.document.querySelectorAll('h1, h2, h3, h4, h5, h6'));
// We grab the last element so we can get the content between the last element and the next element
const lastElement = content[content.length - 1]?.parentElement?.nextElementSibling!;
const elements = [];
// We loop through all the elements and grab the content between each title
for (let i = 0; i < content.length; i++) {
const element = content[i];
const nextElement = content?.[i + 1] || lastElement;
const elementsBetween = getElementsBetween(element, nextElement);
elements.push({
title: element.textContent, content: elementsBetween.map((el) => el.textContent).join('\n')
});
}
// We create a raw text format of all the content
const page = `
----------------------------------
url: ${payload.url}\n
${elements.map((el) => `${el.title}\n${el.content}`).join('\n')}
----------------------------------
`;
// We save it to our database
await prisma.docs.upsert({
where: {
url: payload.url
}, update: {
content: page, identifier: payload.identifier
}, create: {
url: payload.url, content: page, identifier: payload.identifier
}
});
}
catch (e) {
console.log(e);
}
});
},
});
We have finished creating the background job to scrape and index the files 🎉
Question the assistant
Now, let’s create the job to question our assistant.
Go to jobs
and create a new file, question.assistant.ts
. Add the following code:
import {eventTrigger} from "@trigger.dev/sdk";
import {client} from "@openai-assistant/trigger";
import {object, string} from "zod";
import {openai} from "@openai-assistant/helper/open.ai";
client.defineJob({
// This is the unique identifier for your Job; it must be unique across all Jobs in your project.
id: "question-assistant",
name: "Question Assistant",
version: "0.0.1", // This is triggered by an event using eventTrigger. You can also trigger Jobs with webhooks, on schedules, and more: https://trigger.dev/docs/documentation/concepts/triggers/introduction
trigger: eventTrigger({
name: "question.assistant.event", schema: object({
content: string(),
aId: string(),
threadId: string().optional(),
})
}), integrations: {
openai
}, run: async (payload, io, ctx) => {
// Create or use an existing thread
const thread = payload.threadId ? await io.openai.beta.threads.retrieve('get-thread', payload.threadId) : await io.openai.beta.threads.create('create-thread');
// Create a message in the thread
await io.openai.beta.threads.messages.create('create-message', thread.id, {
content: payload.content,
role: 'user',
});
// Run the thread
const run = await io.openai.beta.threads.runs.createAndWaitForCompletion('run-thread', thread.id, {
model: 'gpt-4-1106-preview',
assistant_id: payload.aId,
});
// Check the status of the thread
if (run.status !== "completed") {
console.log('not completed');
throw new Error(`Run finished with status ${run.status}: ${JSON.stringify(run.last_error)}`);
}
// Get the messages from the thread
const messages = await io.openai.beta.threads.messages.list("list-messages", run.thread_id, {
query: {
limit: "1"
}
});
const content = messages[0].content[0];
if (content.type === 'text') {
return {content: content.text.value, threadId: thread.id};
}
}
});
- The event takes three parameters
-
content
- the message we want to send to our assistant. -
aId
- the internal ID of the assistant we previously created. -
threadId
- The thread id of the conversation. As you can see, this is an optional parameter because, on the first message, we will not have a thread ID yet.
-
- Then, we create or get the thread the previous thread.
- We add a new message to the thread of the question we ask the assistant.
- We run the thread and wait for it to finish.
- We get the list of messages (and limit it to 1) as the first message is the last one in the conversation.
- We return the message content and the thread ID we just created.
Add routing
We need to create 3 API routes for our application:
- Send a new assistant for processing.
- Get a specific assistant by URL.
- Add a new message to an assistant.
Create a new folder inside of app/api
called assistant, and inside, create a new file called route.ts
. Add the following code inside:
import {client} from "@openai-assistant/trigger";
import {prisma} from "@openai-assistant/helper/prisma.client";
export async function POST(request: Request) {
const body = await request.json();
if (!body.url) {
return new Response(JSON.stringify({error: 'URL is required'}), {status: 400});
}
// We send an event to the trigger to process the documentation
const {id: eventId} = await client.sendEvent({
name: "process.documentation.event",
payload: {url: body.url},
});
return new Response(JSON.stringify({eventId}), {status: 200});
}
export async function GET(request: Request) {
const url = new URL(request.url).searchParams.get('url');
if (!url) {
return new Response(JSON.stringify({error: 'URL is required'}), {status: 400});
}
const assistant = await prisma.assistant.findFirst({
where: {
url: url
}
});
return new Response(JSON.stringify(assistant), {status: 200});
}
The first POST
method gets a URL and triggers the process.documentation.event
job with a URL sent from the client.
The second GET
method gets the assistant from our database, from the URL sent from the client.
Now, let’s create the route to add a message to our assistant.
Inside of app/api
create a new folder message
and add a new file called route.ts
, then add the following code:
import {prisma} from "@openai-assistant/helper/prisma.client";
import {client} from "@openai-assistant/trigger";
export async function POST(request: Request) {
const body = await request.json();
// Check that we have the assistant id and the message
if (!body.id || !body.message) {
return new Response(JSON.stringify({error: 'Id and Message are required'}), {status: 400});
}
// get the assistant id in OpenAI from the id in the database
const assistant = await prisma.assistant.findUnique({
where: {
id: +body.id
}
});
// We send an event to the trigger to process the documentation
const {id: eventId} = await client.sendEvent({
name: "question.assistant.event",
payload: {
content: body.message,
aId: assistant?.aId,
threadId: body.threadId
},
});
return new Response(JSON.stringify({eventId}), {status: 200});
}
That’s a very basic code. We get the message, assistant id, and thread id from the client and send it to our previously created question.assistant.event
.
The last thing to do is create a function to get all our assistants.
Inside of helpers
create a new function called get.list.ts
and add the following code:
import {prisma} from "@openai-assistant/helper/prisma.client";
// Get the list of all the available assistants
export const getList = () => {
return prisma.assistant.findMany({
});
}
Very simple code to get all the assistants.
We have finished with the backend 🥳
Let’s move to the front.
Creating the Frontend
We are going to create a basic interface to add URLs and show the list of the added URLs:
The main page
Replace the content of app/page.tsx
with the following code:
import {getList} from "@openai-assistant/helper/get.list";
import Main from "@openai-assistant/components/main";
export default async function Home() {
const list = await getList();
return (
<Main list={list} />
)
}
That’s a straightforward code that grabs the list from the database and passes it to our Main component.
Next, let’s create the Main
component.
Inside app
create a new folder components
and add a new file called main.tsx
. Add the following code:
"use client";
import {Assistant} from '@prisma/client';
import {useCallback, useState} from "react";
import {FieldValues, SubmitHandler, useForm} from "react-hook-form";
import {ChatgptComponent} from "@openai-assistant/components/chatgpt.component";
import {AssistantList} from "@openai-assistant/components/assistant.list";
import {TriggerProvider} from "@trigger.dev/react";
export interface ExtendedAssistant extends Assistant {
pending?: boolean;
eventId?: string;
}
export default function Main({list}: {list: ExtendedAssistant[]}) {
const [assistantState, setAssistantState] = useState(list);
const {register, handleSubmit} = useForm();
const submit: SubmitHandler<FieldValues> = useCallback(async (data) => {
const assistantResponse = await (await fetch('/api/assistant', {
body: JSON.stringify({url: data.url}),
method: 'POST',
headers: {
'Content-Type': 'application/json'
}
})).json();
setAssistantState([...assistantState, {...assistantResponse, url: data.url, pending: true}]);
}, [assistantState])
const changeStatus = useCallback((val: ExtendedAssistant) => async () => {
const assistantResponse = await (await fetch(`/api/assistant?url=${val.url}`, {
method: 'GET',
headers: {
'Content-Type': 'application/json'
}
})).json();
setAssistantState([...assistantState.filter((v) => v.id), assistantResponse]);
}, [assistantState])
return (
<TriggerProvider publicApiKey={process.env.NEXT_PUBLIC_TRIGGER_PUBLIC_API_KEY!}>
<div className="w-full max-w-2xl mx-auto p-6 flex flex-col gap-4">
<form className="flex items-center space-x-4" onSubmit={handleSubmit(submit)}>
<input className="flex-grow p-3 border border-black/20 rounded-xl" placeholder="Add documentation link" type="text" {...register('url', {required: 'true'})} />
<button className="flex-shrink p-3 border border-black/20 rounded-xl" type="submit">
Add
</button>
</form>
<div className="divide-y-2 divide-gray-300 flex gap-2 flex-wrap">
{assistantState.map(val => (
<AssistantList key={val.url} val={val} onFinish={changeStatus(val)} />
))}
</div>
{assistantState.filter(f => !f.pending).length > 0 && <ChatgptComponent list={assistantState} />}
</div>
</TriggerProvider>
)
}
Let’s see what’s going on here:
- We created a new interface that’s called
ExtendedAssistant
with two parameterspending
andeventId
. When we create a new assistant, we don’t have the final value, we will store only theeventId
and listen to the job processing until finished. - We get the list from the server component and set it to our new state (so we can modify it later)
- We added a
TriggerProvider
to help us listen for event completion and update it with data. - We use
react-hook-form
to create a new form for adding new assistants. - We added a form with one input
URL
to submit new assistants for processing. - We iterate and show all the assistants that exist.
- On form submissions, we send the information to the previously created
route
to add the new assistant. - Once the event is completed, we trigger
changeStatus
to load the assistant from the database. - In the end, we have the ChatGPT component, only to be displayed if we don’t have assistants waiting to be processed (
!f.pending
)
Let’s create our AssistantList
component.
inside components
, create a new file assistant.list.tsx
and add the following content there:
"use client";
import {FC, useEffect} from "react";
import {ExtendedAssistant} from "@openai-assistant/components/main";
import {useEventRunDetails} from "@trigger.dev/react";
export const Loading: FC<{eventId: string, onFinish: () => void}> = (props) => {
const {eventId} = props;
const { data, error } = useEventRunDetails(eventId);
useEffect(() => {
if (!data || error) {
return ;
}
if (data.status === 'SUCCESS') {
props.onFinish();
}
}, [data]);
return <div className="pointer bg-yellow-300 border-yellow-500 p-1 px-3 text-yellow-950 border rounded-2xl">Loading</div>
};
export const AssistantList: FC<{val: ExtendedAssistant, onFinish: () => void}> = (props) => {
const {val, onFinish} = props;
if (val.pending) {
return <Loading eventId={val.eventId!} onFinish={onFinish} />
}
return (
<div key={val.url} className="pointer relative bg-green-300 border-green-500 p-1 px-3 text-green-950 border rounded-2xl hover:bg-red-300 hover:border-red-500 hover:text-red-950 before:content-[attr(data-content)]" data-content={val.url} />
)
}
We iterate over all the assistants we created. If the assistants have already been created, we just display the name. If not, we render the <Loading />
component.
The loading component shows a Loading
on the screen and long-polling the server until the event is finished.
We used the useEventRunDetails
function created by Trigger.dev to know when the event is finished.
Once the event is finished, it triggers the onFinish
function to update our client with the newly created assistant.
Chat interface
Now, let’s add the ChatGPT component and question our assistant!
- Select the assistant we would like to use
- Show the list of messages
- Add input for the message we want to send and the submit button.
Inside of components
add a new file called chatgpt.component.tsx
Let’s draw our ChatGPT chat box:
"use client";
import {FC, useCallback, useEffect, useRef, useState} from "react";
import {ExtendedAssistant} from "@openai-assistant/components/main";
import Markdown from 'react-markdown'
import {useEventRunDetails} from "@trigger.dev/react";
interface Messages {
message?: string
eventId?: string
}
export const ChatgptComponent = ({list}: {list: ExtendedAssistant[]}) => {
const url = useRef<HTMLSelectElement>(null);
const [message, setMessage] = useState('');
const [messagesList, setMessagesList] = useState([] as Messages[]);
const [threadId, setThreadId] = useState<string>('' as string);
const submitForm = useCallback(async (e: any) => {
e.preventDefault();
setMessagesList((messages) => [...messages, {message: `**[ME]** ${message}`}]);
setMessage('');
const messageResponse = await (await fetch('/api/message', {
method: 'POST',
body: JSON.stringify({message, id: url.current?.value, threadId}),
})).json();
if (!threadId) {
setThreadId(messageResponse.threadId);
}
setMessagesList((messages) => [...messages, {eventId: messageResponse.eventId}]);
}, [message, messagesList, url, threadId]);
return (
<div className="border border-black/50 rounded-2xl flex flex-col">
<div className="border-b border-b-black/50 h-[60px] gap-3 px-3 flex items-center">
<div>Assistant:</div>
<div>
<select ref={url} className="border border-black/20 rounded-xl p-2">
{list.filter(f => !f.pending).map(val => (
<option key={val.id} value={val.id}>{val.url}</option>
))}
</select>
</div>
</div>
<div className="flex-1 flex flex-col gap-3 py-3 w-full min-h-[500px] max-h-[1000px] overflow-y-auto overflow-x-hidden messages-list">
{messagesList.map((val, index) => (
<div key={index} className={`flex border-b border-b-black/20 pb-3 px-3`}>
<div className="w-full">
{val.message ? <Markdown>{val.message}</Markdown> : <MessageComponent eventId={val.eventId!} onFinish={setThreadId} />}
</div>
</div>
))}
</div>
<form onSubmit={submitForm}>
<div className="border-t border-t-black/50 h-[60px] gap-3 px-3 flex items-center">
<div className="flex-1">
<input value={message} onChange={(e) => setMessage(e.target.value)} className="read-only:opacity-20 outline-none border border-black/20 rounded-xl p-2 w-full" placeholder="Type your message here" />
</div>
<div>
<button className="border border-black/20 rounded-xl p-2 disabled:opacity-20" disabled={message.length < 3}>Send</button>
</div>
</div>
</form>
</div>
)
}
export const MessageComponent: FC<{eventId: string, onFinish: (threadId: string) => void}> = (props) => {
const {eventId} = props;
const { data, error } = useEventRunDetails(eventId);
useEffect(() => {
if (!data || error) {
return ;
}
if (data.status === 'SUCCESS') {
props.onFinish(data.output.threadId);
}
}, [data]);
if (!data || error || data.status !== 'SUCCESS') {
return (
<div className="flex justify-end items-center pb-3 px-3">
<div className="animate-spin rounded-full h-3 w-3 border-t-2 border-b-2 border-blue-500" />
</div>
}
return <Markdown>{data.output.content}</Markdown>;
};
A few exciting things are going on over here:
- When we create a new message, we automatically render it on the screen as “our” message, but when we send it to the server, we need to push the event id, as we don’t have the message yet. That’s why we use
{val.message ? <Markdown>{val.message}</Markdown> : <MessageComponent eventId={val.eventId!} onFinish={setThreadId} />}
- We wrap our messages with a
Markdown
component. If you remember, we told ChatGPT in the previous steps to output everything in an MD format so we can render it correctly. - Once the event has finished processing, we update the thread id so that we will have the context of the same conversation from the following message.
And we are done 🎉
Let's connect! 🔌
As an open-source developer, you can join our community to contribute and engage with maintainers. Don't hesitate to visit our GitHub repository to contribute and create issues related to Trigger.dev.
The source for this tutorial is available here:
https://github.com/triggerdotdev/blog/tree/main/openai-assistant
Thank you for reading!
Top comments (18)
Wow! thank you for the in-depth explanation. Going to try it as well @nathan_tarbert
Failed to compile
./src/app/page.tsx:1:0
Module not found: Can't resolve '@openai-assistant/helper/get.list'
nextjs.org/docs/messages/module-no...
This error occurred during the build process and can only be dismissed by fixing the error.
I really like this attempt at making fine-tuning simpler.
Assistants are definitely a very interesting concept. Thanks for the detailed tutorial!
Great tutorial, I'm going to code this out. Thanks!
This is really a game-changer, I can finally drop all the vector databases :)
Ha ha 😂
Great writing, thanks for such a detailed post. The assistant is really a powerful tool.
Haven't read the article yet. Just wondering: Is this really training/fine-tuning it, or is it just providing it context to search like with a custom agent?
The assistant does not work with a context but indexes everything in their vector DB.
It's seamless to you :)
Great article!
Detailed tutorial 🔥🔥👍. Learned so much..
Thinking about openai.beta.assistants.create..