Ahsan Mangal 👨🏻‍💻

Posted on May 6, 2023

Constructing a Personalized ChatGPT Assistant: Optimize Your Documentation with SEO Techniques

#javascript #react #chatgpt #documentation

Using vector embeddings for context

ChatGPT is a beautiful resource for growth and learning when it is educated about the technology you are using.

Initially, I wanted ChatGPT to help me create Hilla applications. However, since it was trained on data up to 2021, it gave manufactured replies that did not correspond to reality.

Another area I could have improved was that Hilla supports React and Lit for the front end. I needed to ensure that the replies took the relevant framework into account as context.

Here's my approach to building an assistant that uses the most recent documentation to provide relevant replies to ChatGPT.

Crucial Principle: Embeddings

ChatGPT, like other big language models, has a limited context size that must fit your question, relevant background facts, and the response. For example, get-3.5-turbo has a token maximum of 4096, roughly equivalent to 3000 words. Combining the most helpful documentation pieces within the prompt is critical to elicit meaningful replies.

Embeddings are a practical approach for identifying these critical documentation sections. Embeddings are a technique for encoding the meaning of a text into a vector representing a location in a multidimensional space. Texts with similar meanings are placed close together, while those with distinct meanings are further apart.

The idea is similar to that of a color picker. A three-element vector containing red, green, and blue values can be used to represent each color. Colors with comparable values have similar values, whereas colors with different values have distinct values.

It's enough to know that an OpenAI API converts text into embeddings for this article. If you want to understand how embeddings function, this article is a beautiful place to start.

Once you've generated embeddings for your content, you can quickly discover the most relevant bits to include in the prompt by locating the parts that are most closely related to the query.

Overview: Supplying Documentation as Context for ChatGPT

The following are the high-level procedures required to have ChatGPT use your documentation as context while answering questions:

Creating Embeddings for Your Documentation

Divide your documentation into smaller chunks, such as by heading, then generate an embedding (vector) for each one.
Save the embedding, source text, and other metadata in a vector database.

Providing Responses with Documentation as Context

Make an embedding for the user query.
Using the embedding, search the vector database for the N portions of the documentation most relevant to the inquiry.
Create a prompt telling ChatGPT only to use the available documentation to answer the given query.
To produce a completion for the prompt, use the OpenAI API.

In the parts that follow, I will go into further detail about how I carried out these processes.

Tools for used

Source code

I'll simply mention the most important parts of the code below. You can locate the source code on GitHube

Documentation Processing

The Hilla documentation is written in Asciidoc. The following are the processes required to convert them into embeddings:

Asciidoctor should be used to process the Asciidoc files to include code snippets and other inclusions.
Based on the HTML document structure, divide the generated document into sections.
To save tokens, convert the material to plain text.
If necessary, divide parts into smaller pieces.
Make embedding vectors for each chunk of text.
Pinecone should be used to save the embedding vectors and the source text.

Processing of ASCIIdoc

async function processAdoc(file, path) {
  console.log(`Processing ${path}`);

  const frontMatterRegex = /^---[\s\S]+?---\n*/;


  const namespace = path.includes('articles/react') ? 'react' : path.includes('articles/lit') ? 'lit' : '';
  if (!namespace) return;

  // Remove front matter. The JS version of asciidoctor doesn't support removing it.
  const noFrontMatter = file.replace(frontMatterRegex, '');

  // Run through asciidoctor to get includes
  const html = asciidoctor.convert(noFrontMatter, {
    attributes: {
      root: process.env.DOCS_ROOT,
      articles: process.env.DOCS_ARTICLES,
      react: namespace === 'react',
      lit: namespace === 'lit'
    },
    safe: 'unsafe',
    base_dir: process.env.DOCS_ARTICLES
  });

  // Extract sections
  const dom = new JSDOM(html);
  const sections = dom.window.document.querySelectorAll('.sect1');

  // Convert section html to plain text to save on tokens
  const plainText = Array.from(sections).map(section => convert(section.innerHTML));

  // Split section content further if needed, filter out short blocks
  const docs = await splitter.createDocuments(plainText);
  const blocks = docs.map(doc => doc.pageContent)
    .filter(block => block.length > 200);

  await createAndSaveEmbeddings(blocks, path, namespace);
}

Create embeddings and save them

async function createAndSaveEmbeddings(blocks, path, namespace) {

  // OpenAI suggests removing newlines for better performance when creating embeddings. 
  // Don't remove them from the source.
  const withoutNewlines = blocks.map(block => block.replace(/\n/g, ' '));
  const embeddings = await getEmbeddings(withoutNewlines);
  const vectors = embeddings.map((embedding, i) => ({
    id: nanoid(),
    values: embedding,
    metadata: {
      path: path,
      text: blocks[i]
    }
  }));

  await pinecone.upsert({
    upsertRequest: {
      vectors,
      namespace
    }
  });
}

Get embeddings from OpenAI

export async function getEmbeddings(texts) {
  const response = await openai.createEmbedding({
    model: 'text-embedding-ada-002',
    input: texts
  });
  return response.data.data.map((item) => item.embedding);
}

Searching with context

So far, we've divided the documentation into manageable chunks and put it in a vector database. When a user asks a question, we must do the following:

Create an embedding depending on the query asked.
Search the vector database for the ten most relevant documentation sections.
Create a question with as many documentation sections packed into 1536 tokens, leaving 2560 for the response.

async function getMessagesWithContext(messages: ChatCompletionRequestMessage[], frontend: string) {

  // Ensure that there are only messages from the user and assistant, trim input
  const historyMessages = sanitizeMessages(messages);

  // Send all messages to OpenAI for moderation.
  // Throws exception if flagged -> should be handled properly in a real app.
  await moderate(historyMessages);

  // Extract the last user message to get the question
  const [userMessage] = historyMessages.filter(({role}) => role === ChatCompletionRequestMessageRoleEnum.User).slice(-1)

  // Create an embedding for the user's question
  const embedding = await createEmbedding(userMessage.content);

  // Find the most similar documents to the user's question
  const docSections = await findSimilarDocuments(embedding, 10, frontend);

  // Get at most 1536 tokens of documentation as context
  const contextString = await getContextString(docSections, 1536);

  // The messages that set up the context for the question
  const initMessages: ChatCompletionRequestMessage[] = [
    {
      role: ChatCompletionRequestMessageRoleEnum.System,
      content: codeBlock`
          ${oneLine`
            You are Hilla AI. You love to help developers! 
            Answer the user's question given the following
            information from the Hilla documentation.
          `}
        `
    },
    {
      role: ChatCompletionRequestMessageRoleEnum.User,
      content: codeBlock`
          Here is the Hilla documentation:
          """
          ${contextString}
          """
        `
    },
    {
      role: ChatCompletionRequestMessageRoleEnum.User,
      content: codeBlock`
          ${oneLine`
            Answer all future questions using only the above        
            documentation and your knowledge of the 
            ${frontend === 'react' ? 'React' : 'Lit'} library
          `}
          ${oneLine`
            You must also follow the below rules when answering:
          `}
          ${oneLine`
            - Do not make up answers that are not provided 
              in the documentation 
          `}
          ${oneLine`
            - If you are unsure and the answer is not explicitly 
              written in the documentation context, say 
              "Sorry, I don't know how to help with that"
          `}
          ${oneLine`
            - Prefer splitting your response into 
              multiple paragraphs
          `}
          ${oneLine`
            - Output as markdown
          `}
          ${oneLine`
            - Always include code snippets if available
          `}
        `
    }
  ];

  // Cap the messages to fit the max token count, removing earlier messages if necessary
  return capMessages(
    initMessages,
    historyMessages
  );
}

When a user asks a question, we utilize getMessagesWithContext() to retrieve the messages that must be sent to ChatGPT. The OpenAI API is then used to obtain the complete and feed the response to the client.

export default async function handler(req: NextRequest) {
  // All the non-system messages up until now along with 
  // the framework we should use for the context.
  const {messages, frontend} = (await req.json()) as {
    messages: ChatCompletionRequestMessage[],
    frontend: string
  };
  const completionMessages = await getMessagesWithContext(messages, frontend);
  const stream = await streamChatCompletion(completionMessages, MAX_RESPONSE_TOKENS);
  return new Response(stream);
}

Source code

Thank you for sticking with me till the end. You’re a fantastic reader!

Ahsan Mangal
I hope you found it informative and engaging. If you enjoyed this content, please consider following me for more articles like this in the future. Stay curious and keep learning!

DEV Community

Constructing a Personalized ChatGPT Assistant: Optimize Your Documentation with SEO Techniques

Crucial Principle: Embeddings

Overview: Supplying Documentation as Context for ChatGPT

Providing Responses with Documentation as Context

Source code

Documentation Processing

Searching with context

Source code

Top comments (0)

Read next

Create an Interactive Eraser Tool with HTML5 Canvas 🚀

Build a Single Page Application (SPA) Using HTML, CSS & JavaScript - No Frameworks Needed

Server-Side Rendering (SSR) vs. Client-Side Rendering (CSR) in Web Applications: A Complete Guide

Supercharge your HTML with mizu.js!