DEV Community


Posted on

GPT Loop Prompting: How to Summarize Text that Exceeds the Token Limit

Never get blocked by max token restrictions again

Let's say you want GPT to summarize a transcript of a meeting, a chapter of a book, or any piece of long-form content that doesn't fit into the message length constraints of the prompt.

How do you overcome the token limitation to get your desired summary?

Grab your OpenAI key and let's do it with the following Javascript code:

import axios from 'axios';

let text = "Insert your long transcript or piece of text here"
let splitSize = 2048; // The size of each chunk

let chunks = [];
for (let i = 0; i < text.length; i += splitSize) {
    chunks.push(text.slice(i, i + splitSize));

// Initialize the conversation with system instruction.
let messages = [{
    role: "system",
    content: "Your task is to summarize the transcript. You'll receive the transcript in many parts (identified by Part X/Y, e.g. 'Part 1/10'). With each part that you receive, summarize that part into bullet point format highlighting the key points and append it in the correct order to whatever the 'Existing Summary' is, should it exist."

let totalParts = chunks.length;
let summarySoFar = '';

const generateSummary = async () => {
    // Iterate through chunks and generate GPT prompts
    for (let i = 0; i < chunks.length; i++) {
        let partNumber = i + 1;

        // Adding user message with the existing summary.
            role: "user",
            content: `Existing Summary:\n${summarySoFar}`

        // Adding user message with the transcript part (e.g. Part 1/10)
            role: "user",
            content: `Transcript Part ${partNumber}/${totalParts}:\n${chunks[i]}`

        // Make an API call to the GPT model of your choice
        const response = await'', {
            model: 'gpt-3.5-turbo',
            messages: messages,
            temperature: 0.5
        }, {
            headers: {
                'Authorization': `Bearer ${process.env.OPENAI_KEY}`,
                'Content-Type': 'application/json'

        // Update the existing summary with the newly generated summary
        summarySoFar += '\n' +[0].message.content;

        // Reset messages for next chunk with assistant's latest message.
        messages = [{ "role": "assistant", "content":[0].message.content }];
    console.log('Final Summary:', summarySoFar);

Enter fullscreen mode Exit fullscreen mode

The way this works is as follows:

  1. We split the long-form text into an array of chunks.
  2. We loop through the array of chunks through a GPT processing algorithm, which sequentially builds the summary with each iteration.
  3. In the algorithm, we start with a System prompt, giving GPT the high level task and what to expect.
  4. For the User messages, we provide the existing state of the summary (which starts off as blank) along with the next chunk of the transcript.
  5. GPT generates a summarySoFar for us, which feeds into the next prompt.
  6. We repeat this process until we get the final summary output.
  7. (As a BONUS, I would recommend sending the final output into one more prompt for final processing. I excluded this from the code here, but simply take the final summarySoFar and pass that into a fresh GPT prompt with your own custom instructions.)

To optimize this for your particular use case, I recommend adjusting the system prompt to your desired organization, formatting, style, etc. along with the temperature setting and choice of model (as expected, gpt-4 performs WAY better than gpt-3.5 but I left in gpt 3.5 as the default since I don't want you to rack up unexepected usage charges :) ).

Otherwise, all you need to do is put your API key in place of the ${process.env.OPENAI_KEY} and assign your long-form content to the text variable at the top.

Happy Summarizing!

Top comments (0)