How to make a ChatBot using HTTP streaming with LangChain and Express

#ai #javascript #llm #openai

This is a very basic guide to help you get started with HTTP streaming using LangChain and Express.

Server

Let's start by creating a basic Express app:

// server.js
import * as dotenv from "dotenv";
import express from "express";
import cors from "cors";

dotenv.config();

const app = express();
app.use(cors({ origin: "*" }));

app.listen(5000, () => {
  console.log("Server running on http://localhost:5000");
});

We will use dotenv to manage environment variables required by our program. For this example we are going to need an API key from OpenAI.
If you don't know how to get one, have a look at the official docs.

Add the OpenAI key into .env

OPENAI_API_KEY="..."

Then we can add a /chat handler to our server.

// ...
import { ChatOpenAI } from "langchain/chat_models/openai";
import { HumanChatMessage } from "langchain/schema";

const chat = new ChatOpenAI({
  temperature: 0.9,
  openAIApiKey: process.env.OPENAI_API_KEY,
});

app.get("/chat", async (req, res) => {
  const message = req.query.message;
  if (message && typeof message === "string") {
    const response = await chat.call([
      new HumanChatMessage(message),
    ]);
    res.json(response);
  } else {
    res.json({ error: "No message provided" });
  }
});

This implementation will be very slow because express will wait for the entire response to be generated before sending it back to the client.

To fix this we can tell LangChain to respond using a stream, which can be intercepted using the handleLLMNewToken callback.
On every new token we will use res.write to stream the response to the client.

app.get("/chat", async (req, res) => {
  const message = req.query.message;
  if (typeof message === "string" && message) {
    // To stream the response we need access to the Response object.
    // For this reason we need to create a new ChatOpenAI instance
    // for each request.
    const chat = new ChatOpenAI({
      temperature: 0.9,
      openAIApiKey: process.env.OPENAI_API_KEY,
      streaming: true,
      callbacks: [
        {
          handleLLMNewToken(token: string) {
            console.log("New token:", token);
            res.write(token);
          },
        },
      ],
    });
    // We need to await the call to ensure that the
    // connection is closed after the whole response
    // is sent.
    await chat.call([new HumanChatMessage(message)]);
    res.end();
  } else {
    res.json({ error: "No message provided" });
  }
});

The server part is complete, now let's see how clients can consume the response.

Command Line Client

This function will accept a message and a callback that will be called with every token in the stream.
If we use it in Node.JS with readline we can build a very simple ChatBot in the Terminal:

#!/usr/bin/env node
import { createInterface } from "node:readline/promises";

async function chat(
  message: string,
  callback: (token: string) => void
) {
  // Send request to the server
  const url = new URL("http://localhost:3333/chat");
  url.searchParams.append("message", message);
  const response = await fetch(url);
  if (!response.body) throw new Error("No response body");

  // Read the response body as a stream
  const reader = response.body.getReader();
  let text = "";

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    const token = new TextDecoder("utf-8").decode(value);
    text += token;
    callback(token);
  }
  return text;
}

const readline = createInterface({
  input: process.stdin,
  output: process.stdout,
});

console.log();
console.log("Welcome to the LangChain chat demo!");
console.log(
  "Type a message and press enter to chat with the AI."
);
console.log();

// We read the user input from stdin
readline.on("line", async (line) => {
  await chat(line, (token) => {
    // We write the response tokens to stdout as they come in
    process.stdout.write(token);
  });
});

Browser Client

If you need to create an interface for the browser, here is a reference implementation using React.

/* eslint-disable no-constant-condition */
import { useCallback, useState } from "react";

export default function Chat() {
  const [prompt, setPrompt] = useState("");
  const [lastMessage, setLastMessage] = useState("");

  const handleSubmit = useCallback(async () => {
    // Reset UI
    setPrompt("");
    setLastMessage("");

    // Send request
    const url = new URL("/chat", "http://localhost:3333");
    url.searchParams.append("message", prompt);
    const response = await fetch(url);

    // Stream response
    if (!response.body) throw new Error("No response body");
    const reader = response.body.getReader();

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;
      const text = new TextDecoder("utf-8").decode(value);
      setLastMessage((prevText) => prevText + text);
    }
  }, [setLastMessage, setPrompt, prompt]);

  return (
    <section>
      <form onSubmit={handleSubmit}>
        <label htmlFor="message">Type a question</label>
        <input
          id="message"
          value={prompt}
          onChange={(e) => setPrompt(e.target.value)}
        />
        <button type="submit">Send</button>
      </form>
      <p>{lastMessage}</p>
    </section>
  );
}