DEV Community

Cover image for Chat Moderation with OpenAI
PubNub Developer Relations for PubNub

Posted on

Chat Moderation with OpenAI

Any application containing in-app chat needs some way to regulate and moderate the messages that users can exchange.  Since it is not feasible to moderate all inappropriate content with human moderators, the moderation system must be automatic.  Since users will frequently try to circumvent moderation, machine learning, generative AI, and large language models (LLMs) [and GPT models such as GPT-3 and GPT-4] are popular ways to moderate content.

Moderation is a complex topic, and PubNub offers various solutions to meet all of our developers’ use cases.

The Open AI Moderation Endpoint

This article will look at OpenAI’s Moderation API, a REST API that uses artificial intelligence (AI) to determine whether the provided text contains potentially harmful terms.  The API's intention is to allow developers to filter or remove harmful content, and at the time of writing, it is provided free of charge though only supports English.

The model behind the Moderation API will categorize the provided text as follows (taken from the API documentation):

  • Hate: Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste. Hateful content aimed at non-protected groups (e.g., chess players) is harassment.

  • Hate / Threatening: Hateful content that also includes violence or serious harm towards the targeted group based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.

  • Harassment: Content that expresses, incites, or promotes harassing language towards any target.

  • Harassment / Threatening: Harassment content that also includes violence or serious harm towards any target.

  • Self-Harm: Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders.

  • Self-Harm / Intent: Content where the speaker expresses that they are engaging or intend to engage in acts of self-harm, such as suicide, cutting, and eating disorders.

  • Self-Harm / Instructions: Content that encourages performing acts of self-harm, such as suicide, cutting, and eating disorders, or that gives instructions or advice on how to commit such acts.

  • Sexual: Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness).

  • Sexual / Minors: Sexual content that includes an individual who is under 18 years old.

  • Violence: Content that depicts death, violence, or physical injury.

  • Violence / Graphic: Content that depicts death, violence, or physical injury in graphic detail.

Results are provided within a JSON structure as follows (again, taken from the API documentation):

{
  "id": "modr-XXXXX",
  "model": "text-moderation-007",
  "results": [
    {
      "flagged": true,
      "categories": {
        "sexual": false,
        "hate": false,
        "harassment": false,
        "self-harm": false,
        "sexual/minors": false,
        "hate/threatening": false,
        "violence/graphic": false,
        "self-harm/intent": false,
        "self-harm/instructions": false,
        "harassment/threatening": true,
        "violence": true
      },
      "category_scores": {
        //  Out of scope for this article
      }
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Calling the Open AI Moderation API from PubNub

Integrating the Moderation API into any PubNub application is easy using PubNub Functions by following this step-by-step tutorial:

Functions allow you to capture real-time events happening on the PubNub platform, such as messages being sent and received; you can then write custom serverless code within those functions to modify, re-route, augment, or filter messages as needed.

You will need to use the “Before Publish or Fire” event type; this function type will be invoked before the message is delivered and must finish executing before the message is released to be delivered to its recipients.  The PubNub documentation provides more background and detail, but in summary: “Before Publish or Fire” is a synchronous call that can alter a message or its payload.

Create the PubNub Function

  1. Log into the PubNub admin portal and select the application and keyset for the app you want to moderate.

  2. Select ‘Functions’, which can be found unde the ‘Build’ tab.

  3. Select ‘+ CREATE NEW MODULE’ and give the module a name and description

  4. Select ‘+ CREATE NEW FUNCTION’ and give the function a name.

  5. For the event type, select ‘Before Publish or Fire’

  6. For the Channel name, enter * (this demo will use *, but your application may choose to specify only the channels here that you want to moderate)

Having created the PubNub function, you need to provide your Open AI API key as a secret.

  1. Select ‘MY SECRETS’ and create a new key with name ‘OPENAI_API_KEY’

  2. Generate an Open AI API key and ensure that key has access to the moderate API.

  3. Provide the generated API key to the PubNub function secret you just created.

The body of the PubNub function will look as follows:

const xhr  = require('xhr');
const vault = require('vault');

export default request => {
  if (request.message && request.message.text)
  {
    let messageText = request.message.text
    return getOpenaiApiKey().then(apiKey => {
      return openAIModeration(messageText).then(aiResponse => {
        //  Append the response to the message
        request.message.openAiModeration = aiResponse;
        //  If the message was harmful, you might also choose to report the message here.
        return request.ok();
      })
    })
  }
  return request.ok();
};

let OPENAI_API_KEY = null;
function getOpenaiApiKey() {
  // Use cached key
  if (OPENAI_API_KEY) {
      return new Promise(resolve => resolve(OPENAI_API_KEY));
  }
  // Fetch key from vault
  return vault.get("OPENAI_API_KEY").then(apikey => {
      OPENAI_API_KEY = apikey;
      return new Promise(resolve => resolve(OPENAI_API_KEY));
  });
}

function openAIModeration(messageText) {
  const url = 'https://api.openai.com/v1/moderations';
  const http_options = {
    'method': 'POST',
    'headers': {
      "Content-Type": "application/json",
      "Authorization": `Bearer ${OPENAI_API_KEY}`,
    },
    'body': JSON.stringify({
      "input": messageText
    }),
    timeout: 9500,
    retries: 0
  };
  return xhr.fetch(url, http_options)
    .then((resp) => {
      const body = JSON.parse(resp.body);
      return body;
    })
    .catch((err) => {
      console.log(err);
      return "Open AI Timed out";
    });
}
Enter fullscreen mode Exit fullscreen mode

The function itself is quite straightforward:

For each message received:

  • Pass it to the Open AI moderation function

  • Append the returned moderation object as a new key on the Message (JSON) object 

Save your function and make sure your module is started

Latency

The PubNub function you have just created will be executed synchronously every time a message is sent, and that message will not be delivered until the function has finished executing.  Since the function contains a call to an external API, the delivery latency will depend on how fast the API call to Open AI returns, which is outside of PubNub’s control and could be quite high.

There are several ways to mitigate any degradation in the user experience. Most deployments provide immediate feedback to the sender that the message was sent and then rely on read receipts to indicate that the message is delivered (or reported). 

Update the Client Application

Let’s consider what would be required to handle the moderation payload within your application using the Chat Demo, which is a React application that uses the PubNub Chat SDK to show most of the features of a typical chat app.  

Set up an attribute to track whether or not a potentially harmful message should be displayed:

 const [showHarmfulMessage, setShowHarmfulMessage] = useState(false)
Enter fullscreen mode Exit fullscreen mode

And add some logic to not show a potentially harmful message by default, in this case within message.tsx:

{(
  !message.content.openAiModeration || 
  !message.content.openAiModeration?.results[0].flagged || 
  showHarmfulMessage) && (message.content.text
)}
{
  !showHarmfulMessage && 
  message.content.openAiModeration?.results[0].flagged && 
  <span>Message contains potentially harmful content 
    <span className="text-blue-400 cursor-pointer" 
    onClick={() => {setShowHarmfulMessage(true)}}>(Reveal)
    </span>
  </span>
}
Enter fullscreen mode Exit fullscreen mode

Chat Moderation with OpenAI - Image

Note that these changes are not present on the hosted version of the Chat Demo but the ReadMe contains full instructions to build it and run it yourself from your own keyset. 

Wrap up

And there you have it, a quick and easy (and free) way to add both moderation and sentiment analysis to your application using Open AI.

To learn more about integrating Open AI with PubNub, check out these other resources:

Feel free to reach out to the DevRel team at devrel@pubnub.com or contact our Support team for help with any aspect of your PubNub development.

How can PubNub help you?

This article was originally published on PubNub.com

Our platform helps developers build, deliver, and manage real-time interactivity for web apps, mobile apps, and IoT devices.

The foundation of our platform is the industry's largest and most scalable real-time edge messaging network. With over 15 points-of-presence worldwide supporting 800 million monthly active users, and 99.999% reliability, you'll never have to worry about outages, concurrency limits, or any latency issues caused by traffic spikes.

Experience PubNub

Check out Live Tour to understand the essential concepts behind every PubNub-powered app in less than 5 minutes

Get Setup

Sign up for a PubNub account for immediate access to PubNub keys for free

Get Started

The PubNub docs will get you up and running, regardless of your use case or SDK

Top comments (0)