DEV Community

CaptainPickles
CaptainPickles

Posted on

Distributed web chat and NLP

Hi there! My name is Lucas Levesque and I am a French web development student. As part of a group project, I recently worked on a distributed web chat app that uses natural language processing (NLP) to classify and flag messages that contain hate speech or may be related to suicide. We built this app because we believe that it is important to create safe and supportive online communities, and we wanted to do our part to help prevent the spread of harmful or dangerous content. In this article, I will share more about our project and the challenges we faced while building the app. I hope that our work can inspire others to create similar tools to help create a safer and more supportive online environment.

What is a web chat

Web chat is a form of online communication in which users can exchange messages in real-time through a web browser. Web chat can be used for a variety of purposes, such as customer service, online education, or socializing with friends and family.

An exemple would be slack
Image description

But as a developper we want our app to scale and for that we use distributed application.

A distributed application (or distributed app) is a software application that runs on multiple devices or computers, often connected over a network or the internet. Distributed apps are designed to provide a single, cohesive user experience, even though the underlying software and data may be distributed across multiple devices or servers.

As one of the lead on the project, I was in charge of designing the architecture of the app and implementing the natural language processing (NLP) services. One of the main challenges we faced was how to effectively classify messages for hate speech and suicidal content. To address this challenge, I created two separate work queues: one for hate speech classification and one for suicidal content classification.

Architecture of our distributed app

Image description

In order to scale our chat app, we implemented a message queue using Redis, a popular in-memory data structure store. Specifically, we used Redis' PUB/SUB (publish-subscribe) feature, which allows clients to send and receive messages asynchronously.

To use Redis' PUB/SUB feature, we created two separate channels: one for hate speech classification and one for suicidal content classification. Whenever a user sent a message in the chat app, it would be added to the appropriate work queue (either the hate speech queue or the suicidal content queue) and then processed by the NLP service.

Using Redis' PUB/SUB feature allowed us to scale our chat app by allowing us to process messages asynchronously and in parallel, which helped to improve the speed and reliability of the app. It also allowed us to easily add additional processing power by adding more nodes to the network, which helped us to handle increases in traffic and data volume.

For the NLP component of the project, we decided to use the Hugging Face library, which provides access to a wide range of pre-trained NLP models. We found that these models provided excellent results and allowed us to quickly prototype and iterate on our classifiers.

The hate speech classifier allowed us to ban peoples with hate speech since our model classify message with 3 labels "normal" | "offensive" | "hate speech". If message is classified as hate speech we ban him, if he is offensive we show a warning before showing the message and if normal we do nothing.

The Suicidal classifier allow us to provide helps to people in need of support.

Hate speech classifier model

Suicidal speech classifier model

Overall, it was a challenging but rewarding project, and I am proud of what we were able to accomplish. I hope that our app can serve as a model for other developers looking to build similar tools to help create safer and more supportive online communities.

Git of the project

Top comments (0)