DEV Community

Daniel Brian
Daniel Brian

Posted on

FAKE NEWS DETECTION WITH PYTHON

Fake news is one of the biggest problems with online social platforms.We can using machine learning for fake news detection. In this article, I will walk you through the task of Fake News Detection with Machine Learning using Python.Here is the basic workflow.

#we're loading data into our notebook
#we load in the necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB

Enter fullscreen mode Exit fullscreen mode

After importing the relevant libraries we have to load in our d

#importing our data through pandas
data=pd.read_csv('/content/drive/MyDrive/news.csv')
Enter fullscreen mode Exit fullscreen mode

After loading the data, let us examine it. It is usually not recommended to throw all the data into a predictive model without first understanding the data. this would often help us improving our model.

#we check our first five rows and columns
data.head()
Enter fullscreen mode Exit fullscreen mode

Missing data values can really mess our model. It's crucial to check whether there are some missing data and fill them. For our case we didnt have any missing values.

#To check whether our data have missing values
data.isnull().sum()

Enter fullscreen mode Exit fullscreen mode

Machine learning algorithms cannot understand texts(categorical values)and thus we have to convert them to numeric data.

We will convert our 'label' into 1 and 0 using the lamda function.

data['label'] = data['label'].apply(lambda x: 1 if x == 'REAL' else 0)
Enter fullscreen mode Exit fullscreen mode

For our other categorical features we will use function dummies from pandas library.

dummies=pd.get_dummies(data[['title','text']])
Enter fullscreen mode Exit fullscreen mode

Afterwards we split our data into x and y variables.

#we split our data into x and y
x=data.drop('label',axis=1)
y=data['label']
Enter fullscreen mode Exit fullscreen mode

Lastly we will fit our model using RandomForestClassifier.

from sklearn.naive_bayes import MultinomialNB
from sklearn.model_selection import train_test_split

np.random.seed(42)
x_train,x_test,y_train,y_test=train_test_split(dummies,y,test_size=0.2)
model=MultinomialNB()
model.fit(x_train,y_train)
model.score(x_test,y_test)

Enter fullscreen mode Exit fullscreen mode

So this is how we can train a machine learning model for the task of fake news detection by using the Python programming language.I hope you liked this article on the task of Fake News detection with machine learning using Python.

Top comments (0)