Ganesh Kumar

Posted on Jul 25, 2019

AI to Control “The Fake News” Market !

#ai #fake #machinelearning #mmachinelearning

Fake news is the reason why the Internet is going in the wrong direction: whether if it's coming from dodgy websites or from fake social media profiles, putting fake news online is still one of the biggest (if not the biggest) problems of the current internet situation. When Mark Zuckerberg was interrogated at the Congress, he said that "What we need to stop the whole fake news market is an AI who's capable of understanding what content is coming from relevant sources and what pieces of information are not reporting true facts". The question right now is: are we already at a point in which AI can understand what is "true" and what is not? Let's try and break it down in a simpler way.

How does it work

The whole process will be based on ultrafast split testing: basically, what people usually do when it comes to researching if an article is legit, is a simple Google search and, if there are many articles reporting the same news in the same way, then it should be true. The AI will be required to do it instantly, looking at multiple results and following the websites' Trust Flow listed by the Google's algorithm. Once all the results are elaborated, the programmed AI will decide if the results are following relevant sources' information, like the NY Times for example.

Correlating the Linguistic Features

One of the most reliable ways to detect fake news is to examine the common linguistic features across various source’s articles. Those features include sentiment, complexity, and structure. Sources that often produce fake news are more likely to use words that are exaggerated, subjective, and full of emotion.

Another common feature of fake news is a sensational headline. As headlines are the key to capture the attention of the audience, they have become a tool of attracting the interest from a wider population. Fake news almost always uses a sensational headline since they are only partially limited by actual facts.

There is already a type of AI (machine learning algorithms) widely deployed to fight spam email. Those algorithms analyze the text and determine if the email comes from a genuine person or if it is a mass-distributed message, designed to sell something or spread a political message. Refining and adjusting those algorithms would make it possible to examine and compare the title and the text of a post with the actual content of the article. This can be another clue in assessing the post’s accuracy.

Considering some aspects of artificial intelligence make it possible to learn from past behaviors, the best approach is to train the machine learning algorithms to improve based on past articles already proven to be fake. By doing so, it is possible to determine the most apparent commonalities and develop a foundation upon which we can predict the likelihood an article is fake.

Weighing the Facts

Weighing the facts that the news is relying on is another important aspect. Artificial intelligence has developed to a stage where it is possible to examine the facts in a certain article (a Natural Language Processing engine can be used to go through the headline and the subject of the story, including the geolocation) and compare them with the facts from other sites covering the same subject. After this is done, the facts can be weighed against each other, which adds another dimension to the credibility of the story.

In crypto weighing the facts could also backfire, because there are many cases when the initial report comes from within a crypto project. When the news is spread the untrue parts of the story are also replicated. That’s why it is important to add another crucial aspect of news assessment – keeping a good track of source reputation.

Source Reputation

Focusing on the news sources themselves is a very important aspect of assessing the news. Machine learning algorithms have already been successful in examining the accuracy and political bias of news sources.

Artificial intelligence can also be used to find correlations with a source’s Wikipedia page, which can be examined and rated based on various criteria. For example, a longer Wikipedia description of a source associated with a higher credibility. Furthermore, words like “extreme” or “conspiracy” are often used when describing an unreliable source. Another thing to look at is the source’s URL text. A less reliable source is more likely to have a URL with lots of special characters and complex subdirectories.

Keeping a good track record of news sources is also very important, as it is necessary to constantly update the source reputation. Every piece of news should influence the overall source score, for it is important to assess every situation in a quick and accurate fashion.

AI as the Creator of Fake News

One of the biggest challenges of using artificial intelligence to combat fake news is the arms race with itself. Artificial intelligence is already used to creating incredible “deepfakes”(photos and video in which someone’s face is replaced or footage is manipulated, making it appear as if the person said something he actually didn’t). Even smartphone apps are capable of this kind of manipulation, making the technology accessible to nearly anyone.

Researchers have already developed artificial intelligence capable of recognizing the manipulated image and video material. For example, through video magnification, it is possible to detect human pulse patterns to confirm whether a person is real or computer generated. This is just the beginning as technology on both sides is just going to get better and better.

What about Google?

As mentioned in the split testing paragraph, Google's algorithm determines which website has good and reliable content and because of that the AI who's responsible for the fake news reporting must be "Google trained": following the crawlers' criteria, all the pages that are listing news about delicate subjects such as politics, medicine, war etc. must be inspected first, to see if there is a possible discrepancy somewhere else.

Human Intelligence is Still Crucial

Humans will still play an important role in the process of news assessment. There are complex cases where humans will have to work together with technology to efficiently address the situation. The evolution of artificial intelligence should reduce the number of such situations, but it is likely human intervention will still be required for quite some time.

Audience awareness and critical thinking are additional aspects of human intelligence. People should be encouraged to always investigate information rather than simply sharing it. Sharing means giving credibility to an article. People who know you personally and trust you will more likely believe the shared post and won’t necessarily question its factuality.

The good news is that the audience exposed to factual reporting is much more likely to differentiate between real and fake information. Therefore, a lot can be done just by sharing true information as much as possible.

Once it's found

If the AI finds out that there is a random viral article online with distorted information, it will be required to report it to relevant authorities. Currently, this is done manually and that's why many different pieces of information are left behind while they keep on getting shares and comments. With an automated and fast AI, this would literally be a game changer.

Some Data

During the last year, the amount of interest over the fake news subject has risen up consistently. The current fake news market is currently fluctuating between 70.8 thousand / 118 thousand clicks per month on Google Search and has over 251.2 thousand mentions on Twitter per month. In a recent poll, it was estimated that 64% of the people interviewed said that fake news caused "a great deal of confusion", as included in the infographic below.

Is it possible right now?

Let's put it straight here: this kind of AI technology is already possible to create, manage and develop since it's basically the same process that happens with mobile app development. The problems are revolving around Facebook, Twitter and Tumblr (this last one after the recent scandal with Russian accounts spreading fake pieces of information on the micro-blogging platform), since every single one of those has its own architecture and its reporting process when a piece of content is not following guidelines.

It's a startup thing

There are not many companies who are trying to develop an AI who's able to control and manage an insane amount of traffic, while constantly split testing with many other results to guarantee the veracity and the user experience when it comes to a news article. The brightest one, without any doubts, is Factmata: a London-based startup with Mark Cuban on the lead. Not only Factmata has attracted personalities like Craigslist's founder Craig Newmark, but also the Twitter's Co-Founder Biz Stone. They are currently listed in the top startups to watch out in 2018.

Fake news infographic: numbers

Conclusions

AI will be a central point in news and content administration in the near future, following Google and Facebook's guidelines, terms and conditions, most likely. What we should expect in the next 5 years is a user-friendly interface that tells what and where are the dodgy articles.

Top comments (2)

Vicente G. Reyes • Jul 25 '19

Interesting!

However, the media sometimes lies about the truth. How would the AI know if it's fake news or not?

Ganesh Kumar • Jul 25 '19

"To know your enemy, you must become your enemy."
To detect fake news, this AI/ML should first learned to write it using an “adversarial” system, wherein one aspect of the model generates content and another rates how convincing it is — if it doesn’t meet a threshold, the generator tries again, and eventually it learns what is convincing and what isn’t. Ofcourse, some NLP also needed !