DEV Community

Purity-E
Purity-E

Posted on

Named Entity Recognition with Spacy

Introduction

Named Entity Recognition also known as NER, is a Natural Language Processing (NLP) task that identifies and classifies named entities in a text. Named entities are real-world objects assigned a name. They include people's names, location names, work of art, organizations, days, dates and among many others.

Named Entity Recognition is usually used for extracting key information to understand a text while performing task such as topic identification. It can also be used on its own for the case of just extracting important information from a text.

In this article, I am going to explain how to perform Named Entity Recognition using Spacy.

Prerequisite

  • Spacy installed
  • Python installed
  • Basic knowledge of python programming

What is Spacy?

Spacy is an open-source NLP library that is used for performing various NLP tasks.
It has a built-in mechanism that is used for identifying and classifying named entities.

NER using Spacy

First, let's import the Spacy library

import spacy
Enter fullscreen mode Exit fullscreen mode

Then load the "en_core_web_sm" model and assign it to a variable named nlp

nlp = spacy.load("en_core_web_sm")
Enter fullscreen mode Exit fullscreen mode

Let's create a sample text which we will extract named entities from

sample_text = "Over 200 youth from Kisumu County in Kenya, have today gotten a chance to take  part in a Golf programme by Safaricom held at Lolwe Grounds."
Enter fullscreen mode Exit fullscreen mode

Then create a Spacy document by passing the sample text into nlp()

doc = nlp(sample_text)
Enter fullscreen mode Exit fullscreen mode

To extract the named entities from the document we will use '.ents'

print(doc.ents)
Enter fullscreen mode Exit fullscreen mode

Output: (200, Kisumu County, Kenya, today, Safaricom, Lolwe Grounds)

Let's now print all the entities together with the category(label) they have been classified to.

for ent in doc.ents:
    print(ent, ent.label_)
Enter fullscreen mode Exit fullscreen mode

Output
200 CARDINAL
Kisumu County GPE
Kenya GPE
today DATE
Safaricom ORG
Lolwe Grounds FAC

The explain() method

Spacy has a method 'explain()', that a label/category can be passed to and it gives an explanation of that label/category.
To get a quick definition of a label, we can use the 'explain()' method.

Let's try it out with the labels we got

spacy.explain("CARDINAL")
Enter fullscreen mode Exit fullscreen mode

Output: Numerals that do not fall under another type

spacy.explain("GPE")
Enter fullscreen mode Exit fullscreen mode

Output: Countries, cities, states

spacy.explain("DATE")
Enter fullscreen mode Exit fullscreen mode

Output: Absolute or relative dates or periods

spacy.explain("FAC")
Enter fullscreen mode Exit fullscreen mode

Output: Buildings, airports, highways, bridges, etc.

Visualizing Named Entities using Displacy

Displacy is a built-in Spacy dependency visualizer.
It will show the Named Entities directly in the text.

Let's import Displacy

from spacy import displacy
Enter fullscreen mode Exit fullscreen mode

Then, we will create the visual

displacy.render(doc,style="ent",jupyter=True)
Enter fullscreen mode Exit fullscreen mode

Output

Displacy output

Conclusion

Named Entity Recognition is one of the methods that can be used to gain insights from a text while carrying out NLP tasks. Named Entity Recognition has several use cases such as in Recommendation systems, enabling efficient search algorithms, customer support and so on.

In this article, we looked at Named Entity Recognition using Spacy. But, Spacy is not the only library that can be used for NER. Other open-source libraries that you can use are NLTK and Stanford NER

Credits

Discussion (0)