Named Entity Recognition also known as NER, is a Natural Language Processing (NLP) task that identifies and classifies named entities in a text. Named entities are real-world objects assigned a name. They include people's names, location names, work of art, organizations, days, dates and among many others.
Named Entity Recognition is usually used for extracting key information to understand a text while performing task such as topic identification. It can also be used on its own for the case of just extracting important information from a text.
In this article, I am going to explain how to perform Named Entity Recognition using Spacy.
- Spacy installed
- Python installed
- Basic knowledge of python programming
Spacy is an open-source NLP library that is used for performing various NLP tasks.
It has a built-in mechanism that is used for identifying and classifying named entities.
First, let's import the Spacy library
Then load the "en_core_web_sm" model and assign it to a variable named nlp
nlp = spacy.load("en_core_web_sm")
Let's create a sample text which we will extract named entities from
sample_text = "Over 200 youth from Kisumu County in Kenya, have today gotten a chance to take part in a Golf programme by Safaricom held at Lolwe Grounds."
Then create a Spacy document by passing the sample text into nlp()
doc = nlp(sample_text)
To extract the named entities from the document we will use '.ents'
Output: (200, Kisumu County, Kenya, today, Safaricom, Lolwe Grounds)
Let's now print all the entities together with the category(label) they have been classified to.
for ent in doc.ents: print(ent, ent.label_)
Kisumu County GPE
Lolwe Grounds FAC
Spacy has a method 'explain()', that a label/category can be passed to and it gives an explanation of that label/category.
To get a quick definition of a label, we can use the 'explain()' method.
Let's try it out with the labels we got
Output: Numerals that do not fall under another type
Output: Countries, cities, states
Output: Absolute or relative dates or periods
Output: Buildings, airports, highways, bridges, etc.
Displacy is a built-in Spacy dependency visualizer.
It will show the Named Entities directly in the text.
Let's import Displacy
from spacy import displacy
Then, we will create the visual
Named Entity Recognition is one of the methods that can be used to gain insights from a text while carrying out NLP tasks. Named Entity Recognition has several use cases such as in Recommendation systems, enabling efficient search algorithms, customer support and so on.