DEV Community

Cover image for Three Ways to Decode Text in NLP

Three Ways to Decode Text in NLP

jamescalam profile image James Briggs ・1 min read

One of the often-overlooked parts of sequence generation in natural language processing (NLP) is how we select our output tokens — otherwise known as decoding.

You may be thinking — we select a token/word/character based on the probability of each token assigned by our model.

This is half-true — in language-based tasks, we typically build a model which outputs a set of probabilities to an array where each value in that array represents the probability of a specific word/token.

At this point, it might seem logical to select the token with the highest probability? Well, not really — this can create some unforeseen consequences — as we will see soon.

When we are selecting a token in machine-generated text, we have a few alternative methods for performing this decode — and options for modifying the exact behavior too.

In this article we will explore three different methods for selecting our output token, these are:

Greedy Decoding
Random Sampling
Beam Search

It’s pretty important to understand how each of these works — often-times in language applications, the solution to a poor output can be a simple switch between these four methods.

Discussion (0)

Editor guide