DEV Community

Cover image for Reinforcement Learning in 5 Minutes
Temp-insta
Temp-insta

Posted on

Reinforcement Learning in 5 Minutes

We want our AI model to make right decisions, and one powerful approach to achieve this is through reinforcement learning. Reinforcement Learning is a subfield of machine learning that involves training an agent to make decisions by selecting actions from its action space within a specific environment, with the goal of maximizing cumulative rewards over time.

How Do We Define Reinforcement Learning?

Reinforcement Learning is a subfield of machine learning that teaches an agent how to choose an action from its action space, within a particular environment, in order to maximize rewards over time.

Reinforcement Learning has four essential elements:

  • Agent. The program you train, with the aim of doing a job you specify.
  • Environment. The world, real or virtual, in which the agent performs actions.
  • Action. A move made by the agent, which causes a status change in the environment.
  • Rewards. The evaluation of an action, which can be positive or negative.

Supervised, Unsupervised, and Reinforcement Learning: What are the Differences?

Difference #1: Static Vs.Dynamic

The goal of supervised and unsupervised learning is to search for and learn about patterns in training data, which is quite static. RL, on the other hand, is about developing a policy that tells an agent which action to choose at each step — making it more dynamic.

Difference #2: No Explicit Right Answer

In supervised learning, the right answer is given by the training data. In Reinforcement Learning, the right answer is not explicitly given: instead, the agent needs to learn by trial and error. The only reference is the reward it gets after taking an action, which tells the agent when it is making progress or when it has failed.

Difference #3: RL Requires Exploration

A Reinforcement Learning agent needs to find the right balance between exploring the environment, looking for new ways to get rewards, and exploiting the reward sources it has already discovered. In contrast, supervised and unsupervised learning systems take the answer directly from training data without having to explore other answers.

Difference #4: RL is a Multiple-Decision Process

Reinforcement Learning is a multiple-decision process: it forms a decision-making chain through the time required to finish a specific job. Conversely, supervised learning is a single-decision process: one instance, one prediction.

Reference: https://www.epicprogrammer.org/reinforcement-learning/

Top comments (0)