You might have read about Reinforcement Learning when browsing through stories about AlphaGo – the algorithm that has taught itself to play the game of GO and beat an expert human player – and might have found it fascinating.
However, as the subject’s inherently complex and doesn’t seem that promising from a business point of view, you might not have thought it useful to explore it deeply.
Well, turns out RL’s lack of practical benefits is a misconception; there are actually quite a few ways companies can use the technology right now.
In this post, we’ll list possible reinforcement learning applications and explain without technical jargon how RL works in general.
So, in conventional supervised learning , as per our recent post, we have input/output (x/y) pairs (e.g labeled data) that we use to train machines with. Knowing the results for every input, we let the algorithm determine a function that maps Xs->Ys and we keep correcting the model every time it makes a prediction/classification mistake (by doing backward propagation and twitching the function.) We continue this kind of training until the results the algorithm produces are satisfactory.
In conventional unsupervised learning , we have data without labels and we introduce the dataset to our algorithm hoping that it’ll unveil some hidden structure within it.
Reinforcement learning solves a different kind of problem. In RL, there’s an agent that interacts with a certain environment, thus changing its state, and receives rewards (or penalties) for its input. Its goal is to find patterns of actions, by trying them all and comparing the results, that yield the most reward points.
One of the key features of RL is that the agent’s actions might not affect the immediate state of the environment but impact the subsequent ones. So, sometimes, the machine doesn’t learn whether a certain action is effective until much later in the episode.