DEV Community

alice
alice

Posted on

Read Along: Probabilistic Machine Learning, An Introduction by Kevin P. Murphy (1.1 -1.2)

In this series of blog posts, I am reading through the book "Probabilistic Machine Learning: An Introduction" by Kevin P. Murphy, and writing it down like this to help me understand and remember the material. It's a mix of summary plus my own examples.

1. Introduction

1.1 What is Machine Learning?

Machine Learning (ML) is when a program learns from experience (E) to perform (P) some task (T), and its ability to perform the task is improved by experience.

There are many different kinds of ML, depending on the nature of the task and the measurement of performance. This book covers the most common ones, from a probabilistic perspective. This means that we can predict results even with unknown variables, with weighted parameters. Reasons for this are:

  1. Decision making under uncertainty because real-life scenarios don't always provide all the features required to make a label.

  2. Probabilistic modeling is used by a wide range of subjects, making it a common ground with machine learning.

1.2 Supervised Learning

The most common form of ML, the Task is to learn a mapping f from given features to labels. Each feature is a fixed dimensional vector of numbers.

X = R^D

where X is the input space, R represents real numbers, and D is the number of dimensions, or the number of features in each input data point. In traditional machine learning, D is predefined, but in deep learning, the model can learn to identify and create new higher-level features.

Let's use an hypothetical example of Alice training ChatGPT to cook (imaginary) breakfast eggs. To clarify, ChatGPT is a pretrained LLM and anything it learns from individual user sessions does not affect the base model, and it is updated periodically with new training data.

Alice with a robot cook making her eggs

Alice: "Hi Chat, I want me some eggs sunny side up today, I'm happy, the weather is sunny, and it's Sunday!"

Here we have 4 features: egg type, Alice-mood, weather, day of the week.

So now ChatGPT can learn a mapping of Alice's breakfast egg preferences based on these 4 dimensions and each time Alice tells him about what she wants for breakfast egg, it gives him an input in the respective features. Now here, let's say one day Alice only gives Chat 2 features, she says, "Hey Chat, I'm happy today and it's Monday, I can't see the weather but feed me some eggs please!"

Here, with the probabilistic model, based on weights learned from previous data, Chat can make a prediction of what Alice will enjoy eating for eggs.

ok moving on...

1.2.1 Classification

Classification is a problem where the output space is a set of C unordered and mutually exclusive labels known as classes. Y = {1, 2, ..., C}.

Taking the breakfast egg example: when Alice asks for eggs from ChatGPT, he decides on what class of eggs she wants: sunny side up, hard boiled, poached, scrambled... and so on. Each class of eggs is mutually exclusive here, so in this example, when Alice asks for an egg breakfast, she'll only receive one type of eggs.

Another type of classification is called a binary classification, and it is useful for things like email filtering: spam / not spam.

Pattern recognition is where you are asked to predict the class label given an input.

in our egg-sample:

  1. Input Data (Features): The input data would be the various features that ChatGPT observes about Alice's preferences: the day of the week, Alice's mood, the weather.

  2. Identifying Patterns: Over time, as Alice makes more breakfast requests, ChatGPT starts to notice patterns. For instance, it might recognize that Alice tends to prefer sunny side up eggs on sunny days, or scrambled eggs when she's in a hurry.

  3. Classification (Pattern Recognition): Once ChatGPT has learned these patterns, it can start predicting the type of eggs Alice might want, based on the input features of each new day. If one morning Alice says she's feeling great and it's a sunny Sunday, but doesn't specify her egg preference, ChatGPT can use the patterns it has recognized to predict that she might like sunny side up eggs.

  4. Outcome: The outcome of pattern recognition is the classification or prediction of the class label (type of egg preparation) based on the observed patterns in the input features.

Top comments (0)