DEV Community

Cover image for Introduction to Machine Learning

Posted on

Introduction to Machine Learning

Welcome to my Azure Machine Learning Series where we will be discussing everything from scratch about Machine Learning and implementation of the same in Azure.

In this blog, we will be discussing about

  • What is Machine Learning?
  • Machine Learning Techniques
  • Overview of Machine Learning Workflows


  • Basic Python Programming Knowledge
  • Azure Account

The buzz about Machine Learning

We have been hearing the buzz word “Machine Learning” very often lately. Why is that? Why not before two or three decades back? So machine learning just came up recently?

No, Machine learning started in the late 1960s. So again back to the same question. Why is there so much hype very recently?

Thanks to the huge volume of data we are generating nowadays and increased computational power at our disposal.


I am sure you would be confusing with the terms “Artificial Intelligence” and “Machine Learning”. Is it same?
Machine Learning is a subset of Artificial Intelligence, to be put in simple words.

This churns out a number of jobs in IT industries focusing on all sectors.

Data Scientist is termed as “ The Sexiest Job of 21st Century” and there is no doubt in it.

Not only data scientists, but there are also multiple related jobs such as Data Analysts, Data Engineer, Machine Learning Engineer and so on.
Below gives a basic expertise and skill one possesses for the job.


What is Machine Learning?

Machine learning (ML) is the process of using mathematical models of data to help a computer learn without direct instruction. It’s considered a subset of artificial intelligence (AI). Machine learning uses algorithms to identify patterns within data, and those patterns are then used to create a data model that can make predictions.


How does Machine Learning different from Classical Programming?


The classical programming is where the programmer has to construct a logic/algorithm where the input data is processed and output is given
A machine learning model learns the rules/algorithm by itself by looking at the input, output training examples.
It formulates the hypothesis at the end of the training.
So with the hypothesis, the test output is generated form the input

Let us take a simple example
Assume that the training pairs given are
<1,1> , <2,4>, < 3,9>, <4,16>,<5,25>
The machine learning model tries to generate a hypothesis, say f(x) = 2x , g(x) = x*x , h(x) = x*4
The best hypothesis among the hypothesis space is g(x) where the training output matches with the target.
So the g(x) is the hypothesis generated from the model. Let us say, we give an input 10, which gets applied to the hypothesis i.e, g(10) = 10*10 = 100.
So this is how a machine learning model works.

Machine Learning Techniques

There are three different types of learning known as

  1. Supervised Learning
  2. Unsupervised Learning
  3. Reinforcement Learning

Supervised Learning

A type of learning where we train the model by giving both inputs as well as the target(output). The above example is supervised learning.
For instance, in your childhood days, your father or someone would have taught you to ride a bicycle. We learnt it only from teaching someone.

Unsupervised Learning

A type of learning where is no target. The model tries to learn the data by itself without any target label.
This is similar to the skills we acquire. For example, take swimming, most of us would have got pushed into the pool and we somehow figured to swim to the other end.

Reinforcement Learning

A type of learning based on reward and punishment.

Few use-cases of Machine Learning are


Benefits of Machine Learning

  • Uncover insight
  • Improve data integrity
  • Enhance user experience
  • Reduce risk
  • Anticipate customer behavior
  • Lower costs

Overview of Stages in Machine Learning

Data Collection & Preprocessing

  • Identify data source
  • Data collection
  • Data Transformation
  • Anomaly Detection
  • Cleaning the data
  • Domain understanding

Understand the domain the problem belongs. The domain knowledge can be acquired from domain experts. Useful knowledge can be applied in our later process for better results.

Next thing is to go through the data.
Identify the features, target and the type of problem. Think of the ways you can transform the data such that it is easy to understand.

Go for visualization of data. It can give you a lot of insights by plotting data in a graph. A little statistics knowledge would be a big boon for you.

Only if the quality of the data is good, the results we acquire from our model will also be good.

So the quality measures of data are as follows

  • Accuracy: correct or wrong, accurate or not
  • Completeness: not recorded, unavailable, …
  • Consistency: some modified but some not, dangling, …
  • Timeliness: timely update?
  • Believability: how trustable the data are correct?
  • Interpretability: how easily the data can be understood?
  • Accessibility

The major steps in data preprocessing are

  • Data cleaning
    Fill in missing values, smooth noisy data, identify or remove outliers, and resolve inconsistencies

  • Data integration
    Integration of multiple databases, data cubes, or files

  • Data reduction

    • Dimensionality reduction
    • Numerosity reduction
    • Data compression
  • Data transformation and data discretization

    • Normalization
    • Concept hierarchy generation

Train the model

With the preprocessed data, we split it into train-test set and train the suitable machine learning with train dataset.

Validating the model

  • Validating on test dataset
  • Evaluating results
  • Finalising the data model

We evaluate how well the model performs with test data by calculating the metrics. We try hyper-parameter tuning( change the parameters of the model) or different models to see if we can improve the accuracy.

Deployment of ML Model

  • Deploying the ML Model
  • Real-time prediction
  • Model monitoring
  • Visualizations

If you have reached this far, I am sure you would have liked the article and understood the basics of Machine Learning by demystifying some myths.

Follow the series to learn more about Machine Learning in Azure.

Top comments (0)