## DEV Community is a community of 701,526 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

# Demystifying machine learning for beginners

Code_Jedi
Javascript, Node.js, Python, PHP, React and Vue. Coding since 2017
Updated on ・3 min read

### If you're a confused beginner like I was when just starting out with machine learning in python, then stick around, because today, I'll be trying my best at demystifying and simplifying machine learning for you!

To start off, I presume that you would like to learn machine learning for the following reasons:

1. Working with datasets
2. Visualizing data
3. Predicting data
4. Classifying data

In this tutorial we're going to be making a python script, that will:

• Load a dataset
• Visualize the dataset
• Classify a new piece of data given the dataset

## Let's get started!

First, let's import the required libraries:

``````import pandas
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.neighbors import KNeighborsClassifier
from sklearn import preprocessing
``````

If you don't have some of these installed, you can install them by using `pip install` or `pip3 install`

Next, we're going to load-in the dataset which we're going to be using for this project:

``````import pandas
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.neighbors import KNeighborsClassifier
from sklearn import preprocessing

``````

For this project, we're going to be using the classic iris dataset which you can download here

### Now comes the tricky bit...

Add these lines of code to your python script:

``````model = KNeighborsClassifier(n_neighbors=3)

features = list(zip(df["sepal_length"], df["sepal_width"]))

model.fit(features,df["species"])
``````

Let me explain...

• First, we define our model and give it 3 possible classes into which a new piece of data can be classified.
• We then define the "features" variable which is going to take the "sepal_length" and "sepal_width" columns as the characteristics that we're going to compare in order to classify new pieces of data.
• Finally, we fit our model with the names of the 3 Iris species, as well as their corresponding "sepal_length" and "sepal_width" values.

Before, we start predicting new pieces of data, let's graph our dataset using a scatter graph. In our graph, the X axis will be representing the "sepal_length" and the Y axis will be representing the "sepal_width". We're also going to color code the different species of Iris flowers by adding `hue='species'`. and then finally we'll define the data that we're going to be graphing as our Iris dataset by adding `data=df` to the end:

``````sns.scatterplot(x='sepal_length', y='sepal_width',
hue='species', data=df, )

# Placing Legend outside the Figure
plt.legend(bbox_to_anchor=(1, 1), loc=1)

plt.show()
``````

Here's how the scatter graph should look like:

To start classifying new pieces of data, first comment out the last code snippet like so:

``````import pandas
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.neighbors import KNeighborsClassifier
from sklearn import preprocessing

model = KNeighborsClassifier(n_neighbors=3)

features = list(zip(df["sepal_length"], df["sepal_width"]))

model.fit(features,df["species"])

"""sns.scatterplot(x='sepal_length', y='sepal_width',
hue='species', data=df, )

# Placing Legend outside the Figure
plt.legend(bbox_to_anchor=(1, 1), loc=1)

plt.show()
"""
``````

Then add these 2 lines of code to the end of your script:

``````predicted = model.predict([[4.6,5.8]])
print(predicted)
``````

This will simply predict which species of Iris flower is one that has a sepal_length of 4.6 and a sepal_width of 5.8.

#### Now if you run your code, your output should look like this:

``````['Iris-setosa']
``````

This means that our new mystery Iris flower has been classified as an "Iris-setosa".

You've made your first machine learning project!

You can now experiment with this code as well as try some new datasets(you can find lots of great ones on https://www.kaggle.com/).

If you're a beginner who likes discovering new things about python, try my weekly python newsletter

Byeeeee👋