DEV Community

Kendi Muriuki
Kendi Muriuki

Posted on

INTRODUCTION TO PYTHON FOR DATA SCIENCE!

python is a programming language that was discovered bt a Dutch programmer in the late 1980s and the first version was released in 1991. over the years the version has been improved with the latest one being python 3.10 releases in 2021.Python is used for data science and web development among other uses. in todays article we will focus on how python is used for data science. without further a o, lets get into it.
In this comprehensive guide, we will provide an in-depth introduction to Python for data science. We will cover the basics of the language, including data types, functions, control structures, and object-oriented programming. We will also explore some of the most popular libraries and frameworks used in data science, including NumPy, Pandas, Matplotlib, and Scikit-learn.
Lets begin with data types. Understanding data types is key in ensuring that you write bug free codes. There are several types of data,
1.Numbers: Python supports several types of numbers, including integers, floats, and complex numbers. Integers are whole numbers, while floats are numbers with a decimal point. Complex numbers have a real and imaginary part.

2.Strings: Strings are used to represent text in Python. They are created by enclosing a sequence of characters in single or double quotes. String manipulation is a common task in Python, and there are many built-in string methods and functions for this purpose.

3.Booleans: Booleans are a special data type in Python that can only have two values: True and False. They are often used in conditional statements and loops to control the flow of a program.

4.Lists: Lists are a collection of values that can be of any data type. They are created using square brackets and can be modified by adding or removing items.

5.Tuples: Tuples are similar to lists, but they are immutable, which means that their contents cannot be changed once they are created. Tuples are created using parentheses.

6.Sets: Sets are collections of unique values. They are created using curly braces or the set() function.

7.Dictionaries: Dictionaries are collections of key-value pairs. They are created using curly braces and colons to separate the keys and values.
Variables are used in python to store values. the syntax used in variables is (=), to assign lets say value 10 to y, we would write,

y=10
Enter fullscreen mode Exit fullscreen mode

you can also assign multiple values in a single line of code, for example,

w, x, y, z =2,5,10,15
This assigns the value 10 to x, 20 to y, and 30 to z.

Enter fullscreen mode Exit fullscreen mode

Here are some examples of how to create variables of each data type:

x = 10 # integer
y = 3.14 # floating-point number
z = "Hello, world!" # string
a = True # boolean
Enter fullscreen mode Exit fullscreen mode

python supports mathematical operations as well, including, addition, subtraction, division, and multiplication.
here are some examples

x = 10
y = 5
print(x + y) # 15
print(x - y) # 5
print(x * y) # 50
print(x / y) # 2.0

Enter fullscreen mode Exit fullscreen mode

Having looked at the data types and how we can assign different values to variables, and the mathematical operations in python lets now look at something else in python known as control structures.
Control structures in Python are used to control the flow of execution of a program. They are used to decide whether to execute a particular block of code or not, based on certain conditions. There are three main types of control structures in Python:

  1. Conditional statements: Conditional statements are used to check if a certain condition is true or false, and execute a block of code accordingly. The two main types of conditional statements in Python are the if statement and the if-else statement. The syntax for the if statement is: if condition: # code to be executed if the condition is true``

The syntax for the if-else statement is:

`
if condition:
# code to be executed if the condition is true
else:
# code to be executed if the condition is false``
x = 10
if x > 5:
print("x is greater than 5")
else:
print("x is less than or equal to 5")
`

2.Loops: Loops are used to execute a block of code repeatedly, based on certain conditions. There are two main types of loops in Python: the while loop and the for loop. The syntax for the while loop is:
while condition:
# code to be executed while the condition is true

The syntax for the loop is:
for variable in iterable:
# code to be executed for each item in the iterable

`
for i in range(5):
print(i)
`

This code will print the numbers 0 through 4.

While Loops

While loops are used to execute a block of code as long as a condition is true. Here is an example:

`
x = 0
while x < 5:
print(x)
x += 1
`

This will print the numbers 0 through 4
Functions

Functions are a key aspect of Python and are used to group related code and simplify code reuse. Here is an example of how to create a function in Python:

def add_numbers(x, y):
return x + y

This function takes two arguments, x, and y, and returns their sum. You can call the function like this:

result = add_numbers(5, 10)
print(result) # 15

Object-Oriented Programming

Object-oriented programming (OOP) is an essential part of Python. It allows developers to create classes, which can be used to create objects. Objects are instances of a class and have attributes and methods. Here is an example of how to create a class in Python:

`
class Rectangle:
def init(self, length, width):
self.length = length
self.width = width

def area(self):
    return self.length * self.width
Enter fullscreen mode Exit fullscreen mode

`
This class represents a rectangle and has an attribute length and width, and a method area. You can create an object of this class like this:

`
rectangle = Rectangle(10, 5)
print(rectangle.area()) # 50

`

3.Control statements: Control statements are used to change the flow of execution of a program. There are three main types of control statements in Python: break, continue, and pass. The break statement is used to break out of a loop, the continue statement is used to skip the current iteration of a loop, and the pass statement is used as a placeholder when you don't want to execute any code.
These control structures are essential for building more complex programs in Python, and they provide the ability to make decisions and repeat actions based on certain conditions.
Now lets look at the popular libraries in python and the frameworks

NumPy

NumPy is a Python library that is used for scientific computing. It provides a variety of mathematical functions and data structures, including arrays and matrices. Here is an example of how to use NumPy:
import numpy as np

`

Create a NumPy array

arr = np.array([1, 2, 3, 4, 5])

Perform basic arithmetic with the array

print(arr + 1) # [2, 3, 4, 5, 6]
print(arr * 2) # [2, 4, 6, 8, 10]

`
Pandas

Pandas is a Python library that is used for data manipulation and analysis. It provides a variety of functions and data structures, including data frames and series. Here is an example of how to use Pandas:

`
import pandas as pd

Create a Pandas data frame

df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]})

Print the data frame

print(df)

`
This code will create a data frame with two columns, Name and Age, and print it to the console.

Matplotlib

Matplotlib is a Python library that is used for data visualization. It provides a variety of functions for creating charts and graphs. Here is an example of how to use Matplotlib:

`
import matplotlib.pyplot as plt

Create a line chart

x = [1, 2, 3, 4, 5]
y = [1, 4, 9, 16, 25]
plt.plot(x, y)

Add labels and a title to the chart

plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Line Chart')

Display the chart

plt.show()

`
Scikit-learn

Scikit-learn is a Python library that is used for machine learning. It provides a variety of functions and algorithms for training and testing machine learning models. Here is an example of how to use Scikit-learn:

`
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier

Load the Iris dataset

iris = load_iris()

`

Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)

Train a K-nearest neighbors classifier

knn = KNeighborsClassifier()
knn.fit(X_train, y_train)

Test the classifier

score = knn.score(X_test, y_test)
print(score)
Conclusion

Python is a powerful language for data science. Its simplicity, flexibility, and an extensive collection of libraries and frameworks make it an ideal choice for beginners and experienced data scientists as well as web developers.

Top comments (0)