DEV Community

Chris Achinga
Chris Achinga

Posted on • Originally published at chrisdevcode.hashnode.dev on

Reading Data From Files Using Python

Hi There! Welcome to Data 101.

In this article, I will take you through reading files using Python as you prepare to analyze them.

I will be using Google's Colaboratory tool as my IDE. You don't have to install or set up anything on your laptop/computer to use it, simply go to https://research.google.com/colaboratory/ and create a new notebook.

Reading CSV Files

You'll need to upload a data file. To do so, click on the folder icon on the far left of the Notebook:

upload.png

I will be using airlines and airports CSV files

On the new Notebook, import Pandas. Pandas is a Python library that is used for data manipulation.

import pandas as pd

Enter fullscreen mode Exit fullscreen mode

Since we have two files, let's create two variables with the path to the files:

airlines = pd.read_csv('airlines.csv')
airports = pd.read_csv('airports.csv')

Enter fullscreen mode Exit fullscreen mode

The .read_csv from pandas library enables us to read the CSV files. (That simple!).

To view the contents of the files read, we'll use .head() from pandas that will return the first 5 rows of the data loaded.

airlines.head()

Enter fullscreen mode Exit fullscreen mode

airlines.png

airports.head()

Enter fullscreen mode Exit fullscreen mode

airports.png

Reading JSON Files

Get the file airports.json

We'll create the path to the file:

airports_json = pd.read_json('airports.json')

Enter fullscreen mode Exit fullscreen mode

To view the first 5 data objects:

airports_json.head()

Enter fullscreen mode Exit fullscreen mode

airportjson.png

Conclusion

This is probably the first-ever step into data analysis, and we nailed it!

Here is the whole demo file:

https://github.com/achingachris/datasciencelearninghub/blob/master/data_cleaning.ipynb

Top comments (0)