DEV Community

Dendi Handian
Dendi Handian

Posted on • Edited on

SQL Query into Pandas DataFrame - Part 1

The Playground Database

I'm using a database from sqlitetutorial.net and SQLite Browser to play with the queries. From the database, I will only use one table for this post, let's use only the Albums table here.

For The Python playground, I'm using Jupyter Notebook by Anaconda. And to import a table into Pandas DataFrame, we just need to export a table into csv file and this could be done using SQLite Browser, or you can get it from my gist.

Or if you know Google Colab then It's great alternative for Jupyter Notebook.

Preparing the Dataframe

Let's assume the albums.csv file is in the same directory as your python or ipynb file, then we can load it into pandas dataframe like this:

import pandas as pd

albums_df = pd.read_csv("albums.csv")
Enter fullscreen mode Exit fullscreen mode

Basic SQL Queries into Pandas Dataframe

For Part 1, I will only cover SELECT, WHERE, LIMIT and ORDER BY of SQL in DataFrame syntax.

Select All Columns and Rows

SELECT * 
FROM albums
Enter fullscreen mode Exit fullscreen mode
albums_df
Enter fullscreen mode Exit fullscreen mode

Select One Column

SELECT Title 
FROM albums
Enter fullscreen mode Exit fullscreen mode
albums_df[['Title']]
Enter fullscreen mode Exit fullscreen mode

Select More Columns

SELECT Title, ArtistId 
FROM albums
Enter fullscreen mode Exit fullscreen mode
albums_df[['Title', 'ArtistId']]
Enter fullscreen mode Exit fullscreen mode

Filtering with One Condition

SELECT * 
FROM albums
WHERE Title = 'The Best Of Van Halen, Vol. I'
Enter fullscreen mode Exit fullscreen mode
albums_df[albums_df['Title'] == 'The Best Of Van Halen, Vol. I']
Enter fullscreen mode Exit fullscreen mode

Filtering With More Conditions

SELECT * 
FROM albums
WHERE ArtistId = 2 AND AlbumId = 3
Enter fullscreen mode Exit fullscreen mode
albums_df[(albums_df['ArtistId'] == 2) & (albums_df['AlbumId'] == 3) ]
Enter fullscreen mode Exit fullscreen mode

Filtering With More Values

SELECT * 
FROM albums
WHERE ArtistId IN (8, 9, 10)
Enter fullscreen mode Exit fullscreen mode
albums_df[albums_df['ArtistId'].isin([8,9,10])]
Enter fullscreen mode Exit fullscreen mode

Search on a String

In Between:

SELECT * 
FROM albums
WHERE Title LIKE '%The%'
Enter fullscreen mode Exit fullscreen mode
albums_df[albums_df['Title'].str.contains('The')]
Enter fullscreen mode Exit fullscreen mode

Starts With:

SELECT * 
FROM albums
WHERE Title LIKE 'The%'
Enter fullscreen mode Exit fullscreen mode
albums_df[albums_df['Title'].str.contains('^The')]
Enter fullscreen mode Exit fullscreen mode

Ends With:

SELECT * 
FROM albums
WHERE Title LIKE '% Hits'
Enter fullscreen mode Exit fullscreen mode
albums_df[albums_df['Title'].str.contains(' Hits$')]
Enter fullscreen mode Exit fullscreen mode

Limit The First 10 Rows

SELECT * 
FROM albums
LIMIT 10
Enter fullscreen mode Exit fullscreen mode
albums_df[0:10]
Enter fullscreen mode Exit fullscreen mode

or

albums_df.head(10)
Enter fullscreen mode Exit fullscreen mode

Sort by a Column

Ascending:

SELECT *
FROM albums
ORDER BY Title ASC
Enter fullscreen mode Exit fullscreen mode
albums_df.sort_values(['Title'], ascending=True)
Enter fullscreen mode Exit fullscreen mode

Descending:

SELECT *
FROM albums
ORDER BY Title DESC
Enter fullscreen mode Exit fullscreen mode
albums_df.sort_values(['Title'], ascending=False)
Enter fullscreen mode Exit fullscreen mode

To the Part 2

Top comments (0)