Abstract
SingleStoreDB Cloud now supports notebooks. These provide a convenient way to perform Data Science directly with SingleStoreDB without installing external tools. In this short article, we'll use a simple example to demonstrate some of the benefits of this new built-in integration.
The notebook file used in this article is available on GitHub.
Introduction
Jupyter-style notebooks are a great interactive way to test code, analyse data and visualise results. In this short article, we'll demonstrate some of the capabilities of the SingleStoreDB Cloud built-in notebooks environment using the Iris flower data set.
Create a SingleStoreDB Cloud account
A previous article showed the steps to create a free SingleStoreDB Cloud account. We'll use the following settings:
- Workspace Group Name: Iris Demo Group
- Cloud Provider: AWS
- Region: US East 1 (N. Virginia)
- Workspace Name: iris-demo
- Size: S-00
- Advanced Settings: MarTech Application deselected
From the left navigation pane, we'll select DEVELOP > SQL Editor to create a new database, as follows:
CREATE DATABASE IF NOT EXISTS iris_demo;
New notebook
From the left navigation pane, we'll select DEVELOP > Notebooks.
In the top right of the web page, we'll select New Notebook > New Notebook, as shown in Figure 1.
We'll call the notebook iris_demo, and select a Blank notebook template from the available options.
We also need to select the Connection and Database using the drop-down menus just above the notebook, as shown in Figure 2.
Fill out the notebook
Create Table
First, we'll create a new table, as follows:
%%sql
USE iris_demo;
DROP TABLE IF EXISTS iris;
CREATE TABLE IF NOT EXISTS iris (
sepal_length FLOAT,
sepal_width FLOAT,
petal_length FLOAT,
petal_width FLOAT,
species VARCHAR(20)
);
We can enter SQL statements directly into a notebook cell using the %%sql
magic command.
Load Data
We'll now load the data into the table, as follows:
%%sql
USE iris_demo;
INSERT INTO iris VALUES
(5.1,3.5,1.4,0.2,'Iris-setosa'),
(4.9,3,1.4,0.2,'Iris-setosa'),
(4.7,3.2,1.3,0.2,'Iris-setosa'),
(4.6,3.1,1.5,0.2,'Iris-setosa'),
(5,3.6,1.4,0.2,'Iris-setosa'),
(5.4,3.9,1.7,0.4,'Iris-setosa'),
(4.6,3.4,1.4,0.3,'Iris-setosa'),
(5,3.4,1.5,0.2,'Iris-setosa'),
(4.4,2.9,1.4,0.2,'Iris-setosa'),
(4.9,3.1,1.5,0.1,'Iris-setosa'),
(5.4,3.7,1.5,0.2,'Iris-setosa'),
(4.8,3.4,1.6,0.2,'Iris-setosa'),
(4.8,3,1.4,0.1,'Iris-setosa'),
(4.3,3,1.1,0.1,'Iris-setosa'),
(5.8,4,1.2,0.2,'Iris-setosa'),
(5.7,4.4,1.5,0.4,'Iris-setosa'),
(5.4,3.9,1.3,0.4,'Iris-setosa'),
(5.1,3.5,1.4,0.3,'Iris-setosa'),
(5.7,3.8,1.7,0.3,'Iris-setosa'),
(5.1,3.8,1.5,0.3,'Iris-setosa'),
(5.4,3.4,1.7,0.2,'Iris-setosa'),
(5.1,3.7,1.5,0.4,'Iris-setosa'),
(4.6,3.6,1,0.2,'Iris-setosa'),
(5.1,3.3,1.7,0.5,'Iris-setosa'),
(4.8,3.4,1.9,0.2,'Iris-setosa'),
(5,3,1.6,0.2,'Iris-setosa'),
(5,3.4,1.6,0.4,'Iris-setosa'),
(5.2,3.5,1.5,0.2,'Iris-setosa'),
(5.2,3.4,1.4,0.2,'Iris-setosa'),
(4.7,3.2,1.6,0.2,'Iris-setosa'),
(4.8,3.1,1.6,0.2,'Iris-setosa'),
(5.4,3.4,1.5,0.4,'Iris-setosa'),
(5.2,4.1,1.5,0.1,'Iris-setosa'),
(5.5,4.2,1.4,0.2,'Iris-setosa'),
(4.9,3.1,1.5,0.1,'Iris-setosa'),
(5,3.2,1.2,0.2,'Iris-setosa'),
(5.5,3.5,1.3,0.2,'Iris-setosa'),
(4.9,3.1,1.5,0.1,'Iris-setosa'),
(4.4,3,1.3,0.2,'Iris-setosa'),
(5.1,3.4,1.5,0.2,'Iris-setosa'),
(5,3.5,1.3,0.3,'Iris-setosa'),
(4.5,2.3,1.3,0.3,'Iris-setosa'),
(4.4,3.2,1.3,0.2,'Iris-setosa'),
(5,3.5,1.6,0.6,'Iris-setosa'),
(5.1,3.8,1.9,0.4,'Iris-setosa'),
(4.8,3,1.4,0.3,'Iris-setosa'),
(5.1,3.8,1.6,0.2,'Iris-setosa'),
(4.6,3.2,1.4,0.2,'Iris-setosa'),
(5.3,3.7,1.5,0.2,'Iris-setosa'),
(5,3.3,1.4,0.2,'Iris-setosa'),
(7,3.2,4.7,1.4,'Iris-versicolor'),
(6.4,3.2,4.5,1.5,'Iris-versicolor'),
(6.9,3.1,4.9,1.5,'Iris-versicolor'),
(5.5,2.3,4,1.3,'Iris-versicolor'),
(6.5,2.8,4.6,1.5,'Iris-versicolor'),
(5.7,2.8,4.5,1.3,'Iris-versicolor'),
(6.3,3.3,4.7,1.6,'Iris-versicolor'),
(4.9,2.4,3.3,1,'Iris-versicolor'),
(6.6,2.9,4.6,1.3,'Iris-versicolor'),
(5.2,2.7,3.9,1.4,'Iris-versicolor'),
(5,2,3.5,1,'Iris-versicolor'),
(5.9,3,4.2,1.5,'Iris-versicolor'),
(6,2.2,4,1,'Iris-versicolor'),
(6.1,2.9,4.7,1.4,'Iris-versicolor'),
(5.6,2.9,3.6,1.3,'Iris-versicolor'),
(6.7,3.1,4.4,1.4,'Iris-versicolor'),
(5.6,3,4.5,1.5,'Iris-versicolor'),
(5.8,2.7,4.1,1,'Iris-versicolor'),
(6.2,2.2,4.5,1.5,'Iris-versicolor'),
(5.6,2.5,3.9,1.1,'Iris-versicolor'),
(5.9,3.2,4.8,1.8,'Iris-versicolor'),
(6.1,2.8,4,1.3,'Iris-versicolor'),
(6.3,2.5,4.9,1.5,'Iris-versicolor'),
(6.1,2.8,4.7,1.2,'Iris-versicolor'),
(6.4,2.9,4.3,1.3,'Iris-versicolor'),
(6.6,3,4.4,1.4,'Iris-versicolor'),
(6.8,2.8,4.8,1.4,'Iris-versicolor'),
(6.7,3,5,1.7,'Iris-versicolor'),
(6,2.9,4.5,1.5,'Iris-versicolor'),
(5.7,2.6,3.5,1,'Iris-versicolor'),
(5.5,2.4,3.8,1.1,'Iris-versicolor'),
(5.5,2.4,3.7,1,'Iris-versicolor'),
(5.8,2.7,3.9,1.2,'Iris-versicolor'),
(6,2.7,5.1,1.6,'Iris-versicolor'),
(5.4,3,4.5,1.5,'Iris-versicolor'),
(6,3.4,4.5,1.6,'Iris-versicolor'),
(6.7,3.1,4.7,1.5,'Iris-versicolor'),
(6.3,2.3,4.4,1.3,'Iris-versicolor'),
(5.6,3,4.1,1.3,'Iris-versicolor'),
(5.5,2.5,4,1.3,'Iris-versicolor'),
(5.5,2.6,4.4,1.2,'Iris-versicolor'),
(6.1,3,4.6,1.4,'Iris-versicolor'),
(5.8,2.6,4,1.2,'Iris-versicolor'),
(5,2.3,3.3,1,'Iris-versicolor'),
(5.6,2.7,4.2,1.3,'Iris-versicolor'),
(5.7,3,4.2,1.2,'Iris-versicolor'),
(5.7,2.9,4.2,1.3,'Iris-versicolor'),
(6.2,2.9,4.3,1.3,'Iris-versicolor'),
(5.1,2.5,3,1.1,'Iris-versicolor'),
(5.7,2.8,4.1,1.3,'Iris-versicolor'),
(6.3,3.3,6,2.5,'Iris-virginica'),
(5.8,2.7,5.1,1.9,'Iris-virginica'),
(7.1,3,5.9,2.1,'Iris-virginica'),
(6.3,2.9,5.6,1.8,'Iris-virginica'),
(6.5,3,5.8,2.2,'Iris-virginica'),
(7.6,3,6.6,2.1,'Iris-virginica'),
(4.9,2.5,4.5,1.7,'Iris-virginica'),
(7.3,2.9,6.3,1.8,'Iris-virginica'),
(6.7,2.5,5.8,1.8,'Iris-virginica'),
(7.2,3.6,6.1,2.5,'Iris-virginica'),
(6.5,3.2,5.1,2,'Iris-virginica'),
(6.4,2.7,5.3,1.9,'Iris-virginica'),
(6.8,3,5.5,2.1,'Iris-virginica'),
(5.7,2.5,5,2,'Iris-virginica'),
(5.8,2.8,5.1,2.4,'Iris-virginica'),
(6.4,3.2,5.3,2.3,'Iris-virginica'),
(6.5,3,5.5,1.8,'Iris-virginica'),
(7.7,3.8,6.7,2.2,'Iris-virginica'),
(7.7,2.6,6.9,2.3,'Iris-virginica'),
(6,2.2,5,1.5,'Iris-virginica'),
(6.9,3.2,5.7,2.3,'Iris-virginica'),
(5.6,2.8,4.9,2,'Iris-virginica'),
(7.7,2.8,6.7,2,'Iris-virginica'),
(6.3,2.7,4.9,1.8,'Iris-virginica'),
(6.7,3.3,5.7,2.1,'Iris-virginica'),
(7.2,3.2,6,1.8,'Iris-virginica'),
(6.2,2.8,4.8,1.8,'Iris-virginica'),
(6.1,3,4.9,1.8,'Iris-virginica'),
(6.4,2.8,5.6,2.1,'Iris-virginica'),
(7.2,3,5.8,1.6,'Iris-virginica'),
(7.4,2.8,6.1,1.9,'Iris-virginica'),
(7.9,3.8,6.4,2,'Iris-virginica'),
(6.4,2.8,5.6,2.2,'Iris-virginica'),
(6.3,2.8,5.1,1.5,'Iris-virginica'),
(6.1,2.6,5.6,1.4,'Iris-virginica'),
(7.7,3,6.1,2.3,'Iris-virginica'),
(6.3,3.4,5.6,2.4,'Iris-virginica'),
(6.4,3.1,5.5,1.8,'Iris-virginica'),
(6,3,4.8,1.8,'Iris-virginica'),
(6.9,3.1,5.4,2.1,'Iris-virginica'),
(6.7,3.1,5.6,2.4,'Iris-virginica'),
(6.9,3.1,5.1,2.3,'Iris-virginica'),
(5.8,2.7,5.1,1.9,'Iris-virginica'),
(6.8,3.2,5.9,2.3,'Iris-virginica'),
(6.7,3.3,5.7,2.5,'Iris-virginica'),
(6.7,3,5.2,2.3,'Iris-virginica'),
(6.3,2.5,5,1.9,'Iris-virginica'),
(6.5,3,5.2,2,'Iris-virginica'),
(6.2,3.4,5.4,2.3,'Iris-virginica'),
(5.9,3,5.1,1.8,'Iris-virginica');
Install Libraries
Next, we'll install some libraries, as follows:
!pip install matplotlib --quiet
!pip install scikit-learn --quiet
!pip install seaborn --quiet
Connect to Database
We'll now connect to our database, as follows:
from sqlalchemy import *
db_connection = create_engine(connection_url)
There are no complicated connection strings. The connection_url
contains everything we need.
Read from Database
We'll now read the table contents into a Pandas Dataframe, as follows:
import pandas as pd
iris_df = pd.read_sql_query("SELECT * FROM iris", db_connection)
Perform Data Analysis
We can now perform some Exploratory Data Analysis (EDA), such as:
# https://plotly.com/python/pca-visualization/
import plotly.express as px
from sklearn.decomposition import PCA
X = iris_df[[
"sepal_length",
"sepal_width",
"petal_length",
"petal_width"
]]
pca = PCA(n_components = 2)
components = pca.fit_transform(X)
pca_fig = px.scatter(
components,
x = 0,
y = 1,
color = iris_df["species"]
)
pca_fig.show()
The output should be similar to Figure 3.
And further analysis, as follows:
# More about correlations:
# https://www.w3schools.com/python/pandas/pandas_correlations.asp
import matplotlib.pyplot as plt
import seaborn as sns
sns.heatmap(
iris_df.corr(numeric_only = True),
cmap = "OrRd",
annot = True
)
plt.title("Correlations")
plt.plot()
The output should be similar to Figure 4.
The SingleStoreDB Cloud notebook environment provides many more capabilities. Check out the documentation for further details.
Summary
This short article used one approach to load data into SingleStoreDB Cloud. We also performed some data analysis of the data stored in our database, and created several visualisations. All of this was achieved using the built-in notebook environment.
Top comments (0)