DEV Community

Akmal Chaudhri for SingleStore

Posted on

Quick tip: Using the built-in notebooks with SingleStoreDB

Abstract

SingleStoreDB Cloud now supports notebooks. These provide a convenient way to perform Data Science directly with SingleStoreDB without installing external tools. In this short article, we'll use a simple example to demonstrate some of the benefits of this new built-in integration.

The notebook file used in this article is available on GitHub.

Introduction

Jupyter-style notebooks are a great interactive way to test code, analyse data and visualise results. In this short article, we'll demonstrate some of the capabilities of the SingleStoreDB Cloud built-in notebooks environment using the Iris flower data set.

Create a SingleStoreDB Cloud account

A previous article showed the steps to create a free SingleStoreDB Cloud account. We'll use the following settings:

  • Workspace Group Name: Iris Demo Group
  • Cloud Provider: AWS
  • Region: US East 1 (N. Virginia)
  • Workspace Name: iris-demo
  • Size: S-00
  • Advanced Settings: MarTech Application deselected

From the left navigation pane, we'll select DEVELOP > SQL Editor to create a new database, as follows:

CREATE DATABASE IF NOT EXISTS iris_demo;
Enter fullscreen mode Exit fullscreen mode

New notebook

From the left navigation pane, we'll select DEVELOP > Notebooks.

In the top right of the web page, we'll select New Notebook > New Notebook, as shown in Figure 1.

Figure 1. New Notebook.

Figure 1. New Notebook.

We'll call the notebook iris_demo, and select a Blank notebook template from the available options.

We also need to select the Connection and Database using the drop-down menus just above the notebook, as shown in Figure 2.

Figure 2. Connection and Database.

Figure 2. Connection and Database.

Fill out the notebook

Create Table

First, we'll create a new table, as follows:

%%sql

USE iris_demo;
DROP TABLE IF EXISTS iris;
CREATE TABLE IF NOT EXISTS iris (
    sepal_length FLOAT,
    sepal_width FLOAT,
    petal_length FLOAT,
    petal_width FLOAT,
    species VARCHAR(20)
);
Enter fullscreen mode Exit fullscreen mode

We can enter SQL statements directly into a notebook cell using the %%sql magic command.

Load Data

We'll now load the data into the table, as follows:

%%sql

USE iris_demo;
INSERT INTO iris VALUES
(5.1,3.5,1.4,0.2,'Iris-setosa'),
(4.9,3,1.4,0.2,'Iris-setosa'),
(4.7,3.2,1.3,0.2,'Iris-setosa'),
(4.6,3.1,1.5,0.2,'Iris-setosa'),
(5,3.6,1.4,0.2,'Iris-setosa'),
(5.4,3.9,1.7,0.4,'Iris-setosa'),
(4.6,3.4,1.4,0.3,'Iris-setosa'),
(5,3.4,1.5,0.2,'Iris-setosa'),
(4.4,2.9,1.4,0.2,'Iris-setosa'),
(4.9,3.1,1.5,0.1,'Iris-setosa'),
(5.4,3.7,1.5,0.2,'Iris-setosa'),
(4.8,3.4,1.6,0.2,'Iris-setosa'),
(4.8,3,1.4,0.1,'Iris-setosa'),
(4.3,3,1.1,0.1,'Iris-setosa'),
(5.8,4,1.2,0.2,'Iris-setosa'),
(5.7,4.4,1.5,0.4,'Iris-setosa'),
(5.4,3.9,1.3,0.4,'Iris-setosa'),
(5.1,3.5,1.4,0.3,'Iris-setosa'),
(5.7,3.8,1.7,0.3,'Iris-setosa'),
(5.1,3.8,1.5,0.3,'Iris-setosa'),
(5.4,3.4,1.7,0.2,'Iris-setosa'),
(5.1,3.7,1.5,0.4,'Iris-setosa'),
(4.6,3.6,1,0.2,'Iris-setosa'),
(5.1,3.3,1.7,0.5,'Iris-setosa'),
(4.8,3.4,1.9,0.2,'Iris-setosa'),
(5,3,1.6,0.2,'Iris-setosa'),
(5,3.4,1.6,0.4,'Iris-setosa'),
(5.2,3.5,1.5,0.2,'Iris-setosa'),
(5.2,3.4,1.4,0.2,'Iris-setosa'),
(4.7,3.2,1.6,0.2,'Iris-setosa'),
(4.8,3.1,1.6,0.2,'Iris-setosa'),
(5.4,3.4,1.5,0.4,'Iris-setosa'),
(5.2,4.1,1.5,0.1,'Iris-setosa'),
(5.5,4.2,1.4,0.2,'Iris-setosa'),
(4.9,3.1,1.5,0.1,'Iris-setosa'),
(5,3.2,1.2,0.2,'Iris-setosa'),
(5.5,3.5,1.3,0.2,'Iris-setosa'),
(4.9,3.1,1.5,0.1,'Iris-setosa'),
(4.4,3,1.3,0.2,'Iris-setosa'),
(5.1,3.4,1.5,0.2,'Iris-setosa'),
(5,3.5,1.3,0.3,'Iris-setosa'),
(4.5,2.3,1.3,0.3,'Iris-setosa'),
(4.4,3.2,1.3,0.2,'Iris-setosa'),
(5,3.5,1.6,0.6,'Iris-setosa'),
(5.1,3.8,1.9,0.4,'Iris-setosa'),
(4.8,3,1.4,0.3,'Iris-setosa'),
(5.1,3.8,1.6,0.2,'Iris-setosa'),
(4.6,3.2,1.4,0.2,'Iris-setosa'),
(5.3,3.7,1.5,0.2,'Iris-setosa'),
(5,3.3,1.4,0.2,'Iris-setosa'),
(7,3.2,4.7,1.4,'Iris-versicolor'),
(6.4,3.2,4.5,1.5,'Iris-versicolor'),
(6.9,3.1,4.9,1.5,'Iris-versicolor'),
(5.5,2.3,4,1.3,'Iris-versicolor'),
(6.5,2.8,4.6,1.5,'Iris-versicolor'),
(5.7,2.8,4.5,1.3,'Iris-versicolor'),
(6.3,3.3,4.7,1.6,'Iris-versicolor'),
(4.9,2.4,3.3,1,'Iris-versicolor'),
(6.6,2.9,4.6,1.3,'Iris-versicolor'),
(5.2,2.7,3.9,1.4,'Iris-versicolor'),
(5,2,3.5,1,'Iris-versicolor'),
(5.9,3,4.2,1.5,'Iris-versicolor'),
(6,2.2,4,1,'Iris-versicolor'),
(6.1,2.9,4.7,1.4,'Iris-versicolor'),
(5.6,2.9,3.6,1.3,'Iris-versicolor'),
(6.7,3.1,4.4,1.4,'Iris-versicolor'),
(5.6,3,4.5,1.5,'Iris-versicolor'),
(5.8,2.7,4.1,1,'Iris-versicolor'),
(6.2,2.2,4.5,1.5,'Iris-versicolor'),
(5.6,2.5,3.9,1.1,'Iris-versicolor'),
(5.9,3.2,4.8,1.8,'Iris-versicolor'),
(6.1,2.8,4,1.3,'Iris-versicolor'),
(6.3,2.5,4.9,1.5,'Iris-versicolor'),
(6.1,2.8,4.7,1.2,'Iris-versicolor'),
(6.4,2.9,4.3,1.3,'Iris-versicolor'),
(6.6,3,4.4,1.4,'Iris-versicolor'),
(6.8,2.8,4.8,1.4,'Iris-versicolor'),
(6.7,3,5,1.7,'Iris-versicolor'),
(6,2.9,4.5,1.5,'Iris-versicolor'),
(5.7,2.6,3.5,1,'Iris-versicolor'),
(5.5,2.4,3.8,1.1,'Iris-versicolor'),
(5.5,2.4,3.7,1,'Iris-versicolor'),
(5.8,2.7,3.9,1.2,'Iris-versicolor'),
(6,2.7,5.1,1.6,'Iris-versicolor'),
(5.4,3,4.5,1.5,'Iris-versicolor'),
(6,3.4,4.5,1.6,'Iris-versicolor'),
(6.7,3.1,4.7,1.5,'Iris-versicolor'),
(6.3,2.3,4.4,1.3,'Iris-versicolor'),
(5.6,3,4.1,1.3,'Iris-versicolor'),
(5.5,2.5,4,1.3,'Iris-versicolor'),
(5.5,2.6,4.4,1.2,'Iris-versicolor'),
(6.1,3,4.6,1.4,'Iris-versicolor'),
(5.8,2.6,4,1.2,'Iris-versicolor'),
(5,2.3,3.3,1,'Iris-versicolor'),
(5.6,2.7,4.2,1.3,'Iris-versicolor'),
(5.7,3,4.2,1.2,'Iris-versicolor'),
(5.7,2.9,4.2,1.3,'Iris-versicolor'),
(6.2,2.9,4.3,1.3,'Iris-versicolor'),
(5.1,2.5,3,1.1,'Iris-versicolor'),
(5.7,2.8,4.1,1.3,'Iris-versicolor'),
(6.3,3.3,6,2.5,'Iris-virginica'),
(5.8,2.7,5.1,1.9,'Iris-virginica'),
(7.1,3,5.9,2.1,'Iris-virginica'),
(6.3,2.9,5.6,1.8,'Iris-virginica'),
(6.5,3,5.8,2.2,'Iris-virginica'),
(7.6,3,6.6,2.1,'Iris-virginica'),
(4.9,2.5,4.5,1.7,'Iris-virginica'),
(7.3,2.9,6.3,1.8,'Iris-virginica'),
(6.7,2.5,5.8,1.8,'Iris-virginica'),
(7.2,3.6,6.1,2.5,'Iris-virginica'),
(6.5,3.2,5.1,2,'Iris-virginica'),
(6.4,2.7,5.3,1.9,'Iris-virginica'),
(6.8,3,5.5,2.1,'Iris-virginica'),
(5.7,2.5,5,2,'Iris-virginica'),
(5.8,2.8,5.1,2.4,'Iris-virginica'),
(6.4,3.2,5.3,2.3,'Iris-virginica'),
(6.5,3,5.5,1.8,'Iris-virginica'),
(7.7,3.8,6.7,2.2,'Iris-virginica'),
(7.7,2.6,6.9,2.3,'Iris-virginica'),
(6,2.2,5,1.5,'Iris-virginica'),
(6.9,3.2,5.7,2.3,'Iris-virginica'),
(5.6,2.8,4.9,2,'Iris-virginica'),
(7.7,2.8,6.7,2,'Iris-virginica'),
(6.3,2.7,4.9,1.8,'Iris-virginica'),
(6.7,3.3,5.7,2.1,'Iris-virginica'),
(7.2,3.2,6,1.8,'Iris-virginica'),
(6.2,2.8,4.8,1.8,'Iris-virginica'),
(6.1,3,4.9,1.8,'Iris-virginica'),
(6.4,2.8,5.6,2.1,'Iris-virginica'),
(7.2,3,5.8,1.6,'Iris-virginica'),
(7.4,2.8,6.1,1.9,'Iris-virginica'),
(7.9,3.8,6.4,2,'Iris-virginica'),
(6.4,2.8,5.6,2.2,'Iris-virginica'),
(6.3,2.8,5.1,1.5,'Iris-virginica'),
(6.1,2.6,5.6,1.4,'Iris-virginica'),
(7.7,3,6.1,2.3,'Iris-virginica'),
(6.3,3.4,5.6,2.4,'Iris-virginica'),
(6.4,3.1,5.5,1.8,'Iris-virginica'),
(6,3,4.8,1.8,'Iris-virginica'),
(6.9,3.1,5.4,2.1,'Iris-virginica'),
(6.7,3.1,5.6,2.4,'Iris-virginica'),
(6.9,3.1,5.1,2.3,'Iris-virginica'),
(5.8,2.7,5.1,1.9,'Iris-virginica'),
(6.8,3.2,5.9,2.3,'Iris-virginica'),
(6.7,3.3,5.7,2.5,'Iris-virginica'),
(6.7,3,5.2,2.3,'Iris-virginica'),
(6.3,2.5,5,1.9,'Iris-virginica'),
(6.5,3,5.2,2,'Iris-virginica'),
(6.2,3.4,5.4,2.3,'Iris-virginica'),
(5.9,3,5.1,1.8,'Iris-virginica');
Enter fullscreen mode Exit fullscreen mode

Install Libraries

Next, we'll install some libraries, as follows:

!pip install matplotlib --quiet
!pip install scikit-learn --quiet
!pip install seaborn --quiet
Enter fullscreen mode Exit fullscreen mode

Connect to Database

We'll now connect to our database, as follows:

from sqlalchemy import *

db_connection = create_engine(connection_url)
Enter fullscreen mode Exit fullscreen mode

There are no complicated connection strings. The connection_url contains everything we need.

Read from Database

We'll now read the table contents into a Pandas Dataframe, as follows:

import pandas as pd

iris_df = pd.read_sql_query("SELECT * FROM iris", db_connection)
Enter fullscreen mode Exit fullscreen mode

Perform Data Analysis

We can now perform some Exploratory Data Analysis (EDA), such as:

# https://plotly.com/python/pca-visualization/

import plotly.express as px
from sklearn.decomposition import PCA

X = iris_df[[
    "sepal_length",
    "sepal_width",
    "petal_length",
    "petal_width"
]]

pca = PCA(n_components = 2)
components = pca.fit_transform(X)

pca_fig = px.scatter(
    components,
    x = 0,
    y = 1,
    color = iris_df["species"]
)

pca_fig.show()
Enter fullscreen mode Exit fullscreen mode

The output should be similar to Figure 3.

Figure 3. PCA Visualisation.

Figure 3. PCA Visualisation.

And further analysis, as follows:

# More about correlations:
# https://www.w3schools.com/python/pandas/pandas_correlations.asp

import matplotlib.pyplot as plt
import seaborn as sns

sns.heatmap(
    iris_df.corr(numeric_only = True),
    cmap = "OrRd",
    annot = True
)

plt.title("Correlations")
plt.plot()
Enter fullscreen mode Exit fullscreen mode

The output should be similar to Figure 4.

Figure 4. Correlations.

Figure 4. Correlations.

The SingleStoreDB Cloud notebook environment provides many more capabilities. Check out the documentation for further details.

Summary

This short article used one approach to load data into SingleStoreDB Cloud. We also performed some data analysis of the data stored in our database, and created several visualisations. All of this was achieved using the built-in notebook environment.

Top comments (0)