DEV Community

Akmal Chaudhri for SingleStore

Posted on • Updated on

Quick tip: Accessing SingleStoreDB using the Python Client

Abstract

The Python Client enables developers to get up and running with SingleStoreDB quickly. In this short article, we'll discuss how to configure and use the Python Client. We'll use Deepnote as our development environment.

Introduction

Python is one of the most popular programming languages today. It is also widely used by Data Scientists. We'll look at several small code examples of how we can use Python with SingleStoreDB and some of the data analyses we might be interested in performing.

Create a Deepnote account

We'll create a free account on the Deepnote website. Once logged in, we'll create a new Deepnote project to give us a new notebook. We'll also need to create a data folder. In the data folder we'll store a CSV file containing the Iris flower data set.

Create a SingleStoreDB Cloud account

A previous article showed the steps required to create a free SingleStoreDB Cloud account. We'll use Iris Demo Group as our Workspace Group Name and iris-demo as our Workspace Name. We'll make a note of our password and host name. Finally, we'll create a new database using the SQL Editor:

CREATE DATABASE iris_demo;
Enter fullscreen mode Exit fullscreen mode

Deepnote notebook

Let's now start to fill out our notebook.

First, we'll need to install the Python Client:

!pip install singlestoredb
Enter fullscreen mode Exit fullscreen mode

Next, we'll import the library:

import singlestoredb as s2
Enter fullscreen mode Exit fullscreen mode

We'll now create a connection to SingleStoreDB:

conn = s2.connect(
            host = "<host>",
            port = "3306",
            user = "admin",
            password = "<password>",
            database = "iris_demo",
            results_type = "tuples"
        )
Enter fullscreen mode Exit fullscreen mode

We'll replace the <host> and <password> with the values from our SingleStoreDB Cloud account. The results_type provides several options and, in this example, we'll work with tuples.

Let's now read our CSV data into a Pandas Dataframe, as follows:

import pandas as pd

df = pd.read_csv("data/iris.csv", index_col = False)
Enter fullscreen mode Exit fullscreen mode

and we can view the data, as follows:

print(df.head(5))
Enter fullscreen mode Exit fullscreen mode

The output should be similar to the following:

   sepal_length  sepal_width  petal_length  petal_width      species
0           5.1          3.5           1.4          0.2  Iris-setosa
1           4.9          3.0           1.4          0.2  Iris-setosa
2           4.7          3.2           1.3          0.2  Iris-setosa
3           4.6          3.1           1.5          0.2  Iris-setosa
4           5.0          3.6           1.4          0.2  Iris-setosa
Enter fullscreen mode Exit fullscreen mode

Let's now prepare the SQL statement to insert the data into SingleStoreDB:

stmt = """
    INSERT INTO iris (
        sepal_length,
        sepal_width,
        petal_length,
        petal_width,
        species
    ) VALUES (%s, %s, %s, %s, %s)
"""
Enter fullscreen mode Exit fullscreen mode

Next, we'll create the table, insert the data from the Pandas Dataframe into SingleStoreDB and then retrieve the data back from SingleStoreDB:

with conn:
    conn.autocommit(True)
    with conn.cursor() as cur:
        cur.execute("""
            CREATE TABLE iris (
                sepal_length FLOAT,
                sepal_width FLOAT,
                petal_length FLOAT,
                petal_width FLOAT,
                species VARCHAR(20)
            )
        """)
        cur.executemany(stmt, df)
        cur.execute("""
            SELECT *
            FROM iris
        """)
        rows = cur.fetchall()
Enter fullscreen mode Exit fullscreen mode

Next, we'll convert the returned data into a new Pandas Dataframe:

iris_df = pd.DataFrame(rows, columns = [
    "sepal_length",
    "sepal_width",
    "petal_length",
    "petal_width",
    "species"
])
Enter fullscreen mode Exit fullscreen mode

We can now perform some data analysis, such as PCA:

# https://plotly.com/python/pca-visualization/

import plotly.express as px
from sklearn.decomposition import PCA

X = iris_df[[
    "sepal_length",
    "sepal_width",
    "petal_length",
    "petal_width"
]]

pca = PCA(n_components = 2)
components = pca.fit_transform(X)

pca_fig = px.scatter(
    components,
    x = 0,
    y = 1,
    color = iris_df["species"]
)

pca_fig.show()
Enter fullscreen mode Exit fullscreen mode

The output should be similar to Figure 1.

Figure 1. PCA

Figure 1. PCA

We can also check for correlations:

import matplotlib.pyplot as plt
import seaborn as sns

sns.heatmap(
    iris_df.corr(),
    cmap = "OrRd",
    annot = True
)

plt.title("Correlations")
plt.plot()
Enter fullscreen mode Exit fullscreen mode

The output should be similar to Figure 2.

Figure 2. Correlations.

Figure 2. Correlations.

And so on.

Summary

The Python Client provides a wide range of options and, in this article, we have used tuples as they are a convenient way to work with bulk data. The documentation provides examples of how to use the other options.

Top comments (0)