Akmal Chaudhri for SingleStore

Posted on Mar 16, 2023 • Edited on Nov 1

AdTech using SingleStoreDB and Looker

#singlestoredb #adtech #looker #go

Abstract

In this article, we'll see how to run a SingleStore GitHub project for implementing an AdTech application in SingleStoreDB Cloud. We'll also use Looker to create some visualisations.

Introduction

In this article, we'll take an existing SingleStore GitHub repo and demonstrate the ease with which it can be deployed and run on SingleStoreDB Cloud. The complete updated code can be found in a forked GitHub repo for ease of use.

Create a SingleStoreDB Cloud account

A previous article showed the steps to create a free SingleStoreDB Cloud account. We'll use AdTech Demo Group as our Workspace Group Name and adtech-demo as our Workspace Name. We'll make a note of our password and host name.

Install Go

First, we'll download and install Go, as follows:

rm -rf /usr/local/go && tar -C /usr/local -xzf go1.20.1.linux-amd64.tar.gz

We may need to use sudo, as follows:

sudo rm -rf /usr/local/go && sudo tar -C /usr/local -xzf go1.20.1.linux-amd64.tar.gz

We'll add the following to the PATH variable:

export PATH=$PATH:/usr/local/go/bin

and then check the version:

go version

This will print the installed version of Go. For example:

go version go1.20.1 linux/amd64

Clone GitHub repo

We'll now clone the following GitHub repo:

git clone https://github.com/VeryFatBoy/starter_kits

Data Generator

Let's first start with the AdTech data generator and change to the directory:

cd starter_kits/adtech/generator

We'll now build the code, as follows:

go mod init generator

go mod tidy

go build

This will create an executable generator file.

Let's now create a small dataset for initial testing:

./generator small

This will create four files with tab-separated values:

campaigns-small.tsv
events-small.tsv
networks-small.tsv
platforms-small.tsv

We'll gzip these files:

gzip *.tsv

We can also pass the parameter medium or large for larger datasets.

Load and run SQL

We'll move up one directory level:

cd ..

In the file setup.sql, the following four code blocks each use a pipeline and have been commented out:

...
-- create or replace pipeline load_events as
-- load data s3 'reference-solutions.memsql.com/events-large.tsv.gz'
-- config '{"region": "us-east-1"}'
-- credentials '{"aws_access_key_id": "", "aws_secret_access_key": ""}'
-- into table events;
-- start pipeline load_events;
...
-- create or replace pipeline load_campaigns as
-- load data s3 'reference-solutions.memsql.com/campaigns-large.tsv.gz'
-- config '{"region": "us-east-1"}'
-- credentials '{"aws_access_key_id": "", "aws_secret_access_key": ""}'
-- skip duplicate key errors
-- into table campaigns;
-- start pipeline load_campaigns;
...
-- create or replace pipeline load_networks as
-- load data s3 'reference-solutions.memsql.com/networks-large.tsv.gz'
-- config '{"region": "us-east-1"}'
-- credentials '{"aws_access_key_id": "", "aws_secret_access_key": ""}'
-- skip duplicate key errors
-- into table networks;
-- start pipeline load_networks;
...
-- create or replace pipeline load_platforms as
-- load data s3 'reference-solutions.memsql.com/platforms-large.tsv.gz'
-- config '{"region": "us-east-1"}'
-- credentials '{"aws_access_key_id": "", "aws_secret_access_key": ""}'
-- skip duplicate key errors
-- into table platforms;
-- start pipeline load_platforms;
...

If we decide to use pipelines later, we can uncomment and modify the code to suit our needs.

The following four new code blocks, respectively, have been added so that we can load the locally generated data instead:

...
load data local infile "generator/events-small.tsv.gz"
into table events
columns terminated by '\t';
...
load data local infile "generator/campaigns-small.tsv.gz"
into table campaigns
skip duplicate key errors
columns terminated by '\t';
...
load data local infile "generator/networks-small.tsv.gz"
into table networks
skip duplicate key errors
columns terminated by '\t';
...
load data local infile "generator/platforms-small.tsv.gz"
into table platforms
skip duplicate key errors
columns terminated by '\t';
...

Next, we'll run the commands in the setup.sql file to create the adtech database and tables and load the data:

mysql -u admin -h <host> -P 3306 -p<password> --local-infile < setup.sql

We'll replace the <host> and <password> with the values from our SingleStoreDB Cloud account.

From our SingleStoreDB Cloud account, we can use the SQL Editor to check that the tables were correctly created:

USE adtech;

SHOW TABLES;

Benchmarking using dbbench (optional)

The dbbench tool can be used to run workloads against a SingleStoreDB database. A discussion of the tool is beyond the scope of this article, but the SingleStore GitHub repo is a great place to start and contains a tutorial.

To build and run the dbbench program, we'll first change to the directory starter_kits/adtech/benchmarks.

Next, we'll set up some environment variables:

export GOPATH=$HOME/go
export PATH=$PATH:$GOPATH/bin

and then install dbbench, as follows:

go install github.com/memsql/dbbench@latest

We can run the program from the starter_kits/adtech/benchmarks directory, as follows:

dbbench --host=<host> --username=admin --password=<password> --intermediate-stats=false adtech.ini

We'll replace the <host> and <password> with the values from our SingleStoreDB Cloud account.

Here is an example of the results using the small dataset with an S-00 Workspace Size:

Connected
starting impressions per campaign
starting ctr_per_country
starting campaign facts
starting spend per advertiser
stopping spend per advertiser
stopping ctr_per_country
stopping campaign facts
stopping impressions per campaign

ctr_per_country: latency 59.327792ms±15.229575ms; 30 transactions (16.847 TPS); 7470 rows (4194.792 RPS)
 16.777216ms -  33.554432ms [    12]: ██████████████████████████████████████████████████
 33.554432ms -  67.108864ms [     7]: █████████████████████████████▏
 67.108864ms - 134.217728ms [    10]: █████████████████████████████████████████▋
134.217728ms - 268.435456ms [     1]: ████▏

spend per advertiser: latency 58.056977ms±15.261319ms; 30 transactions (17.215 TPS); 15870 rows (9106.741 RPS)
  8.388608ms -  16.777216ms [     1]: ███▎
 16.777216ms -  33.554432ms [     6]: ████████████████████
 33.554432ms -  67.108864ms [    15]: ██████████████████████████████████████████████████
 67.108864ms - 134.217728ms [     7]: ███████████████████████▎
134.217728ms - 268.435456ms [     1]: ███▎

impressions per campaign: latency 116.803395ms±35.36871ms; 30 transactions (8.559 TPS); 30 rows (8.559 RPS)
  8.388608ms -  16.777216ms [     2]: ██████▋
 16.777216ms -  33.554432ms [     2]: ██████▋
 33.554432ms -  67.108864ms [     2]: ██████▋
 67.108864ms - 134.217728ms [    15]: ██████████████████████████████████████████████████
134.217728ms - 268.435456ms [     7]: ███████████████████████▎
268.435456ms - 536.870912ms [     2]: ██████▋

campaign facts: latency 64.628849ms±30.600693ms; 30 transactions (15.464 TPS); 0 rows (0.000 RPS)
  8.388608ms -  16.777216ms [     5]: █████████████████████████
 16.777216ms -  33.554432ms [     5]: █████████████████████████
 33.554432ms -  67.108864ms [    10]: ██████████████████████████████████████████████████
 67.108864ms - 134.217728ms [     9]: █████████████████████████████████████████████
134.217728ms - 268.435456ms [     0]: 
268.435456ms - 536.870912ms [     1]: █████

Using dbbench allows us to build reference implementations and then test the performance against these.

Visualisations using Looker

Finally, we'll create some visualisations.

To use Looker, we can use an existing account or request a free trial.

Once logged into the Looker account, from Account > Admin > Database > Connections we'll create a New Connection, as follows:

Name: reference_solutions
Dialect: SingleStore 7+
Remote Host: <host> Port: 3306
Database: adtech
Username: admin
Password: <password>

We'll replace the <host> and <password> with the values from our SingleStoreDB Cloud account.

We can test if the connection is successful.

Next, from Account > Develop, we'll ensure that Development Mode is toggled to ON, select Manage LookML Projects, and choose New LookML Project.

We'll now enter the following:

Project Name: adtech
Starting Point: Generate Model from Database Schema
Connection: reference_solutions
Build Views From: All Tables

Next, we'll select Create > Dashboard from the Looker home page. We'll use the name Events and then click Create Dashboard.

We'll click Edit Dashboard and then Add Visualization.

From the left nav, we'll select the Events table, which will appear expanded in a new window. Under DIMENSIONS, we'll select Country; under MEASURES, we'll choose Count. We'll select Google Maps on the right-hand side and then choose Run.

This will create a heatmap as shown in Figure 1.

Figure 1. Count per Country.

Hovering over the different countries will provide the number of events for that country.

From the Looker home page, we'll select Create > Dashboard. We'll use the name Campaigns and then click Create Dashboard.

We'll click Edit Dashboard and then Add Visualization.

We'll select the Campaigns table from the left nav, which will appear expanded in a new window. Under DIMENSIONS, we'll select Campaign Name; under MEASURES, we'll choose Count. On the right-hand side, we'll set the Row Limit to 10, select Pie Chart and then choose Run.

This will create a chart as shown in Figure 2.

Figure 2. Count per Campaign.

Additional dashboards could be created to get further insights.

Summary

We have generated some example AdTech data and loaded the small dataset into SingleStoreDB Cloud. The next step would be to create and load the medium or large dataset. We could also store the AdTech data in the cloud and use the pipelines feature to stream the data into SingleStoreDB.

We have built and tested dbbench. This tool could be handy for creating reference test results.

Finally, we have created two quick dashboards using Looker to gain insights into some of our AdTech data. Additional visualisations could be easily created using this powerful browser-based tool.

DEV Community

AdTech using SingleStoreDB and Looker

Abstract

Introduction

Create a SingleStoreDB Cloud account

Install Go

Clone GitHub repo

Data Generator

Load and run SQL

Benchmarking using dbbench (optional)

Visualisations using Looker

Summary

Top comments (0)

Read next

Golang Field ordering matters?

The Best Resources for Learning Golang

Connect Kafka with Golang

Building a City Hall Clock App for macOS: A Comprehensive Guide