DEV Community

Ajeet Singh Raina for Redis

Posted on

How to import CSV data into a Redis database

Image2

CSV refers to "comma-separated values". A CSV file is a simple text file in which data record is separated by commas or other types of values as the delimiter. It is commonly used in spreadsheet apps. Each line of the file in CSV is basically a data record. Each record consists of one or more fields.

Sometimes Redis instances need to be loaded with a big amount of preexisting or user generated data in a short amount of time, so that millions of keys will be created as fast as possible. This is called a "mass insertion".

https://stackoverflow.com/questions/32149626/how-to-insert-billion-of-data-to-redis-efficiently/32165090

For example, the below example shows how one can use Linux utility "awk" to import data into Redis.

cat data.csv | awk -F',' '{print " SET \""$1"\" \""$2"\" \n"}' | redis-cli --pipe 
Enter fullscreen mode Exit fullscreen mode

Using redis-cli and available Linux tools like awk to perform mass insertion might not be a good idea. Imagine you have a ton of graph data and you want to insert it into Redis database. By using a Linux utility like "awk", you are basically sending one command after the other and you might end up paying for the round trip time for every single command, hence resulting in a slow performance speed.

Introducing RedisGraph Bulk Loader

If you have a bunch of CSV files that you want to load to RedisGraph database, you must try out this Bulk Loader utility. Rightly called RedisGraph Bulk Loader, this tool is written in Python and helps you in building RedisGraph databases from CSV inputs. This utility requires a Python 3 interpreter.

Follow the steps below to load CSV data into Graph database hosted over Redis Enterprise Cloud

Step 1. Create a free Redis Enterprise Cloud account

Image6

Create your free Redis Enterprise Cloud account by visiting this link. Use Coupon code MATRIX200 and get $200 free credit for using Redis Enterprise Cloud.

Image1

Follow this link to create a Redis Enterprise Cloud subscription and database as shown below:

Image5

The database endpoint URL is unique for all and hence might be different in your case. Save it for future reference.

Step 3. Clone the Bulk Loader Utility

  $ git clone https://github.com/RedisGraph/redisgraph-bulk-loader
Enter fullscreen mode Exit fullscreen mode

Step 4. Installing the RedisGraph Bulk Loader tool

The bulk loader can be installed using pip:

   pip3 install redisgraph-bulk-loader
Enter fullscreen mode Exit fullscreen mode

Or

  pip3 install git+https://github.com/RedisGraph/redisgraph-bulk-loader.git@master
Enter fullscreen mode Exit fullscreen mode

Step 5. Create a Python virtual env for this work

  python3 -m venv redisgraphloader
Enter fullscreen mode Exit fullscreen mode

Step 6. Step into the venv:

  source redisgraphloader/bin/activate
Enter fullscreen mode Exit fullscreen mode

Step 7. Install the dependencies for the bulk loader:

  pip3 install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

If the above command doesn’t work, install the below modules:

  pip3 install pathos
  pip3 install redis
  pip3 install click
Enter fullscreen mode Exit fullscreen mode

Step 8. Install groovy

Ensure that the app points to Redis Enterprise Cloud Endpoint URL and password.

  groovy generateCommerceGraphCSVForImport.groovy
Enter fullscreen mode Exit fullscreen mode

Step 9. Verify the .csv files created

 head -n2 *.csv
 ==> addtocart.csv <==
 src_person,dst_product,timestamp
 0,1156,2010-07-20T16:11:20.551748

 ==> contain.csv <==
 src_person,dst_order
 2000,1215

 ==> order.csv <==
 _internalid,id,subTotal,tax,shipping,total
 2000,0,904.71,86.40,81.90,1073.01

 ==> person.csv <==
 _internalid,id,name,address,age,memberSince
  0,0,Cherlyn Corkery,146 Kuphal Isle South Jarvis MS 74838-0662,16,2010-03-18T16:25:20.551748

 ==> product.csv <==
 _internalid,id,name,manufacturer,msrp
 1000,0,Sleek Plastic Car,Thiel Hills and Leannon,385.62

 ==> transact.csv <==
 src_person,dst_order
 2,2000

 ==> view.csv <==
 src_person,dst_product,timestamp
 0,1152,2012-04-14T11:23:20.551748
Enter fullscreen mode Exit fullscreen mode

Step 10. Run the Bulk loader script

   python3 bulk_insert.py prodrec-bulk -n person.csv -n product.csv -n order.csv -r view.csv -r addtocart.csv -r transact.csv -r contain.csv
  person  [####################################]  100%
  1000 nodes created with label 'person'
  product  [####################################]  100%
  1000 nodes created with label 'product'
  order  [####################################]  100%
  811 nodes created with label 'order'
  view  [####################################]  100%
  24370 relations created for type 'view'
  addtocart  [####################################]  100%
  6458 relations created for type 'addtocart'
  transact  [####################################]  100%
  811 relations created for type 'transact'
  contain  [####################################]  100%
  1047 relations created for type 'contain'
  Construction of graph 'prodrec-bulk' complete: 2811 nodes created, 32686 relations created in 1.021761 seconds
Enter fullscreen mode Exit fullscreen mode
  graph.query prodrec "match (p:person) where p.id=200 return p.name"
  1) 1) "p.name"
  2) (empty array)
  3) 1) "Cached execution: 0"
     2) "Query internal execution time: 0.518300 milliseconds"
Enter fullscreen mode Exit fullscreen mode

Step 10 . Install RedisInsight

To use RedisInsight on a local Mac, you can download from the RedisInsight page on the RedisLabs website:

Click this link to access a form that allows you to select the operating system of your choice.

Image4

Alternatively, if you have Docker Engine installed in your system, the quick way is to run the following command:

  docker run -d -v redisinsight:/db -p 8001:8001 redislabs/redisinsight:latest
Enter fullscreen mode Exit fullscreen mode

Step 11. Accessing RedisInsight

Next, point your browser to http://localhost:8001. Once you are able to access the dashboard, supply your database endpoint, port and password to connect to the remote Redis Enterprise Cloud database.

Step 12. Run the Graph Query

  GRAPH.QUERY "prodrec-bulk" "match (p:person) where p.id=199 return p"
Enter fullscreen mode Exit fullscreen mode

Image111

Further Read

Click Here to access 85+ Sample Redis Apps

Image10

Top comments (0)