CSV refers to "comma-separated values". A CSV file is a simple text file in which data record is separated by commas or other types of values as the delimiter. It is commonly used in spreadsheet apps. Each line of the file in CSV is basically a data record. Each record consists of one or more fields.
Sometimes Redis instances need to be loaded with a big amount of preexisting or user generated data in a short amount of time, so that millions of keys will be created as fast as possible. This is called a "mass insertion".
For example, the below example shows how one can use Linux utility "awk" to import data into Redis.
cat data.csv | awk -F',' '{print " SET \""$1"\" \""$2"\" \n"}' | redis-cli --pipe
Using redis-cli and available Linux tools like awk to perform mass insertion might not be a good idea. Imagine you have a ton of graph data and you want to insert it into Redis database. By using a Linux utility like "awk", you are basically sending one command after the other and you might end up paying for the round trip time for every single command, hence resulting in a slow performance speed.
Introducing RedisGraph Bulk Loader
If you have a bunch of CSV files that you want to load to RedisGraph database, you must try out this Bulk Loader utility. Rightly called RedisGraph Bulk Loader, this tool is written in Python and helps you in building RedisGraph databases from CSV inputs. This utility requires a Python 3 interpreter.
Follow the steps below to load CSV data into Graph database hosted over Redis Enterprise Cloud
Step 1. Create a free Redis Enterprise Cloud account
Create your free Redis Enterprise Cloud account by visiting this link. Use Coupon code MATRIX200 and get $200 free credit for using Redis Enterprise Cloud.
Follow this link to create a Redis Enterprise Cloud subscription and database as shown below:
The database endpoint URL is unique for all and hence might be different in your case. Save it for future reference.
Step 3. Clone the Bulk Loader Utility
$ git clone https://github.com/RedisGraph/redisgraph-bulk-loader
Step 4. Installing the RedisGraph Bulk Loader tool
The bulk loader can be installed using pip:
pip3 install redisgraph-bulk-loader
Or
pip3 install git+https://github.com/RedisGraph/redisgraph-bulk-loader.git@master
Step 5. Create a Python virtual env for this work
python3 -m venv redisgraphloader
Step 6. Step into the venv:
source redisgraphloader/bin/activate
Step 7. Install the dependencies for the bulk loader:
pip3 install -r requirements.txt
If the above command doesn’t work, install the below modules:
pip3 install pathos
pip3 install redis
pip3 install click
Step 8. Install groovy
Ensure that the app points to Redis Enterprise Cloud Endpoint URL and password.
groovy generateCommerceGraphCSVForImport.groovy
Step 9. Verify the .csv files created
head -n2 *.csv
==> addtocart.csv <==
src_person,dst_product,timestamp
0,1156,2010-07-20T16:11:20.551748
==> contain.csv <==
src_person,dst_order
2000,1215
==> order.csv <==
_internalid,id,subTotal,tax,shipping,total
2000,0,904.71,86.40,81.90,1073.01
==> person.csv <==
_internalid,id,name,address,age,memberSince
0,0,Cherlyn Corkery,146 Kuphal Isle South Jarvis MS 74838-0662,16,2010-03-18T16:25:20.551748
==> product.csv <==
_internalid,id,name,manufacturer,msrp
1000,0,Sleek Plastic Car,Thiel Hills and Leannon,385.62
==> transact.csv <==
src_person,dst_order
2,2000
==> view.csv <==
src_person,dst_product,timestamp
0,1152,2012-04-14T11:23:20.551748
Step 10. Run the Bulk loader script
python3 bulk_insert.py prodrec-bulk -n person.csv -n product.csv -n order.csv -r view.csv -r addtocart.csv -r transact.csv -r contain.csv
person [####################################] 100%
1000 nodes created with label 'person'
product [####################################] 100%
1000 nodes created with label 'product'
order [####################################] 100%
811 nodes created with label 'order'
view [####################################] 100%
24370 relations created for type 'view'
addtocart [####################################] 100%
6458 relations created for type 'addtocart'
transact [####################################] 100%
811 relations created for type 'transact'
contain [####################################] 100%
1047 relations created for type 'contain'
Construction of graph 'prodrec-bulk' complete: 2811 nodes created, 32686 relations created in 1.021761 seconds
graph.query prodrec "match (p:person) where p.id=200 return p.name"
1) 1) "p.name"
2) (empty array)
3) 1) "Cached execution: 0"
2) "Query internal execution time: 0.518300 milliseconds"
Step 10 . Install RedisInsight
To use RedisInsight on a local Mac, you can download from the RedisInsight page on the RedisLabs website:
Click this link to access a form that allows you to select the operating system of your choice.
Alternatively, if you have Docker Engine installed in your system, the quick way is to run the following command:
docker run -d -v redisinsight:/db -p 8001:8001 redislabs/redisinsight:latest
Step 11. Accessing RedisInsight
Next, point your browser to http://localhost:8001. Once you are able to access the dashboard, supply your database endpoint, port and password to connect to the remote Redis Enterprise Cloud database.
Step 12. Run the Graph Query
GRAPH.QUERY "prodrec-bulk" "match (p:person) where p.id=199 return p"
Further Read
- Learn about Redis and its use cases
- Building Movies database app using Redis
- i used Redis as a primary database, and here is what happened?
- Import Data into a Redis Database
Top comments (0)