Problem
Relational databases are great. Normalization of schema allows to store user's data in your application cheaper and faster than in document oriented NoSQL solutions. But complex, deep schemas can be real pain if you need to determine rows ownership and extract them.
There are few scenarios where you may want to extract some user and all records that belong to him (like addresses, orders, comments, etc) from database:
- For backup. To move inactive user data to cheap, long-term storage and free database space.
- For legal purposes. If you need to secure current state of user.
- For better load balancing. To migrate user between database shards.
- For sales. To prepare some role model demo user in your application, save his data and load it into demo instances of your application.
- For debugging. If your application is misbehaving and you want to analyze user data in isolated manner and clone it from production to local machine.
Solution
I work a lot with big schemas. Hundreds of databases, hundreds of tables in each database, deep relation chains, up to several schema changes a day. So I created a tool automate relational data extraction: UpRooted.
Goals were simple:
-
Fully automated. I do not want to write a single
JOIN
. - Optimal. Data in related tables should be reached in most efficient way from database standpoint.
-
Smart. Data should be processed in order that allows future loading into another database without breaking
FOREIGN KEY
constraints.
Example
Let's assume you have MySQL database and you need to extract user from users
table having id
= 1 with all his data from related tables.
The easiest way to start using UpRooted is to take Docker container.
Create working directory for scripts and output files, for example:
mkdir /Users/me/uprooted
Run Docker and mount this directory:
docker run --interactive --tty --volume /Users/me/uprooted:/home/uprooted bbkr2/uprooted:latest
UpRooted is written in Raku so create example.raku
file in this directory with following content (remember to adjust connection credentials):
use DBIish;
use UpRooted::Schema::MySQL;
use UpRooted::Tree;
use UpRooted::Reader::MySQL;
use UpRooted::Writer::MySQLFile;
my $connection = DBIish.connect(
'mysql',
host => '***', port => 3306, user => '***', password => '***', database => '***'
);
my $schema = UpRooted::Schema::MySQL.new( :$connection );
my $tree = UpRooted::Tree.new( root-table => $schema.table( 'users' ) );
my $reader = UpRooted::Reader::MySQL.new( :$connection, :$tree );
my $writer = UpRooted::Writer::MySQLFile.new;
$writer.write( $reader, id => 1 );
Run your script from within container:
raku example.raku
Your user will be saved to out.sql
file in /Users/me/uprooted
directory.
More info
I strongly recommend reading UpRooted README for explanation of each module in UpRooted::
namespace, more advanced use cases and potential schema design issues.
If you have found any bugs or have some suggestions feel free to create issue on GitHub.
Happy data extraction :)
Top comments (1)
Looks useful