DEV Community

Paweł bbkr Pabian
Paweł bbkr Pabian

Posted on

Extracting user data tree from relational databases

Problem

Relational databases are great. Normalization of schema allows to store user's data in your application cheaper and faster than in document oriented NoSQL solutions. But complex, deep schemas can be real pain if you need to determine rows ownership and extract them.

There are few scenarios where you may want to extract some user and all records that belong to him (like addresses, orders, comments, etc) from database:

  • For backup. To move inactive user data to cheap, long-term storage and free database space.
  • For legal purposes. If you need to secure current state of user.
  • For better load balancing. To migrate user between database shards.
  • For sales. To prepare some role model demo user in your application, save his data and load it into demo instances of your application.
  • For debugging. If your application is misbehaving and you want to analyze user data in isolated manner and clone it from production to local machine.

Solution

I work a lot with big schemas. Hundreds of databases, hundreds of tables in each database, deep relation chains, up to several schema changes a day. So I created a tool automate relational data extraction: UpRooted.

Goals were simple:

  • Fully automated. I do not want to write a single JOIN.
  • Optimal. Data in related tables should be reached in most efficient way from database standpoint.
  • Smart. Data should be processed in order that allows future loading into another database without breaking FOREIGN KEY constraints.

Example

Let's assume you have MySQL database and you need to extract user from users table having id = 1 with all his data from related tables.

The easiest way to start using UpRooted is to take Docker container.

Create working directory for scripts and output files, for example:

mkdir /Users/me/uprooted
Enter fullscreen mode Exit fullscreen mode

Run Docker and mount this directory:

docker run --interactive --tty --volume /Users/me/uprooted:/home/uprooted bbkr2/uprooted:latest
Enter fullscreen mode Exit fullscreen mode

UpRooted is written in Raku so create example.raku file in this directory with following content (remember to adjust connection credentials):

use DBIish;
use UpRooted::Schema::MySQL;
use UpRooted::Tree;
use UpRooted::Reader::MySQL;
use UpRooted::Writer::MySQLFile;

my $connection = DBIish.connect(
    'mysql',
    host => '***', port => 3306, user => '***', password => '***', database => '***'
);

my $schema = UpRooted::Schema::MySQL.new( :$connection );
my $tree = UpRooted::Tree.new( root-table => $schema.table( 'users' ) );
my $reader = UpRooted::Reader::MySQL.new( :$connection, :$tree );
my $writer = UpRooted::Writer::MySQLFile.new;

$writer.write( $reader, id => 1 );
Enter fullscreen mode Exit fullscreen mode

Run your script from within container:

raku example.raku
Enter fullscreen mode Exit fullscreen mode

Your user will be saved to out.sql file in /Users/me/uprooted directory.

More info

I strongly recommend reading UpRooted README for explanation of each module in UpRooted:: namespace, more advanced use cases and potential schema design issues.

If you have found any bugs or have some suggestions feel free to create issue on GitHub.

Happy data extraction :)

Top comments (1)

Collapse
 
handyman profile image
Nik Handyman

Looks useful