loading...
Cover image for Managing fixtures in raw PHP

Managing fixtures in raw PHP

manuelmolina97 profile image Manuel ・7 min read

This post complements to these ones:

We have developed this solution for a MariaDB database, it may vary if you use other database.

We started managing fixtures in a way after each test we restart the whole database, but that just does work in real environments as databases get massive. So we needed a way to reload tables just when they have been modified. For this we used Pixie, which is a wrapper of PDO, but whichever you use in your projects should be enough.

Let's have a look at FixtureLoader class step by step:

class FixtureLoader
{

    public static DatabaseService $databaseService;

    public static array $modelsModified;

    public static array $modelsCharged;

    private static array $fixtures;

    private static array $modulesLoaded = [
        'Trainer' => false,
        'Gym' => false,
        'GymTrainer' => false,
    ];
}

Everything is static! So we need it to be due to how PhpUnit works by default, synchronously, meaning each test will run one after another and the same instance of FixtureLoader will be used. This does not work like that in NodeJS testing tools as Jest.

We store our DatabaseService instance, which if you come from the previous post you know where we use it and why, which models have been modified, charged, the whole fixtures array and which modules have been loaded already.

We understand as module an endpoint in our test, we have Trainers, Gyms and GymTrainers. This will be used in order to know if we need to clean the whole database again.

modelsModified and modelsCharted are positional arrays because just one test suite is going to be running at a time, so we do not need to track several suites. If we wanted to do so, we would need them to be associative arrays, so something like this:

public static array $modelsModified = [
    'Trainer' => [],
    'Gym' => [],
    'GymTrainer' => [],
];

public static array $modelsCharged = [
    'Trainer' => [],
    'Gym' => [],
    'GymTrainer' => [],
];

We tried running them with --process-isolation flag but we did not get any performace at all, hence we stuck to running them sequentially.

public static function preSuite(array $fixtures, string $module)
{
    self::$fixtures = $fixtures;

    $config = Config::getConfig();

    self::$databaseService = new DatabaseService($config['databaseConfig']);

    self::initDB($config['databaseConfig'], $module);

    self::$databaseService->connectToDb();

    self::load($fixtures);
}

This function is called from setUpBeforeClass. As you can see it just saves the fixtures we have sent, set databaseService, inits the database if needed, we will see afterwards how this is done, connects to database and load its fixtures.

Apart from all these things, we are taking a config out from apparently nowhere, right? No worries, this is not magic whatsoever, it will be explained here.

private static function initDB(array $databaseConfig, string $module)
{
    if (!self::$modulesLoaded[$module]) {
        shell_exec("mysql -u {$databaseConfig['username']} -p{$databaseConfig['password']} -h {$databaseConfig['host']} -P {$databaseConfig['port']} -e 'DROP DATABASE IF EXISTS {$databaseConfig['database']}; CREATE DATABASE {$databaseConfig['database']};'");

        shell_exec("mysql -u {$databaseConfig['username']} -p{$databaseConfig['password']} -h {$databaseConfig['host']} -P {$databaseConfig['port']} {$databaseConfig['database']} < " .
                __DIR__ .
                '/dump.sql');
        self::$modelsModified = self::$modelsCharged = [];
        self::$modulesLoaded[$module] = true;
    }
}

This is how initDB wors, if the module has not been charged yet, it reloads completely the database, apart from restarting modelsModified and modelsCharged trackers this last step is really important not to corrupt your fixtures.

And as we want the database to be restarted as less as possible, we have to set the module to loaded, so it wont be restarted again in our tests. For example, if we have multiple different test suites for the same module, the database will be created just once.

Before moving into how do we actually load the fixtures, I think it is better if we have a look at how fixtures look like:

use Modules\Gym\Factories\GymFactory;
use Modules\Trainer\Factories\TrainerFactory;

$gyms = [
    [
        'id' => 1,
        'endpoint' => 'https://forgev.com',
    ],
    [
        'id' => 2,
        'endpoint' => 'https://repeated-endpoint.com',
    ],
];

$trainers = [
    [
        'id' => 1,
        'email' => 'contact@forgdev.com',
        'gym_id' => $gyms[0]['id'],
    ],
    [
        'id' => 2,
        'email' => 'rmontoya@forgdev.com',
        'gym_id' => $gyms[1]['id'],
    ],
    'depends' => [
        GymFactory::TABLE_NAME,
    ],
];

return [
    GymFactory::TABLE_NAME => $gyms,
    TrainerFactory::TABLE_NAME => $trainers,
];

This is what will be inserted in our database. As you may have noticed these is a key in trainers called depends, this is in order to create fixtures in order, so gyms will be created before trainers.

Take your staffs sorcerers, because this is going to be rough:

public static function load(?array $fixtures)
    {
        foreach ($fixtures ?? self::$fixtures as $table => $rows) {
            if (!empty($rows['depends'])) {
                $dependencies = array_flip($rows['depends']);
                $dependencyModels = array_intersect_key($fixtures ?? self::$fixtures, $dependencies);
                $fixturesToLoad = array_diff_key($dependencyModels, self::$modelsCharged);
                self::load($fixturesToLoad);
                // Or, if you prefer an one line solution:
                // self::load(array_diff_key(array_intersect_key($fixtures ?? self::$fixtures, array_flip($rows['depends'])), self::$modelsCharged));
                unset($rows['depends']);
            }

            if (!isset(self::$modelsCharged[$table])) {
                foreach ($rows as $row) {
                    self::$databaseService->getQueryBuilder()->pdo()->query('SET FOREIGN_KEY_CHECKS = 0;');
                    self::$databaseService->getTable($table)->insert($row);
                    self::$databaseService->getQueryBuilder()->pdo()->query('SET FOREIGN_KEY_CHECKS = 1;');
                }
                self::$modelsCharged[$table] = true;
            }
        }

    }

Let's explain how the recursivity works here:

In case we depend on something, meaning we need to load other table before loading the current one, we are going to execute load again with just the models which haven't been charged yet.

depends array looks like this:

[
    GymFactory::TABLE_NAME,
]

First we need the Gym table name to be a key in the array, that's why we call array_flip.

Then, we intersect the keys from what we have in fixtures to what we need to load, so we just take the fixtures needed. Having in mind the example above, we would get from this call:

[
    'id' => 1,
    'endpoint' => 'https://forgev.com',
],
[
    'id' => 2,
    'endpoint' => 'https://repeated-endpoint.com',
],

And then we execute array_diff_key comparing it to the models already charged. That's in case gyms are already loaded, we do not need to load them again, besides if we try we will get an error due to duplicated keys. And then we are done with that tables dependencies, so we unset them.

Afterwards it comes the real loading, which is much simpler, we disable FOREIGN_KEY_CHECKS first just to avoid some warnings, though this is not completely necessary, it depends on your schema, we needed it. And then just insert the row into the table. And of course saving the table as charged not to be loaded again.

Realoding fixtures:

    public static function reload()
    {
        if (empty(self::$modelsModified)) {
            return;
        }

        $truncates = '';

        foreach (array_keys(self::$modelsModified) as $table) {
            $truncates .= "TRUNCATE TABLE {$table}; ";
            unset(self::$modelsCharged[$table]);
        }

        self::$databaseService->getQueryBuilder()->pdo()->query("SET FOREIGN_KEY_CHECKS = 0; {$truncates} SET FOREIGN_KEY_CHECKS = 1;")->closeCursor();

        self::$databaseService->getQueryBuilder()->removeEvent('after-insert');
    }

We always call this method from tearDown, so after finishing a test.

So, imagine we added a trainer in our first test, and we do not want that new trainer to be in the second test, what's why we need to reload Trainer table.

In the next method we explain how we keep track of modified tables.

We truncate all tables in one query and we set that table as not charged, since we are about to clean it.

Since such query: SET FOREIGN_KEY_CHECKS = 0; {$truncates} SET FOREIGN_KEY_CHECKS = 1; does not make any result, we need to close PDO's cursor, so we can continue making queries. And then remove the event for after insert queries, because setUp method is going to be executed and it is going to load some tables, and we do not need to keep track of those.

public static function postLoad()
    {
        self::$databaseService->getQueryBuilder()->registerEvent('after-insert',
            ':any',
            fn(QueryBuilderHandler $queryBuilder) => self::$modelsModified[$queryBuilder->getStatements()['tables'][0]] = true);

        self::$databaseService->getQueryBuilder()->registerEvent('after-update',
            ':any',
            fn(QueryBuilderHandler $queryBuilder) => self::$modelsModified[$queryBuilder->getStatements()['tables'][0]] = true);

        self::$databaseService->getQueryBuilder()->registerEvent('after-delete',
            ':any',
            fn(QueryBuilderHandler $queryBuilder) => self::$modelsModified[$queryBuilder->getStatements()['tables'][0]] = true);
    }

Pixie has some query events, so we use these in order to keep track of queries. All we need are inserts, updates and deletes, so we just save it in our modelsModified array for reloading them afterwards.

Your ORM or query builder for sure has something like that, so it should not be a big deal to implement this using your tool.

Tracking raw queries

Does not really matter how much we want to avoid writing raw SQL in our code, we will end up doing it either due to the ORM does not support your query or you do not find a proper way to do what you need. In case of Pixie, in the very last notes of the docs:
Of course Query Events don't work with raw queries.

Come on!! Give me a break, then, cannot we keep track of raw queries? Of course we can, we just have to find a way.

Remember the DatabaseService we used in our FixtureLoader? Well, that's one we just use for our e2e tests:


class DatabaseService extends \Shared\Database\DatabaseService
{

    public function connectToDb()
    {
        $connection = new PixieConnection($this->databaseConfig['driver'], [
            'host' => $this->databaseConfig['host'],
            'database' => $this->databaseConfig['database'],
            'username' => $this->databaseConfig['username'],
            'password' => $this->databaseConfig['password'],
            'port' => $this->databaseConfig['port'],
            'options' => [
                PDO::ATTR_TIMEOUT => 1,
            ],
        ]);

        $this->queryBuilder = new QueryBuilderHandler($connection, PDO::FETCH_ASSOC);
    }
}

And this QueryBuilderHandler is the real deal here, we have monkey patched query method (this is what we use to run raw queries)


class QueryBuilderHandler extends \Pixie\QueryBuilder\QueryBuilderHandler
{

    public function query($sql, $bindings = [])
    {
        ['action' => $action, 'table' => $table] = $this->getSqlInfo($sql);

        if ($action == 'UPDATE') {
            $this->statements = [
                'tables' => [
                    $table,
                ],
            ];
            $this->connection->getEventHandler()->fireEvents($this, 'after-update');
        }

        if ($action == 'INSERT' || $action == 'INSERT IGNORE') {
            $this->statements = [
                'tables' => [
                    $table,
                ],
            ];

            $this->connection->getEventHandler()->fireEvents($this, 'after-insert');
        }

        return parent::query($sql, $bindings);
    }

    private function getSqlInfo(string $sql)
    {
        $update = '/^(?<action>.*) (?<table>.*) SET/';
        $insert = '/^(?<action>.*)( IGNORE)? INTO (?<table>.*) \(/';

        preg_match($update, $sql, $matchesUpdate);
        preg_match($insert, $sql, $matchesInsert);

        return [
            'table' => $matchesUpdate['table'] ?? $matchesInsert['table'] ?? null,
            'action' => $matchesUpdate['action'] ?? $matchesInsert['action'] ?? null,
        ];
    }

}

So actually when we execute a raw query we can read it and know either if it's an update or insert. Since we do not use raw deletes in our code we do not need to keep track of them here, but if you do you just need to add the delete regex.

Hence we just need to behave the same way query does, and by that I mean fire the events we need:

$this->connection->getEventHandler()->fireEvents($this, 'after-update');

// OR

$this->connection->getEventHandler()->fireEvents($this, 'after-insert');

So any of these will get executed:

self::$databaseService->getQueryBuilder()->registerEvent('after-insert',
            ':any',
            fn(QueryBuilderHandler $queryBuilder) => self::$modelsModified[$queryBuilder->getStatements()['tables'][0]] = true);

self::$databaseService->getQueryBuilder()->registerEvent('after-update',
            ':any',
            fn(QueryBuilderHandler $queryBuilder) => self::$modelsModified[$queryBuilder->getStatements()['tables'][0]] = true);

And we can finally reload those as well! Just like if we did not use raw queries. By doing $queryBuilder->getStatements() we get information about the queries, and one of them is an array called tables, this way we can keep track of which tables have been modified in any way.

Discussion

pic
Editor guide