DEV Community

Ching Cheng Kang
Ching Cheng Kang

Posted on

How i use chunking to improve the import speed.

Intro

Yesterday, i talked about chunking the record before insert to the database in Improve Speed Of Importing Data from Third Party with HTTP Request. Today, i want to show how does it works and what is the differences in speed.

TLDR;

Always chunk your record before insert into database to get 3x speed.

Import Console Command

In this Console code, we iterate each model, get the latest collection from GitHub and try to inject it.

public function handle(): int
{
    //...
    foreach ($this->models as $model) {
        $record = $service->factory($model);
        $injectStation->inject($record, $model);
    }
    //...
}
Enter fullscreen mode Exit fullscreen mode

Inject Station

In the Inject station, we chunk the record to 500 each and insert it into database.

public function inject(?Collection $records, string $modelName): void
{
    $tableName = (new $modelName)->getTable();

    DB::table($tableName)->truncate();

    $chunks = $records->chunk(500);

    foreach ($chunks as $chunk) {
        DB::table($tableName)->insert($chunk->toArray());
    }
}
Enter fullscreen mode Exit fullscreen mode

The import speed is around 8 second, cool speed for 70k records.
Import Speed using chunks

What if ...

So, for the sake of science ... we remove the chunking and insert it one by one ... (notice how few lines of code have changed).

public function inject(?Collection $records, string $modelName): void
{
    $tableName = (new $modelName)->getTable();

    DB::table($tableName)->truncate();

    foreach ($records as $record) {
        DB::table($tableName)->insert($record);
    }
}
Enter fullscreen mode Exit fullscreen mode

The import speed dropped to ~28 seconds, 3x slower compare to the chunked version!
Import Speed without chunks

Moral of the Story

If you want to insert a large amount of data into the database, remember to chunk it!

Top comments (0)