DEV Community

Timothy Karani
Timothy Karani

Posted on • Originally published at c3n7.tech

ORDER BY x LIMIT y Gotcha

Introduction

On a production environment I was watching a query I had just made attempts at optimizing take longer and longer as days went. Usually I would look at WHERE clauses to inform me on indexes that need creating or queries that need restructuring but often overlooked ORDER BY clauses, turns out both need equal attention.

Reproduction

I will be using Laravel for this because of easy EXPLAIN statements provided by barryvdh/laravel-debugbar and FakerPHP that we'll use to seed the database with lots of records.

  1. Create a Laravel Project
   $ composer create-project laravel/laravel order-by-gotcha
Enter fullscreen mode Exit fullscreen mode
  1. Update .env and set DB_DATABASE, DB_USERNAME and DB_PASSWORD. I will be using mysql. You might need to create a database.
  2. Create a Client model together with a corresponding migration and controller.
   $ php artisan make:model Client -mc
Enter fullscreen mode Exit fullscreen mode
  1. Update the migration. Only the email column is indexed, for future demonstration purposes.
   // ...
   $table->string('name');
   $table->string('email')->index();
   $table->string('phone');
   $table->timestamps();
   // ...
Enter fullscreen mode Exit fullscreen mode
  1. Update the Client model to make the columns fillable.
   // ...
   protected $fillable = [
       'name',
       'email',
       'phone'
   ];
   // ...
Enter fullscreen mode Exit fullscreen mode
  1. Update the DatabaseSeeder to seed 1,000,000 clients. This might take a bit, 6 min on my end (SSD, 8 Cores):
    public function run(): void
    {
        $records = [];
        $start_time = microtime(true);
        $iter_start_time = microtime(true);
        for ($i = 0; $i < 1_000_000; $i++) {
            // For logging, just to see stuff works
            if ($i > 0 && (($i % 10_000) == 0)) {
                $end_time = microtime(true);
                $cumulative = $end_time - $start_time;
                $iter = $end_time - $iter_start_time;

                $this->command->info("$i - Cumulative: $cumulative, Iter: $iter");
                $iter_start_time = microtime(true);
            }

            $records[] = [
                'name' => fake()->name(),
                'email' => fake()->email(),
                'phone' => fake()->phoneNumber(),
                'created_at' => fake()->dateTimeBetween('-10 Years', 'now'),
                'updated_at' => fake()->dateTimeBetween('-10 Years', 'now'),
            ];

            // Bulk insert every 1000 records
            if ($i > 0 && (($i % 1_000) == 0)) {
                DB::table('clients')->insert($records);
                $records = [];
            }
        }
    }
Enter fullscreen mode Exit fullscreen mode
  1. Now on to the ClientController class, with the following methods:
    public function index()
    {
        $client =  Client::query()->latest()
            ->first();

        return view('welcome', compact('client'));
    }

    public function index_with_where_clause()
    {
        $filter = 'ca';
        $client =  Client::query()
            ->where('email', 'like', "%{$filter}%")
            ->latest()
            ->first();

        return view('welcome', compact('client'));
    }

    public function index_order_by_email()
    {
        $client =  Client::query()
            ->latest('email')
            ->first();

        return view('welcome', compact('client'));
    }

    public function index_order_by_id()
    {
        $client =  Client::query()
            ->latest('id')
            ->first();

        return view('welcome', compact('client'));
    }
Enter fullscreen mode Exit fullscreen mode
  1. Clean up welcome.blade.php, just show the client name:
   <!DOCTYPE html>
   <html>
   <head>
       <title>Laravel</title>
   </head>
   <body>
       {{ $client->name }}
   </body>
   </html>
Enter fullscreen mode Exit fullscreen mode
  1. Update web.php to add the route:
    Route::get('/', [ClientController::class, 'index']);
    Route::get('/with_where', [ClientController::class, 'index_with_where_clause']);
    Route::get('/order_by_email', [ClientController::class, 'index_order_by_email']);
    Route::get('/order_by_id', [ClientController::class, 'index_order_by_id']);
Enter fullscreen mode Exit fullscreen mode
  1. Add laravel-debugbar and publish the config:

    $ composer require barryvdh/laravel-debugbar --dev
    $ php artisan vendor:publish --provider="Barryvdh\Debugbar\ServiceProvider"
    
  2. Enable explain output by editing config/debugbar.php. Scroll until you find the setting then update it as follows:

    'explain' => [                 // Show EXPLAIN output on queries
        'enabled' => true,
        'types' => ['SELECT'],     // Deprecated setting, is always only SELECT
    ],
    
  3. Run migrations and seed the database:

    $ php artisan migrate --seed
    
  4. Start the server

    $ php artisan serve
    

Exploring Performance

  explain select * from `clients` order by `created_at` desc limit 1
Enter fullscreen mode Exit fullscreen mode
  | table   | type | possible_keys | key | key_len | ref | rows   | Extra          |
  | ------- | ---- | ------------- | --- | ------- | --- | ------ | -------------- |
  | clients | ALL  |               |     |         |     | 999001 | Using filesort |
Enter fullscreen mode Exit fullscreen mode

999001 records are being loaded into memory! And the query takes 904ms. Not good. On production, a comparable query was taking 1.7 seconds.

  explain select * from `clients` where `email` like '%ca%' order by `created_at` desc limit 1
Enter fullscreen mode Exit fullscreen mode
  | table   | type | possible_keys | key | key_len | ref | rows   | Extra                       |
  | ------- | ---- | ------------- | --- | ------- | --- | ------ | --------------------------- |
  | clients | ALL  |               |     |         |     | 999001 | Using where; Using filesort |
Enter fullscreen mode Exit fullscreen mode

Still way to many records being loaded with the operation taking 767ms.

  explain select * from `clients` order by `email` desc limit 1
Enter fullscreen mode Exit fullscreen mode
  | table   | type  | possible_keys | key                 | key_len | ref | rows | Extra |
  | ------- | ----- | ------------- | ------------------- | ------- | --- | ---- | ----- |
  | clients | index |               | clients_email_index | 1022    |     | 1    |       |
Enter fullscreen mode Exit fullscreen mode

Only one row is loaded, and the index is used with the select operation taking 2.46ms. Just like that we get a significant bump in performance. On production the query went from 1.7 seconds to ~70ms. Now you can either add an index to the created_at column or leverage the primary key which is indexed by default. The email index in this example was just to show indexes help in ORDER BY statements.

  explain select * from `clients` order by `id` desc limit 1
Enter fullscreen mode Exit fullscreen mode
  | table   | type  | possible_keys | key     | key_len | ref | rows | Extra |
  | ------- | ----- | ------------- | ------- | ------- | --- | ---- | ----- |
  | clients | index |               | PRIMARY | 8       |     | 1    |       |
Enter fullscreen mode Exit fullscreen mode

This one takes 1.74ms, doesn't load close to a million records to memory, looks alright.

TL;DR

Use indexed columns when using ORDER BY clauses.

Resources for Further Exploration

Top comments (0)