Let's say you have a blog with posts (or an ad board with listings, or travelers with trips... you get the idea) and you want to retrieve all the posts between two dates. Sounds familiar ? So how exactly do we do that ?
The candid approach
It really seems like a trivial question with an easy (but completely wrong) answer : Just use BETWEEN
(or in Laravel whereBetween
) like so :
$startDate = '2021-06-01';
$endDate = '2021-06-30';
$posts = Post::whereBetween('created_at', [$startDate, $endDate])->get();
We've all done that (at least I know I have) to retrieve all posts created in June. The issue here is that our created_at
column is usually a Datetime, so it's not a simple date but it also has a time. Which means that in practice any post created on the 30th won't be retrieved because their creation date will always be greater than 2021-06-30 (which SQL will assume means '2021-06-30 00:00:00').
Or maybe we're using Carbon and have something like that :
$startDate = Carbon::createFromFormat('Y-m-d', '2021-06-01');
$endDate = Carbon::createFromFormat('Y-m-d', '2021-06-30');
$posts = Post::whereBetween('created_at', [$startDate, $endDate])->get();
That's actually even worse because it becomes totally unpredictable. A Carbon instance represents an instant, and it has a time too, except if you don't specify a time it will default to the current time when the script runs. So if you run this script at 9AM and the post was created at 8AM on the 30th, you'll retrieve it... But run the exact same script at 7AM, and you won't retrieve that post anymore because $endDate will actually be '2021-06-30 07:00:00'.
We could use $endDate->toDateString() to get rid of the time, but we'd end up in the situation above.
A better way with carbon
One solution would be to make sure that we specify a time in our query, and that this time is at the very start of the day for our start date (00:00:00) and at the very end of the day for our end date (23:59:59.999999).
Fortunately, Carbon provides the startOfDay()
and endOfDay()
methods that do just that :
$startDate = Carbon::createFromFormat('Y-m-d', '2021-06-01')->startOfDay();
$endDate = Carbon::createFromFormat('Y-m-d', '2021-06-30')->endOfDay();
$posts = Post::whereBetween('created_at', [$startDate, $endDate])->get();
Now that's much better, we can be pretty sure that everything created on the 1st or the 30th will be retrieved no matter what time they were created at or what time it is.
It's a solid solution and you can definitely use it, but adding a time when really we only care about the date feels a tiny bit like a hack to me, so let's see another solution
Another way with MySQL
We could also explicitly tell MySQL that we only care about the date by using DATE()
. The query we want is this :
SELECT * FROM posts
WHERE DATE(created_at) BETWEEN '2021-06-01' AND '2021-06-30'
That way we'll compare dates with dates, and not with a Datetime. We'll need to resort to DB:raw() to replicate this with Eloquent, which would look like this :
$startDate = '2021-06-01';
$endDate = '2021-06-30';
Post::whereBetween(DB::raw('DATE(created_at)'), [$startDate, $endDate])->get();
Ideally we should make sure that $startDate
and $endDate
are properly formatted as dates, but it seems to work even if we pass a full Carbon object (which is automatically converted to a string) as MySQL will ignore the time portion.
So that's another way to do it, but I'm not a fan of using DB::raw() either, plus it could be slower (more about that in the last paragraph). So let's see a final solution that leverages Eloquent to handle that.
Yet another way with Eloquent
Eloquent provides a very helpful whereDate()
method that will do two things
- Build an SQL query that uses the
DATE()
SQL function to format the content of the column as Y-m-d. - Properly cast a Carbon or Datetime object to the Y-m-d format before comparing it.
Using this, we can confidently pass Carbon instances and know that any time that happens to be a part of it will be discarded and we'll actually be searching between two dates :
$startDate = Carbon::createFromFormat('Y-m-d', '2021-06-01');
$endDate = Carbon::createFromFormat('Y-m-d', '2021-06-30');
$posts = Post::query()
->whereDate('created_at', '>=', $startDate)
->whereDate('created_at', '<=', $endDate)
->get();
This will generate this SQL query :
SELECT * from "posts"
WHERE DATE("created_at") >= '2021-06-01'
AND DATE("created_at") <= '2021-06-30';
And it works flawlessly. The only downside is that we can't use between so it's a little bit longer to write, but if we're going to use it in several places we could easily write a scope (and maybe even make it generic so that it could be imported as a Trait in every model that needs it ?), something like that :
public function scopeCreatedBetweenDates($query, array $dates)
{
return $query->whereDate('created_at', '>=', $dates[0])
->whereDate('created_at', '<=', $dates[1])
}
And use it instead :
$startDate = Carbon::createFromFormat('Y-m-d', '2021-06-01');
$endDate = Carbon::createFromFormat('Y-m-d', '2021-06-30');
$posts = Post::createdBetweenDates([$startDate, $endDate])->get();
Now that looks pretty good to me ! Unfortunately it might not be the fastest solution...
What about performance ?
Note : This was not part of the original article, and someone on Linkedin asked me "what about performance"... Which opened a whole new can of worms I hadn't really anticipated (which Oliver also pointed out in the comments).
All of the approaches above have about the same performance if you don't have any index on the created_at column, which is probably fine if you need to filter a few dozens or hundreds of models by date, but if you start working with a lot more than that you'll need to create an index and... Things get messy.
If you create a simple index on the created_at
column, the Carbon approach that uses ->startOfDay()
, ->endOfDay()
and ->between()
will be a lot faster because it can leverage the index.
Unfortunately anything that involves the DATE()
function, either by using DB::raw()
or Laravel's WhereDate()
function won't be able to use that index because it contains the timestamp.
The theoretical solution to this is to create a functional index on DATE(created_at)
in MySQL. There are two caveats though :
- Functional indexes are only available on MySQL 8.0.13 and above
- For some reason that I can't figure out, the functional index is not used when using prepared statements. So in order for it to work you'd have to also use PDO's emulated prepared statements which is not the recommended way.
So all in all, my final recommendation is to stick with Carbon to get the start and end of day. And we can still use a scope for that to make it easier to use (we'll also make it accept either a Carbon instance or a string) :
public function scopeCreatedBetweenDates($query, array $dates)
{
$start = ($dates[0] instanceof Carbon) ? $dates[0] : Carbon::parse($dates[0]);
$end = ($dates[1] instanceof Carbon) ? $dates[1] : Carbon::parse($dates[1]);
return $query->whereBetween('created_at', [
$start->startOfDay(),
$end->endOfDay()
]);
}
And we can use it like so :
$posts = Post::createdBetweenDates(['2021-06-01', '2021-06-30'])->get();
Conclusion
Dealing with time is hard. I find that when dealing with dates and time, even if it looks simple, it pays to take an extra minute to wonder if it really is that simple and if you're not missing something (and we haven't even touched time zones, Daylight Saving Time, and leap seconds...)
Anything I missed ? Questions ? Horror stories to share about dates ? hit the comments !
Top comments (3)
Sorry, but this is a very bad advise! I suppose you never worked with larger amount of data.
If you use WHERE DATE(created_at) in your query, the database can not use the index for created_at (which hopefully is there), because you are applying a function to the column.
Your approach will work for tables with 100 or 1000 rows because every data is in memory anyway, but if there are like 100 million rows, the performance of the query will be very poor and the load for the database server will be very high!
As a software developer it is important to know how database systems are working, especially what indexes are and how they can make queries faster.
Hi !
Thanks I was actually of investigating this after someone asked me about performance and that's completely true. In my tests I forgot to add an index so the there was no difference.
The thing is, if you know you'll only fetch between two date and not two times you could add a functional index on
DATE(created_at)
and I would have exepected it to be a little bit faster as it will be smaller and have a lower cardinality.So far I have found two things :
PDO::ATTR_EMULATE_PREPARES
to emulate prepared statements, but as soon as I use native prepared statements I'm back to the same performance as without the index.The next step (but I ran out of time for now) is to run the prepared statements in the SQL console to determine if it's linked to something Laravel does or not, and if not check what's happening with MySQL. I didn't find anything indicating that functional indexes could not be used with prepared statements so it might be a bug ?
Thanks! I use Laravel Purity for filtering and implemented this solution in a trait so it could be used by several model classes and created a custom filter so it could be used as an additional filter :)
Some comments have been hidden by the post's author - find out more