DEV Community

Akash Kava
Akash Kava

Posted on

Improving SQL Query by Adding conditions in Joins

Recently I was trying to optimize query, this was the query,

SELECT ... 
FROM A
INNER JOIN B
   ON B.ID = A.ID
WHERE B.Status = 'Done' 
   AND B.DateCreated BETWEEN @Start and @End
   AND NOT EXISTS (...)
   AND A. ... other conditions
Enter fullscreen mode Exit fullscreen mode

Table B is huge table, and query took couple of seconds. Even though there was an index of (ID, Status, DateCreated).

When I changed query to,

SELECT ... 
FROM A
INNER JOIN B
   ON B.ID = A.ID AND B.Status = 'Done'
   AND B.DateCreated BETWEEN @Start and @End
   AND NOT EXISTS (...)
WHERE A. ... other conditions
Enter fullscreen mode Exit fullscreen mode

Surprisingly query only took few milliseconds. Upon further investigation I found that moving NOT EXISTS inside JOIN improved the speed.

Discussion (10)

Collapse
buinauskas profile image
Evaldas Buinauskas

You were just lucky. Query optimizer has chosen a non cached version of plan to execute this query.

Collapse
akashkava profile image
Akash Kava Author

I don't think that was the case, Execution plans for both queries are different, Also these queries were executing frequently and I saw the difference even executing them in parallel (half queries in old way and half in new way simultaneously).

Collapse
buinauskas profile image
Evaldas Buinauskas

Plans have obviously have to be different.

I see this is SQL Server and it will cache plan for an exact query you run. By moving date time clause to join condition you force a new plan generation.

You should be able to get same results by forcing plan recompilation

Thread Thread
akashkava profile image
Akash Kava Author

I looked at my query again and I found out that I had an extra clause NOT EXISTS which was causing difference in speed and plans, omitting not EXISTS didn't make difference in speed in both cases.

Collapse
sqlknowitall profile image
Jared Karney

That's not luck, it would absolutely produce a new plan because the text changed. Even adding a space or making a letter upper case will generate a new plan.

Collapse
buinauskas profile image
Evaldas Buinauskas • Edited on

Yes, this is correct.

I should've been more clear that moving clause from where to join does not add any performance on its own, new forcefully generated plan did. Thanks for bringing this up!

Collapse
caderoux profile image
Cade Roux

In any decent optimizer, INNER JOIN and WHERE conditions are basically interchangeable.

Also, I'm suspicious about any index that starts on ID because they are highly selective from the start.

Would like to see the execution plans of the two different queries.

Collapse
sqlknowitall profile image
Jared Karney • Edited on

You also have to look at the possible 'reason for early termination' in the plan. If the plan is not able to be fully optimized and has a reason for early termination, it may have executed serially and in these cases you may get a better plan with the filter in the join. However, your problem lies in the fact that the optimizer isn't able to complete. Potential causes there are complex queries or auto updating statistics, to name a couple.

Collapse
rhymes profile image
rhymes

Can you share the execution plans?

Collapse
alexandersuprun profile image
alexander-suprun

That's just a very useless observation and doesn't prove anything.
Have you figured out why exactly did this happen? What's in NOT EXISTS condition? What are the execution plans?