Careful with table partitioning in SQL

#programming #database #picture

Original post at: https://siderite.dev/blog/careful-with-table-partitioning-in-sql

I had this situation where I was trying to optimize a query. And after some investigation I've stumbled upon something strange: querying on the primary key was generating a lot of reads. I was joining my table with a temporary table of 10 ids and there were 630 reads! How come?

At first I thought it was because the way indexes work. The primary key was comprised of RowId and RowDate and, even if I knew theoretically searching by RowId should use the primary key, the evidence was against me: when querying by RowId and RowDate I would get the expected 10 reads.

I created two queries, one with and one without RowDate. I then compared their execution plans. They were identical! Only one took a lot longer, specifically in the Index Seek (which used correctly the primary key). When I looked at the properties for that plan element, I saw something strange:

Actual Partitions Accessed 1..63!

I then realized that the table was partitioned on the RowDate column. In this case, RowDate takes precedence to any indexed column! You might think of partitioning a table like forcefully adding the partition columns to every index in the table, including the primary key. In fact, a partitioned table acts like a number of separate tables with the same definition (columns, indexes, etc.), just different data. The indexes work on each separate partition. When you partition a table, you also partition its indexes.

In truth, I would have expected the query execution plan to show the partition split as a separate step. I understand it's hard to conceptualize it without creating as many execution paths as there are partitions, but still, there should be an indication in the shape of the plan that makes it clear you are querying on multiple partitions.

Once RowDate was used, the SQL engine would choose the one partition of my row, then use the primary key index to seek it. Instead of 63*10 reads, just 10 reads, the number of the rows in the id table.

So be careful when you use table partitioning to ALWAYS use the partition columns in the queries for the table, else you will get as many parallel searches as there are partitions, regardless of the indexes you created, as they are also partitioned.

Hope that helps!

DEV Community

Careful with table partitioning in SQL

Top comments (0)

Read next

Decoding Numerical Representation: Floating-Point vs. Fixed-Point Arithmetic in Computing

Learning Cyber Security Day 4 - Integrating Encryption in Your Code🔐

React JS Fragments

Build a spreadsheet app with an AI-copilot (Next.js, gpt4, LangChain, & CopilotKit)