Data technologies have gone through great advancements over the past decades enabling businesses to easily own and operate databases on the cloud and scale up their resources in a few clicks. Faded by technological improvement, people sometimes neglect the basic but essential techniques that can make their database fast and reliable. In this article, we will learn the top five query-tuning techniques for Microsoft SQL Server.
Detect slow queries
To tune slow queries, you first need to find them. You will need to examine them one by one and prioritize tuning. Before selecting slow queries, prepare a speed threshold to only include tuning candidates. To query slow queries that are slower than your threshold, check the query below.
SELECT
req.session_id
, req.total_elapsed_time AS duration_ms
, req.cpu_time AS cpu_time_ms
, req.total_elapsed_time - req.cpu_time AS wait_time
, req.logical_reads
, SUBSTRING (REPLACE (REPLACE (SUBSTRING (ST.text, (req.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1
THEN DATALENGTH(ST.text)
ELSE req.statement_end_offset
END - req.statement_start_offset)/2) + 1) , CHAR(10), ' '), CHAR(13), ' '),
1, 512) AS statement_text
FROM sys.dm_exec_requests AS req
CROSS APPLY sys.dm_exec_sql_text(req.sql_handle) AS ST
WHERE total_elapsed_time > {YOUR_THRESHOLD}
ORDER BY total_elapsed_time DESC;
This query gives you a list of elapsed time of each query. The slowest query will appear at the top. You can add your threshold figure in the where clause to, for example, select the top five slowest queries.
Start with the basics
Once you find the queries to be tuned, check if the queries are following the basic rules for performance.
- Use the where condition to limit scanning scope
- Don’t select everything
- Use inner join instead of correlated subqueries
- Try to avoid HAVING and use WHERE
- Use inner join instead of two tables in the where clause
Use the where condition to limit scanning scope
The goal of running a query is to get the information you need. When you run a query without a condition, the database needs to scan the whole area of a table, which leads to a slower query response. If possible, use the where clause to precisely aim at the data you need.
Don't select everything
People often use the star symbol (*) for convenience. However, if a table consists of many columns and holds a large number of records, selecting all of the columns and rows will consume more resources. Instead of using select all, specify the column names that you want.
Use inner join instead of correlated subqueries
If you use a correlated subquery (or a repeating subquery), the subquery you use gets executed repeatedly. The sample below shows you what a correlated subquery looks like.
SELECT column1, column2, ....
FROM table1 outer
WHERE column1 operator
(SELECT column1, column2
FROM table2
WHERE expr1 = outer.expr2);**
The subquery after the operator is run repeatedly until it satisfies the WHERE
condition. Instead of using the correlated subquery, consider using the inner join.
Try to avoid the HAVINGclause and use the WHEREstatement
When you run an aggregated query using GROUP BY
, you can add a condition using HAVING
. It is recommended to use HAVING
only on an aggregated field and not to use it when you can replace it with a where condition. A HAVING
clause is calculated after a where clause, so it is recommended to limit data scanning prior to the HAVING
clause.
Use inner join instead of two tables in the where clause
You can put two tables in the where clause to use like JOIN
. Although it is grammatically accepted, it creates inefficient query execution. Cartesian Join or CROSS JOIN
refers to the SQL computation that requires a combination of all possible variables. When you use two tables in the where clause, Cartesian Join kicks in, which consumes a lot of resources. Instead, try to use INNER JOIN
.
Use EXPLAIN to find pain points
You can use the EXPLAIN
command to diagnose your queries. The command explains your query and shows you how a query will be executed step-by-step. You can use the result to find inefficient steps. The syntax of EXPLAIN
is simple.
EXPLAIN
{YOUR_QUERY}
Put the command, EXPLAIN
, above your query and then execute it. In the {YOUR_QUERY} section, you can put not only a SELECT
query but also an UPDATE
, INSERT
, or DELETE
statement.
When you execute an SQL in DbVisualizer, you can see its explain plan at the bottom section as the image above. When you select the Graph View option highlighted in a red box, you can see graph view.
In this graph view, each node in the visual presentation contains more detailed information such as different types of costs, arguments, and more.
Index your tables
Indexing your tables can speed up your query performance. It is like how the index at the end of a book works. By referring to the index, you can go to the page and find the information you need. To efficiently index your tables, there are several points you need to consider.
- Prioritize tables by frequency and importance: before setting indexes, you need a plan. It is a good practice to prioritize your queries by frequency and importance and then start examining the tables to decide their indexes. For example, if you have some queries that are scheduled to be run every hour and the result of those queries is used for generating invoices for customers (which is directly linked to your revenue), they can become top candidates.
- Choose columns that are often used in the where clause or join keys: when you index columns, you put those columns in an SQL index table so that when those columns are searched, the database can quickly retrieve the records you are looking for. Check which columns you often use in where or join conditions. Adding them to the index table can speed up a search or a join query.
- Consider column data types: after researching the columns that are frequently used in join and where conditions, check their data type. The most suitable data type for the index is the integer type and the worst candidate is a string type. Also, a column that always has a unique value and a NOT NULL constraint can be a good index candidate.
Use visualization tools like DbVisualizer
When you perform query tuning on your own, it can be pretty challenging and time-consuming. DbVisualizer is a universal tool that can meet all your database needs from running queries to database management and query tuning. It beautifully displays database system data for users to interpret information more easily. Its optimization feature can help you to achieve your tuning goals and make your database operate efficiently.
The ERD generation feature visually displays the relations of your tables. Using the auto-generated diagram, you can efficiently document your table designs and share your ideas with team members.
Conclusion
In this article, we learned the top five query tuning techniques for Microsoft SQL Server. By optimizing your queries, you can more reliably and efficiently serve data requests that come from your online products, dashboards, ad-hoc queries, and other sources. Also, you can reduce costs for upgrading database resources. DbVisualizer can help you to achieve these objectives by providing a single point for users to perform various tasks of the database. Find out more features of DbVisualizer.
About the author
Igor Bobriakov is an entrepreneur and educator with a background in marketing and data science. His passion is to help small businesses and startups to utilize modern technology for growth and success. Igor is Data Scientist and ML Engineer. He founded Data Science School in 2016 and wrote numerous award-winning articles covering advanced tech and marketing topics. You can connect with him on LinkedIn or Twitter.
Top comments (0)