DEV Community

Soma
Soma

Posted on

How SQL Query works? SQL Query Execution Order for Tech Interview

Disclosure: This post includes affiliate links; I may receive compensation if you purchase products or services from the different links provided in this article.

SQL query execution order

Hello guys, one of the common question on technical interviews about SQL is how exactly SQL query work? While this may seems simple, many programmers including experienced one fail to answer this with confidence.

Many developer don't even know how the SQL commands are executed and in which order?

For them the SQL query is executed as they are written but that's not true, you can see from the above diagram that FROM and JOIN is executed before you can SELECT anything, which is again very rational if you think through.

Earlier, I have shared 20 SQL queries from interviews and 50 System design questions and in this article, I am going to answer how exactly SQL query works under the hood, so stay tuned and continue reading.

And, if are preparing for tech interviews and you need more questions not just queries but also database and SQL related questions from other topics like indexes, joins, group by, aggregation, and window functions then you can also checkout these 200+ SQL Interview Questions .

This course is one of the specially designed course to prepare you for SQL interviews by answering popular questions. You can also get this for big discount now.

How exactly SQL Query is executed?

Structured Query Language or SQL is the standard language for managing and manipulating relational databases.

It provides a powerful and efficient way to interact with data, enabling developers, analysts, and data scientists to retrieve, insert, update, and delete information from databases.

While SQL queries are written in a declarative, human-readable format, there is a complex process that occurs behind the scenes to execute these queries and retrieve the desired results.

In this article, we'll delve into the inner workings of SQL queries, breaking down the process step by step.

1. Query Parsing and Tokenization

The journey of an SQL query begins with parsing and tokenization. When a user submits an SQL query, the database management system (DBMS) must first break down the query into individual tokens.

Tokens are the smallest units of the query and can include keywords (SELECT, FROM, WHERE, etc.), table and column names, operators (=, >, <, etc.), and values.

This process involves identifying the syntax and structure of the query to ensure it follows the rules of the SQL language.

how SQL query are executed


2. Query Optimization

Once the query is parsed and tokenized, the DBMS performs query optimization. This is a crucial step that aims to improve the efficiency of query execution.

The DBMS analyzes the query and explores various execution plans to determine the most efficient way to retrieve the requested data.

It considers factors such as indexes, table relationships, and available resources to create an execution plan that minimizes the time and resources needed to complete the query.

how to do query optimization


3. Execution Plan Generation

The chosen execution plan outlines the sequence of steps required to fulfill the query.

It determines the order in which tables are accessed, the types of joins performed, and the filtering conditions applied.

The DBMS generates this plan based on statistical information about the data distribution and the database schema.

The goal is to reduce the amount of data that needs to be processed and to optimize disk and memory usage.

On Microsoft SQL Server, a Query Execution plan looks like below:

how Execution plan looks like


4. Data Retrieval and Joins

With the execution plan in place, the DBMS begins the process of data retrieval. If the query involves multiple tables, the DBMS performs join operations to combine the relevant data.

Joining tables efficiently requires comparing and matching rows based on specified conditions. Depending on the type of join (inner join, outer join, etc.), the DBMS determines which rows from each table should be included in the result set.

How SQL join works


5. Filtering and Sorting

After joining the necessary tables, the DBMS applies filtering conditions specified in the WHERE clause. This involves evaluating each row to determine whether it meets the criteria set by the user.

Rows that do not satisfy the conditions are discarded, while those that pass the filter are retained for further processing.

Additionally, if the query includes an ORDER BY clause, the DBMS will sort the resulting rows based on the specified column(s).

Sorting involves arranging the data in a specific order, such as ascending or descending, to produce the final ordered result set.

When does filtering and sorting happens in SQL Query Execution


6. Aggregation and Grouping

Aggregation functions such as SUM, COUNT, AVG, MIN, and MAX are commonly used in SQL queries to perform calculations on groups of data.

If the query includes a GROUP BY clause, the DBMS groups the rows based on the specified columns. It then applies the aggregation functions to each group separately, producing summary statistics or calculations for the grouped data.

Aggregation and Grouping in sQL query


7. Result Set Generation

With all the necessary operations performed, the DBMS generates the final result set. This set of rows and columns represents the data that satisfies the user's query. T

he result set is then returned to the user or the application that initiated the query.

when is result generated from SQL query


8. Index Utilization

Indexes play a vital role in optimizing the performance of SQL queries. An index is a data structure that provides a quick way to look up data based on specific columns.

When executing a query, the DBMS may utilize indexes to efficiently locate the relevant rows, reducing the need for full-table scans and improving query response times.

Index Utilization in SQL query


9. Transaction Management

Transactional operations in SQL, such as INSERT, UPDATE, and DELETE, involve modifying data in the database. These operations are grouped into transactions, which ensure data consistency and integrity.

When a transaction is initiated, the DBMS may lock the affected rows or tables to prevent other transactions from accessing or modifying them concurrently.

Once the transaction is completed, the changes are either committed to the database or rolled back, depending on the success or failure of the transaction.

Transaction Management in SQL


10. Caching and Memory Management

Modern database systems employ various caching and memory management techniques to optimize query performance.

Caching involves storing frequently accessed data in memory to reduce the need for disk reads, which are slower in comparison.

The DBMS may also use buffer pools to manage memory allocation for query execution and result set generation, further enhancing efficiency.

Caching and Memory Management in SQL


SQL Query Order? How SQL Query are executed under the hood?

It's also important to know and remember in which order various SQL commands like SELECT, FROM, COUNT, WHERE, HAVING, ORDER BY, JOIN etc are applied

SQL queries are processed in a specific order, and understanding this order is crucial for writing and optimizing queries effectively. The typical order of SQL query processing involves the following steps:

  1. FROM: The query begins by specifying the source tables or views from which the data will be retrieved. This clause defines the primary data source for the query.

  2. JOIN: If the query involves multiple tables, the JOIN clause is used to combine data from different tables based on specified conditions. Different types of joins (INNER JOIN, LEFT JOIN, RIGHT JOIN, etc.) determine how rows from each table are matched and included in the result set.

  3. WHERE: The WHERE clause is used to filter rows based on specific conditions. It restricts the data to only those rows that meet the specified criteria. Rows that do not satisfy the conditions are excluded from further processing.

  4. GROUP BY: If aggregation is required, the GROUP BY clause is used to group rows with similar values in specified columns. This step is often used in conjunction with aggregation functions like COUNT, SUM, AVG, etc. to perform calculations on grouped data.

  5. HAVING: The HAVING clause is used to filter the result set after the GROUP BY operation has been performed. It specifies conditions for filtering aggregated data. Similar to the WHERE clause, rows that do not meet the criteria are excluded from the final result.

  6. SELECT: The SELECT clause is used to specify the columns that should appear in the final result set. It determines which data will be retrieved and displayed in the query output.

  7. DISTINCT: The DISTINCT keyword, if used, removes duplicate rows from the result set, ensuring that only unique values are displayed.

  8. ORDER BY: The ORDER BY clause is used to sort the result set based on specified columns. It arranges the rows in ascending or descending order, as specified.

  9. LIMIT/OFFSET or FETCH/FIRST: Depending on the database system, you might use LIMIT (or FETCH or FIRST) and OFFSET clauses to control the number of rows returned and to implement pagination.

  10. UNION/INTERSECT/EXCEPT: If needed, these set operations can be used to combine the results of multiple queries.

Here is a nice diagram from Medium which clearly explains how the SQL query looks like and how its executed by Query engine:

SQL Query execution order

It's important to note that the actual order of execution may vary based on the specific database management system being used.** However, the logical processing order remains consistent across most SQL databases.

Additionally, modern query optimizer may rearrange some of these steps for performance reasons while ensuring that the final result remains accurate and consistent.

Understanding the order of SQL query processing not only help in technical interviews but also allows you to write efficient and effective queries, and it provides insights into query optimization and performance tuning.

By structuring your queries with this order in mind, you can better control the flow of data and achieve the desired results.

Conclusion

That's all about how SQL query are executed under the hood. SQL queries might seem like simple statements, but there is a complex process that unfolds behind the scenes to retrieve, manipulate, and manage data.

From parsing and optimization to execution plan generation and result set generation, every step is meticulously orchestrated to ensure efficient and accurate query processing.

Understanding how SQL queries work under the hood provides developers and database administrators with valuable insights into performance optimization and query tuning, ultimately leading to better utilization of database resources and improved application responsiveness.

And, if are preparing for tech interviews and you need more questions not just queries but also database and SQL related questions from other topics like indexes, joins, group by, aggregation, and window functions then you can also read Grokking the SQL Interview book or join 200+ SQL Interview Questions .

Both are great resources to prepare you for SQL interviews by answering popular questions.

All the best !!

Top comments (4)

Collapse
 
mahmoudessam profile image
Mahmoud EL-kariouny

Great article thank you so much.

Collapse
 
somadevtoo profile image
Soma

Thanks for your encouraging comment, appreciate it.

Collapse
 
tedsteinmann profile image
Ted Steinmann

What sort of implications does this have on writing efficient queries? Do you have examples of what to do and what not to do? How do views and stored procs come into play?

Collapse
 
suresh_radhakrishnan profile image
Suresh Radhakrishnan

Thanks for giving more clarity about the SQL Query Execution steps. Great Article and Insights