Clustered and Nonclustered Indexes

#sql #beginners #sqlserver #mssql

I used to learn about SQL, especially MySQL. Then now my job requires me to understand Microsoft SQL Server (mssql). So I make these notes to guide me and others who want to transition to mssql. I usually summarized what I found and understood so far from browsing the internet.

Here is my first note of my first series, Re-learning SQL Server 🤓.

My takeaways:

An index is an on-disk structure.
An index contains keys built from one or more columns in the table or view.
Indexes can be helpful for a variety of queries that contain SELECT, UPDATE, DELETE, or MERGE statements. When one of this query is executed, the query optimizer evaluates each available method for retrieving the data and selects the most efficient method. The method may be a table scan, or may be scanning one or more indexes if they exist.
When performing a table scan, the query optimizer reads all the rows in the table, and extracts the rows that meet the criteria of the query. It could be the most efficient method if the result set of the query is a high percentage of rows from the table.
When the query optimizer uses an index, it searches the index key columns, finds the storage location of the rows needed by the query and extracts the matching rows from that location. Generally, searching the index is much faster than searching the table because unlike a table, an index frequently contains very few columns per row and the rows are in sorted order.
The query optimizer typically selects the most efficient method when executing queries. However, if no indexes are available, the query optimizer must use a table scan. Our task is to design and create indexes that are best suited to our environment so that the query optimizer has a selection of efficient indexes from which to select.
Clustered index sorts and stores the data rows of the table or view in order based on the clustered index key. The clustered index is implemented as a B-tree index structure that supports fast retrieval of the rows, based on their clustered index key values.
Nonclustered index can be defined on a table or view with a clustered index or on a heap (structure without a clustered index). Each index row in the nonclustered index contains the nonclustered key value and a row locator. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key. The rows in the index are stored in the order of the index key values, but the data rows are not guaranteed to be in any particular order unless a clustered index is created on the table.
Clustered index is like a table of contents on the book.
Nonclustered index is like an index on the book.
Primary key could be a clustered index but it doesn't have to. Both primary key and clustered index could coincide on the same table.
Both clustered and nonclustered indexes can be unique.
When we create a table with a UNIQUE constraint, Database Engine automatically creates a nonclustered index. If we configure a PRIMARY KEY, Database Engine automatically creates a clustered index, unless a clustered index already exists.