DEV Community

Dmitry Pursanov
Dmitry Pursanov

Posted on

How Indexing Works in MariaDB and Why It Helps with Optimization

Picture this: you're at a giant library with thousands of books, and you need to find a specific one. Would you prefer flipping through every single page, or heading straight to the index to locate exactly what you need? That's the magic of indexing! In the world of databases, indexing is your best friend, helping you zip through mountains of data faster than you can say "query optimization." Let's dive into how indexing works in MariaDB and why it's the ultimate game-changer for your database performance.

kinda database

Indexing is one of the most fundamental and powerful techniques used to optimize database performance in MariaDB (actually in any relational database). It plays a critical role in speeding up data retrieval, making your queries more efficient, and reducing the load on your database.

What is an Index?

An index in MariaDB is a data structure that improves the speed of data retrieval operations on a database table. It acts like a "lookup" mechanism that allows the database to find rows faster than it would if it had to scan the entire table sequentially.

Indexes are created on one or more columns of a table. When a query is executed, the database engine can use the index to quickly locate the data, rather than searching every row in the table.

Simply: indexes are like mini-databases within your tables, created on one or more columns. They speed up data retrieval by giving the database engine a shortcut, so it doesn't have to slog through every single row.

Basic Example:

Consider a simple table users with columns id, name, and email.

CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    email VARCHAR(100)
);
Enter fullscreen mode Exit fullscreen mode

If you frequently search for users by their email, you could create an index on the email column:

CREATE INDEX idx_email ON users(email);
Enter fullscreen mode Exit fullscreen mode

Boom! You've just given your queries a rocket-powered engine. Now, when you run something like SELECT * FROM users WHERE email = 'example@example.com';, MariaDB will zoom in on the right row in no time!

How Indexing Works

So, what's happening under the hood? When you create an index, MariaDB builds a sorted data structure (think of it as a super-organized filing cabinet) (typically a B-tree or, in some cases, a hash table) that stores the indexed column's values along with pointers to the actual rows in the table. This structure allows the database to perform binary searches, significantly reducing the number of comparisons needed to find the desired rows.

  • B-tree Indexes: Most indexes in MariaDB are implemented as B-trees, a type of self-balancing tree structure. B-trees allow for efficient searches, insertions, deletions, and range queries. The tree structure is designed to maintain balance, ensuring that operations remain efficient even as the database grows.
  • Hash Indexes: These are like the secret weapon for exact matches. They’re quick and efficient but don’t do well with range queries (they’re a bit like that friend who’s great at finding the nearest coffee shop but gets lost on a road trip). When you run a query, MariaDB’s optimizer steps in like a savvy travel guide, figuring out whether an index can make the journey quicker. If an index is the best route, off we go, speeding to the right data in no time! ### What is the MariaDB Optimizer? The MariaDB Optimizer is a query planner and executor within the MariaDB database engine. Its main job is to analyze SQL queries and decide the best strategy to execute them, considering factors such as the structure of the database, the indexes available, and the specific conditions in the query.

When you run a query, MariaDB doesn’t just blindly execute it. Instead, the optimizer considers various possible ways to fulfill the query and chooses the one that it estimates will be the most efficient. This process is known as “query optimization.”

Analyzing the Optimizer with EXPLAIN

To understand how the optimizer is handling your queries, you can use the EXPLAIN statement in MariaDB. EXPLAIN provides a breakdown of the execution plan chosen by the optimizer, including which indexes are used, the order of table access, and more.

EXPLAIN SELECT * FROM users WHERE email = 'example@example.com';
Enter fullscreen mode Exit fullscreen mode

The output of EXPLAIN will show you the steps that the optimizer has decided to take, allowing you to spot potential inefficiencies and make improvements. For example the optimizer often chooses to use indexes to speed up data retrieval.

How the B-tree Index Works

  1. Insertion: When a new row is added to the table, the index is automatically updated. The new entry is placed in the appropriate location within the B-tree structure.
  2. Search: When a query searches for a value in an indexed column, the database engine uses the B-tree to quickly narrow down the possible locations of the data, often finding it with only a few comparisons.
  3. Range Queries: B-tree indexes are particularly useful for range queries (e.g., BETWEEN, >, <), as the ordered structure allows the database to efficiently retrieve all rows within the specified range.

Why Indexing is Your Database’s Best Friend

Indexing is like giving your database a turbo boost. It makes data retrieval not just faster, but super fast. Here’s why indexing is the secret ingredient to a well-oiled, high-performance database:

  • Less Disk I/O: Without an index, your database is like someone searching for a lost sock in a pile of laundry — it has to look at every item. With an index, it’s like having that sock magically appear right on top, saving tons of time and effort.
  • Speedier Queries: Indexes let your queries go from “let me think about that” to “here’s your answer!” in a flash, especially with large tables where every second counts.
  • Smarter Resource Use: By cutting down on unnecessary work, indexes free up your database’s brainpower (CPU and memory), so it can handle more tasks at once.
  • Better Joins: When combining data from multiple tables, indexes make the process smooth and efficient, like a well-choreographed dance. Instead of clumsily bumping into each other, the tables glide together in perfect harmony.
  • Unique Constraints? No Problem!: Need to make sure there are no duplicate emails in your users table? Indexes help enforce those rules while keeping lookups fast.

A Few Words of Caution

But wait — before you go index-crazy, remember that even the best tools come with a few trade-offs:

  • Storage Space: Indexes aren’t free — they take up space. Each index adds a bit more weight to your database’s storage needs.
  • Update Overhead: Every time you add, update, or delete rows, your indexes need a little TLC. That’s more work for your database, so if you have tons of indexes, writes can slow down.
  • Choose Wisely: Not every query needs an index. Like a chef carefully selecting the right spice, you need to pick your indexes based on which queries will benefit the most.

— — — — — Practice Part! — — — — —

Smart Strategies for Selecting What to Index in MariaDB

Choosing the right indexes for your MariaDB database is like crafting the perfect recipe: you want just the right ingredients in the right amounts to make everything work together smoothly. Here are some strategies to help you plan and select what to index, ensuring that your database runs efficiently without overburdening it.

1. Analyze Your Query Patterns

The first step in deciding what to index is understanding how your database is used. Start by analyzing the most common queries in your application. Pay attention to:

Frequently Queried Columns: Which columns are most often used in WHERE clauses? If a column is frequently filtered on, it’s a strong candidate for indexing.
Join Conditions: Identify columns that are commonly used in JOIN operations. Indexing these columns can significantly speed up join performance.
Sorting and Grouping: Columns used in ORDER BY and GROUP BY clauses often benefit from indexing, especially if sorting or grouping is done on large datasets.
Example:

If you have a query like

SELECT * FROM orders WHERE customer_id = 123 ORDER BY order_date DESC;

, you might want to index both customer_id and order_date.

2. Use the EXPLAIN Command

Before creating an index, use the EXPLAIN command to see how MariaDB is currently executing your queries. This command shows you the execution plan, including whether the database is using a full table scan, an index scan, or another method.

Full Table Scans: If EXPLAIN shows that a query is performing a full table scan on a large table, it’s a signal that an index might be needed.
Index Usage: If EXPLAIN reveals that an existing index is being used but isn’t performing well, it might be time to refine or add additional indexes.

EXPLAIN SELECT * FROM orders WHERE customer_id = 123 ORDER BY order_date DESC;
Enter fullscreen mode Exit fullscreen mode

3. Prioritize Primary Keys and Foreign Keys

Primary and foreign keys are natural candidates for indexing because they uniquely identify rows and establish relationships between tables. In fact, primary keys are automatically indexed by MariaDB. However, don’t forget to explicitly index foreign keys as well.

Primary Keys: These are automatically indexed, but make sure they’re appropriately designed (e.g., using an integer or another small, unique value).
Foreign Keys: Indexing foreign keys can drastically improve join performance and ensure data integrity.

CREATE INDEX idx_customer_id ON orders(customer_id);

4. Composite Indexes: Grouping Columns Wisely

Sometimes, a single-column index isn’t enough. Composite indexes, which involve multiple columns, can be extremely powerful when used correctly.

Order Matters: In a composite index, the order of the columns matters. Place the most selective column (the one that filters out the most rows) first.
Covering Indexes: A composite index that includes all the columns needed by a query is known as a covering index. Using a covering index can eliminate the need for the database to access the table data at all, further speeding up queries.

Example:

For a query like SELECT * FROM orders WHERE customer_id = 123 AND order_date >= '2023-01-01';, a composite index on (customer_id, order_date) could be very effective.

5. Consider the Cardinality

Cardinality refers to the uniqueness of data in a column. Columns with high cardinality (many unique values, like email addresses) often benefit more from indexing than columns with low cardinality (few unique values, like gender).

High Cardinality: Index columns with high cardinality, as the index will provide significant performance gains.
Low Cardinality: Be cautious with low cardinality columns; indexing them might not yield noticeable benefits and could add unnecessary overhead.

Example:

Indexing a country column in a global user database might be less useful than indexing an email column.

6. Avoid Over-Indexing

More isn’t always better. Each index adds overhead to your database, both in terms of storage and maintenance. Too many indexes can slow down insert, update, and delete operations, as the indexes need to be updated along with the data.

Monitor Index Performance: Regularly review and monitor the performance of your indexes. If an index isn’t being used or is providing minimal benefits, consider dropping it.
Focus on High-Impact Queries: Prioritize indexing for queries that run frequently or that impact user experience, rather than trying to optimize every single query.

Example:

If you find that an index isn’t being used, or if it’s only benefiting a query that runs infrequently, it might be worth removing it.

7. Leverage Partial Indexes and Prefix Indexes

Sometimes, indexing the entire column might be overkill. In these cases, partial or prefix indexes can be useful.

Partial Indexes: Create indexes on a subset of rows (using a WHERE clause in the index creation). This is useful when you only query a specific range of data.
Prefix Indexes: For large text columns (like VARCHAR), you can index just the first few characters to save space while still improving performance.

CREATE INDEX idx_email_prefix ON users(email(10));
Enter fullscreen mode Exit fullscreen mode

8. Test and Iterate

Database indexing is an ongoing process, not a one-time setup. As your data grows and your application evolves, regularly revisit your indexing strategy.

  • Benchmarking: Test your queries before and after adding indexes to see the actual performance impact.
  • Iterate: Based on your findings, tweak your indexes, drop unnecessary ones, and add new ones where needed.

So there you have it! Indexing in MariaDB — and yes, in MySQL too — isn’t just about throwing in a few shortcuts here and there. It’s about mastering the art of database efficiency. Think of it as being a savvy librarian who knows exactly where every book is stored, ensuring that anyone who walks in finds what they need in record time.

Go ahead, unleash the power of indexing in MariaDB and MySQL, and watch your queries fly through data like a hot knife through butter. And when your users are wowed by the blazing speed of your app, you can sit back, smile, and maybe even treat yourself to a well-earned cup of coffee. After all, you’re now the master chef of database optimization!

— —

References:

MySQL Documentation: The InnoDB Storage Engine
MySQL Documentation: Optimizing Queries with EXPLAIN
SQL Indexing and Tuning e-Book

Top comments (0)