DEV Community

Jamiebones
Jamiebones

Posted on

Understanding MongoDB Indexes

Introduction

Indexes are used in databases to improve the performance of queries. Database index are like the indexes on the back of a book that allows you navigate quickly to any portion in the book. Without indexes, MongoDB must search the entire collection to select documents that match a query.

Indexes are special data structure that stores a portion of a collection in a easy to transform format. Indexes are stored at the collection level and they can be created on any field of the document. A default index ix created on the _id field by MongoDB.

Creating Index on a Collection

An index is created by using the createIndex() method of a collection in MongoDB. Assuming we have an employee collection on our database.

   db.collection.createIndex()
   //command for creating an index on the mongo shell
Enter fullscreen mode Exit fullscreen mode

sample data from a fictional employee database

  db.employee.insertMany([{
   empId: 1, empName: "Jamie Jamo", state: "Edo", country: "Nigeria"
}, {
   empId: 2, empName: "Fletcher Parker", state: "Californai", country: "USA"

}, { empId: 3, empName: "Gongo Aso", state: "Lagos", country: "Nigeria"}, {
 empId: 4, empName: "Andrew Young", state: "Edo", country: "Nigeria"

}])

Enter fullscreen mode Exit fullscreen mode

An index can be created on the empId field by running the command below

  db.employee.createIndex({empId: 1})
Enter fullscreen mode Exit fullscreen mode

The value 1 signifies that the index is stored in ascending order. -1 stores the index in descending order. Index could be created on multiple fields, know as a compound index.

  db.employee.createIndex({empId: 1, empName: 1})
 //example of a compound index that contains multiple fields.
Enter fullscreen mode Exit fullscreen mode

The list of indexes in a collection can be retrieved by running the following command

  db.employee.getIndexes()
//where employee represents the collection you want to retrieve its indexes
Enter fullscreen mode Exit fullscreen mode

An index can be dropped from a collection by using the dropIndex() method. The index created on our employee collection can be dropped by:

  db.employee.dropIndex({empId: 1})
  or 
 //we can drop all the indexes in a collection by running
 db.employee.dropIndexes("*")
//this drops all the index in a collection. This feature is available from mongoDB v4 and upwards
Enter fullscreen mode Exit fullscreen mode

Note the default _id field index can not be deleted.

Types of Indexes

  1. Single Field Indexes
  2. Compound Indexex
  3. MultiKey Indexes
  4. Text Indexex
  5. Hashed Indexes
  6. 2dSphere Indexes
  //single field indexes
  db.employee.createIndex({empId: 1})

 //compound indexes involves combining indexes on more than one field
 db.employee.createIndex({empId: 1, empName: 1})
Enter fullscreen mode Exit fullscreen mode
Multikey Indexes

Multikey Indexes are refereed to as indexes created on a field that holds an array. MongoDB creates a key for each value in the array.

   db.employee.insertOne({
   empId: 1, empName: "Jamie Jamo", state: "Edo", country: "Nigeria", skills: ["Javascript", "React", "Meteorjs"]
})
//inserting this sample data into our fictitious employee collection. We can create a multi key index on the skills field 
db.employee.createIndex({skills: 1})

Enter fullscreen mode Exit fullscreen mode

MongoDB will create a multikey index. You cannot create a compound multikey index.

Text Indexes

A text index is created on a field to enable and optimize text search queries on strings. A text index is created on a field that contains a string value or an array of string values.

   //sample data
   db.post.insert({
       "post_text" : "This is the new dawn"
        "tags" : ["volleyball", "dawn"]
    })
 //text index can be created on the post_text field by running this command
 db.post.createIndex({"post_text": "text"})
//this command creates the a text index on the post_text field and makes it searchable.
//The command below is used to perform a search query on a //text index. where post represents your collection
db.post.find({"text": {$search: "your search term"}})
Enter fullscreen mode Exit fullscreen mode
Hashed Indexes

Indexes in a collection normally requires extra storage space to store the index alongside the collection data. Hashed Indexes can be used to reduced the space occupied by indexes. A hashed index store the hashes of the values of the indexed field. Hashed indexes do not support multikey indexes.

   db.employee.createIndex({empId: "hashed"})
   //this creates an hashed index
Enter fullscreen mode Exit fullscreen mode
2dSphere Index

The 2dsphere index is useful to return queries on geospatial data.

  //sample geospatial data
  db.schools.insertMany(
[{
    name: "University of Uyo",
    location: { type: "Point", coordinates: [-73.97, 40.77 ]}
  }, 
   {
    name: "University of Ilorin",
    location: { type: "Point", coordinates: [-73.9375, 40.8303 ]}
  },
 {
    name: "University of Lagos",
    location: { type: "Point", coordinates: [-73.9928, 
   40.7193]}
 }
])
//to create a 2dsphere index use the following command
db.schools.createIndex({location: "2dsphere"})
//you can perform queries on GeoJSON point to find places close to each other. The following code makes use of the $near operator to return documents that at least 500 meters from and at most 1500 metres from the specified GeoJSON point.
 db.schools.find({location: {$near: {$geometry: {type: "point", coordinates: [-73.9667, 40.78]}, $minDistance: 500, $maxDistance: 1500 }}})

Enter fullscreen mode Exit fullscreen mode

Properties of index

Index properties define certain behaviors of an indexed field at runtime. Index properties include:
...TTL indexes
... Unique indexes
... Sparse indexes
... Partial indexes

TTl indexes

Time to live (TTL) indexes are special field indexes that are used to remove documents from a collection after a period of time. This is essentially for some type of data that expires like event logs, non-active users and api keys etc. Instead of writing a cron job to delete these data; you can delegate it to MongoDB by declaring a TTL index property.

  db.eventLogs.createIndex({createdAt: 1}, {expireAfterSeconds: 3600})
//this example creates an index on the createdAt field in the eventLogs collection and sets it to expire after 3600 seconds. 
Enter fullscreen mode Exit fullscreen mode

A TTL index can only be created on a field that has a BSON Date type or an array of BSON Date type. If the indexed field in a document is not a date type or an array of date type the document will not expire and also if the document does not contain the indexed field, it will also not expire and be delete by MongoDB

Unique Indexes

Unique indexes are constraint that ensures that the index field does not contain the same value. MongoDB creates a unique index on the _id field.

  db.employee.createIndex({empName: 1}, {unique: true})
//this example creates a unique index on the empName field in the employee collection and it ensures that the empName field does not contain the same value. 
Enter fullscreen mode Exit fullscreen mode
Partial indexes

Partial index are used to create index that matches certain filter conditions. The filter condition could be specified using any operators. Partial index help to reduce the storage requirements and performance cost because they store only a subset of the documents in the index.

  //partial indexes can be created by the following command
db.employee.createIndex({"age": 1}, {partialFilterExpression: {age: {$gt: 34}}})
//the partialFilterExpression option accepts a document that specifies the filter condition.
Enter fullscreen mode Exit fullscreen mode

We could combine our knowledge of TTL index properties and the partial index to remove documents from a collection that passes a certain filter criteria.

  db.eventLogs.createIndex({createdAt: 1}, {expireAfterSeconds: 3600, partialFilterExpression: {state: "Debug"}})
//MongoDB by this created index will delete any document that was created an hour ago with a state equal to 'Debug'
Enter fullscreen mode Exit fullscreen mode
Sparse index

Sparse index store entry for the document that have the indexed field. A sparse entry skip any document that does not have the index field.

    db.employee.createIndex({age: 1}, {sparse: true})

Enter fullscreen mode Exit fullscreen mode

consider the following person collection

  db.person.insert({personName: "john", age: 34 })
  db.person.insert({personName: "Abu", age: 67 })
  db.person.insert({personName: "favour", hobbies: ["running", "dancing"] })
Enter fullscreen mode Exit fullscreen mode

Creating a sparse index on the age field will only save the first two documents that contain the age field in the index. Documents saved in the sparse index can be retrieved by using the *** hint ()*** to specify a sparse index.

  //retrieving a document by making use of the sparse index
   db.person.find().hint({age: 1})
  //hint implies that the sparse index should be used. This will result in two documents retrieved from the system
Enter fullscreen mode Exit fullscreen mode

Strategy For Choosing Indexes

The right strategy must be followed when creating indexes for our collections in mongo. An efficient index will help in improving the speed of our database queries. The best indexing strategy is determined by different factors, including:
... type of executing query
... number of read/write operations
... available memory

Different indexing strategies

Different indexing strategies:
... create index to support your queries
... use index to sort the query result
... indexes to fit in RAM i.e index to cache the recent values in memory
... indexes to ensure selectivity i.e queries to limit the number of possible documents

Create an index to support your queries

Create a single field index if all the query make use of that single key to retrieve document.

  db.student.createIndex({studentId: 1})
//create a compound index if all queries make use of more than one field to retrieve document.
db.student.createIndex({studentId: 1, studentName: 1})
Enter fullscreen mode Exit fullscreen mode
Use an index to sort query result

Sort operations make use of indexes for better performance. Indexes determine the sort order by fetching the document based on the ordering in the index.
Sorting can be done in the following manner:
... sorting with a single field index
... sorting on multiple fields
To support sorting on a single index the index is created on the key used to sort the document

  db.student.createIndex({studentId: 1})
 //this index can support both ascending and descending order sorting
//if queries use more than one field to sort documents; //indexes can be created on multiple fields to support the //query
db.student.createIndex({studentId: 1, studentName: 1})

db.student.createIndex({studentId: 1, studentName: 1})
//the index above will support the following query with a fine performance
db.student.find().sort({studentId: 1, studentName: 1})
Enter fullscreen mode Exit fullscreen mode
Index to hold recent value in RAM (memory)

An effective index is one that fits completely into memory. We can use the totalIndexSize() methods to find out the size of our collection indexes

  db.student.totalIndexSize()
//this shows the total space occupied by the student collection in memory. MongoDB reads an index that fits into memory from the RAM which is faster than reading from disk
Enter fullscreen mode Exit fullscreen mode
Create queries that ensures selectivity:

The ability of any query to narrow the results using the created index is called selectivity. Writing queries that limit the number of possible documents with the indexed field greatly helps to improve the performance of your queries.

In conclusion, index are an important part of MongoDB. Selecting the right indexing strategy will greatly improve your MongoDB read operations.

Top comments (1)

Collapse
 
victorhazbun profile image
Victor Hazbun

Since mongo 4.4 Hashed indexes can be compound mongodb.com/docs/manual/core/index...