Introduction
Indexes are used in databases to improve the performance of queries. Database index are like the indexes on the back of a book that allows you navigate quickly to any portion in the book. Without indexes, MongoDB must search the entire collection to select documents that match a query.
Indexes are special data structure that stores a portion of a collection in a easy to transform format. Indexes are stored at the collection level and they can be created on any field of the document. A default index ix created on the _id field by MongoDB.
Creating Index on a Collection
An index is created by using the createIndex() method of a collection in MongoDB. Assuming we have an employee collection on our database.
db.collection.createIndex()
//command for creating an index on the mongo shell
sample data from a fictional employee database
db.employee.insertMany([{
empId: 1, empName: "Jamie Jamo", state: "Edo", country: "Nigeria"
}, {
empId: 2, empName: "Fletcher Parker", state: "Californai", country: "USA"
}, { empId: 3, empName: "Gongo Aso", state: "Lagos", country: "Nigeria"}, {
empId: 4, empName: "Andrew Young", state: "Edo", country: "Nigeria"
}])
An index can be created on the empId field by running the command below
db.employee.createIndex({empId: 1})
The value 1 signifies that the index is stored in ascending order. -1 stores the index in descending order. Index could be created on multiple fields, know as a compound index.
db.employee.createIndex({empId: 1, empName: 1})
//example of a compound index that contains multiple fields.
The list of indexes in a collection can be retrieved by running the following command
db.employee.getIndexes()
//where employee represents the collection you want to retrieve its indexes
An index can be dropped from a collection by using the dropIndex() method. The index created on our employee collection can be dropped by:
db.employee.dropIndex({empId: 1})
or
//we can drop all the indexes in a collection by running
db.employee.dropIndexes("*")
//this drops all the index in a collection. This feature is available from mongoDB v4 and upwards
Note the default _id field index can not be deleted.
Types of Indexes
- Single Field Indexes
- Compound Indexex
- MultiKey Indexes
- Text Indexex
- Hashed Indexes
- 2dSphere Indexes
//single field indexes
db.employee.createIndex({empId: 1})
//compound indexes involves combining indexes on more than one field
db.employee.createIndex({empId: 1, empName: 1})
Multikey Indexes
Multikey Indexes are refereed to as indexes created on a field that holds an array. MongoDB creates a key for each value in the array.
db.employee.insertOne({
empId: 1, empName: "Jamie Jamo", state: "Edo", country: "Nigeria", skills: ["Javascript", "React", "Meteorjs"]
})
//inserting this sample data into our fictitious employee collection. We can create a multi key index on the skills field
db.employee.createIndex({skills: 1})
MongoDB will create a multikey index. You cannot create a compound multikey index.
Text Indexes
A text index is created on a field to enable and optimize text search queries on strings. A text index is created on a field that contains a string value or an array of string values.
//sample data
db.post.insert({
"post_text" : "This is the new dawn"
"tags" : ["volleyball", "dawn"]
})
//text index can be created on the post_text field by running this command
db.post.createIndex({"post_text": "text"})
//this command creates the a text index on the post_text field and makes it searchable.
//The command below is used to perform a search query on a //text index. where post represents your collection
db.post.find({"text": {$search: "your search term"}})
Hashed Indexes
Indexes in a collection normally requires extra storage space to store the index alongside the collection data. Hashed Indexes can be used to reduced the space occupied by indexes. A hashed index store the hashes of the values of the indexed field. Hashed indexes do not support multikey indexes.
db.employee.createIndex({empId: "hashed"})
//this creates an hashed index
2dSphere Index
The 2dsphere index is useful to return queries on geospatial data.
//sample geospatial data
db.schools.insertMany(
[{
name: "University of Uyo",
location: { type: "Point", coordinates: [-73.97, 40.77 ]}
},
{
name: "University of Ilorin",
location: { type: "Point", coordinates: [-73.9375, 40.8303 ]}
},
{
name: "University of Lagos",
location: { type: "Point", coordinates: [-73.9928,
40.7193]}
}
])
//to create a 2dsphere index use the following command
db.schools.createIndex({location: "2dsphere"})
//you can perform queries on GeoJSON point to find places close to each other. The following code makes use of the $near operator to return documents that at least 500 meters from and at most 1500 metres from the specified GeoJSON point.
db.schools.find({location: {$near: {$geometry: {type: "point", coordinates: [-73.9667, 40.78]}, $minDistance: 500, $maxDistance: 1500 }}})
Properties of index
Index properties define certain behaviors of an indexed field at runtime. Index properties include:
...TTL indexes
... Unique indexes
... Sparse indexes
... Partial indexes
TTl indexes
Time to live (TTL) indexes are special field indexes that are used to remove documents from a collection after a period of time. This is essentially for some type of data that expires like event logs, non-active users and api keys etc. Instead of writing a cron job to delete these data; you can delegate it to MongoDB by declaring a TTL index property.
db.eventLogs.createIndex({createdAt: 1}, {expireAfterSeconds: 3600})
//this example creates an index on the createdAt field in the eventLogs collection and sets it to expire after 3600 seconds.
A TTL index can only be created on a field that has a BSON Date type or an array of BSON Date type. If the indexed field in a document is not a date type or an array of date type the document will not expire and also if the document does not contain the indexed field, it will also not expire and be delete by MongoDB
Unique Indexes
Unique indexes are constraint that ensures that the index field does not contain the same value. MongoDB creates a unique index on the _id field.
db.employee.createIndex({empName: 1}, {unique: true})
//this example creates a unique index on the empName field in the employee collection and it ensures that the empName field does not contain the same value.
Partial indexes
Partial index are used to create index that matches certain filter conditions. The filter condition could be specified using any operators. Partial index help to reduce the storage requirements and performance cost because they store only a subset of the documents in the index.
//partial indexes can be created by the following command
db.employee.createIndex({"age": 1}, {partialFilterExpression: {age: {$gt: 34}}})
//the partialFilterExpression option accepts a document that specifies the filter condition.
We could combine our knowledge of TTL index properties and the partial index to remove documents from a collection that passes a certain filter criteria.
db.eventLogs.createIndex({createdAt: 1}, {expireAfterSeconds: 3600, partialFilterExpression: {state: "Debug"}})
//MongoDB by this created index will delete any document that was created an hour ago with a state equal to 'Debug'
Sparse index
Sparse index store entry for the document that have the indexed field. A sparse entry skip any document that does not have the index field.
db.employee.createIndex({age: 1}, {sparse: true})
consider the following person collection
db.person.insert({personName: "john", age: 34 })
db.person.insert({personName: "Abu", age: 67 })
db.person.insert({personName: "favour", hobbies: ["running", "dancing"] })
Creating a sparse index on the age field will only save the first two documents that contain the age field in the index. Documents saved in the sparse index can be retrieved by using the *** hint ()*** to specify a sparse index.
//retrieving a document by making use of the sparse index
db.person.find().hint({age: 1})
//hint implies that the sparse index should be used. This will result in two documents retrieved from the system
Strategy For Choosing Indexes
The right strategy must be followed when creating indexes for our collections in mongo. An efficient index will help in improving the speed of our database queries. The best indexing strategy is determined by different factors, including:
... type of executing query
... number of read/write operations
... available memory
Different indexing strategies
Different indexing strategies:
... create index to support your queries
... use index to sort the query result
... indexes to fit in RAM i.e index to cache the recent values in memory
... indexes to ensure selectivity i.e queries to limit the number of possible documents
Create an index to support your queries
Create a single field index if all the query make use of that single key to retrieve document.
db.student.createIndex({studentId: 1})
//create a compound index if all queries make use of more than one field to retrieve document.
db.student.createIndex({studentId: 1, studentName: 1})
Use an index to sort query result
Sort operations make use of indexes for better performance. Indexes determine the sort order by fetching the document based on the ordering in the index.
Sorting can be done in the following manner:
... sorting with a single field index
... sorting on multiple fields
To support sorting on a single index the index is created on the key used to sort the document
db.student.createIndex({studentId: 1})
//this index can support both ascending and descending order sorting
//if queries use more than one field to sort documents; //indexes can be created on multiple fields to support the //query
db.student.createIndex({studentId: 1, studentName: 1})
db.student.createIndex({studentId: 1, studentName: 1})
//the index above will support the following query with a fine performance
db.student.find().sort({studentId: 1, studentName: 1})
Index to hold recent value in RAM (memory)
An effective index is one that fits completely into memory. We can use the totalIndexSize() methods to find out the size of our collection indexes
db.student.totalIndexSize()
//this shows the total space occupied by the student collection in memory. MongoDB reads an index that fits into memory from the RAM which is faster than reading from disk
Create queries that ensures selectivity:
The ability of any query to narrow the results using the created index is called selectivity. Writing queries that limit the number of possible documents with the indexed field greatly helps to improve the performance of your queries.
In conclusion, index are an important part of MongoDB. Selecting the right indexing strategy will greatly improve your MongoDB read operations.
Top comments (1)
Since mongo 4.4 Hashed indexes can be compound mongodb.com/docs/manual/core/index...