DEV Community

Lucas Rivelles
Lucas Rivelles

Posted on

Managing indexes in Elasticsearch

Documents being stored

An index is the equivalent of a relational database table in Elasticsearch. It stores related documents in JSON format. We`ll see how to create an index from scratch and perform some basic operations on documents in order to explore its features.

For this, we are going to use the trial version of Elastic Cloud and its Kibana interface. By default, it uses a cluster with two nodes.

Creating an index

In order to create a simple index, we just need to send a PUT request, for example:


PUT /users

It will create our index named users and return the following confirmation:


{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "users"
}

By default, Elasticsearch will create an index with 1 shard and 1 replica, we can see this by making the following request:


GET /_cat/shards/users?v

The return tells us the details: we have an index with one shard, in which the primary is running at 10.42.11.185 and the replica is running at 10.42.1.147


index shard prirep state docs store ip node
users 0 r STARTED 0 225b 10.42.1.147 instance-0000000001
users 0 p STARTED 0 225b 10.42.11.185 instance-0000000000

If we want to specify the number of shards and replicas in the moment of the creation of our index, we can do so by passing the details in the request body:


PUT /products
{
"settings": {
"number_of_shards": 2,
"number_of_replicas": 2
}
}

It will create an index called products with 2 shards, and each shard will have 2 replicas. Since we are running in a cluster with two nodes, it will result in this architecture:
Architecture of an index with 2 shards and 2 replicas
Since each node must run either one primary shard or its replica, we end up with 2 unassigned nodes, we can see it with our own eyes by calling GET /_cat/shards/products?v.


index shard prirep state docs store ip node
products 0 r STARTED 0 225b 10.42.1.147 instance-0000000001
products 0 p STARTED 0 225b 10.42.11.185 instance-0000000000
products 0 r UNASSIGNED
products 1 p STARTED 0 225b 10.42.1.147 instance-0000000001
products 1 r STARTED 0 225b 10.42.11.185 instance-0000000000
products 1 r UNASSIGNED

Updating properties

As we saw in the previous example, we ended up with unused replicas at our index. Let's fix this by updating the number of replicas.


PUT /products/_settings
{
"settings": {
"number_of_replicas": 1
}
}

Now, let's see if it worked by calling GET /_cat/shards/products?v again.


index shard prirep state docs store ip node
products 0 r STARTED 0 225b 10.42.1.147 instance-0000000001
products 0 p STARTED 0 225b 10.42.11.185 instance-0000000000
products 1 p STARTED 0 225b 10.42.1.147 instance-0000000001
products 1 r STARTED 0 225b 10.42.11.185 instance-0000000000

Yes! It worked! :)

Now, let's take a look in our first example. We created a 'users' index with default values, i.e., 1 shard and 1 replica. Let's try to increase its number of shards to match the 'products' index.


PUT /users/_settings
{
"settings": {
"number_of_shards": 2
}
}

Uh oh! It returns the following error message:

"Can't update non dynamic settings [[index.number_of_shards]] for open indices [[users/78H88glwR0WkN0Z-92naSA]]"

Updating the number of shards is a special operation. It wasn't possible to do so before version 5.0 (to decrease) and version 6.1 (to increase). We can do this using the split and the shrink operations.

The split operation

A banana split
This operation is used specifically in order to increase the number of shards of an index. Before splitting an index, we need to:

  • Make sure that the index is read-only
  • Make sure that the cluster is in green status (healthy)

It's also important to remember that we must split the index into factors of the number of primary shards. So, if we have 2 primary shards, we might split it to 4, 6, 8 shards, and so on.

First, let's make the index read-only:


PUT /users/_settings
{
"settings": {
"index.blocks.write": true
}
}

Then, we can split the number of shards:


POST /users/_split/users_v2
{
"settings": {
"index.number_of_shards": 2
}
}

After this operation, Elasticsearch will create a new index called users_v2 with 2 shards. If we don't want our clients to point to the newly created index, we can use the aliases feature.

Deleting an index

Let's remove the old users index, it's as simple as it looks like:


DELETE /users

Discussion (0)