9.7. Aliases

Now let’s talk about one of the easiest and potentially most useful features of Elasticsearch: aliases. Aliases are exactly what they sound like; they’re a pointer or a name you can use that corresponds to one or more concrete indices. This turns out to be quite useful because of the flexibility it provides when scaling your cluster and managing how data is laid out across your indices. Even when using an

Elasticsearch cluster with only a single index, use an alias. You’ll thank us later for the flexibility it will give you.

9.7.1. What is an alias, really?

You may be wondering what an alias is exactly and what kind of overhead is involved with Elasticsearch in creating one. An alias spends its life inside the cluster state, managed by the master node; this means that if you have an alias called idaho that points to an index named potatoes, the overhead is an extra key in the cluster state map that maps the name idaho to the concrete index potatoes. This means that compared to additional indices, aliases are much lighter in weight; thousands of them can be maintained without negatively impacting your cluster. That said, we would caution against creating hundreds of thousands or millions of aliases because at that point, even the minimal overhead of a single entry in a map can cause the cluster state to grow to a large size. This means operations that create a new cluster state will take longer because the entire cluster state is sent to each node every time it changes.

Why are aliases useful?

We recommend that everyone use an alias for their Elasticsearch indices because it will give a lot more flexibility in the future when it comes to re-indexing. Let’s say that you start off by creating an index with a single primary shard and then later decide that you need more capacity on your index. If you were using an alias for the original index, you can now change that alias to point to an additionally created index without having to change the name of the index you’re searching (assuming you’re using an alias for searching from the beginning).

Another useful feature can be creating windows into different indices; for example, if you create daily indices for your data, you may want a sliding window of the last week’s data by creating an alias called last-7-days; then every day when you create a new daily index, you can add it to the alias while simultaneously removing the eight-day-old index.

Managing aliases

Aliases are created using the dedicated aliases API endpoint and a list of actions. Each action is a map with either an add or remove action followed by the index and alias on which to apply the operation. This will be much clearer with the example shown in the next listing.

Listing 9.7. Adding and removing aliases

In this listing the get-together index is being added to an alias named gt-alias, and the made-up index oldget-together is being removed from the alias gt-alias. The act of adding an index to an alias creates it, and removing all indices that an alias points to removes the alias; there’s no manual alias creation and deletion. But the alias operations will fail if the index doesn’t exist, so keep that in mind. You can specify as many add and remove actions as you like. It’s important to recognize that these actions will all occur atomically, which means in the previous example there’ll be no moment of time in which the gt-alias alias points to both the get-together and old-get-together indices. Although the compound Alias API call we just discussed may suit your needs, it’s important to note that individual actions can be performed on the Alias API, using the common HTTP methods that Elasticsearch has standardized on. For instance, the following series of calls would have the same effect as the compound actions call shown previously:

curl -XPUT 'http://localhost:9200/get-together/_alias/gt-alias' curl -XDELETE 'http://localhost:9200/old-get-together/_alias/gt-alias'

While we’re exploring single-call API methods, this section wouldn’t be complete without covering the API in more detail, specifically those endpoints that can come in handy in creating and listing operations.

9.7.2. Alias creation

When creating aliases, there are many options available via the API endpoint. For instance, you can create aliases on a specific index, many indices, or a pattern that matches index names:

Alias deletion accepts the same path parameter format: curl -XDELETE 'localhost:9200/{index}/_alias/{alias}'

You can retrieve all of the aliases that a concrete index points to by issuing a GET request on an index with _alias, or you can retrieve all indices and the aliases that point to them by leaving out the index name. Retrieving the aliases for an index is shown in the next listing.

Listing 9.8. Retrieving the aliases pointing to a specific index

In addition to the _alias endpoint on an index, you have a number of different ways to get the alias information from an index:

Masking documents with alias filters

Aliases have some other neat features as well; they can be used to automatically apply a filter to queries that are executed. For example, with your get-together data it could be useful to have an alias that points only to the groups that contain the elasticsearch tag, so you can create an alias that does this filtering automatically, as shown in the following listing.

Listing 9.9. Creating a filtered alias

Here you can see that the es-groups alias contains only two groups instead of five. This is because it’s automatically applying the term filter for groups that contain the tag elasticsearch. This has a lot of applications; if you’re indexing sensitive data, for instance, you can create a filtered alias to ensure that anyone using that alias can’t see data they’re not meant to see.

There’s one more feature that aliases can provide, routing, but before we talk about using it with an alias, we’ll talk about using it in general.