Chapter 9. Scaling out

This chapter covers

Adding nodes to your Elasticsearch cluster

Master election in your Elasticsearch cluster

Removing and decommissioning nodes

Using the _cat API to understand your cluster

Planning and scaling strategies

Aliases and custom routing

Now that you have a good understanding of what Elasticsearch is capable of, you’re ready to hear about Elasticsearch’s next killer feature: the ability to scale—that is, to be able to handle more indexing and searching or to handle indexing and searching faster. These days, scaling is an important factor when dealing with millions or billions of documents. You won’t always be able to support the amount of traffic you’d like to on a single running instance of Elasticsearch, or node, without scaling in some form.

Fortunately, Elasticsearch is easy to scale. In this chapter we’ll take a look at the scaling capabilities that Elasticsearch has at its disposal and how you can use those features to give Elasticsearch more performance and, at the same time, more reliability.

Having already seen how Elasticsearch handles the get-together data we introduced in chapters 2 and 3, we’re now ready to talk about how to scale your search system to handle all the traffic you can throw at it. Imagine you’re sitting in your office, and in comes your boss to announce that your site has been featured in Wired magazine as the hot new site everyone should use for booking social get-togethers. Your job: make sure Elasticsearch can handle the influx of new groups and events, as well as all the new searches expected to hit the site once that Wired article gets published! You have 24 hours. How are you going to scale up your Elasticsearch server to handle this traffic in this time frame? Thankfully, Elasticsearch makes scaling a breeze by adding nodes to your existing Elasticsearch cluster.