Roadmap

Elasticsearch in Action is divided into two parts: “Core functionality” and “Advanced functionality.” We recommend reading chapters in order, as the functionality discussed in one chapter often depends on the concepts presented in previous chapters. Each chapter contains code listings and snippets you can follow if you prefer a hands-on approach, but it’s not necessary to have a laptop with you in order to learn the concepts and how Elasticsearch works.

The first part explains the core features—how to model and index data so you can search and analyze it as your use case requires. By the end of it, you’ll understand the building blocks of Elasticsearch functionality:

Chapter 1 gives an overview of what a search engine does in general and Elasticsearch’s features in particular. By the end of it you should know what kind of problems you can solve with Elasticsearch. Chapter 2 gets your feet wet regarding the major functionality: indexing documents, searching them, analyzing data via aggregations, and scaling out to multiple nodes.

Chapter 3 covers the options you have while indexing, updating, and deleting your data. You’ll learn what kind of fields you can have in your documents, as well as what happens when you’re writing them.

In chapter 4 you’ll dive deeper into the realm of full-text search. You’ll discover the important types of queries and filters and learn how they work and when to use which.

Chapter 5 explains how analysis breaks down the text from both documents and queries into the tokens used for searching. You’ll learn how to use different kinds of analyzers—as well as how to build your own—in order to fully utilize Elasticsearch’s full text search potential.

Chapter 6 helps you complete your full text search skills by focusing on relevancy. You’ll learn about the factors affecting a document’s score and how to manipulate them using different scoring algorithms, boosting a particular query or field, or using values from the document itself—such as the number of likes or retweets—to boost the score.

Chapter 7 shows how to use aggregations to perform real-time analytics. You’ll learn how to couple aggregations with queries and how to nest them in order to find the number of needles in the haystack . . . dropped by someone from Poland . . . two years ago.

Chapter 8 deals with relational data, like bands and their albums. You’ll learn how to use Elasticsearch features—such as nested documents and parent-child relationships—as well as general NoSQL techniques (such as denormalizing or application-side joins) to index and search data that isn’t flat.

The second part helps you get the core functionality out to production. In doing so, you’ll learn more about how each feature works, as well as its impact on performance and scalability:

Chapter 9 deals with scaling out to multiple nodes. You’ll learn how to shard and replicate your indices—for example, by oversharding or using time-based indices—so that today’s design can cope with next year’s data.

In chapter 10 you’ll find tricks that will help you squeeze more performance out of your cluster. Along the way, you’ll learn how Elasticsearch uses caches and writes data to disk, as well as various trade-offs you can make to tweak Elasticsearch for your use case.

Chapter 11 shows how to monitor and administer your cluster in production. We’ll cover the important metrics you should watch, how to back up and restore your data, and how to use shortcuts such as index templates and aliases.

The book’s six appendixes cover features you should know about, but these features may not be relevant to some use cases. We hope that the term “appendix” doesn’t mislead you into thinking we cover these features superficially. As with the rest of the book, we’ll dive into the details of how each feature works under the hood:

Appendix A is about geospatial search and aggregations.

Appendix B shows how to manage Elasticsearch plugins.

In Appendix C you’ll learn about highlighting query terms in your search results.

Appendix D introduces third-party monitoring tools that you may want to use in production to help you manage Elasticsearch.

Appendix E explains how to use the Percolator in order to match few documents against many queries.

Finally, appendix F explains how to use different suggesters in order to implement did-you-mean and autocomplete functionality.