4.3. Combining queries or compound queries

After learning about and using different types of queries, you’ll likely find yourself needing to combine query types; this is where Elasticsearch’s bool query comes in.

4.3.1. bool query

The bool query allows you to combine any number of queries into a single query by specifying a query clause that indicates which parts must, should, or must_not match the data in your Elasticsearch index:

If you specify that part of a bool query must match, only results matching that query (or queries) are returned.

Specifying that a part of a query should match means that a specified number of the clauses must match for a document to be returned.

If no must clauses are specified, at least one should clause has to match for the document to be returned.

Finally, the must_not clause causes matching documents to be excluded from the result set.

Table 4.1 lists the three clauses and their binary counterparts.

Table 4.1. bool query clause types

bool query clause Binary equivalent Meaning
must To combine multiple clauses, use a
must_not Any searches in the must_not clause must not be part of the
Should Searches in the should clause may or may not match a document, but

Understanding the difference between must, should, and must_not may be easier through an example. In the following listing, you search for events that were attended by David, must be attended by either Clint or Andy, and must not be older than June 30, 2013.

Listing 4.20. Combining queries with a bool query

4.3.2. bool filter

The filter version of the bool query acts almost exactly like the query version, but instead of combining queries, it combines filters. The filter equivalent of the previous example is shown in the following listing.

Listing 4.21. Combining filters with the bool filter

% curl 'localhost:9200/get-together/_search' -d'

{

"query": {

"filtered": {

"query": {

"match_all": {}

},

"filter": {

"bool": {

"must": [

{

"term": {

"attendees": "david"

}

}

],

"should": [

{

"term": {

"attendees": "clint"

}

},

{

"term": {

"attendees": "andy"

}

}

],

"must_not": [

{

"range" :{

"date": {

"lt": "2013-06-30T00:00"

}

}

}

]

}

}

}

}

}'

As you saw in the bool query (listing 4.20), the minimum_should_match setting of the query version lets you specify the minimum number of should clauses that have to match for a result to be returned. In listing 4.21, the default value of 1 is used; the bool filter does not support this property.

Improving the bool query

The provided bool query is slightly contrived, but it includes all three of the bool query options: must, should, and must_not. You could rewrite this bool query in a better form like this:

Note that this query is smaller than the previous query. By inverting the range query from lt (less than) to gte (greater than or equal to), you can move it from the must_not section to the must section. You can also collapse the two separate should queries into a single terms query instead of two term queries. Now you can replace the minimum_should_match of 1 and the should clause by moving the terms query into the must clause as well. Elasticsearch has a flexible query language, so don’t be afraid to experiment with how queries are formed as you’re sending them to Elasticsearch!

With the bool query and filter under your belt, you can combine any number of queries and filters. We can now return to the other types of queries that Elasticsearch supports. You already know about the term query, but what if you want Elasticsearch to analyze the data you’re sending it? The match query is exactly what you need.

Note

The option minimum_should_match has some hidden features for default values. If you specify a must clause, the minimum_should_match has a default value of 0. If there’s no must clause, the default value is 1.