4.5. Querying for field existence with filters

Sometimes when querying Elasticsearch, it can be helpful to search for all the documents that don’t have a field or are missing a value in the field. In the get-together index, for example, you might want to search for all groups that don’t have a review. On the other hand, you may also want to search for all the documents that have a field, regardless of what the content of the field is. This is where the exists and missing filters come in, both of which act only as filters, not as regular queries.

4.5.1. Exists filter

As the name suggests, the exists filter allows you to filter any query to documents that have a value in a particular field, whatever that value may be. Here’s what the exists filter looks like:

% curl 'localhost:9200/get-together/_search' –d '

{

"query": {

"filtered": {

"query": {

"match_all": {}

},

"filter": {

"exists": { "field": "location.geolocation" }

}

}

}

}' ... only documents with the location.geolocation field are returned ...

On the opposite side, you can use the missing filter.

4.5.2. Missing filter

The missing filter allows you to search for documents that have no value or where the value is a default value (also called the null value, or null_value in the mapping) that was specified during the mapping. To search for documents that are missing the reviews field, you’d use a filter like this:

If you wanted to expand that filter to also match documents that are missing the field entirely and that might have the null_value field, you can specify a Boolean value for the existence and null_value fields. The response includes documents that have null_value set in the field, as shown in the next listing.

Listing 4.23. Specify existence and null_value fields as Boolean values

Both the missing and exists filters are cached by default.

4.5.3. Transforming any query into a filter

So far, we’ve talked about the different types of queries and filters that Elasticsearch supports, but we’ve been limited to using only the filters that are already provided. Sometimes you may want to take a query such as query_string, which has no filter equivalent, and turn it into a filter. You rarely need this, but if you need full-text search within the filter context, you can use this. Elasticsearch allows you to do this with the query filter, which takes any query and turns it into a filter.

To transform a query_string query that searches for a name matching “denver clojure” to a filter, you’d use a search like this:

Using this, you can get some of the benefits of a filter (such as not having to calculate a score for that part of the query). You can also choose to cache this filter if it turns out to be used many times; the syntax for caching looks slightly different than adding the _cache key, as shown in the next listing.

Listing 4.24. Caching query filter

The query part of the query has moved inside a new key named fquery, which is where the _cache key now resides. If you find yourself often using a particular query that doesn’t have a filter equivalent (like one of the match queries or a query_string query), you may want to cache it, assuming the score for that particular part of the query isn’t important.