In the same way that the match
query is the go-to query for standard
full-text search, the match_phrase
query is the one you should reach for
when you want to find words that are near each other:
GET /my_index/my_type/_search
{
"query": {
"match_phrase": {
"title": "quick brown fox"
}
}
}
Like the match
query, the match_phrase
query first analyzes the query
string to produce a list of terms. It then searches for all the terms, but
keeps only documents that contain all of the search terms, in the same
positions relative to each other. A query for the phrase quick fox
would not match any of our documents, because no document contains the word
quick
immediately followed by fox
.
Tip
|
The "match": {
"title": {
"query": "quick brown fox",
"type": "phrase"
}
} |
When a string is analyzed, the analyzer returns not only a list of terms, but also the position, or order, of each term in the original string:
GET /_analyze?analyzer=standard
Quick brown fox
This returns the following:
{
"tokens": [
{
"token": "quick",
"start_offset": 0,
"end_offset": 5,
"type": "<ALPHANUM>",
"position": 1 (1)
},
{
"token": "brown",
"start_offset": 6,
"end_offset": 11,
"type": "<ALPHANUM>",
"position": 2 (1)
},
{
"token": "fox",
"start_offset": 12,
"end_offset": 15,
"type": "<ALPHANUM>",
"position": 3 (1)
}
]
}
-
The
position
of each term in the original string.
Positions can be stored in the inverted index, and position-aware queries like
the match_phrase
query can use them to match only documents that contain
all the words in exactly the order specified, with no words in-between.
For a document to be considered a match for the phrase ``quick brown fox'', the following must be true:
-
quick
,brown
, andfox
must all appear in the field. -
The position of
brown
must be1
greater than the position ofquick
. -
The position of
fox
must be2
greater than the position ofquick
.
If any of these conditions is not met, the document is not considered a match.
Tip
|
Internally, the Thankfully, most people never need to use the |