How to search nested objects with Elasticsearch

JsonElasticsearchNested

Json Problem Overview


OK, I've not been able to figure this out thus far. Hoping someone can offer some insight.

Given the documents below, how would I search for all documents with a video that has "test" in the video title? I'm using the HTTP API. (Basically, how do you search nested objects with elastic search? I know there has to be docs out there, but I haven't really been able to find any.)

[{
    id:4635,
    description:"This is a test description",
    author:"John",
    author_id:51421,
    video: {
        title:"This is a test title for a video",
        description:"This is my video description",
        url:"/url_of_video"
    }
},
{
    id:4636,
    description:"This is a test description 2",
    author:"John",
    author_id:51421,
    video: {
        title:"This is an example title for a video",
        description:"This is my video description2",
        url:"/url_of_video2"
    }
},
{
    id:4637,
    description:"This is a test description3",
    author:"John",
    author_id:51421,
    video: {
        title:"This is a test title for a video3",
        description:"This is my video description3",
        url:"/url_of_video3"
    }
}]

Json Solutions


Solution 1 - Json

You don't necessarily need to nest video; you can map it as a normal field. Which means it will store

'video:title': "This is a test title for a video3",
'video:description':"This is my video description3",
'video:url':"/url_of_video3"

And you can search for video.title:'test'.

As far as I get it, nested fields are useful when you have multiple nested items, and you want to make a query only for the nested items. For example, having this data

[{    id:4635,    description:"This is a test description",    author:"John",    author_id:51421,    video: [      {        title:"This is a test title for a video",        description:"This is my video description",        url:"/url_of_video"      },      {        title:"This is an example title for a video",        description:"This is my video description2",        url:"/url_of_video2"      }    ]
},
{
    id:4637,
    description:"This is a test description3",
    author:"John",
    author_id:51421,
    video: [
      {
        title:"This is a test title for a video3",
        description:"This is my video description3",
        url:"/url_of_video3"
      }
    ]
}]

If you would search for video.title: 'test' and video.description: 'description2', and video was not nested, it will give you a fake result (because test is in the first video and description2 in the second, but in all the video field you have both).

In this case, if you map video as nested, it will remember each video as a separate entity and will search for individual videos that fit those conditions, so for video.title: 'test' and video.description: 'description2' it will return nothing, for video.title: 'example' and video.description: 'description2' it will return one result.

Solution 2 - Json

OK, I finally found these pages (should have taken more time with the docs beforehand) and it seems we set the property that holds the video to type:nested, then use nested queries.

http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html

http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-filter.html

Hope this helps someone down the road.

Solution 3 - Json

If you want to put it in Rest API URL format

>

/_search?pretty&q=video.title:*test*

Solution 4 - Json

You can use the .keyword suffix if the name of the nested object is unique:

{
        'query': {
            'term': {
                'title.keyword': "This is a test title for a video"               
            }
        }
}

Which should match your first example entry. Note that the video object name is not specified anywhere; this matches on all objects that have a title sub-object.

Solution 5 - Json

schema is:

  private schema = {
    id: {
      type: 'integer',
    },
    name: {
      type: 'text',
    },
    tags: {
      type: 'nested',
      properties: {
        id: {
          type: 'integer',
        },
        name: {
          type: 'keyword',
          normalizer: 'useLowercase',
        },
      },
    },
  }

document structure is

id: 38938
name: "summer fruits"
tags:[
   {
    id : 73
    name: "Grapes"
   },
  {
    id : 74
    name: "Pineapple"
   }
]

search query:

    const { tags } = req.body;

    const { body } = await elasticWrapper.client.search({
        index: ElasticIndexs.Fruits,
        pretty: true,
        filter_path: 'hits.hits._source*',
        body: {
          query: {
            bool: {
              must: tags.map((ele: { name: string }) => {
                return {
                  nested: {
                    path: 'tags',
                    query: {
                      match: {
                        'tags.name': ele.name,
                      },
                    },
                  },
                };
              }),
            },
          },
        },
      });

Solution 6 - Json

To give a more generic answer:

  • use objects (no nested) if you search one field at a time. That's because internally, fields are flattened, like so:

objects in Elasticsearch

  • use nested if you need to search in multiple fields (e.g. title:test AND description:my), because objects don't care about boundaries. Meanwhile, nested fields create separate Lucene documents under the hood that are quickly joined via Lucene's BlockJoin:

nested fields in Elasticsearch

  • use parent-child relationships (between different Elasticsearch documents) if you search in multiple fields and update child documents often (because updates of nested documents will update the whole ensemble). Basically, if you want to trade query performance for update performance, since the query runs in two steps under the hood:

Elasticsearch has-child query

Note: the drawings above are from Sematext's Elasticsearch training classes (disclosure: I deliver these classes).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionswatkinsView Question on Stackoverflow
Solution 1 - JsondiraView Answer on Stackoverflow
Solution 2 - JsonswatkinsView Answer on Stackoverflow
Solution 3 - JsonDinesh Kumar SahuView Answer on Stackoverflow
Solution 4 - JsontsornView Answer on Stackoverflow
Solution 5 - JsonRafiqView Answer on Stackoverflow
Solution 6 - JsonRadu GheorgheView Answer on Stackoverflow