6

es针对nested类型数据无法进行过滤查询的问题记录 - blayn

 11 months ago
source link: https://www.cnblogs.com/blayn/p/17736195.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

es针对nested类型数据无法进行过滤查询的问题记录

es中存在有一个名为task_data_1的索引,其字段映射关系如下所示:
{
    "task_data_1" : {
        "mappings" : {
        "dynamic_templates" : [
            {
                "dates" : {
                "match_mapping_type" : "date",
                    "mapping" : {
                        "type" : "date"
                    }
                }
            },
            {
            "doubles" : {
                "match_mapping_type" : "double",
                "mapping" : {
                    "type" : "double"
                    }
                }
            },
            {
            "objects" : {
                "match_mapping_type" : "object",
                "mapping" : {
                    "type" : "object"
                    }
                }
            },
            {
            "longs" : {
                "match_mapping_type" : "long",
                "mapping" : {
                    "type" : "integer"
                }
                }
            },
            {
            "strings" : {
                "match" : "*",
                "match_mapping_type" : "string",
                "mapping" : {
                    "type" : "keyword"
                }
            }
            }
            ],
            "properties" : {
                "createUsername" : {
                    "type" : "keyword"
                },
                "data" : {
                    "type" : "nested",
                    "dynamic" : "true",
                    "properties" : {
                        "daterange102110" : {
                            "type" : "date"
                        },
                        "input18779" : {
                            "type" : "keyword"
                        },
                        "rate48025" : {
                            "type" : "integer"
                        },
                        "textarea24212" : {
                            "type" : "keyword"
                        },
                        "textarea38172" : {
                            "type" : "keyword"
                        },
                        "timerange47544" : {
                            "type" : "keyword"
                        },
                        "url" : {
                            "type" : "keyword"
                        }
                    }
                },
                "formId" : {
                    "type" : "long",
                    "store" : true
                },
                "updateUsername" : {
                    "type" : "keyword"
                }
            }
        }
    }
}
通过createUsername、updateUsername、formId等字段可以正常进行过滤查询功能,但是data这个map中的所有字段都无法正常进行过滤查询功能。
起初,我是直接用map中对应的字段名进行过滤查询,编写的代码生成的DSL如下所示:
POST task_data_1/_search
{
  "from": 0,
  "size": 10,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "input18779": {
              "value": "3213",
              "boost": 1
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "sort": [
    {
      "createTime": {
        "order": "desc"
      }
    }
  ],
  "track_total_hits": 2147483647
}
这样子的查询语法有很明显的问题,因为input32768这个字段是嵌套在data这个map中的,直接使用input32768这个字段名是无法查询到对应数据的。
发现问题后,我修改了代码逻辑,而后生成的DSL如下所示:
POST task_data_1/_search
{
  "from": 0,
  "size": 10,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "data.input18779": {
              "value": "3213",
              "boost": 1
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "sort": [
    {
      "createTime": {
        "order": "desc"
      }
    }
  ],
  "track_total_hits": 2147483647
}
这样子的查询语法,从表现上看是没有任何问题的,但依然查不出数据。
后来我到kibana查询了该索引的字段映射关系,就是上文中的这一段json数据:
{
    "task_data_1" : {
        "mappings" : {
        "dynamic_templates" : [
            {
                "dates" : {
                "match_mapping_type" : "date",
                    "mapping" : {
                        "type" : "date"
                    }
                }
            },
            {
            "doubles" : {
                "match_mapping_type" : "double",
                "mapping" : {
                    "type" : "double"
                    }
                }
            },
            {
            "objects" : {
                "match_mapping_type" : "object",
                "mapping" : {
                    "type" : "object"
                    }
                }
            },
            {
            "longs" : {
                "match_mapping_type" : "long",
                "mapping" : {
                    "type" : "integer"
                }
                }
            },
            {
            "strings" : {
                "match" : "*",
                "match_mapping_type" : "string",
                "mapping" : {
                    "type" : "keyword"
                }
            }
            }
            ],
            "properties" : {
                "createUsername" : {
                    "type" : "keyword"
                },
                "data" : {
                    "type" : "nested",
                    "dynamic" : "true",
                    "properties" : {
                        "daterange102110" : {
                            "type" : "date"
                        },
                        "input18779" : {
                            "type" : "keyword"
                        },
                        "rate48025" : {
                            "type" : "integer"
                        },
                        "textarea24212" : {
                            "type" : "keyword"
                        },
                        "textarea38172" : {
                            "type" : "keyword"
                        },
                        "timerange47544" : {
                            "type" : "keyword"
                        },
                        "url" : {
                            "type" : "keyword"
                        }
                    }
                },
                "formId" : {
                    "type" : "long",
                    "store" : true
                },
                "updateUsername" : {
                    "type" : "keyword"
                }
            }
        }
    }
}
从这段json数据中可以发现,data这个map的类型是nested。
查资料后得知,在 Elasticsearch 中,"nested" 类型是一种特殊的数据类型,用于处理嵌套文档(nested documents)。
针对这种类型的数据,需要使用 Nested Query 结合 Match Query 或 Term Query 等查询类型来搜索嵌套字段。
因此,我对代码做出类似如下整改:
queryBuilder.must(QueryBuilders.nestedQuery("data", QueryBuilders.termQuery(queryFieldName, item.getFilterValue()), ScoreMode.None));
主要是使用到了 Nested Query,之后生成的DSL如下所示:
POST task_data_1/_search
{
  "from": 0,
  "size": 10,
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "query": {
              "term": {
                "data.input18779": {
                  "value": "3213",
                  "boost": 1
                }
              }
            },
            "path": "data",
            "ignore_unmapped": false,
            "score_mode": "none",
            "boost": 1
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "sort": [
    {
      "createTime": {
        "order": "desc"
      }
    }
  ],
  "track_total_hits": 2147483647
}
此时,对应的数据结果就能够被查询出来了。
在 Elasticsearch 中,"nested" 类型是一种特殊的数据类型,用于处理嵌套文档(nested documents)。
对于 "nested" 类型的字段,它包含的子字段(metadata)在查询时需要使用特定的嵌套查询来进行搜索操作,简单的查询无法直接搜索到嵌套字段的内容。
以我提供的数据映射为例,数据中的 "data" map 中的每个字段(如 "daterange102110"、 "input18779" 等)都无法直接进行搜索,因为 Elasticsearch 默认不会对嵌套字段进行索引。
如果你希望能够对嵌套字段进行搜索,你需要使用嵌套查询。例如,可以使用 Nested Query 结合 Match Query 或 Term Query 等查询类型来搜索嵌套字段。

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK