ElasticSearch7.3 学习之生产环境实时重建索引 - |旧市拾荒|
source link: https://www.cnblogs.com/xiaoyh/p/16028045.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
1、实时重建索引
在实际的生产环境中,一个field
的设置是不能被修改的,如果要修改一个Field
,那么应该重新按照新的mapping
,建立一个index
,然后将数据批量查询出来,重新用bulk api
写入index
中。
批量查询的时候,建议采用scroll api
,并且采用多线程并发的方式来reindex
数据。例如说每次scoll
就查询指定日期的一段数据,交给一个线程即可。
(1) 一开始,依靠dynamic mapping
,插入数据,但是不小心有些数据是2019-09-10
这种日期格式的,所以title
这种field
被自动映射为了date
类型,实际上它应该是string
类型的。
首先插入以下数据
PUT /my_index/_doc/1
{
"title": "2019-09-10"
}
PUT /my_index/_doc/2
{
"title": "2019-09-11"
}
(2)当后期向索引中加入string
类型的title
值的时候,就会报错
PUT /my_index/_doc/3
{
"title": "my first article"
}
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "failed to parse field [title] of type [date] in document with id '3'. Preview of field's value: 'my first article'"
}
],
"type": "mapper_parsing_exception",
"reason": "failed to parse field [title] of type [date] in document with id '3'. Preview of field's value: 'my first article'",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "failed to parse date field [my first article] with format [strict_date_optional_time||epoch_millis]",
"caused_by": {
"type": "date_time_parse_exception",
"reason": "Failed to parse with all enclosed parsers"
}
}
},
"status": 400
}
(3)如果此时想修改title
的类型,是不可能的
PUT /my_index/_mapping
{
"properties": {
"title": {
"type": "text"
}
}
}
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "mapper [title] of different type, current_type [date], merged_type [text]"
}
],
"type": "illegal_argument_exception",
"reason": "mapper [title] of different type, current_type [date], merged_type [text]"
},
"status": 400
}
(4)此时,唯一的办法,就是进行reindex
,也就是说,重新建立一个索引,将旧索引的数据查询出来,再导入新索引。
(5)如果说旧索引的名字,是old_index
,新索引的名字是new_index
,终端java
应用,已经在使用old_index
在操作了,难道还要去停止java
应用,修改使用的index
为new_index
,才重新启动java
应用吗?这个过程中,就会导致java
应用停机,可用性降低。
(6)所以说,给java
应用一个别名,这个别名是指向旧索引的,java
应用先用着,java
应用先用prod_index
来操作,此时实际指向的是旧的my_index
PUT /my_index/_alias/prod_index
(7)查看别名,会发现my_index
已经存在一个别名prod_index
了。
GET my_index/_alias
(8)新建一个index
,调整其title
的类型为string
PUT /my_index_new
{
"mappings": {
"properties": {
"title": {
"type": "text"
}
}
}
}
(9)使用scroll api
将数据批量查询出来
GET /my_index/_search?scroll=1m
{
"query": {
"match_all": {}
},
"size": 1
}
{
"_scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAARUMWQWx5bzRmTW9TeUNpNmVvN0E2dF9YQQ==",
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"title" : "2019-09-10"
}
}
]
}
}
(9)采用bulk api
将scoll
查出来的一批数据,批量写入新索引
POST /_bulk
{"index":{"_index":"my_index_new","_id":"1"}}
{"title":"2019-09-10"}
(10)反复循环8~9,查询一批又一批的数据出来,采取bulk api
将每一批数据批量写入新索引
(11)将my_index
索引的别名prod_index
切换到my_index_new
上去,java应用会直接通过index别名使用新的索引中的数据,java应用程序不需要停机,零提交,高可用
POST /_aliases
{
"actions": [
{
"remove": {
"index": "my_index",
"alias": "prod_index"
}
},
{
"add": {
"index": "my_index_new",
"alias": "prod_index"
}
}
]
}
(12)直接通过prod_index
别名来查询,是否ok
GET prod_index/_search
可以看到能够查询到新索引my_index_new
的数据了
{
"took" : 1117,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_index_new",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"title" : "2019-09-10"
}
}
]
}
}
2、总结:
基于alias
对client
透明切换index
PUT /my_index_v1/_alias/my_index
client
对my_index
进行操作
reindex
操作,完成之后,切换v1到v2
POST /_aliases
{
"actions": [
{ "remove": { "index": "my_index_v1", "alias": "my_index" }},
{ "add": { "index": "my_index_v2", "alias": "my_index" }}
]
}
如果您觉得阅读本文对您有帮助,请点一下“推荐”按钮,您的“推荐”将是我最大的写作动力!欢迎各位转载,但是未经作者本人同意,转载文章之后必须在文章页面明显位置给出作者和原文连接,否则保留追究法律责任的权利。
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK