Elasticsearch-5.6.0使用repository-hdfs快照(備份)數(shù)據(jù)到hdfs并恢復(fù)

背景

Elasticsearch的副本機(jī)制提供了可靠性,可以容忍個(gè)別節(jié)點(diǎn)丟失而不影響集群的對(duì)外服務(wù),但是并不能提供對(duì)災(zāi)難性故障的保護(hù),所以需要對(duì)ES集群數(shù)據(jù)做一個(gè)完整的備份,以便在災(zāi)難性故障發(fā)生時(shí),能快速恢復(fù)數(shù)據(jù)。ES官方提供了快照/恢復(fù)(Snapshot/Restore)的方式,支持的插件包括Azure Repository Plugin、S3 Repository Plugin、Hadoop HDFS Repository Plugin、Google Cloud Storage Respository Plugin,這里我使用Hadoop HDFS Repository插件,將ES中的數(shù)據(jù)備份到HDFS上。

  • 說(shuō)明

本文基于Elasticsearch-5.6.0、hadoop-2.6.0-cdh5.7.0,使用的插件及版本是repository-hdfs-5.6.0.zip,官網(wǎng)地址:
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/modules-snapshots.html
https://www.elastic.co/guide/en/elasticsearch/plugins/5.6/repository-hdfs.html
ES集群快照存在版本兼容性問(wèn)題,請(qǐng)注意:
A snapshot of an index created in 5.x can be restored to 6.x.
A snapshot of an index created in 2.x can be restored to 5.x.
A snapshot of an index created in 1.x can be restored to 2.x.
我的情況是從5.6.0備份數(shù)據(jù)然后恢復(fù)到6.3.2,不存在這種兼容性問(wèn)題。

  • 操作步驟
1. 安裝插件

分別在集群的各個(gè)節(jié)點(diǎn)安裝repository-hdfs插件
在線安裝:sudo bin/elasticsearch-plugin install repository-hdfs
離線安裝:
先wget https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-5.6.0.zip
然后bin/elasticsearch-plugin install file:///data/elastic/repository-hdfs-5.6.0.zip

2. 創(chuàng)建倉(cāng)庫(kù),并在ES注冊(cè)
curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository" -H 'Content-Type: application/json' -d'
{
"type": "hdfs",
"settings": {
    "uri": "hdfs://golive-master:8020/",
    "path": "elasticsearch/respositories/es_hdfs_repository",
    "conf.dfs.client.read.shortcircuit": "true",
    "conf.dfs.domain.socket.path": "/var/lib/hadoop-hdfs/dn_socket"
     }
}
'

創(chuàng)建過(guò)程中遇到Permission denied的問(wèn)題,我暫時(shí)關(guān)閉了hdfs權(quán)限,即修改hadoop各節(jié)點(diǎn)hdfs-site.xml,添加如下配置:

<property> 
    <name>dfs.permissions</name> 
    <value>false</value>
</property>

然后重啟hdfs,再次執(zhí)行上述創(chuàng)建倉(cāng)庫(kù)命令即可成功創(chuàng)建,查看hdfs目錄如下:


hdfs_dir

可以通過(guò)如下命令查看倉(cāng)庫(kù):
curl -X GET "172.16.221.104:9400/_snapshot/es_hdfs_repository"
返回結(jié)果如下:

{
"es_hdfs_repository": {
"type": "hdfs",
"settings": {
"path": "elasticsearch/respositories/es_hdfs_repository",
"uri": "hdfs://golive-master:8020/",
"conf": {
"dfs": {
"client": {
"read": {
"shortcircuit": "true"
}
},
"domain": {
"socket": {
"path": "/var/lib/hadoop-hdfs/dn_socket"
           }
          }
        }
      }
    }
  }
}
3. 創(chuàng)建快照

為所有索引創(chuàng)建快照:

curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1?wait_for_completion=true" -H 'Content-Type: application/json' -d'
{
"indices": "*"
}
'

通常你會(huì)希望你的快照作為后臺(tái)進(jìn)程運(yùn)行,不過(guò)有時(shí)候你會(huì)希望在你的腳本中一直等待到完成。這可以通過(guò)添加一個(gè) wait_for_completion 標(biāo)記實(shí)現(xiàn):wait_for_completion=true,這個(gè)會(huì)阻塞調(diào)用直到快照完成。注意大型快照會(huì)花很長(zhǎng)時(shí)間才返回。
https://www.elastic.co/guide/cn/elasticsearch/guide/current/backing-up-your-cluster.html

4.恢復(fù)快照

curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_restore"
和快照類(lèi)似, restore 命令也會(huì)立刻返回,恢復(fù)進(jìn)程會(huì)在后臺(tái)進(jìn)行。如果你更希望你的 HTTP 調(diào)用阻塞直到恢復(fù)完成,添加 wait_for_completion 標(biāo)記:

curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_restore?wait_for_completion=true"

我恢復(fù)的時(shí)候是恢復(fù)到一個(gè)新的集群(6.3.2的一個(gè)集群),因?yàn)闆](méi)有在es注冊(cè)HDFS倉(cāng)庫(kù)的位置,報(bào)錯(cuò)說(shuō)找不到倉(cāng)庫(kù),于是又通過(guò)創(chuàng)建倉(cāng)庫(kù)的命令注冊(cè)了一下,再執(zhí)行恢復(fù)命令就好了,這一點(diǎn)官方是這么說(shuō)的:

All that is required is registering the repository containing the snapshot in the new cluster and starting the >restore process.

英文文檔:https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html
中文文檔:https://www.elastic.co/guide/cn/elasticsearch/guide/current/_restoring_from_a_snapshot.html

5.獲取快照信息和狀態(tài)

獲取一個(gè)倉(cāng)庫(kù)中所有快照的完整列表,使用 _all 占位符替換掉具體的快照名稱(chēng):
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/_all"
獲取一個(gè)快照的詳細(xì)信息:
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2"
獲取一個(gè)快照更詳細(xì)的信息:
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_status"
官方文檔:
https://www.elastic.co/guide/cn/elasticsearch/guide/current/backing-up-your-cluster.html
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/modules-snapshots.html

  • 附錄:

以下是我當(dāng)時(shí)備份/恢復(fù)數(shù)據(jù)用到的相關(guān)命令:

wget https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-5.6.0.zip
elasticsearch-5.6.0/bin/elasticsearch-plugin install file:///data/elastic/repository-hdfs-5.6.0.zip

curl 172.16.221.104:9400/_cat/indices?v
curl 172.16.221.104:9400/_cat/master?v
curl 172.16.221.104:9400/_cat/master?help
curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository" -H 'Content-Type: application/json' -d'
{
"type": "hdfs",
"settings": {
"uri": "hdfs://golive-master:8020/",
"path": "elasticsearch/respositories/es_hdfs_repository",
"conf.dfs.client.read.shortcircuit": "true",
"conf.dfs.domain.socket.path": "/var/lib/hadoop-hdfs/dn_socket"
}
}
'
curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1?wait_for_completion=true" -H 'Content-Type: application/json' -d'
{
"indices": "*"
}
'
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository"
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_status"

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.2/elasticsearch-analysis-ik-6.3.2.zip

[https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-6.3.2.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-6.3.2.zip)

bin/elasticsearch-plugin install file:///data/elastic/repository-hdfs-6.3.2.zip
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_status"
curl 172.16.221.105:9400/_cat/master
curl 172.16.221.12:9400/_cat/nodes
curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_restore"
curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_restore" -H 'Content-Type: application/json' -d'
{
"indices": "a*,l*,m*,u*,i*"
}
'

[https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.2.tar.gz](https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.2.tar.gz)
curl -X DELETE "172.16.221.105:9400/.kibana-6"
curl -X GET "172.16.221.105:9400/_cat/indices"
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_status"
curl -X POST "172.16.221.105:9400/a*,l*,m*,u*,i*/_close"
curl -X POST "172.16.221.105:9400/a*,l*,m*,u*,i*/_open"
curl -X GET http://172.16.221.105:9400/ad_base?pretty
curl -X GET http://172.16.221.105:9400/_cluster/health?pretty
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • 常見(jiàn)的數(shù)據(jù)庫(kù)都會(huì)提供備份的機(jī)制,以解決在數(shù)據(jù)庫(kù)無(wú)法使用的情況下,可以開(kāi)啟新的實(shí)例,然后通過(guò)備份來(lái)恢復(fù)數(shù)據(jù)減少損失。...
    rockybean閱讀 1,550評(píng)論 0 2
  • 我們的系統(tǒng)中大部分都是時(shí)序數(shù)據(jù),一些數(shù)據(jù)被清洗后,過(guò)期的數(shù)據(jù)意義已經(jīng)不大,但是保不齊哪天需要重新清洗或者查閱歷史,...
    RomainXie閱讀 6,818評(píng)論 0 1
  • 隨筆一畫(huà),驀然驚悚?Σ( °嚇°|||)︴
    Believber閱讀 284評(píng)論 0 0
  • 我十歲那年,螞蟻給我留下了深刻得啟示。 那一年,我出于好奇,用泥土把一個(gè)螞蟻洞埋了起來(lái)。正在我得意時(shí),去外邊覓食的...
    仇河壬閱讀 231評(píng)論 0 0
  • 昨晚女兒再三和我確定, 媽媽?zhuān)魈煳乙耘_和雞翅! 好的! 你如果不說(shuō)話算話,我就不理你們了。 那我和爸爸離家出...
    飛芒果閱讀 252評(píng)論 0 0

友情鏈接更多精彩內(nèi)容