TiDB 啟動(dòng)問(wèn)題記錄

案例1

[問(wèn)題澄清]

TiDB集群?jiǎn)?dòng)過(guò)程中報(bào)錯(cuò):

[FATAL] [main.go:111] [“run server failed”] [error=“l(fā)isten tcp 192.xxx.73.101:2380: bind: cannot assign requested address”]


????????????[原因分析]

????????????網(wǎng)絡(luò)問(wèn)題

????????????[解決方案]

1.使用ping命令檢查ip是否可以訪問(wèn)

2.使用telnel命令測(cè)試端口是否可以訪問(wèn)

3.在tidb集群中盡量避免內(nèi)網(wǎng)和外網(wǎng)ip混用

[參考案例]

PD端口無(wú)法啟動(dòng)

https://asktug.com/t/pd/638

[引申學(xué)習(xí)點(diǎn)]

ping命令

https://blog.csdn.net/hebbely/article/details/54965989

telnet命令

https://blog.csdn.net/swazer_z/article/details/64442730

案例2

[問(wèn)題澄清]

PD啟動(dòng)過(guò)程中報(bào)錯(cuò):

[PANIC] [server.go:446] [“failed to recover v3 backend from snapshot”]

[error=“failed to find database snapshot file (snap: snapshot file doesn’t exist)”]

[原因分析]

服務(wù)器掉電,導(dǎo)致操作系統(tǒng)數(shù)據(jù)丟失

[解決方案]

1. 掉電后可能目錄變?yōu)橹蛔x,請(qǐng)運(yùn)維人員幫助從操作系統(tǒng)層面恢復(fù)只讀文件

2. 如果TiDB集群有PD節(jié)點(diǎn)無(wú)法啟動(dòng),建議使用pd-recover命令恢復(fù)

https://pingcap.com/docs-cn/stable/reference/tools/pd-recover/#pd-recover-%25E4%25BD%25BF%25E7%2594%25A8%25E6%2596%2587%25E6%25A1%25A3

[參考案例]

系統(tǒng)斷電,來(lái)電后重啟tidb集群,啟動(dòng)PD節(jié)點(diǎn)報(bào)錯(cuò),3個(gè)PD節(jié)點(diǎn)有兩個(gè)報(bào)錯(cuò)

https://asktug.com/t/tidb-pd-3-pd/1369

[學(xué)習(xí)引申點(diǎn)]

EXT4文件系統(tǒng)學(xué)習(xí)(五)掉電數(shù)據(jù)損壞重啟掛載失敗并修復(fù),僅限參考非標(biāo)準(zhǔn)步驟,fsck失敗可能導(dǎo)致數(shù)據(jù)損壞

https://blog.csdn.net/TSZ0000/article/details/84664865

案例3

[問(wèn)題澄清]

TiKV啟動(dòng)過(guò)程中報(bào)錯(cuò):

ERRO tikv-server.rs:155: failed to create kv engine: “Corruption: Sst file size mismatch: /data/tidb/deploy/data/db/67704904.sst. Size recorded in manifest 325143, actual size 0 ”]

[原因分析]

服務(wù)器重啟,導(dǎo)致數(shù)據(jù)未及時(shí)sync

[解決方案]

下線節(jié)點(diǎn),重新擴(kuò)容,參考擴(kuò)容縮容步驟

https://pingcap.com/docs-cn/stable/how-to/scale/with-ansible/

會(huì)在某個(gè)新版本修復(fù)此問(wèn)題

?https://github.com/tikv/tikv/pull/4807

[參考案例]

TiKV節(jié)點(diǎn)無(wú)法啟動(dòng)

https://asktug.com/t/tikv/1375

[學(xué)習(xí)引申點(diǎn)]

RocksDB - MANIFEST

http://www.itdecent.cn/p/d1b38ce0d966

案例4

[問(wèn)題澄清]

TiKV啟動(dòng)過(guò)程中報(bào)錯(cuò):

ERRO panic_hook.rs:104: thread ‘raftstore-11’ panicked ‘[region 125868]323807 to_commit 181879 is out of range [last_index 181878]’ at "/home/jenkins/.cargo/git/checkouts/raft-rs-841f8a6db665c5c0/b10d74c/src/raft_log.rs:248"3.2019/04/30 18:11:27.625 ERRO panic_hook.rs:104: thread ‘raftstore-11’ panicked ‘[region 125868]323807 to_commit 181879 is out of range [last_index 181878]’ at “/home/jenkins/.cargo/git/checkouts/raft-rs-841f8a6db665c5c0/b10d74c/src/raft_log.rs:248” stack backtrace:stack backtrace:

[原因分析]

to_commit out of range 意味著這個(gè) peer 想要 commit 一條不存在的日志,說(shuō)明因某些主動(dòng)操作或者異常情況發(fā)生導(dǎo)致最近的 raft log 丟失了

[解決方案]

1.通過(guò) tikv-ctl 工具定位損壞的region,指定 db 目錄(當(dāng)前損壞 tikv 節(jié)點(diǎn)的目錄)。

2.通過(guò) tikv-ctl 進(jìn)行數(shù)據(jù)修復(fù)。

2.1 如果修復(fù)失敗。如下:

set_region_tombstone: StringError("The peer is still in target peers")

使用tikv-ctl 執(zhí)行 region tombstone 需要對(duì)損壞節(jié)點(diǎn) region peer 進(jìn)行判斷,需要人工清理。remove 掉異常的 peer。

2.2 重復(fù)使用 tikv-ctl 工具執(zhí)行修復(fù)即可。

[參考案例]

TiKV 報(bào)錯(cuò) ERRO panic_hook.rs:104 是什么原因

https://asktug.com/t/tikv-erro-panic-hook-rs-104/165

Tikv節(jié)點(diǎn)掛掉后,啟動(dòng)報(bào)錯(cuò)“[region 32] 33 to_commit 405937 is out of range [last_index 405933]”

https://asktug.com/t/tikv-region-32-33-to-commit-405937-is-out-of-range-last-index-405933/1922

[學(xué)習(xí)引申點(diǎn)]

Raft 日志復(fù)制 Log replication

http://www.itdecent.cn/p/b28e73eefa88

案例5

[問(wèn)題澄清]

PD啟動(dòng)過(guò)程中報(bào)錯(cuò):

FAILED - RETRYING: wait until the PD health page is available (12 retries left). FAILED - RETRYING: wait until the PD health page is available (12 retries left)

[原因分析]

ip地址異常

[解決方案]

1.檢查是否有內(nèi)外網(wǎng)ip導(dǎo)致不通

2.是否是更換PD ip地址導(dǎo)致,可以采用擴(kuò)容縮容的方法處理PD.

[參考案例]

節(jié)點(diǎn)IP變化后,如何操作更新

https://asktug.com/t/ip/1106

TiDB集群?jiǎn)?dòng)不起來(lái)

https://asktug.com/t/tidb/1563

[學(xué)習(xí)引申點(diǎn)]

TiDB 最佳實(shí)踐系列(二)PD 調(diào)度策略最佳實(shí)踐

https://pingcap.com/blog-cn/best-practice-pd/

案例6

[問(wèn)題澄清]

TiDB無(wú)法啟動(dòng),tidb_stderr.log報(bào)錯(cuò):

fatal error: runtime: out of memory

[原因分析]

設(shè)置 echo 2 > /proc/sys/vm/overcommit_memory

[解決方案]

設(shè)置echo 0 > /proc/sys/vm/overcommit_memory

[參考案例]

修改內(nèi)存使用策略導(dǎo)致 TiDB自動(dòng)下線后 無(wú)法啟動(dòng)

https://asktug.com/t/tidb/1716

[學(xué)習(xí)引申點(diǎn)]

linux下overcommit_memory的問(wèn)題

https://blog.csdn.net/houjixin/article/details/46412557

案例7

[問(wèn)題澄清]

TiDB集群?jiǎn)?dòng)過(guò)程中報(bào)錯(cuò):

Ansible FAILED! => playbook: start.yml; TASK: Check grafana API Key list; message: {“changed”: false, “connection”: “close”, “content”: “{“message”:“Invalid username or password”}”, “content_length”: “42”, “content_type”: “application/json; charset=UTF-8”, “date”: “Wed, 25 Dec 2019 02:22:44 GMT”, “json”: {“message”: “Invalid username or password”}, “msg”: “Status code was 401 and not [200]: HTTP Error 401: Unauthorized”, “redirected”: false, “status”: 401, “url”: “http://192.168.179.112:3000/api/auth/keys”}

[原因分析]

修改過(guò) Grafana 的密碼

[解決方案]

inventory.ini 中配置的用戶名和密碼也需要修改為新的密碼

[參考案例]

啟動(dòng)集tidb集群出現(xiàn)錯(cuò)誤

https://asktug.com/t/topic/2253

[學(xué)習(xí)引申點(diǎn)]

Grafana全面瓦解

http://www.itdecent.cn/p/7e7e0d06709b

案例8

[問(wèn)題澄清]

TiDB集群?jiǎn)?dòng)過(guò)程中TiDB日志報(bào)錯(cuò):

[error="[global:3]critical error write binlog failed, the last error no avaliable pump to write binlog"]

[原因分析]

pump與Draine造成的

[解決方案]

pump錯(cuò)誤為:fail to notify all living drainer: notify drainer。將drainer啟動(dòng),然后成功下線后,start.yml執(zhí)行成功

[參考案例]

tidb服務(wù)已經(jīng)啟動(dòng)了,但是wait until the TiDB port is up失敗

https://asktug.com/t/topic/2606

[學(xué)習(xí)引申點(diǎn)]

TiDB Binlog 簡(jiǎn)介

https://pingcap.com/docs-cn/stable/reference/tidb-binlog/overview/

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

友情鏈接更多精彩內(nèi)容