MySQL relaylog切換導(dǎo)致業(yè)務(wù)響應(yīng)身高

背景

MySQL從5.7.17升級(jí)到MySQL5.7.32后,業(yè)務(wù)反饋隔一段時(shí)間就有幾百筆交易調(diào)用數(shù)據(jù)庫(kù)響應(yīng)超過(guò)200ms觸發(fā)報(bào)警

環(huán)境配置

主從復(fù)制類型:半同步復(fù)制
mysql_version:oracle mysql-5.7.32
max_binlog_size = 1G
max_relay_log_size = 1G

原因

mysql-5.7.25-relnotes:
The value returned by a [`SHOW SLAVE STATUS`](https://docs.oracle.com/cd/E17952_01/mysql-5.7-en/show-slave-status.html) statement for the total combined size of all existing relay log files (`Relay_Log_Space`) could become much larger than the actual disk space used by the relay log files. 
The I/O thread did not lock the variable while it updated the value, so the SQL thread could automatically delete a relay log file and write a reduced value before the I/O thread finished updating the value. 
The I/O thread then wrote its original size calculation, ignoring the SQL thread's update and so adding back the space for the deleted file. The `Relay_Log_Space` value is now locked during updates to prevent concurrent updates and ensure an accurate calculation. (Bug #26997096, Bug #87832)

MySQL為了解決 [Bug #87832] show slave status不準(zhǔn)的問(wèn)題,在io線程寫relaylog的時(shí)候增加了一把log_space_lock日志空間鎖,當(dāng)relaylog切換的時(shí)候SQL線程也會(huì)持有該鎖,當(dāng)relaylog切換的時(shí)候整個(gè)流程如下:
1.sql thread get the log_space_lock
......
2.刪除relaylog物理文件并更改當(dāng)前relaylog占用總空間
mysql_file_delete(key_file_binlog, log_info.log_file_name, MYF(0)) function delete the relay_log file
if (decrease_log_space)
    *decrease_log_space-= s.st_size;
......
3.sql thread release the log_space_lock

io thread 在1-3期間無(wú)法寫入 event(寫event需要獲取日志空間鎖)無(wú)法給主庫(kù)返回ack導(dǎo)致業(yè)務(wù)響應(yīng)升高
該問(wèn)題相當(dāng)于解決一個(gè)bug又引入了另外一個(gè)bug,對(duì)應(yīng)bug已經(jīng)提交給官方,官方已經(jīng)確認(rèn)(https://bugs.mysql.com/bug.php?id=103943)

為什么刪除1G文件需要200ms左右

文件系統(tǒng)存在緩存,當(dāng)刪除此盤文件的時(shí)候需要將cache中的數(shù)據(jù)先清理出去,然后刪除磁盤文件,類似于數(shù)據(jù)庫(kù)drop 表時(shí)清理bufferpool,經(jīng)測(cè)試1G文件不清空緩存刪除文件150ms,刪除緩存后再刪除文件6ms

如何解決這個(gè)問(wèn)題

方案一
給relaylog做硬連接,讓mysql清理relaylog時(shí)只刪除inode,剩下的垃圾文件寫腳本刪除
方案2
更改mysql源碼,將鎖范圍縮小
1.刪除relaylog物理文件
mysql_file_delete(key_file_binlog, log_info.log_file_name, MYF(0)) function delete the relay_log file
2.sql thread get the log_space_lock
3.更改當(dāng)前relaylog占用總空間
if (decrease_log_space)
    *decrease_log_space-= s.st_size;
4.sql thread release the log_space_lock
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

友情鏈接更多精彩內(nèi)容