一、關(guān)鍵參數(shù)
create_if_missing:創(chuàng)建缺失表
num_levels:層次數(shù)量,默認(rèn)是7。如果L0大小有512MB,6層能容納512M+512M+5G+50G+500G+5T,如果配置是7,在數(shù)據(jù)量少于前面計(jì)算的5T+的數(shù)據(jù)之前,最后一層是不會(huì)被使用的。如果num_levels配置為6,那么最下面一層數(shù)據(jù)量會(huì)大于5T
max_background_flushes:memtable dump成sstable的并發(fā)線程數(shù)。默認(rèn)是1,線程數(shù)小,當(dāng)寫入量大時(shí),會(huì)導(dǎo)致無法寫入。
max_background_compactions:底層sst向高層sst compact的并發(fā)線程數(shù)。并發(fā)compaction會(huì)加快compaction的速度,如果compaction過慢,達(dá)到soft_pending_compaction_bytes_limit會(huì)發(fā)生阻塞,達(dá)到hard_pending_compaction_bytes會(huì)停寫。
max_write_buffer_number:指定memtable和immutable memtable總數(shù)。當(dāng)寫入速度過快,或者flush線程速度較慢,出現(xiàn)memtable數(shù)量超過了指定大小,請(qǐng)求會(huì)無法寫入
write_buffer_size:?jiǎn)蝹€(gè)memtable的大小,當(dāng)memtable達(dá)到指定大小,會(huì)自動(dòng)轉(zhuǎn)換成immutable memtable并且新創(chuàng)建一個(gè)memtable
max_bytes_for_level_base:L1的總大小,L1的大小建議設(shè)置成和L0大小一致,提升L0->L1的compaction效率
min_write_buffer_number_to_merge:immutable memtable在flush之前先進(jìn)行合并,比如參數(shù)設(shè)置為2,當(dāng)一個(gè)memtable轉(zhuǎn)換成immutable memtable后,RocksDB不會(huì)進(jìn)行flush操作,等到至少有2個(gè)后才進(jìn)行flush操作。這個(gè)參數(shù)調(diào)大能夠減少磁盤寫的次數(shù),因?yàn)槎鄠€(gè)memtable中可能有重復(fù)的key,在flush之前先merge后就避免了舊數(shù)據(jù)刷盤;但是帶來的問題是每次數(shù)據(jù)查找,當(dāng)memtable中沒有對(duì)應(yīng)數(shù)據(jù),RocksDB可能需要遍歷所有的immutable memtable,會(huì)影響讀取性能。
level0_file_num_compaction_trigger:L0達(dá)到指定個(gè)數(shù)的sstable后,觸發(fā)compaction L0->L1。所以L0穩(wěn)定狀態(tài)下大小為write_buffer_sizemin_write_buffer_number_to_mergelevel0_file_num_compaction_trigger
statistics:統(tǒng)計(jì)系統(tǒng)性能和吞吐信息,開啟statistics會(huì)增加5%到10%的額外開銷
stats_dump_period_sec:統(tǒng)計(jì)信息導(dǎo)出日志時(shí)間間隔
compression_type: 壓縮類型
bloom_filter_bits:使用bloom過濾器來避免不必要的磁盤訪問
lru_cache_size:cache大小
max_open_files:最大打開文件句柄
skip_stats_update_on_db_open: 打開db時(shí),是否跳過stats。建議設(shè)為false
二、wirte sall 常見情況及解決方法
(1)RocksDB在flush或compaction速度來不及處理新的寫入,會(huì)啟動(dòng)自我保護(hù)機(jī)制,延遲寫或者禁寫。主要有幾種情況:
寫限速:如果max_write_buffer_number大于3,將要flush的memtables大于等于max_write_buffer_number-1,write會(huì)被限速。
禁寫:memtable個(gè)數(shù)大于等于max_write_buffer_number,觸發(fā)禁寫,等到flush完成后允許寫入。
寫限速:L0文件數(shù)量達(dá)到level0_slowdown_writes_trigger,觸發(fā)寫限速。
禁寫:L0文件數(shù)量達(dá)到level0_stop_writes_trigger,禁寫。
寫限速:等待compaction的數(shù)據(jù)量達(dá)到soft_pending_compaction_bytes,觸發(fā)寫限速。
禁寫:等待compaction的數(shù)據(jù)量達(dá)到hard_pending_compaction_bytes,觸發(fā)禁寫。
(2)當(dāng)出現(xiàn)write stall時(shí),可以按具體的系統(tǒng)的狀態(tài)調(diào)整如下參數(shù):
調(diào)大max_background_flushes
調(diào)大max_write_buffer_number
調(diào)大max_background_compactions
調(diào)大write_buffer_size
調(diào)大min_write_buffer_number_to_merge
三、推薦配置示例
存儲(chǔ)介質(zhì)flash
options.options.compaction_style = kCompactionStyleLevel;
options.write_buffer_size = 67108864; // 64MB
options.max_write_buffer_number = 3;
options.target_file_size_base = 67108864; // 64MB
options.max_background_compactions = 4;
options.level0_file_num_compaction_trigger = 8;
options.level0_slowdown_writes_trigger = 17;
options.level0_stop_writes_trigger = 24;
options.num_levels = 4;
options.max_bytes_for_level_base = 536870912; // 512MB
options.max_bytes_for_level_multiplier = 8;
全內(nèi)存
options.allow_mmap_reads = true;
BlockBasedTableOptions table_options;
table_options.filter_policy.reset(NewBloomFilterPolicy(10, true));
table_options.no_block_cache = true;
table_options.block_restart_interval = 4;
options.table_factory.reset(NewBlockBasedTableFactory(table_options));
options.level0_file_num_compaction_trigger = 1;
options.max_background_flushes = 8;
options.max_background_compactions = 8;
options.max_subcompactions = 4;
options.max_open_files = -1;
ReadOptions.verify_checksums = false