學習資料:
https://shashankbaravani.medium.com/database-storage-engines-under-the-hood-705418dc0e35
https://blog.csdn.net/qq_31807385/article/details/113662819
核心問題:
了解存儲引擎使用的不同的索引結構的特點和場景
存儲引擎分類 日志結構存儲引擎 面向頁面的存儲引擎
1.Hash

無法進行范圍查詢&規(guī)模小 稠密索引 需要很大的內存存放稠密索引
比如Redis
2.SSTables和LSM Tree

壓縮和合并更快 支持分塊 天生更適合分布式
更好的利用Most Recent原則,多級結構
可以整塊加載到內存進行范圍查詢
更適合寫數(shù)量級高于讀的場景
寫 O(1) 讀 k*O(n) 塊數(shù)*每塊的大小
內存中啥memtable,需要WAL來恢復數(shù)據(jù)
SSTable的合并依賴于塊的大小以及數(shù)據(jù)的age
HBase and Cassandra是經(jīng)典的使用該索引的數(shù)據(jù)結構
3.B-Trees

B-Trees also maintain a sorted in memory map but the underlying file sizes are much smallerblocks of 4KB in line with underlying hardware instead of much larger segments of many MBs.
Thenon leaflevel nodes in the BST contain information around the range of of keys beneath each node. To arrive at a record, we start at therootnode and traverse downwards, breaking down the range at each step ad moving in the direction of the nodes containing the sub-range to which the key belongs.
Theleaflevel nodes either contain the data itself inline or contain references to page blocks which hold the record we are interested in. The number of references to child pages in one page of the B-tree is called thebranching factor(usually in 100s).
B樹的寫需要把整個Block加載到內存,然后整體更新;這導致了讀與寫放大
4.Inverted Index

In search indexes such as ElasticSearch, an index is composed of shards
and each shard is an further broken down into segments. Each segment is
an inverted index in itself and this is how it looks like: a collection
of document Ids and the term frequency.