猜想
????????MySQL哪種索引快?聚簇索引 or 非聚簇索引?
試驗(yàn)用表
# 這是一個(gè)普通的表小t
CREATE TABLE `t` (
`id` int(10) NOT NULL COMMENT '大家好,我是主鍵',
`a` int(11) DEFAULT NULL COMMENT '大家好,我是字段a,我是一個(gè)普通索引',
`b` int(11) DEFAULT NULL COMMENT '我是一個(gè)字段,我沒有索引',
PRIMARY KEY (`id`),
KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='大家好,我是表`t`'
起因
????????在某個(gè)風(fēng)和日麗的下午,大家伙正在溝通業(yè)務(wù)的時(shí)候,引發(fā)了一場(chǎng)“辯論”:主鍵查詢會(huì)比普通索引查詢快嗎?
- 正方:以primary key為條件查詢快
- 反方:效率一樣
淺談索引
????????MySQL的索引主要兩類,聚簇索引(也叫主鍵索引、聚集索引)和非聚簇索引(也叫非聚集索引、二級(jí)索引、普通索引)。目前MySQL innoDB引擎的索引,主要使用B+樹結(jié)構(gòu),B+樹的深度低,葉子節(jié)點(diǎn)多,可以有效減少磁盤IO,提升性能。
MySQL的索引結(jié)構(gòu)
????????MySQL中每張表的數(shù)據(jù),也以B+樹的結(jié)構(gòu)保存。同為B+樹結(jié)構(gòu),它與索引的區(qū)別是:葉子節(jié)點(diǎn)保存的是主鍵 + 數(shù)據(jù),而普通索引保存的是索引值 + 主鍵,如下圖所示。
主鍵索引數(shù)據(jù)結(jié)構(gòu)(下圖)

普通索引(下圖)

查詢與回表
????????那么MySQL查詢是如何工作的呢?我想聰明的你已經(jīng)知道了答案。
????????如果我們通過主鍵查詢(比如下面SQL語句),那么innoDB引擎會(huì)在保存聚簇索引的這顆B+樹中,尋找id = 4的節(jié)點(diǎn),然后通過指針直接獲取行數(shù)據(jù)。
# 根據(jù)聚集索引查詢數(shù)據(jù),一槍頭
select * from t where id = 4;
????????如果我們通過普通查詢(比如下面SQL語句),那么innoDB引擎會(huì)在保存普通索引的這顆B+樹中,尋找a = 'd'的節(jié)點(diǎn),然后在葉子節(jié)點(diǎn)中獲取到主鍵值,再通過主鍵去查詢真實(shí)數(shù)據(jù),這個(gè)過程就叫做回表,顯然它比通過聚簇索引去查詢,多一步。
# 根據(jù)普通索引查詢數(shù)據(jù),回表一次
select * from t where a = 'd';
????????這里有個(gè)小知識(shí)點(diǎn),如果我們只需要查詢a字段的值,那么普通索引的葉子節(jié)點(diǎn)已經(jīng)包含了結(jié)果,就不需要回表了。學(xué)名叫做——索引覆蓋。
實(shí)操
????????鑒于前排的理論知識(shí),我們來進(jìn)行一番實(shí)操,步驟如下:
- 環(huán)境:本機(jī)docker mysql:5.6.47
- 版本:MySQL 5.6.47
- init table
- init data
- have a try
# step 1 init table
CREATE TABLE `t` (
`id` int(10) NOT NULL COMMENT '你好,我是主鍵',
`a` int(11) DEFAULT NULL COMMENT '大家好,我是字段a,我是一個(gè)普通索引',
`b` int(11) DEFAULT NULL COMMENT '我是一個(gè)字段,我沒有索引',
PRIMARY KEY (`id`),
KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='大家好,我是表`t`'
# step 2 init data
# 插入100w數(shù)據(jù)
DROP PROCEDURE if exists idata;
delimiter ;;
create procedure idata()
begin
declare i int;
set i=1;
while(i<=100000)do
insert into t values(i, i, i);
set i=i+1;
end while;
end;;
delimiter ;
call idata();
# step 3 test
語句1:select * from t where id < 100000; -- 0.10s
語句2:select * from t where a < 100000; -- 0.30s
????????結(jié)果體現(xiàn)出,主鍵索引更快,而且執(zhí)行速度差距有點(diǎn)大,我們來看看執(zhí)行計(jì)劃。
mysql> explain select * from t where id < 100000;
+----+-------------+-------+-------+---------------+---------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+-------------+
| 1 | SIMPLE | t | range | PRIMARY | PRIMARY | 4 | NULL | 200664 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+-------------+
1 rows in set (0.02 sec)
mysql> explain select * from t where a < 100000;
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | t | ALL | a | NULL | NULL | NULL | 998222 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
1 rows in set (0.02 sec)
????????通過執(zhí)行計(jì)劃發(fā)現(xiàn),字段a竟然沒有命中索引,執(zhí)行了全表掃描!
????????==出現(xiàn)這種情況,簡(jiǎn)單來說,是由于優(yōu)化器判定掃描行數(shù)過多,而且通過索引a查詢我們想要的數(shù)據(jù),每次都需要通過再次查詢主鍵索引,代價(jià)過高,所以放棄使用索引a。==
????????那么我們可以強(qiáng)制語句走索引a,再試試查詢效果。顯然這次使用到了索引。
mysql> explain select * from t force index(a) where a < 100000;
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
| 1 | SIMPLE | t | range | a | a | 5 | NULL | 205002 | Using index condition |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
1 rows in set (0.09 sec)
????????再來執(zhí)行一遍試試看,果然,執(zhí)行效率上來說:主鍵索引 > 普通索引 > 全表掃描,附上分析細(xì)節(jié),重點(diǎn)關(guān)注Sending data。
select * from t where id < 100000; -- 0.10s
select * from t where a < 100000; -- 0.25s
select * from t force index(a) where a < 100000; -- 0.13s
mysql> show profiles;
+----------+------------+--------------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+--------------------------------------------------+
| 1 | 0.00049450 | set profiling = 1 |
| 2 | 0.09794425 | select * from t where id < 100000 |
| 3 | 0.25010650 | select * from t where a < 100000 |
| 4 | 0.12352975 | select * from t force index(a) where a < 100000 |
+----------+------------+--------------------------------------------------+
4 rows in set (0.02 sec)
mysql> show profile for query 2;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000086 |
| checking permissions | 0.000013 |
| Opening tables | 0.000077 |
| init | 0.000042 |
| System lock | 0.000042 |
| optimizing | 0.000028 |
| statistics | 0.000109 |
| preparing | 0.000035 |
| executing | 0.000017 |
| Sending data | 0.097321 |
| end | 0.000079 |
| query end | 0.000015 |
| closing tables | 0.000018 |
| freeing items | 0.000042 |
| cleaning up | 0.000021 |
+----------------------+----------+
15 rows in set (0.05 sec)
mysql> show profile for query 3;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000121 |
| checking permissions | 0.000019 |
| Opening tables | 0.000037 |
| init | 0.000030 |
| System lock | 0.000027 |
| optimizing | 0.000019 |
| statistics | 0.000056 |
| preparing | 0.000022 |
| executing | 0.000014 |
| Sending data | 0.249550 |
| end | 0.000027 |
| query end | 0.000015 |
| closing tables | 0.000023 |
| freeing items | 0.000115 |
| cleaning up | 0.000034 |
+----------------------+----------+
15 rows in set (0.05 sec)
mysql> show profile for query 4;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000072 |
| checking permissions | 0.000017 |
| Opening tables | 0.000037 |
| init | 0.000029 |
| System lock | 0.000028 |
| optimizing | 0.000019 |
| statistics | 0.000075 |
| preparing | 0.000023 |
| executing | 0.000014 |
| Sending data | 0.123024 |
| end | 0.000027 |
| query end | 0.000012 |
| closing tables | 0.000017 |
| freeing items | 0.000052 |
| cleaning up | 0.000085 |
+----------------------+----------+
15 rows in set (0.01 sec)
????????經(jīng)過以上分析,開始時(shí)的“辯論”,我想你心中已有答案。
完
?
?
?
翻車
????????其實(shí)到這里,故事應(yīng)該已經(jīng)結(jié)束了,本著熟能生巧的原則,我去測(cè)試服務(wù)器上再實(shí)操了一遍,卻出現(xiàn)了意想不到的結(jié)果。
環(huán)境
1. 環(huán)境:阿里云RDS
2. 版本:MySQL 5.6.16-log
現(xiàn)象
語句1:select * from t where id < 100000; -- 0.50s
語句2:select * from t where a < 100000; -- 0.51s
語句3:select * from t force index (a) where a < 100000; -- 0.42s
- 全表掃描和主鍵查詢時(shí)間差不多??
- 普通索引查詢比主鍵查詢還要快??

????????同樣的MySQL,同樣的配方,結(jié)果讓人大跌眼鏡!
為什么呢?
????????經(jīng)過一頓分析,并沒有發(fā)現(xiàn)云rds與本機(jī)MySQL有什么區(qū)別。目前我正在尋找這個(gè)問題的答案,并且正在與阿里云官方積極溝通中,如果你知道其中原委,還請(qǐng)留言告知,不吝賜教!