創(chuàng)建高性能索引

Indexing Basics

索引類型

B-TREE 索引

InnoDB使用的即是B-TREE索引。
存儲引擎以不同的方式使用B-TREE索引,性能也各有不同。例如,MyISAM使用前綴壓縮技術(shù)使得索引更小,但InnoDB則按照原數(shù)據(jù)格式進行存儲。再如MyISAM索引通過數(shù)據(jù)的物理位置引用被索引的行,而InnoDB則根據(jù)主鍵引用被索引的行。

B-TREE對索引列是順序組織存儲的,所以很適合查找范圍數(shù)據(jù)。下圖為B-TREE索引的抽象表示:


B-TREE Index.png

B-TREE索引能夠加快數(shù)據(jù)訪問的速度,因為存儲引擎不再需要進行全表掃描獲取需要的數(shù)據(jù)。

假設(shè)有下面這個數(shù)據(jù)表:

CREATE TABLE People ( 
last_name varchar(50) ,
first_name varchar(50) dob date,
not null, not null, not null,
gender enum('m', 'f')not null,
key(last_name, first_name, dob) );

The index will contain the values from the last_name, first_name, and dob columns for every row in the table.
!!!索引對多個值進行排序的依據(jù)是CREATE TABLE語句中定義索引時的順序。

Types of queries that can use a B-Tree index. B-Tree indexes work well for lookups by the full key value, a key range, or a key prefix. They are useful only if the lookup uses a leftmost prefix of the index.

Here are some limitations of B-Tree indexes:

  • They are not useful if the lookup does not start from the leftmost side of the indexed columns. For example, this index won’t help you find all people named Bill or all people born on a certain date, because those columns are not leftmost in the index. Likewise, you can’t use the index to find people whose last name ends with a par- ticular letter.
  • You can’t skip columns in the index. That is, you won’t be able to find all people whose last name is Smith and who were born on a particular date. If you don’t specify a value for the first_name column, MySQL can use only the first column of the index.
  • The storage engine can’t optimize accesses with any columns to the right of the first range condition. For example, if your query is WHERE last_name="Smith" AND first_name LIKE 'J%' AND dob='1976-12-23', the index access will use only the first two columns in the index, because the LIKE is a range condition (the server can use the rest of the columns for other purposes, though). For a column that has a limited number of values, you can often work around this by specifying equality conditions instead of range conditions. We show detailed examples of this in the indexing case study later in this chapter.

所以索引列的順序是非常的重要,這些限制都和索引列的順序有關(guān)。在優(yōu)化的時候可以使用相同的列但順序不同的索引來滿足快速查詢的需求。

哈希索引

A hash index is built on a hash table and is useful only for exact lookups that use every column in the index.
In MySQL, only the Memory storage engine supports explicit hash indexes. They are the default index type for Memory tables, though Memory tables can have B-Tree indexes, too.

Building your own hash indexes. If your storage engine doesn’t support hash indexes, you can emulate them yourself in a manner similar to that InnoDB uses. This will give you access to some of the desirable properties of hash indexes, such as a very small index size for very long keys.

An example of when this approach works well is for URL lookups. URLs generally cause B-Tree indexes to become huge, because they’re very long. You’d normally query a table of URLs like this:

mysql> SELECT id FROM url WHERE url="http://www.mysql.com";

But if you remove the index on the url column and add an indexed url_crc column to the table, you can use a query like this:

mysql> SELECT id FROM url WHERE url="http://www.mysql.com" -> AND url_crc=CRC32("http://www.mysql.com");

CRC32 is one of hash functions based on on the "polynomial" division idea. The CRC is acronym for Cyclic Redundancy Code (other variants instead "Code" is "Check" and "Checksum") algorithm. The number 32 is specifying the size of resulting hash value (checksum) - 32 bits. The checksum is used to detect errors after transmission or storage of any piece of information.

This works well because the MySQL query optimizer notices there’s a small, highly selective index on the url_crc column and does an index lookup for entries with that value (1560514994, in this case). Even if several rows have the same url_crc value, it’s very easy to find these rows with a fast integer comparison and then examine them to find the one that matches the full URL exactly. The alternative is to index the full URL as a string, which is much slower.

這樣做的一個缺陷是需要維護hash值,可以通過使用觸發(fā)器來較好的解決這個問題。

If you use this approach, you should not use SHA1() or MD5() hash functions. These return very long strings, which waste a lot of space and result in slower comparisons. They are cryptographically strong functions designed to virtually eliminate collisions, which is not your goal here. Simple hash functions can offer acceptable collision rates with better performance.

If your table has many rows and CRC32() gives too many collisions, implement your own 64-bit hash function. Make sure you use a function that returns an integer, not a string. One way to implement a 64-bit hash function is to use just part of the value returned by MD5().

空間數(shù)據(jù)索引(R-Tree)

全文索引

索引的優(yōu)點

Three main benefits proceed from these properties:

  1. Indexes reduce the amount of data the server has to examine.
  2. Indexes help the server avoid sorting and temporary tables.
  3. Indexes turn random I/O into sequential I/O.

高性能索引策略

獨立的列

“Isolating” the column means it should not be part of an expression or be inside a function in the query.如果查詢中得列不是獨立的,MySQL就不會使用索引。

前綴索引和索引選擇性

Index selectivity is the ratio of the number of distinct indexed values (the cardinality) to the total number of rows in the table (#T), and ranges from 1/#T to 1.

A prefix of the column is often selective enough to give good performance. If you’re indexing BLOB or TEXT columns, or very long VARCHAR columns, you must define prefix indexes, because MySQL disallows indexing their full length.

創(chuàng)建前綴索引

mysql> ALTER TABLE tableName ADD KEY (colName(len));

前綴索引的一個缺點是MySQL無法用其做orderBy和groupBy。

選擇合適的索引列順序

選擇索引列順序的經(jīng)驗法則: 將選擇性最高的列放在前面。

聚族索引

聚族索引并不是一種單獨的索引類型,而是一種數(shù)據(jù)存儲方式。
InnoDB將通過主鍵聚集數(shù)據(jù)。


cluster_index.png

覆蓋索引

An index that contains (or “covers”) all the data needed to satisfy a query is called a covering index.

使用索引掃描來做排序

壓縮(前綴壓縮)索引

MySQL需要單獨維護重復(fù)的索引,所以有性能上的考慮。重復(fù)索引是指在相同的列上按照相同的順序創(chuàng)建的相同類型的索引。

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容