寫(xiě)在前面
? ? ? ?網(wǎng)上很多的文章都是教科書(shū)式的說(shuō)教,缺乏實(shí)用價(jià)值。這也是筆者想寫(xiě)此系列文章的初衷,希望把實(shí)際工作的實(shí)戰(zhàn)經(jīng)驗(yàn)分享給大家,幫助大家解決實(shí)際問(wèn)題。后續(xù)的一系列文章都是筆者在實(shí)際工作遇到的問(wèn)題,比較具有代表性,從實(shí)戰(zhàn)的角度進(jìn)行分析總結(jié),希望能夠給大家?guī)?lái)幫助。
關(guān)鍵字:數(shù)據(jù)庫(kù)、Mysql、Mybatis
一、問(wèn)題背景
? ? ? ?用過(guò)Mysql的同學(xué),可能都遇到過(guò)下面這種情況。比如一張訂單表order里面定義了一個(gè)varchar的字段,叫做order_id,當(dāng)我們?cè)诒碇胁樵?xún)某條記錄時(shí),我們使用了如下的方式:
select * from order where order_id = 123456;
? ? ? ?顯然我們忘記了給123456加上單引號(hào),但是這個(gè)查詢(xún)是有效的,而且查到我們想要的數(shù)據(jù)。明明order_id是個(gè)varchar類(lèi)型,應(yīng)該使用order_id = '123456'這種方式明確指定為字符串類(lèi)型,為什么使用123456這樣的數(shù)字類(lèi)型也是ok的呢?這就涉及到了Mysql隱式類(lèi)型轉(zhuǎn)換的問(wèn)題,在MySQL中,當(dāng)操作符與不同類(lèi)型的操作數(shù)一起使用時(shí),會(huì)發(fā)生類(lèi)型轉(zhuǎn)換以使操作數(shù)相互兼容。
? ? ? ?哦!原來(lái)這樣也可以?。∵€省去了寫(xiě)單引號(hào)的麻煩,不錯(cuò)不錯(cuò)!如果你平時(shí)偷懶在Navicat或者M(jìn)ysql Console里面用這種方式寫(xiě)幾個(gè)查詢(xún)語(yǔ)句也就算了,一旦這樣的方式進(jìn)入到生產(chǎn)環(huán)境的代碼里面,那你可給自己挖了個(gè)大坑哦!
二、問(wèn)題分析
? ? ? ?看問(wèn)題不能只停留在表面,Mysql這樣做確實(shí)給平時(shí)的操作帶來(lái)了一些方便之處,但是如果不了解其背后的機(jī)理,也給程序帶來(lái)了不小的隱患。
? ? ? ?當(dāng)然,我們對(duì)待知識(shí)的態(tài)度是要知其然,更要知其所以然。這種問(wèn)題很容易了解清楚,Mysql參考手冊(cè)翻出來(lái)查一查,立馬搞定。以下內(nèi)容摘自MySQL 5.7 Reference Manual:
The following rules describe how conversion occurs for comparison operations:
If one or both arguments are NULL, the result of the comparison is NULL, except for the NULL-safe <=> equality comparison operator. For NULL <=> NULL, the result is true. No conversion is needed.
If both arguments in a comparison operation are strings, they are compared as strings.
If both arguments are integers, they are compared as integers.
Hexadecimal values are treated as binary strings if not compared to a number.
If one of the arguments is a TIMESTAMP or DATETIME column and the other argument is a constant, the constant is converted to a timestamp before the comparison is performed. This is done to be more ODBC-friendly. This is not done for the arguments to IN(). To be safe, always use complete datetime, date, or time strings when doing comparisons. For example, to achieve best results when using BETWEEN with date or time values, use CAST() to explicitly convert the values to the desired data type.
A single-row subquery from a table or tables is not considered a constant. For example, if a subquery returns an integer to be compared to a DATETIME value, the comparison is done as two integers. The integer is not converted to a temporal value. To compare the operands as DATETIME values, use CAST() to explicitly convert the subquery value to DATETIME.
If one of the arguments is a decimal value, comparison depends on the other argument. The arguments are compared as decimal values if the other argument is a decimal or integer value, or as floating-point values if the other argument is a floating-point value.
In all other cases, the arguments are compared as floating-point (real) numbers.
? ? ? ?以上是官方文檔中關(guān)于隱式轉(zhuǎn)化規(guī)則的描述,不翻譯,英文功底過(guò)硬是對(duì)一個(gè)碼農(nóng)最基本的要求。主要是筆者翻譯水平有限,一翻譯就走樣。就煩國(guó)內(nèi)有些人翻譯的文章,本來(lái)人家英文寫(xiě)的通俗易懂、言簡(jiǎn)意賅,被他們一翻譯,狗屁不通、晦澀難懂,還在那里沾沾自喜、自以為是。
? ? ? ?關(guān)于Mysql類(lèi)型轉(zhuǎn)換的規(guī)則與例子,請(qǐng)查看參考文獻(xiàn)1,這里不做過(guò)多解釋。這里只對(duì)最后一條進(jìn)行一下說(shuō)明,“所有其他情況下,兩個(gè)參數(shù)都會(huì)被轉(zhuǎn)換為浮點(diǎn)數(shù)再進(jìn)行比較”,這條很重要,如果不能深入理解,將會(huì)帶來(lái)很多意想不到的問(wèn)題發(fā)生,切記切記!
? ? ? ?有的同學(xué)說(shuō)了,類(lèi)型轉(zhuǎn)換就轉(zhuǎn)換唄!這不挺方便的嗎!其實(shí)不然,這種隱式類(lèi)型轉(zhuǎn)換一來(lái)會(huì)帶來(lái)性能問(wèn)題,二來(lái)也存在安全問(wèn)題。實(shí)際上官方文檔中已經(jīng)說(shuō)的很明白了:
For comparisons of a string column with a number, MySQL cannot use an index on the column to look up the value quickly. If str_col is an indexed string column, the index cannot be used when performing the lookup in the following statement:
SELECT * FROM tbl_name WHERE str_col=1;
The reason for this is that there are many different strings that may convert to the value 1, such as '1', ' 1', or '1a'.
? ? ? ?看到了吧!如果你用一個(gè)數(shù)字與string(varchar)字段進(jìn)行比較的話(huà),那就無(wú)法使用索引了,這可是極大的性能隱患。下面用幾個(gè)例子直觀地說(shuō)明一下。
? ? ? ?還是訂單表order里面定義了一個(gè)varchar的字段,叫做order_id,并且order_id字段上創(chuàng)建索引idx_order_id,order 表里面大約有14萬(wàn)條數(shù)據(jù)。
select count(*) from order;
count(*)
142662
規(guī)范的使用方式如下,看看它的執(zhí)行時(shí)間。
select * from order where order_id = '219052918283139700160';
0.037s elapsed
那不規(guī)范的使用方式呢!
select * from order where order_id = 219052918283139700160;
0.345s elapsed
? ? ? ?大約有十倍左右的性能差異。看看執(zhí)行計(jì)劃有什么不同,上EXPLAIN神器!
Explain
select * from order where order_id = '219052918283139700160';
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE order ref idx_order_id idx_order_id 202 const 1 Using index condition
? ? ? ?這是規(guī)范的使用方式,type為ref,使用了索引idx_order_id,沒(méi)問(wèn)題!再看看不規(guī)范用法的執(zhí)行計(jì)劃。
Explain
select * from order where order_id = 219052918283139700160;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE order ALL idx_order_id 135860 Using where
? ? ? ?我去!完蛋了!type為All,全表掃描,災(zāi)難啊!再看看更詳細(xì)的內(nèi)容,使用Explain EXTENDED和SHOW WARNINGS。
Explain EXTENDED
select * from order where order_id = 219052918283139700160;
SHOW WARNINGS;
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE order ALL idx_order_id 135864 100.00 Using where
Level Code Message
Warning 1739 Cannot use ref access on index 'idx_order_id' due to type or collation conversion on field 'order_id'
Warning 1739 Cannot use range access on index 'idx_order_id' due to type or collation conversion on field 'order_id'
Note 1003 /* select#1 */ select `xx`.`order`.`id` AS `id`,`xx`.`order`.`is_delete` AS `is_delete`,`xx`.`order`.`create_time` AS `create_time`,`xx`.`order`.`update_time` AS `update_time`,`xx`.`order`.`order_id` AS `order_id`,`xx`.`order`.`req_param` AS `req_param`,`xx`.`order`.`source` AS `source` from `xx`.`order` where (`xx`.`order`.`order_id` = 219052918283139700160)
進(jìn)一步證明無(wú)法使用索引。
? ? ? ?另外,對(duì)于隱式類(lèi)型轉(zhuǎn)換存在的安全性問(wèn)題,主要有兩方面:一是容易被sql注入攻擊、二是查詢(xún)/刪除/更新時(shí)會(huì)錯(cuò)誤命中多余數(shù)據(jù)。
? ? ? ?先說(shuō)sql注入的情況,例如下面這條語(yǔ)句:
SELECT * FROM users WHERE username = '?' AND password = '?';
? ? ? ?如果password輸入的是a' OR 1='1,那么username隨便輸入,這樣就生成了下面的查詢(xún):
SELECT * FROM users WHERE username = 'xxx' AND password = 'a' OR 1='1';
? ? ? ?由于or的優(yōu)先級(jí)最低,and次之,所以這條語(yǔ)句其實(shí)等同于:
SELECT * FROM users WHERE (username = 'xxx' AND password = 'a') OR 1='1';
結(jié)果1='1'為true。
? ? ? ?接著說(shuō)第二種情況,繼續(xù)看下面的例子:
mysql> select * from test;
+----+-------+-----------+
| id | name | password |
+----+-------+-----------+
| 1 | test1 | password1 |
| 2 | test2 | password2 |
| 3 | 12 | ddd |
| 4 | 12a | bbb |
+----+-------+-----------+
6 rows in set (0.00 sec)
mysql> select * from test where name = 12;
+----+-------+----------+
| id | name | password |
+----+-------+----------+
| 3 | 12 | ddd |
| 4 | 12a | bbb |
+----+-------+----------+
2 rows in set, 5 warnings (0.00 sec)
mysql> select * from test where name = '12';
+----+------+----------+
| id | name | password |
+----+------+----------+
| 3 | 12 | ddd |
+----+------+----------+
1 row in set (0.00 sec)
? ? ? ?本意是查詢(xún)id為3的那一條記錄,結(jié)果把id為4的那一條也查詢(xún)出來(lái)了,刪除和更新操作也存在同樣的問(wèn)題。
三、使用建議
? ? ? ?其實(shí)使用建議也很簡(jiǎn)單樸實(shí),避免隱式類(lèi)型轉(zhuǎn)換,是什么類(lèi)型就用什么類(lèi)型的條件。字段是數(shù)值類(lèi)型,就用同樣的數(shù)值類(lèi)型比較;字段是varchar,那就別圖省事,一定要叫上單引號(hào)。
? ? ? ?實(shí)際上在我之前的一篇文章(參考文獻(xiàn)3)中提到過(guò)隱式自動(dòng)轉(zhuǎn)換可能存在的問(wèn)題,但沒(méi)有展開(kāi)討論。再回想到上篇文章中提到的My batis的兩種傳參方式${}與#{},#{}這種方式能夠很大程度的防止sql注入,而${}則無(wú)法防止sql注入。想當(dāng)黑客,還是得了解很多技術(shù)細(xì)節(jié)的。
四、參考文獻(xiàn)
- MySQL 5.7 Reference Manual - https://dev.mysql.com/doc/refman/5.7/en/type-conversion.html
- MySQL隱式轉(zhuǎn)化整理 - https://www.cnblogs.com/rollenholt/p/5442825.html
- 實(shí)戰(zhàn)系列:(二)一條被Mybatis誤導(dǎo)的sql語(yǔ)句 - http://www.itdecent.cn/p/21802f7118b6
? ? ? ?? ? ? ?? ? ? ?? ? ? ?? ? ? ?? ? ? ?? ? ? ?? ? ? ?? ? ? ?? ? ? ?? ? ? ?2019年7月10日 星期三 于北京至唐山途中