之前写过关于mysql取随机数的记录,当时没有去分析表里数据量的分布情况,所以当执行计划不同时,实际的效果有非常大的差别。
[cc lang='sql']dbaeyes 06:47:52>select count(*) ,status from act_info group by status;
+----------+--------+
| count(*) | status |
+----------+--------+
|   989978 |      0 |
|       20 |      1 |
|        1 |      2 |
|        1 |      3 |
+----------+--------+[/cc]
表中的status的大部分数据都是0(初始化状态)
用之前文章中提到的方式 索引 idx_info_status(status,id)
[cc lang='sql']dbaeyes 06:53:30>explain SELECT t1.*
    -> FROM
    -> (SELECT ROUND(RAND() * (
    -> (SELECT MAX(id) FROM act_info WHERE status = 0)-
    -> (SELECT MIN(id) FROM act_info WHERE status = 0))
    -> +(SELECT MIN(id) FROM act_info WHERE status = 0)) AS id) AS t2,act_info AS t1
    -> WHERE t1.id >= t2.id AND t1.status = 0
    -> ORDER BY t1.id LIMIT 3 ;
+----+-------------+------------+--------+--------------------------------+------------------------+---------+-------+--------+------------------------------+
| id | select_type | table      | type   | possible_keys                  | key                    | key_len | ref   | rows   | Extra                        |
+----+-------------+------------+--------+--------------------------------+------------------------+---------+-------+--------+------------------------------+
|  1 | PRIMARY     | 
|  1 | PRIMARY     | t1         | ref    | PRIMARY,idx_info_stat | idx_info_stat| 4       | const | 257438 | Using where                  |
|  2 | DERIVED     | NULL       | NULL   | NULL                           | NULL                   | NULL    | NULL  |   NULL | No tables used               |
|  5 | SUBQUERY    | NULL       | NULL   | NULL                           | NULL                   | NULL    | NULL  |   NULL | Select tables optimized away |
|  4 | SUBQUERY    | NULL       | NULL   | NULL                           | NULL                   | NULL    | NULL  |   NULL | Select tables optimized away |
|  3 | SUBQUERY    | NULL       | NULL   | NULL                           | NULL                   | NULL    | NULL  |   NULL | Select tables optimized away |
+----+-------------+------------+--------+--------------------------------+------------------------+---------+-------+--------+------------------------------+
6 rows in set (0.00 sec)[/cc]
通过explain可以看到可选主键primary或者索引方式,mysql选择了索引扫描,扫描记录是有25w
直接执行时间:
3 rows in set (3.81 sec)
通过改用强制索引(主键)的方式
[cc lang='sql']dbaeyes 06:53:48>explain SELECT t1.*
    -> FROM
    -> (SELECT ROUND(RAND() * (
    -> (SELECT MAX(id) FROM act_info WHERE status = 0)-
    -> (SELECT MIN(id) FROM act_info WHERE status = 0))
    -> +(SELECT MIN(id) FROM act_info WHERE status = 0)) AS id) AS t2,act_info AS t1 force index (primary)
    -> WHERE t1.id >= t2.id AND t1.status = 0
    -> ORDER BY t1.id LIMIT 3 ;
+----+-------------+------------+--------+---------------+---------+---------+------+--------+------------------------------+
| id | select_type | table      | type   | possible_keys | key     | key_len | ref  | rows   | Extra                        |
+----+-------------+------------+--------+---------------+---------+---------+------+--------+------------------------------+
|  1 | PRIMARY     | 
|  1 | PRIMARY     | t1         | range  | PRIMARY       | PRIMARY | 8       | NULL | 494509 | Using where                  |
|  2 | DERIVED     | NULL       | NULL   | NULL          | NULL    | NULL    | NULL |   NULL | No tables used               |
|  5 | SUBQUERY    | NULL       | NULL   | NULL          | NULL    | NULL    | NULL |   NULL | Select tables optimized away |
|  4 | SUBQUERY    | NULL       | NULL   | NULL          | NULL    | NULL    | NULL |   NULL | Select tables optimized away |
|  3 | SUBQUERY    | NULL       | NULL   | NULL          | NULL    | NULL    | NULL |   NULL | Select tables optimized away |
+----+-------------+------------+--------+---------------+---------+---------+------+--------+------------------------------+
6 rows in set (0.00 sec)[/cc]
通过主键扫描的记录数有49w之多
sql时间执行时间
3 rows in set (0.00 sec)
explain看到是通过索引扫描的记录数比主键要少,但是实际消耗的时间远远超过通过主键扫描。
理解:上面两条sql的性能差距只是在id > ? and status = 0 
当表中大量数据都是status=0时,通过主键快速定位到行,然后删选少量数据,就可以得到满足的记录。
而通过索引扫描,再回表,效率就比较低。
--如果获取的status=1,数据量较少时
[cc lang='sql']dbaeyes 06:55:05>explain SELECT t1.*
    -> FROM
    -> (SELECT ROUND(RAND() * (
    -> (SELECT MAX(id) FROM act_info WHERE status = 1)-
    -> (SELECT MIN(id) FROM act_info WHERE status = 1))
    -> +(SELECT MIN(id) FROM act_info WHERE status = 1)) AS id) AS t2,act_info AS t1
    -> WHERE t1.id >= t2.id AND t1.status =1
    -> ORDER BY t1.id LIMIT 3 ;[/cc]
那么走status效率更高。因为索引status=1可以快速的获取所需的记录。而通过主键需要扫描非常大量的记录才能得到3条满足需求的记录
[cc lang='sql']dbaeyes 06:55:05>explain SELECT t1.*
    -> FROM
    -> (SELECT ROUND(RAND() * (
    -> (SELECT MAX(id) FROM act_info WHERE status = 1)-
    -> (SELECT MIN(id) FROM act_info WHERE status = 1))
    -> +(SELECT MIN(id) FROM act_info WHERE status = 1)) AS id) AS t2,act_info AS t1
    -> WHERE t1.id >= t2.id AND t1.status =1
    -> ORDER BY t1.id LIMIT 3 ;
+----+-------------+------------+--------+--------------------------------+------------------------+---------+------+------+------------------------------+
| id | select_type | table      | type   | possible_keys                  | key                    | key_len | ref  | rows | Extra                        |
+----+-------------+------------+--------+--------------------------------+------------------------+---------+------+------+------------------------------+
|  1 | PRIMARY     | 
|  1 | PRIMARY     | t1         | range  | PRIMARY,idx_info_stat | idx_info_stat | 12      | NULL |    2 | Using where                  |
|  2 | DERIVED     | NULL       | NULL   | NULL                           | NULL                   | NULL    | NULL | NULL | No tables used               |
|  5 | SUBQUERY    | NULL       | NULL   | NULL                           | NULL                   | NULL    | NULL | NULL | Select tables optimized away |
|  4 | SUBQUERY    | NULL       | NULL   | NULL                           | NULL                   | NULL    | NULL | NULL | Select tables optimized away |
|  3 | SUBQUERY    | NULL       | NULL   | NULL                           | NULL                   | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+------------+--------+--------------------------------+------------------------+---------+------+------+------------------------------+
6 rows in set (0.01 sec)
dbaeyes 06:55:09>SELECT t1.*
    -> FROM
    -> (SELECT ROUND(RAND() * (
    -> (SELECT MAX(id) FROM act_info WHERE status = 1)-
    -> (SELECT MIN(id) FROM act_info WHERE status = 1))
    -> +(SELECT MIN(id) FROM act_info WHERE status = 1)) AS id) AS t2,act_info AS t1
    -> WHERE t1.id >= t2.id AND t1.status =1
    -> ORDER BY t1.id LIMIT 3 ;
+--------+-------------------+--------+-------------+---------------------+---------------------+
| id     | account           | status | activity_id | gmt_create          | gmt_modified        |
+--------+-------------------+--------+-------------+---------------------+---------------------+
| 287970 | 297969@dbaeye.com |      1 |           1 | 2011-05-24 15:39:39 | 2011-05-27 14:28:17 |
| 290249 | 300248@dbaeye.com |      1 |           1 | 2011-05-24 15:39:39 | 2011-05-25 15:18:13 |
| 295588 | 305587@dbaeye.com |      1 |           1 | 2011-05-24 15:39:39 | 2011-05-26 13:47:08 |
+--------+-------------------+--------+-------------+---------------------+---------------------+
3 rows in set (0.00 sec)[/cc]
 [cc lang='sql']
dbaeyes 06:55:38>explain SELECT t1.*
    -> FROM
    -> (SELECT ROUND(RAND() * (
    -> (SELECT MAX(id) FROM act_info WHERE status = 1)-
    -> (SELECT MIN(id) FROM act_info WHERE status = 1))
    -> +(SELECT MIN(id) FROM act_info WHERE status = 1)) AS id) AS t2,act_info AS t1 force index (primary)
    -> WHERE t1.id >= t2.id AND t1.status =1
    -> ORDER BY t1.id LIMIT 3 ;
+----+-------------+------------+--------+---------------+---------+---------+------+--------+------------------------------+
| id | select_type | table      | type   | possible_keys | key     | key_len | ref  | rows   | Extra                        |
+----+-------------+------------+--------+---------------+---------+---------+------+--------+------------------------------+
|  1 | PRIMARY     | 
|  1 | PRIMARY     | t1         | range  | PRIMARY       | PRIMARY | 8       | NULL |153278 | Using where                  |
|  2 | DERIVED     | NULL       | NULL   | NULL          | NULL    | NULL    | NULL |   NULL | No tables used               |
|  5 | SUBQUERY    | NULL       | NULL   | NULL          | NULL    | NULL    | NULL |   NULL | Select tables optimized away |
|  4 | SUBQUERY    | NULL       | NULL   | NULL          | NULL    | NULL    | NULL |   NULL | Select tables optimized away |
|  3 | SUBQUERY    | NULL       | NULL   | NULL          | NULL    | NULL    | NULL |   NULL | Select tables optimized away |
+----+-------------+------------+--------+---------------+---------+---------+------+--------+------------------------------+
6 rows in set (0.00 sec)
dbaeyes 06:55:53>
dbaeyes 06:55:54>SELECT t1.*
    -> FROM
    -> (SELECT ROUND(RAND() * (
    -> (SELECT MAX(id) FROM act_info WHERE status = 1)-
    -> (SELECT MIN(id) FROM act_info WHERE status = 1))
    -> +(SELECT MIN(id) FROM act_info WHERE status = 1)) AS id) AS t2,act_info AS t1 force index (primary)
    -> WHERE t1.id >= t2.id AND t1.status =1
    -> ORDER BY t1.id LIMIT 3 ;
+--------+-------------------+--------+-------------+---------------------+---------------------+
| id     | account           | status | activity_id | gmt_create          | gmt_modified        |
+--------+-------------------+--------+-------------+---------------------+---------------------+
| 664048 | 674047@dbaeye.com |      1 |           1 | 2011-05-24 15:40:08 | 2011-05-25 16:01:27 |
| 675669 | 685668@dbaeye.com |      1 |           1 | 2011-05-24 15:40:09 | 2011-05-25 17:47:41 |
| 732710 | 742709@dbaeye.com |      1 |           1 | 2011-05-24 15:40:13 | 2011-05-25 16:26:18 |
+--------+-------------------+--------+-------------+---------------------+---------------------+
3 rows in set (0.16 sec)
 [/cc]
--EOF--
This work is licensed under a CC A-S 4.0 International License.