首页 > Php, Sql-Mysql > SQL优化实例,看我如何把一个耗时1分钟的请求优化到9秒。

SQL优化实例,看我如何把一个耗时1分钟的请求优化到9秒。

这主要是一个数据统计的web页面,展现统计结果。由于当时需求很紧,要求尽快能出来查询结果就行,而且是说以后会重新对这部分功能进行规划再开发。所以也没有考虑任何性能相关,劈哩啪啦的一天把很多数据统计的功能实现了。
现在,数据量已然有点规模了,但是还没有再开发的计划,目前还使用这个统计程序,但是由于有些数据量了,并且没有考虑过优化,所以目前查询速度很慢。尤其是其中有一个查询,居然耗时1分多,问题很大啊,今天抽点时间,找找问题,优化一下,主要做的工作的是改进sql语句,添加合适的索引。
存储的数据是根据用户的ID末尾的值来简单分表的,所以共有0到9一共10个表,需要遍历这10个表,然后把最后的数据相加,再处理后得到最后结果。由于需求还算比较简单,所以没有采用定时脚本去处理前几日的数据后入库保存的做法,如果采用以上做法,速度会快更多。

以下是其中的一个处理的一条sql,运行看起来耗时11.76秒。

mysql> select sql_no_cache count(distinct(a.mpnumber)) as mpnumber,date_format(a.bindtime,'%Y-%m-%d') as gDate from new_bind_mp_weibo_user_info_7 a left join weibo_user_visit_info_7 b on a.mpnumber=b.mpnumber and a.apmac=b.apmac  where b.id is NULL   and a.bindtime between '2012-03-01' and '2012-03-27 23:59:59' group by gDate order by gDate;
+----------+------------+
| mpnumber | gDate      |
+----------+------------+
|       22 | 2012-03-01 | 
|       17 | 2012-03-02 | 
|       16 | 2012-03-03 | 
|       14 | 2012-03-04 | 
|       12 | 2012-03-05 | 
|        5 | 2012-03-06 | 
|        8 | 2012-03-07 | 
|       26 | 2012-03-08 | 
|       10 | 2012-03-09 | 
|        3 | 2012-03-10 | 
|        2 | 2012-03-11 | 
|       42 | 2012-03-12 | 
|       48 | 2012-03-13 | 
|       40 | 2012-03-14 | 
|       61 | 2012-03-15 | 
|       59 | 2012-03-16 | 
|       70 | 2012-03-17 | 
|       87 | 2012-03-18 | 
|       44 | 2012-03-19 | 
|       47 | 2012-03-20 | 
|       66 | 2012-03-21 | 
|       48 | 2012-03-22 | 
|       12 | 2012-03-26 | 
|       39 | 2012-03-27 | 
+----------+------------+
24 rows in set (11.76 sec)

mysql> explain select count(distinct(a.mpnumber)) as mpnumber,date_format(a.bindtime,'%Y-%m-%d') as gDate from new_bind_mp_weibo_user_info_7 a left join weibo_user_visit_info_7 b on a.mpnumber=b.mpnumber and a.apmac=b.apmac  where b.id is NULL   and a.bindtime between '2012-03-01' and '2012-03-27 23:59:59' group by gDate order by gDate G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
         type: ALL
possible_keys: search_key
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3855
        Extra: Using where; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
         type: ref
possible_keys: search_index4
          key: search_index4
      key_len: 45
          ref: newwifi_visit_info.a.apmac
         rows: 223
        Extra: Using where; Not exists
2 rows in set (0.00 sec)

首先做的第一步,把连接的两个表做处理,分别只提取出需要的字段,并把对a表的时间限制放到连接前。

mysql> select sql_no_cache count(distinct(a.mpnumber)) as mpnumber, date(a.bindtime) as gDate from (select mpnumber,apmac,bindtime from new_bind_mp_weibo_user_info_7 where bindtime between '2012-03-01' and '2012-03-28' ) as a left join (select id,mpnumber,apmac from weibo_user_visit_info_7) as b on a.mpnumber=b.mpnumber and a.apmac=b.apmac  where b.id is NULL    group by gDate order by gDate;
+----------+------------+
| mpnumber | gDate      |
+----------+------------+
|       22 | 2012-03-01 | 
|       17 | 2012-03-02 | 
|       16 | 2012-03-03 | 
|       14 | 2012-03-04 | 
|       12 | 2012-03-05 | 
|        5 | 2012-03-06 | 
|        8 | 2012-03-07 | 
|       26 | 2012-03-08 | 
|       10 | 2012-03-09 | 
|        3 | 2012-03-10 | 
|        2 | 2012-03-11 | 
|       42 | 2012-03-12 | 
|       48 | 2012-03-13 | 
|       40 | 2012-03-14 | 
|       61 | 2012-03-15 | 
|       59 | 2012-03-16 | 
|       70 | 2012-03-17 | 
|       87 | 2012-03-18 | 
|       44 | 2012-03-19 | 
|       47 | 2012-03-20 | 
|       66 | 2012-03-21 | 
|       48 | 2012-03-22 | 
|       12 | 2012-03-26 | 
|       39 | 2012-03-27 | 
+----------+------------+
24 rows in set (5.96 sec)

mysql> explain select sql_no_cache count(distinct(a.mpnumber)) as mpnumber, date(a.bindtime) as gDate from (select mpnumber,apmac,bindtime from new_bind_mp_weibo_user_info_7 where bindtime between '2012-03-01' and '2012-03-28' ) as a left join (select id,mpnumber,apmac from weibo_user_visit_info_7) as b on a.mpnumber=b.mpnumber and a.apmac=b.apmac  where b.id is NULL    group by gDate order by gDateG
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1196
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived3>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 28099
        Extra: Using where; Not exists
*************************** 3. row ***************************
           id: 3
  select_type: DERIVED
        table: weibo_user_visit_info_7
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 28099
        Extra: 
*************************** 4. row ***************************
           id: 2
  select_type: DERIVED
        table: new_bind_mp_weibo_user_info_7
         type: ALL
possible_keys: search_key
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3856
        Extra: Using where
4 rows in set (0.03 sec)

然后,可以看到,遍历b表的行数还是很多的,所以根据需求,添加合适的对b表的限定条件,缩减b表的行数。然后再运行的时候,就能看到,只需要1秒多了。

mysql>  select sql_no_cache count(distinct(a.mpnumber)) as mpnumber, date(a.bindtime) as gDate from (select mpnumber,apmac,bindtime from new_bind_mp_weibo_user_info_7 where bindtime between '2012-03-01' and '2012-03-28' ) as a left join (select id,mpnumber,apmac from weibo_user_visit_info_7 where visittime between '2012-03-01' and '2012-03-28') as b on a.mpnumber=b.mpnumber and a.apmac=b.apmac  where b.id is NULL    group by gDate order by gDate;
+----------+------------+
| mpnumber | gDate      |
+----------+------------+
|       22 | 2012-03-01 | 
|       17 | 2012-03-02 | 
|       16 | 2012-03-03 | 
|       14 | 2012-03-04 | 
|       12 | 2012-03-05 | 
|        5 | 2012-03-06 | 
|        8 | 2012-03-07 | 
|       26 | 2012-03-08 | 
|       10 | 2012-03-09 | 
|        3 | 2012-03-10 | 
|        2 | 2012-03-11 | 
|       42 | 2012-03-12 | 
|       48 | 2012-03-13 | 
|       40 | 2012-03-14 | 
|       61 | 2012-03-15 | 
|       59 | 2012-03-16 | 
|       70 | 2012-03-17 | 
|       87 | 2012-03-18 | 
|       44 | 2012-03-19 | 
|       47 | 2012-03-20 | 
|       66 | 2012-03-21 | 
|       48 | 2012-03-22 | 
|       12 | 2012-03-26 | 
|       39 | 2012-03-27 | 
+----------+------------+
24 rows in set (1.82 sec)

mysql> explain select sql_no_cache count(distinct(a.mpnumber)) as mpnumber, date(a.bindtime) as gDate from (select mpnumber,apmac,bindtime from new_bind_mp_weibo_user_info_7 where bindtime between '2012-03-01' and '2012-03-28' ) as a left join (select id,mpnumber,apmac from weibo_user_visit_info_7 where visittime between '2012-03-01' and '2012-03-28') as b on a.mpnumber=b.mpnumber and a.apmac=b.apmac  where b.id is NULL    group by gDate order by gDateG
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1196
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived3>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 9129
        Extra: Using where; Not exists
*************************** 3. row ***************************
           id: 3
  select_type: DERIVED
        table: weibo_user_visit_info_7
         type: ALL
possible_keys: search_index3
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 28101
        Extra: Using where
*************************** 4. row ***************************
           id: 2
  select_type: DERIVED
        table: new_bind_mp_weibo_user_info_7
         type: ALL
possible_keys: search_key
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3856
        Extra: Using where
4 rows in set (0.02 sec)

没有写出来如何添加索引的,也就不写了,自己根据explain的结果看,需要哪些索引吧。
有一点要说的是,再对某datetime类型的列添加索引后,发现查询时没有使用索引,explain了一下,会遍历27K多条数据,使用force index再来explain下这条sql,类似: explain SELECT * from table force index(index_name) where datetime between ‘2011-03-01’ and ‘2012-03-28’ ;得到的结果是会遍历9K条数据,占上个结果的三分之一,所用时间同不用索引没啥差别,所以,可能是当数据量小到一定的比例时,索引才会被调用。
—–
2012年3月28日 PS:查询了下手册,发现了关于是否使用索引的说明

每个表的索引被查询,并且使用最好的索引,除非优化器认为使用表扫描更有效。是否使用扫描取决于是否最好的索引跨越超过30%的表。优化器更加复杂,其估计基于其它因素,例如表大小、行数和I/O块大小,因此固定比例不再决定选择使用索引还是扫描。

官方手册:http://dev.mysql.com/doc/refman/5.1/zh/optimization.html#where-optimizations

  1. 还没有评论
评论提交中, 请稍候...

留言

可以使用的标签: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>
Trackbacks & Pingbacks ( 0 )
  1. 还没有 trackbacks