评论

收藏

[MySQL] 巧用临时表将大结果集转换为小结果集驱动查询

数据库 数据库 发布于:2021-07-04 10:10 | 阅读数:266 | 评论:0

  sql如下
SELECT DISTINCT o.orders_id, o.oa_order_id,os.orders_status_name, o.order_type,
  o.date_purchased AS add_date,dop.resource, dop.country_code
FROM dm_order_products AS dop
LEFT JOIN orders AS o ON o.orders_id=dop.orders_id
LEFT JOIN orders_total AS ot ON ot.orders_id=o.orders_id AND ot.class='ot_total'
LEFT JOIN orders_status AS os ON os.orders_status_id=o.orders_status
WHERE  o.date_purchased >= '2014-01-31 10:00:00' AND o.date_purchased <= '2014-02-24 09:59:59'
ORDER BY o.orders_id DESC LIMIT 0, 20;
  因为需要在大结果集中order by 去重,再显示20条.
  表特性是orders(o)表对dm_order_products(dop)表为一对多关系,而取出来的dop.country_code为一个订单号对应唯一值,由于表结构设计问题,每次查询该country_code都需要去dop查询。所以,每次查询都放大结果集,然后再去重,得到所要的结果集合。
  explain
+----+-------------+-------+-------+----------------------------------+----------------------------+---------+-------------------------------+-------+----------------------------------------------+
| id | select_type | table | type  | possible_keys          | key            | key_len | ref               | rows  | Extra                    |
+----+-------------+-------+-------+----------------------------------+----------------------------+---------+-------------------------------+-------+----------------------------------------------+
|  1 | SIMPLE    | o   | range | PRIMARY,date_purchased       | date_purchased       | 9     | NULL              | 952922 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE    | ot  | ref   | idx_orders_total_orders_id,class | idx_orders_total_orders_id | 4     | banggood_work.o.orders_id   |   3 |                        |
|  1 | SIMPLE    | os  | ref   | PRIMARY              | PRIMARY          | 4     | banggood_work.o.orders_status |   1 |                        |
|  1 | SIMPLE    | dop   | ref   | orders_id            | orders_id          | 4     | banggood_work.o.orders_id   |   2 |                        |
+----+-------------+-------+-------+----------------------------------+----------------------------+---------+-------------------------------+-------+----------------------------------------------+
  索引情况使用正常,但是发现需要扫描一个大结果集.
  profiling,执行时间为将近20s
mysql> show profile cpu,block io for query 1;
+--------------------------------+-----------+----------+------------+--------------+---------------+
| Status             | Duration  | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+--------------------------------+-----------+----------+------------+--------------+---------------+
| starting             |  0.000025 | 0.000000 |   0.000000 |      0 |       0 |
| Waiting for query cache lock   |  0.000004 | 0.000000 |   0.000000 |      0 |       0 |
| checking query cache for query |  0.000080 | 0.000000 |   0.000000 |      0 |       0 |
| checking permissions       |  0.000005 | 0.000000 |   0.000000 |      0 |       0 |
| checking permissions       |  0.000003 | 0.000000 |   0.000000 |      0 |       0 |
| checking permissions       |  0.000003 | 0.000000 |   0.000000 |      0 |       0 |
| checking permissions       |  0.000006 | 0.000000 |   0.000000 |      0 |       0 |
| Opening tables         |  0.000034 | 0.000000 |   0.000000 |      0 |       0 |
| System lock          |  0.000012 | 0.000000 |   0.000000 |      0 |       0 |
| Waiting for query cache lock   |  0.000024 | 0.000000 |   0.000000 |      0 |       0 |
| init               |  0.000046 | 0.000000 |   0.000000 |      0 |       0 |
| optimizing           |  0.000018 | 0.000000 |   0.000000 |      0 |       0 |
| statistics           |  0.000193 | 0.000000 |   0.000000 |      0 |       0 |
| preparing            |  0.000054 | 0.000000 |   0.000000 |      0 |       0 |
| Creating tmp table       |  0.000031 | 0.000000 |   0.000000 |      0 |       0 |
| executing            |  0.000004 | 0.000000 |   0.000000 |      0 |       0 |
| Copying to tmp table       | 12.491533 | 3.039538 |   3.107527 |    11896 |       824 |
| Sorting result         |  0.030709 | 0.034995 |   0.004000 |       16 |       496 |
| Sending data           |  0.000048 | 0.000000 |   0.000000 |      0 |       0 |
| end              |  0.000004 | 0.000000 |   0.000000 |      0 |       0 |
| removing tmp table       |  0.010108 | 0.000000 |   0.010998 |      8 |      32 |
| end              |  0.000013 | 0.000000 |   0.000000 |      0 |       0 |
| query end            |  0.000004 | 0.000000 |   0.000000 |      0 |       0 |
| closing tables         |  0.000012 | 0.000000 |   0.000000 |      0 |       0 |
| freeing items          |  0.000338 | 0.000000 |   0.000000 |      0 |       0 |
| logging slow query       |  0.000006 | 0.000000 |   0.000000 |      0 |       0 |
| logging slow query       |  0.000033 | 0.000000 |   0.000000 |      0 |       8 |
| cleaning up          |  0.000006 | 0.000000 |   0.000000 |      0 |       0 |
  可以看到Copying to tmp table 占了大部分的cpu时间和io,最后sorting result占比重不大。
  我们可以上面描述的结合特性,是否能够去掉Copying to tmp table 选项!因为是根据orders_id排序,取出最新的20条数据,如果我们在orders表中先把20条数据取出来,再和对应的表连接,这样一来,就将整个大结果Copying to tmp table 再排序这一步去掉!
  看sql语句如下
SELECT DISTINCT  o.orders_id, o.oa_order_id,os.orders_status_name, o.order_type,
  o.date_purchased AS add_date,dop.resource, dop.country_code
FROM
(
SELECT *
FROM   orders AS o
WHERE  o.date_purchased >= '2014-01-31 10:00:00' AND o.date_purchased <= '2014-02-24 09:59:59'
ORDER BY o.orders_id DESC LIMIT 0, 20
) o
LEFT JOIN  dm_order_products AS dop ON o.orders_id=dop.orders_id
LEFT JOIN orders_total AS ot ON ot.orders_id=o.orders_id AND ot.class='ot_total'
LEFT JOIN orders_status AS os ON os.orders_status_id=o.orders_status
ORDER BY o.orders_id DESC LIMIT 0, 20;
+----+-------------+------------+-------+----------------------------------+----------------------------+---------+-----------------+------+---------------------------------+
| id | select_type | table    | type  | possible_keys          | key            | key_len | ref       | rows | Extra               |
+----+-------------+------------+-------+----------------------------------+----------------------------+---------+-----------------+------+---------------------------------+
|  1 | PRIMARY   | <derived2> | ALL   | NULL               | NULL             | NULL  | NULL      |   20 | Using temporary; Using filesort |
|  1 | PRIMARY   | dop    | ref   | orders_id            | orders_id          | 4     | o.orders_id   |  2 |                 |
|  1 | PRIMARY   | ot     | ref   | idx_orders_total_orders_id,class | idx_orders_total_orders_id | 4     | o.orders_id   |  3 |                 |
|  1 | PRIMARY   | os     | ref   | PRIMARY              | PRIMARY          | 4     | o.orders_status |  1 |                 |
|  2 | DERIVED   | o      | index | date_purchased           | PRIMARY          | 4     | NULL      |  330 | Using where           |
+----+-------------+------------+-------+----------------------------------+----------------------------+---------+-----------------+------+---------------------------------+
  可以发现,结果集瞬间缩小N倍,再查看profiling,执行时间变成0.01s
+--------------------------------+----------+----------+------------+--------------+---------------+
| Status             | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+--------------------------------+----------+----------+------------+--------------+---------------+
| starting             | 0.000028 | 0.000000 |   0.000000 |      0 |       0 |
| Waiting for query cache lock   | 0.000008 | 0.000000 |   0.000000 |      0 |       0 |
| checking query cache for query | 0.000148 | 0.000000 |   0.000000 |      0 |       0 |
| checking permissions       | 0.000010 | 0.000000 |   0.000000 |      0 |       0 |
| checking permissions       | 0.000006 | 0.000000 |   0.000000 |      0 |       0 |
| checking permissions       | 0.000006 | 0.000000 |   0.000000 |      0 |       0 |
| checking permissions       | 0.000010 | 0.000000 |   0.000000 |      0 |       0 |
| Opening tables         | 0.000052 | 0.000000 |   0.000000 |      0 |       0 |
| System lock          | 0.001299 | 0.000000 |   0.001000 |      0 |       8 |
| optimizing           | 0.000021 | 0.000000 |   0.000000 |      0 |       0 |
| statistics           | 0.000127 | 0.000000 |   0.000000 |      0 |       0 |
| preparing            | 0.000127 | 0.001000 |   0.000000 |      0 |       0 |
| executing            | 0.000009 | 0.000000 |   0.000000 |      0 |       0 |
| Sorting result         | 0.000012 | 0.000000 |   0.000000 |      0 |       0 |
| Sending data           | 0.002182 | 0.001000 |   0.001000 |      0 |       784 |
| Waiting for query cache lock   | 0.000010 | 0.000000 |   0.000000 |      0 |       0 |
| Sending data           | 0.000053 | 0.000000 |   0.000000 |      0 |       0 |
| init               | 0.000048 | 0.000000 |   0.000000 |      0 |       0 |
| optimizing           | 0.000010 | 0.000000 |   0.000000 |      0 |       0 |
| statistics           | 0.000315 | 0.000000 |   0.000000 |      0 |       0 |
| preparing            | 0.000025 | 0.000000 |   0.000000 |      0 |       0 |
| Creating tmp table       | 0.000043 | 0.000000 |   0.000000 |      0 |       0 |
| executing            | 0.000008 | 0.000000 |   0.000000 |      0 |       0 |
| Copying to tmp table       | 0.003390 | 0.002000 |   0.002000 |      0 |       0 |
| Sorting result         | 0.000028 | 0.000000 |   0.000000 |      0 |       0 |
| Sending data           | 0.000058 | 0.000000 |   0.000000 |      0 |       0 |
| end              | 0.000008 | 0.000000 |   0.000000 |      0 |       0 |
| removing tmp table       | 0.000013 | 0.000000 |   0.000000 |      0 |       0 |
| end              | 0.000008 | 0.000000 |   0.000000 |      0 |       0 |
| query end            | 0.000008 | 0.000000 |   0.000000 |      0 |       0 |
| closing tables         | 0.000007 | 0.000000 |   0.000000 |      0 |       0 |
| removing tmp table       | 0.000591 | 0.000000 |   0.000000 |      0 |       0 |
| closing tables         | 0.000021 | 0.000000 |   0.000000 |      0 |       0 |
| freeing items          | 0.000024 | 0.000000 |   0.000000 |      0 |       0 |
| Waiting for query cache lock   | 0.000007 | 0.000000 |   0.000000 |      0 |       0 |
| freeing items          | 0.000630 | 0.001000 |   0.000000 |      0 |       0 |
| Waiting for query cache lock   | 0.000010 | 0.000000 |   0.000000 |      0 |       0 |
| freeing items          | 0.000006 | 0.000000 |   0.000000 |      0 |       0 |
| storing result in query cache  | 0.000010 | 0.000000 |   0.000000 |      0 |       0 |
| logging slow query       | 0.000006 | 0.000000 |   0.000000 |      0 |       0 |
| logging slow query       | 0.000056 | 0.000000 |   0.000000 |      0 |       8 |
| cleaning up          | 0.000011 | 0.000000 |   0.000000 |      0 |       0 |
+--------------------------------+----------+----------+------------+--------------+---------------+
  可以看到Copying to tmp table已经被优化!执行时间提高了几百倍!
  

  注意:
  1. 如果是limit m,n这种语句类型,第一步要想到,是不是可以抽取一部分使用临时表优化后,再和外面的其他表进行优化
  2. distinct和orders by 是天敌。在大结果集的sql语句中,尽量避免直接使用distinct.否则,mysql会将整个结果集copy导临时表,重新排序后,再返回!


  
关注下面的标签,发现更多相似文章