如何优化SQL查询中的ORDER BY COUNT(column)以避免Using temporary和Using filesort?

我想要统计某个id和对应的数量并排序,sql如下:

SELECT
    t.label_name AS topic,
    t.label_id,
    COUNT( work_order_id) AS requestWorkOrderNumber
FROM
    biz_work_order_label t
INNER JOIN biz_work_order t1 ON
    (t1.id = t.work_order_id)
WHERE
    t.deleted = 0
    AND t1.deleted = 0
    AND (t.label_type = 8
        AND t1.problem_location_code LIKE '%440000%'
        AND t1.reporting_time BETWEEN '2024-01-01 00:00:00' AND '2024-12-31 23:59:59'
        AND t1.nature IN ('ts', 'jb'))
GROUP BY
    t.label_name,
    t.label_id
ORDER BY
    requestWorkOrderNumber DESC
LIMIT 10

explain的结果:
image.png
数据量大约400万,查询这个sql需要5s多,label_name、label_id、work_order_id都加过索引,没有效果,应该是count函数导致索引失效了。
应该如何优化呢?

阅读 2.1k
avatarAI BotBETA

要优化这个SQL查询,尤其是避免Using temporaryUsing filesort,可以考虑以下几种方法:

  1. 索引优化

    • 确保在连接和过滤条件上使用了合适的索引。
    • 对于ORDER BYGROUP BY字段,考虑创建覆盖索引。
  2. 查询重写

    • 使用子查询或CTE(公用表表达式)来预先计算数量,从而避免在GROUP BYORDER BY中直接使用聚合函数。

以下是优化后的SQL查询示例:

-- 首先,创建一个覆盖索引来加速查询
CREATE INDEX idx_biz_work_order_label_optimized
ON biz_work_order_label (deleted, label_type, work_order_id)
WHERE deleted = 0;

CREATE INDEX idx_biz_work_order_optimized
ON biz_work_order (deleted, problem_location_code, reporting_time, nature, id)
WHERE deleted = 0;

-- 然后,使用子查询来计算数量,并排序
WITH RequestWorkOrderCounts AS (
    SELECT
        t.label_name AS topic,
        t.label_id,
        COUNT(t1.id) AS requestWorkOrderNumber
    FROM
        biz_work_order_label t
    INNER JOIN biz_work_order t1 ON t.work_order_id = t1.id
    WHERE
        t.deleted = 0
        AND t1.deleted = 0
        AND t.label_type = 8
        AND t1.problem_location_code LIKE '%440000%'
        AND t1.reporting_time BETWEEN '2024-01-01 00:00:00' AND '2024-12-31 23:59:59'
        AND t1.nature IN ('ts', 'jb')
    GROUP BY
        t.label_name,
        t.label_id
)
SELECT
    topic,
    label_id,
    requestWorkOrderNumber
FROM
    RequestWorkOrderCounts
ORDER BY
    requestWorkOrderNumber DESC
LIMIT 10;

解释:

  1. 索引创建

    • biz_work_order_label表的索引idx_biz_work_order_label_optimized包括了deleted, label_type, 和work_order_id,并过滤deleted = 0
    • biz_work_order表的索引idx_biz_work_order_optimized包括了deleted, problem_location_code, reporting_time, nature, 和id,并过滤deleted = 0
  2. 查询重写

    • 使用CTE(WITH子句)RequestWorkOrderCounts预先计算每个label_namelabel_id的请求工作单数量。
    • 在外层查询中,仅对预先计算的结果进行排序和限制,从而避免在GROUP BYORDER BY中直接使用聚合函数。

这种方法通过减少查询中的临时表和文件排序操作,可以显著提高查询性能。确保在生产环境中测试这些更改,并根据实际情况调整索引和查询结构。

2 个回答

在biz_work_order_label 表上面的work_order_id增加索引,biz_work_order_label 自动会变成被驱动表,会提升效率,现在只需200ms就可以查出来

t1主表索引

CREATE INDEX idx_t1_query 
ON biz_work_order (
    deleted,
    problem_location_code,
    reporting_time,
    nature,
    id
);

t(label表)

CREATE INDEX idx_t_query
ON biz_work_order_label (
    deleted,
    label_type,
    work_order_id,
    label_id,
    label_name
);

优化后的SQL语句

先过滤 t1,再 JOIN,work_order_id 是 JOIN 字段 ,label_type 是过滤字段

SELECT
    t.label_name AS topic,
    t.label_id,
    COUNT(*) AS requestWorkOrderNumber
FROM
(
    SELECT id
    FROM biz_work_order
    WHERE
        deleted = 0
        AND problem_location_code LIKE '440000%' 
        AND reporting_time BETWEEN '2024-01-01' AND '2024-12-31'
        AND nature IN ('ts', 'jb')
) t1
JOIN biz_work_order_label t
    ON t.work_order_id = t1.id
WHERE
    t.deleted = 0
    AND t.label_type = 8
GROUP BY
    t.label_name,
    t.label_id
ORDER BY
    requestWorkOrderNumber DESC
LIMIT 10;
撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题