Skip to content
清晨的一缕阳光
返回

MySQL 子查询与临时表优化

核心概念

子查询和临时表是复杂查询的常用技术,但使用不当会导致严重的性能问题。

子查询分类

类型说明性能
标量子查询返回单个值较好
列子查询返回单列多行一般
行子查询返回单行多列较好
表子查询返回多行多列需注意
相关子查询依赖外层查询最差

临时表分类

类型创建方式特点
隐式临时表MySQL 自动创建USING temporary
显式临时表CREATE TEMPORARY TABLE手动控制
内存临时表小结果集速度快
磁盘临时表大结果集受 tmp_table_size 限制

子查询优化

1. 相关子查询优化

问题: 相关子查询每行执行一次,性能极差。

-- 错误:相关子查询(每行执行一次)
SELECT u.*, 
    (SELECT COUNT(*) FROM orders WHERE user_id = u.id) AS order_count,
    (SELECT SUM(amount) FROM orders WHERE user_id = u.id) AS total_amount,
    (SELECT MAX(create_time) FROM orders WHERE user_id = u.id) AS last_order_time
FROM users u;
-- 10 万用户 × 3 次子查询 = 30 万次查询

-- 优化 1:JOIN 聚合
SELECT u.*, 
    COALESCE(t.order_count, 0) AS order_count,
    COALESCE(t.total_amount, 0) AS total_amount,
    t.last_order_time
FROM users u
LEFT JOIN (
    SELECT user_id, 
           COUNT(*) AS order_count,
           SUM(amount) AS total_amount,
           MAX(create_time) AS last_order_time
    FROM orders
    GROUP BY user_id
) t ON u.id = t.user_id;
-- 1 次聚合查询

-- 优化 2:窗口函数(MySQL 8.0+)
SELECT u.*, 
    COUNT(o.id) OVER (PARTITION BY o.user_id) AS order_count,
    SUM(o.amount) OVER (PARTITION BY o.user_id) AS total_amount,
    MAX(o.create_time) OVER (PARTITION BY o.user_id) AS last_order_time
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;

性能对比:

2. IN 子查询优化

-- IN 子查询(MySQL 5.6+ 已优化)
SELECT * FROM users 
WHERE id IN (SELECT user_id FROM orders WHERE amount > 100);

-- 优化 1:EXISTS(某些场景更优)
SELECT * FROM users u 
WHERE EXISTS (
    SELECT 1 FROM orders o WHERE o.user_id = u.id AND o.amount > 100
);

-- 优化 2:JOIN(通常最优)
SELECT DISTINCT u.* FROM users u
INNER JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100;

-- 优化 3:半连接(MySQL 自动优化)
-- MySQL 优化器会自动将 IN 子查询转为半连接

执行计划对比:

EXPLAIN SELECT * FROM users 
WHERE id IN (SELECT user_id FROM orders WHERE amount > 100);
-- type: index_subquery(MySQL 优化)

EXPLAIN SELECT DISTINCT u.* FROM users u
INNER JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100;
-- type: ref(通常更快)

3. NOT IN 子查询优化

-- NOT IN(性能差,且有 NULL 陷阱)
SELECT * FROM users 
WHERE id NOT IN (SELECT user_id FROM orders);
-- 问题:如果 orders.user_id 有 NULL,结果为空

-- 优化 1:NOT EXISTS(推荐)
SELECT * FROM users u 
WHERE NOT EXISTS (
    SELECT 1 FROM orders o WHERE o.user_id = u.id
);

-- 优化 2:LEFT JOIN(推荐)
SELECT u.* FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.user_id IS NULL;

-- 优化 3:NOT IN + NOT NULL
SELECT * FROM users 
WHERE id NOT IN (
    SELECT user_id FROM orders WHERE user_id IS NOT NULL
);

性能对比:

4. ANY/SOME/ALL 子查询优化

-- ANY / SOME
SELECT * FROM products 
WHERE price > ANY (SELECT price FROM products WHERE category_id = 1);

-- 优化:使用 MIN/MAX
SELECT * FROM products 
WHERE price > (SELECT MIN(price) FROM products WHERE category_id = 1);

-- ALL
SELECT * FROM products 
WHERE price > ALL (SELECT price FROM products WHERE category_id = 1);

-- 优化:使用 MAX
SELECT * FROM products 
WHERE price > (SELECT MAX(price) FROM products WHERE category_id = 1);

5. 派生表优化

-- 派生表(FROM 子句中的子查询)
SELECT t.user_id, t.order_count 
FROM (
    SELECT user_id, COUNT(*) AS order_count 
    FROM orders 
    GROUP BY user_id
) t 
WHERE t.order_count > 5;

-- MySQL 5.7+:派生表物化
-- 子查询结果物化为临时表,只执行一次

-- 优化:直接写查询
SELECT user_id, COUNT(*) AS order_count 
FROM orders 
GROUP BY user_id 
HAVING COUNT(*) > 5;

临时表优化

1. 隐式临时表场景

-- 以下场景 MySQL 会自动创建临时表

-- 1. GROUP BY 无法使用索引
SELECT user_id, COUNT(*) FROM orders GROUP BY user_id;
-- Extra: Using temporary

-- 2. ORDER BY 与 GROUP BY 字段不同
SELECT user_id, COUNT(*) AS cnt 
FROM orders 
GROUP BY user_id 
ORDER BY cnt DESC;
-- Extra: Using temporary; Using filesort

-- 3. DISTINCT
SELECT DISTINCT user_id FROM orders;
-- Extra: Using temporary

-- 4. UNION(非 UNION ALL)
SELECT user_id FROM orders
UNION
SELECT user_id FROM users;
-- Extra: Using temporary

-- 5. 子查询在某些版本
SELECT * FROM users 
WHERE id IN (SELECT user_id FROM orders);
-- 可能创建临时表

2. 临时表性能影响

-- 查看临时表使用情况
SHOW STATUS LIKE 'Created_tmp%';

-- Created_tmp_tables: 创建的临时表总数
-- Created_tmp_disk_tables: 磁盘临时表数
-- Created_tmp_files: 创建的临时文件数

-- 理想状态:
-- Created_tmp_disk_tables / Created_tmp_tables < 0.1
-- 即磁盘临时表占比 < 10%

优化临时表配置:

-- 增大内存临时表限制
SET GLOBAL tmp_table_size = 256M;
SET GLOBAL max_heap_table_size = 256M;

-- 临时表存储引擎(MySQL 5.7+)
SET GLOBAL default_tmp_storage_engine = 'MEMORY';

-- 临时文件目录(使用 SSD)
-- my.cnf: tmpdir = /mnt/ssd/tmp

3. 显式临时表使用

场景: 复杂多步查询,中间结果复用。

-- 创建临时表
CREATE TEMPORARY TABLE tmp_user_stats (
    user_id BIGINT PRIMARY KEY,
    order_count INT,
    total_amount DECIMAL(10, 2),
    last_order_time DATETIME,
    INDEX idx_count (order_count DESC)
) ENGINE=MEMORY;

-- 填充数据
INSERT INTO tmp_user_stats
SELECT user_id, COUNT(*), SUM(amount), MAX(create_time)
FROM orders
GROUP BY user_id;

-- 多次使用临时表
SELECT u.*, t.order_count, t.total_amount
FROM users u
INNER JOIN tmp_user_stats t ON u.id = t.user_id
WHERE t.order_count > 5;

SELECT u.*, t.order_count
FROM users u
INNER JOIN tmp_user_stats t ON u.id = t.user_id
WHERE t.total_amount > 1000;

-- 临时表自动销毁(会话结束)
-- 或手动销毁
DROP TEMPORARY TABLE tmp_user_stats;

适用场景:

4. 临时表 vs 派生表

-- 派生表(子查询)
SELECT t.user_id, t.order_count 
FROM (
    SELECT user_id, COUNT(*) AS order_count 
    FROM orders 
    GROUP BY user_id
) t 
WHERE t.order_count > 5;
-- 每次执行都重新计算

-- 临时表
CREATE TEMPORARY TABLE tmp_stats AS
SELECT user_id, COUNT(*) AS order_count 
FROM orders 
GROUP BY user_id;

SELECT user_id, order_count FROM tmp_stats WHERE order_count > 5;
SELECT user_id, order_count FROM tmp_stats WHERE order_count > 10;
-- 物化一次,多次使用

选择原则:

物化视图优化

MySQL 8.0+ 物化视图

-- 创建物化视图
CREATE TABLE user_order_stats (
    user_id BIGINT PRIMARY KEY,
    order_count INT,
    total_amount DECIMAL(10, 2),
    last_order_time DATETIME,
    INDEX idx_count (order_count DESC),
    INDEX idx_amount (total_amount DESC)
) ENGINE=InnoDB;

-- 初始数据
INSERT INTO user_order_stats
SELECT user_id, COUNT(*), SUM(amount), MAX(create_time)
FROM orders
GROUP BY user_id;

-- 增量更新(使用 ON DUPLICATE KEY UPDATE)
INSERT INTO user_order_stats
SELECT user_id, 1 AS order_count, amount, create_time
FROM new_orders
ON DUPLICATE KEY UPDATE 
    order_count = order_count + 1,
    total_amount = total_amount + VALUES(amount),
    last_order_time = GREATEST(last_order_time, VALUES(create_time));

-- 查询物化视图
SELECT u.*, t.order_count, t.total_amount
FROM users u
INNER JOIN user_order_stats t ON u.id = t.user_id
WHERE t.order_count > 5
ORDER BY t.total_amount DESC
LIMIT 0, 20;

物化视图刷新策略

-- 1. 定时全量刷新
CREATE EVENT refresh_user_stats
ON SCHEDULE EVERY 1 HOUR
DO
    TRUNCATE TABLE user_order_stats;
    INSERT INTO user_order_stats
    SELECT user_id, COUNT(*), SUM(amount), MAX(create_time)
    FROM orders
    GROUP BY user_id;

-- 2. 触发器增量刷新
DELIMITER $$
CREATE TRIGGER trg_order_after_insert
AFTER INSERT ON orders
FOR EACH ROW
BEGIN
    INSERT INTO user_order_stats (user_id, order_count, total_amount, last_order_time)
    VALUES (NEW.user_id, 1, NEW.amount, NEW.create_time)
    ON DUPLICATE KEY UPDATE 
        order_count = order_count + 1,
        total_amount = total_amount + NEW.amount,
        last_order_time = GREATEST(last_order_time, NEW.create_time);
END$$
DELIMITER ;

-- 3. 应用层维护
-- 在订单创建/更新时同步更新物化视图

实战案例

案例 1:用户行为分析优化

原始 SQL:

SELECT u.id, u.name,
    (SELECT COUNT(*) FROM orders WHERE user_id = u.id) AS order_count,
    (SELECT SUM(amount) FROM orders WHERE user_id = u.id) AS total_amount,
    (SELECT COUNT(DISTINCT product_id) FROM orders WHERE user_id = u.id) AS product_count,
    (SELECT MAX(create_time) FROM orders WHERE user_id = u.id) AS last_order_time,
    (SELECT COUNT(*) FROM comments WHERE user_id = u.id) AS comment_count
FROM users u
WHERE u.status = 1;
-- 5 个相关子查询,性能极差

优化方案:

-- 方案 1:JOIN 聚合
SELECT u.id, u.name,
    COALESCE(o.order_count, 0) AS order_count,
    COALESCE(o.total_amount, 0) AS total_amount,
    COALESCE(o.product_count, 0) AS product_count,
    o.last_order_time,
    COALESCE(c.comment_count, 0) AS comment_count
FROM users u
LEFT JOIN (
    SELECT user_id, 
           COUNT(*) AS order_count,
           SUM(amount) AS total_amount,
           COUNT(DISTINCT product_id) AS product_count,
           MAX(create_time) AS last_order_time
    FROM orders
    GROUP BY user_id
) o ON u.id = o.user_id
LEFT JOIN (
    SELECT user_id, COUNT(*) AS comment_count
    FROM comments
    GROUP BY user_id
) c ON u.id = c.user_id
WHERE u.status = 1;

-- 方案 2:物化视图
-- 创建 user_stats 表存储聚合数据
-- 定期刷新或增量更新

案例 2:复杂报表查询优化

原始 SQL:

-- 多步骤复杂查询
SELECT 
    DATE(o.create_time) AS date,
    u.city,
    COUNT(*) AS order_count,
    SUM(o.amount) AS total_amount,
    AVG(o.amount) AS avg_amount
FROM orders o
INNER JOIN users u ON o.user_id = u.id
WHERE o.create_time BETWEEN '2024-01-01' AND '2024-01-31'
GROUP BY DATE(o.create_time), u.city
ORDER BY date, total_amount DESC;
-- 问题:GROUP BY 表达式,无法使用索引

优化方案:

-- 方案 1:临时表
CREATE TEMPORARY TABLE tmp_orders (
    order_date DATE,
    city VARCHAR(50),
    amount DECIMAL(10, 2),
    INDEX idx_date_city (order_date, city)
);

INSERT INTO tmp_orders
SELECT DATE(o.create_time), u.city, o.amount
FROM orders o
INNER JOIN users u ON o.user_id = u.id
WHERE o.create_time BETWEEN '2024-01-01' AND '2024-01-31';

SELECT order_date AS date, city,
    COUNT(*) AS order_count,
    SUM(amount) AS total_amount,
    AVG(amount) AS avg_amount
FROM tmp_orders
GROUP BY order_date, city
ORDER BY date, total_amount DESC;

-- 方案 2:添加计算列
ALTER TABLE orders ADD COLUMN order_date DATE;
UPDATE orders SET order_date = DATE(create_time);
CREATE INDEX idx_order_date ON orders(order_date);

SELECT order_date, u.city,
    COUNT(*) AS order_count,
    SUM(o.amount) AS total_amount,
    AVG(o.amount) AS avg_amount
FROM orders o
INNER JOIN users u ON o.id = o.user_id
WHERE o.order_date BETWEEN '2024-01-01' AND '2024-01-31'
GROUP BY order_date, u.city
ORDER BY order_date, total_amount DESC;

案例 3:批量数据处理优化

问题: 批量更新 100 万条数据,逐条更新太慢。

-- 错误:逐条更新(100 万次)
UPDATE orders SET status = 1 WHERE id = 1;
UPDATE orders SET status = 1 WHERE id = 2;
...

-- 优化 1:临时表 + JOIN
CREATE TEMPORARY TABLE tmp_order_ids (
    id BIGINT PRIMARY KEY
);

-- 批量插入 ID
INSERT INTO tmp_order_ids VALUES (1), (2), (3), ...;

-- JOIN 更新
UPDATE orders o
INNER JOIN tmp_order_ids t ON o.id = t.id
SET o.status = 1;

-- 优化 2:CASE 语句
UPDATE orders SET status = CASE id
    WHEN 1 THEN 1
    WHEN 2 THEN 1
    WHEN 3 THEN 1
    ...
END
WHERE id IN (1, 2, 3, ...);

案例 4:去重查询优化

-- 错误:DISTINCT 性能差
SELECT DISTINCT user_id, product_id FROM orders;
-- Extra: Using temporary

-- 优化 1:GROUP BY
SELECT user_id, product_id FROM orders GROUP BY user_id, product_id;
-- 仍然使用临时表,但语义更清晰

-- 优化 2:索引
CREATE INDEX idx_user_product ON orders(user_id, product_id);
SELECT DISTINCT user_id, product_id FROM orders;
-- Extra: Using index(覆盖索引,无需临时表)

最佳实践总结

子查询优化原则

  1. 避免相关子查询

    • 改为 JOIN 聚合
    • 使用窗口函数(MySQL 8.0+)
  2. IN 子查询

    • MySQL 5.6+ 已优化
    • 可考虑 EXISTS 或 JOIN
  3. NOT IN 子查询

    • 使用 NOT EXISTS
    • 或 LEFT JOIN … IS NULL
  4. 派生表

    • 确保有合适的索引
    • 考虑物化为临时表

临时表使用指南

使用临时表场景:
1. 中间结果复用多次
2. 复杂多步查询
3. 需要索引优化中间结果

避免临时表场景:
1. 简单查询
2. 结果集很小
3. 只使用一次

配置优化

[mysqld]
# 临时表配置
tmp_table_size = 256M
max_heap_table_size = 256M

# 临时文件目录(使用 SSD)
tmpdir = /mnt/ssd/tmp

# 监控临时表使用
# SHOW STATUS LIKE 'Created_tmp%';

参考资料


分享这篇文章到:

上一篇文章
Spring Boot JWT 认证实战
下一篇文章
Spring Boot Spring Data JPA 实战