Derived Tables in MySQL 5.7
In MySQL 5.7, we have enhanced the optimizer so that derived tables and views are handled more consistently.
Just to recap: A derived table is a subquery that can take the place of a table in the FROM clause of an SQL statement. A derived table that is embedded in the query is sometimes called an unnamed derived table. It is also referred to as simply a “subquery in FROM clause”.
It is also possible to replace a derived table with a view. This is why views sometimes are called named derived tables.
These two query sets are thus equivalent:
CREATE VIEW v1 AS SELECT * FROM t1;SELECT * FROM v1 JOIN t2 USING (a); CREATE VIEWv1ASSELECT *FROMt1;SELECT *FROMv1JOINt2USING(a); SELECT * FROM (SELECT * FROM t1) AS dt1 JOIN t2 USING (a); SELECT *FROM(SELECT *FROMt1)ASdt1JOINt2USING(a);
For a long time, there has been a difference in the handling of views and derived tables: While it has been possible to merge views into the outer query, a derived table would always be materialized and accessed as a temporary table in the outer query.
This is costly for two reasons: creating and reading the materialized temporary table takes time, and it prohibits pushing down conditions from the outer query to the derived table.
In MySQL 5.7, we have consolidated how views and derived tables are handled: Both are resolved when the query they are used in is resolved. (Resolving means the process of looking up table and column names in the dictionary, checking that queries are semantically correct, etc.) We also decide whether to merge or materialize the derived table/view at the same place. In fact, this is almost the same place as where subqueries are transformed into semi-join operations, so the decision of whether we should do a semi-join transform, merge a derived table or view, or apply both transforms, is now taken when the containing query is resolved.
The criteria for when to merge a derived table is the same as for views, merging is not supportedif the subquery contains:
UNION clause GROUP BY DISTINCT Aggregation LIMIT or OFFSET an assignment to user variables
We can show this with EXPLAIN. The query
SELECT * FROM (SELECT * FROM t1) AS dt1 JOIN t2 USING (a); SELECT *FROM(SELECT *FROMt1)ASdt1JOINt2USING(a);
is explained like this in 5.6:
| 1 | PRIMARY | t2 | ALL | NULL | NULL | NULL | NULL | 1 | Using where || 1 | PRIMARY | <derived2>| ref | <auto_key0> | <auto_key0> | 5 | test.t2.a | 2 | Using index || 2 | DERIVED | t1 | ALL | NULL | NULL | NULL | NULL | 1 | NULL | |1|PRIMARY|t2|ALL|NULL|NULL|NULL|NULL|1|Usingwhere||1|PRIMARY|<derived2>|ref|<auto_key0>|<auto_key0>|5|test.t2.a|2|Usingindex||2|DERIVED|t1|ALL|NULL|NULL|NULL|NULL|1|NULL|
and like this in 5.7:
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL || 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where | |1|SIMPLE|t1|NULL|ALL|NULL|NULL|NULL|NULL|1|100.00|NULL||1|SIMPLE|t2|NULL|ALL|NULL|NULL|NULL|NULL|1|100.00|Usingwhere| How to control whether to merge or materialize
A user has always been ableto determine whether to merge or materialize a view by specifying the ALGORITHM clause when creating the view: ALGORITHM=MERGE will merge the view into the outer query if possible, whereas ALGORITHM=TEMPTABLE will always materialize the view. There is also a third option: If no ALGORITHM value is specified, the view is handled similar to ALGORITHM=MERGE.
Since derived tables are specified directly in the query, we cannot specify an algorithm when creating them. We have therefore added a new optimizer switch named derived_merge that can be used to control whether to merge or materialize.optimizer_switch=’derived_merge=on’ will merge the derived table if possible, optimizer_switch=’derived_merge=off’ will materialize the derived table always.
The optimizer_switch is also applicable to views: If no ALGORITHM is specified for the view, the decision is taken according to the value of derived_merge.
ORDER BY in derived table or view
ORDER BY in the subquery/view is handled differently in 5.7 compared to 5.6: It is ignored unless the outer query is non-aggregated and contains only one table reference (i.e the reference to this derived table or view). It is also ignored if the outer query itself contains an ORDER BY clause.
In this set of statements:
CREATE VIEW v1 AS SELECT * FROM t1 WHERE x > 100 ORDER BY y;SELECT * FROM v1; CREATE VIEWv1ASSELECT *FROMt1WHEREx>100ORDERBYy;SELECT *FROMv1;
the ORDER BY specification is propagated to the outer query, since v1 is the only table reference in the SELECT statement, the query is not aggregated and does not contain an ORDER BY clause.
The old behavior was doubtful: Propagating ORDER BY to an aggregated outer query is rarely possible. If the outer query is a join query, it is also hard to imagine that it is useful to propagate the ORDER BY to the outer query (if the query contains two view references, both with ORDER BY, which of them should be used?). But we kept the behavior for the simple case of a single non-aggregated table, since this is a quite common practice in existing applications.
Derived tables inside views:
Previously,it wasnot possible to create a view that containeda derived table in its FROM clause, but this restriction has been lifted in 5.7.
CREATE VIEW v ASSELECT t1.a, t1.b, dt.cFROM t1 JOIN (SELECT * FROM t2) AS dt ON t1.a=dt.a;SELECT * FROM v WHERE c > 10; CREATE VIEWvASSELECT t1.a,t1.b,dt.cFROM t1JOIN(SELECT *FROMt2)ASdtONt1.a=dt.a; SELECT *FROMvWHEREc>10;
Here is the explain for this query in 5.7:
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL || 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where | |1|SIMPLE|t1|NULL|ALL|NULL|NULL|NULL|NULL|1|100.00|NULL||1|SIMPLE|t2|NULL|ALL|NULL|NULL|NULL|NULL|1|100.00|Usingwhere|
For use in data change statements, we have chosen to make derived tables readonly when used as part of a join view. This has a few consequences:
A join view (a view over two or more tables that are joined) cannot be used in a DELETE statement, so there is no change here. A view containinga derived table cannot be used as a target tablein an INSERT statement. UPDATE generally allows to update any updatable tables of a join view. We have copied this behavior for UPDATE of views containing derived tables.
The view created above cannot be inserted into, since it contains a derived table:
INSERT INTO v(a, b, c) VALUES (1, 10, 100);ERROR 1471 (HY000): The target table v of the INSERT is not insertable-into INSERT INTOv(a,b,c)VALUES(1,10,100);ERROR1471(HY000):ThetargettablevoftheINSERTisnotinsertable-into
The view cannot be deleted from, since it is a join view:
DELETE FROM v WHERE a > 0;ERROR 1395 (HY000): Can not delete from join view ‘test.v’ DELETE FROMvWHEREa>0;ERROR1395(HY000):Cannotdeletefromjoinview’test.v’
The view can be updated, as long as the non-derived part of it is updated. This UPDATE statement is accepted:
UPDATE v SET b= b + 1 WHERE a > 0; UPDATEvSETb=b+1WHEREa>0;
But this statement is invalid, since the column “c” to be updated is from the derived table:
UPDATE v SET c= c + 1 WHERE a > 0;ERROR 1288 (HY000): The target table t2 of the UPDATE is not updatable UPDATEvSETc=c+1WHEREa>0;ERROR1288(HY000):Thetargettablet2oftheUPDATEisnotupdatable
This was feature request #12755 (Subquery in FROM clause of views).
Derived tables in select list of views
In earlier versions of MySQL, having a subquery in the select list of a view meant that the view had to be materialized. This restriction has been lifted in 5.7.
CREATE VIEW v ASSELECT a, (SELECT COUNT(*) FROM t2) AS cFROM t1; CREATE VIEWvASSELECTa,(SELECTCOUNT(*)FROMt2)AScFROM t1;
This was explained in 5.6 as:
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2 | NULL || 2 | DERIVED | t1 | ALL | NULL | NULL | NULL | NULL | 1 | NULL || 3 | SUBQUERY | t2 | ALL | NULL | NULL | NULL | NULL | 1 | NULL | |1|PRIMARY|<derived2>|ALL|NULL|NULL|NULL|NULL|2|NULL||2|DERIVED|t1|ALL|NULL|NULL|NULL|NULL|1|NULL||3|SUBQUERY|t2|ALL|NULL|NULL|NULL|NULL|1|NULL|
In 5.7 we can see the view is merged into the outer query:
| 1 | PRIMARY | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL || 3 | SUBQUERY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away | |1|PRIMARY|t1|NULL|ALL|NULL|NULL|NULL|NULL|1|100.00|NULL||3|SUBQUERY|NULL|NULL|NULL|NULL|NULL|NULL|NULL|NULL|NULL|Selecttablesoptimizedaway|
This was feature request #60417 (Slow viewsby using subquery as column)
Refactored resolver order
Earlier versions of MySQL analyzed derived tables and views when tables were opened. We continue to refactor MySQL, and an important part of that is to analyze, optimize and execute queries in strict phases. As a natural part of that we now resolve derived tables and view references in “natural” order, by analyzing them when looking at table references for the outer query.
Refactored column privilege checking
Analysis of how we do column privilege check revealed a complex algorithm and a few corner case bugs. We have therefore tightened the column privilege checks:
Column privileges are not checked when resolving the derived table/view, but rather when the outer query is resolved and we know the context of the derived table. Previously, the required privileges for columns were assignedto a GRANT_INFO object connected to a table. But different columns require different privileges (a column in a WHERE clause requires SELECT privilege, whereas a column that is target for insertion requires INSERT privilege). In 5.7, the column privilege is always passed as argument to the privilege checking function, giving better control to the caller. Due to the above refactoring, we could eliminate the field GRANT_INFO::orig_want_privilege, and GRANT_INFO::want_privilege is only used in debug build for consistency checking. Refactored read_set/write_set processing
Each table involved in a query has registered one set of columns that is read and one set of columns that are written, called read_set and write_set respectively.
We have consolidated the setting of these values into a function TABLE::mark_column_used() and a walker Item::mark_field_in_map(). Like for privilege checking, we delay setting of columns into read_set and write_set until when the outer query is resolved. Thus, we know the columns from the derived table that are actually used in the outer query and need not copy unused columns from the storage engine.
CREATE TABLE t1(a INTEGER, b BLOB);CREATE VIEW v1 AS SELECT * FROM t1;SELECT a FROM v1; CREATE TABLEt1(aINTEGER,bBLOB);CREATE VIEWv1ASSELECT *FROMt1;SELECTaFROMv1;
With the current way of setting read_set, it is simple to make sure we only retrieve the “a” column and not the “b” blob.
Preparing for future extensions
Since derived tables now are resolved within proper context, it will become simpler to implement outer references in derived tables. More importantly, implementing LATERALderived tables may also become simpler, since these are depending heavily on references to tables from the embedding outer query.
Axure汉化版已经发布，版本号Axure 22.214.171.1249，下面是截图效果 Axure汉化版文件下载地址：Axure汉化补丁 Axure RP pro 126.96.36.1999 下载地址注册用户名：Axure 序列...
BI中文站 6月7日报道 艾默生·斯帕茨(Emerson Spartz)今年28岁，已婚，是Spartz Inc公司的首席执行官。Spartz Inc是一个网站媒体帝国，旗下的30多家网站专门发布有趣、励志和让人感到不可思议的文章和帖子，其分享量非常...
- MySQL惊喜之Prepared statement contains too many placeholders
- Spark SQL连接数据库找不到Mysql驱动解决方法
- mysql jdbc 实现读写分离
- 使用procedure analyse()分析mysql
- AMIS Technology blog: Database security for MySQL – MySQL Database Firewall