Manually adjust the join order
This topic describes how to manually adjust the join order using hints.
Function introduction
AnalyticDB for MySQL supports complex join queries and automatically adjusts the join order by default. However, query statements and table filter conditions can change. If data characteristics are complex, the automatic adjustment may not select the optimal join order. A suboptimal join order can lead to large intermediate result sets and high memory usage, which degrades query performance.
To address these issues, AnalyticDB for MySQL lets you use the /*+ reorder_joins*/ hint to enable or disable the join order adjustment feature:
-
/*+ reorder_joins=true*/: Enables automatic join order adjustment. AnalyticDB for MySQL enables this feature by default, so you do not need to specify this hint when you execute SQL queries. -
/*+ reorder_joins=false*/: Disables automatic join order adjustment. This lets you manually adjust the join order based on the data characteristics of the query. The query is then executed using the join order specified in the SQL statement.
/*+ reorder_joins*/ is a session-level hint. It affects only the SQL query statement in which it is specified.
Adjustment method
-
Before adjustment
-
Query statement
The original Query10 statement is as follows:
Note-
This topic uses Query10 from TPC-H as an example to demonstrate the method and effect of manually adjusting the join order. For more information about TPC-H, see TPC-H.
-
The automatic join order adjustment feature of AnalyticDB for MySQL is enabled by default. The
/*+ reorder_joins=false*/hint is used in the following query statement to simulate a scenario with a suboptimal join order.
SELECT c_custkey, c_name, Sum(l_extendedprice * (1 - l_discount)) AS revenue, c_acctbal, n_name, c_address, c_phone, c_comment FROM customer c, orders o, lineitem l, nation n WHERE c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate >= date '1993-10-01' AND o_orderdate < date '1993-10-01' + INTERVAL '3' month AND l_returnflag = 'R' AND c_nationkey = n_nationkey GROUP BY c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment ORDER BY revenue DESC LIMIT 20; -
-
Join order
Based on the preceding SQL statement, the join order is:
customer JOIN orders JOIN lineitem JOIN nation; -
Query results
The intermediate results of each join in the execution plan are as follows:
NoteFor more information about how to view an execution plan, see Analyze a query using an execution plan.
-
The
customertable is joined with theorderstable. The join outputs 57,069 rows to the intermediate result settmp1. -
The intermediate result set
tmp1is joined with thelineitemtable using an INNER (Hash) Join on the conditiono_orderkey = l_orderkey. This operation takes 4.88 ms (21.67%) and outputs 114,705 rows to the new intermediate result settmp2. -
The intermediate result set
tmp2is joined with thenationtable. The join outputs 114,705 rows as the final result. The join node is of the INNER (Hash) type. The join condition isleft.c_nationkey = right.n_nationkey. The operation takes 5.71 ms (12.07%).
The cumulative number of rows output by the three joins is: 57,069 + 114,705 + 114,705 = 286,479.
-
-
-
After adjustment
-
Query statement
You can add the
/*+ reorder_joins=false*/hint to the SQL statement to disable the automatic join order adjustment feature of AnalyticDB for MySQL and then manually adjust the join order. The SQL statement with the adjusted order is as follows:/*reorder_joins=false*/ SELECT c_custkey, c_name, Sum(l_extendedprice * (1 - l_discount)) AS revenue, c_acctbal, n_name, c_address, c_phone, c_comment FROM customer c, orders o, nation n, lineitem l WHERE c_custkey = o_custkey AND c_nationkey = n_nationkey AND l_orderkey = o_orderkey AND o_orderdate >= date '1993-10-01' AND o_orderdate < date '1993-10-01' + INTERVAL '3' month AND l_returnflag = 'R' GROUP BY c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment ORDER BY revenue DESC LIMIT 20; -
Join order
Based on the preceding SQL statement, the join order is:
customer JOIN orders JOIN nation JOIN lineitem -
Query results
The intermediate results of each join in the execution plan are as follows:
NoteFor more information about how to view an execution plan, see Analyze a query using an execution plan.
-
The
customertable is joined with theorderstable. The join outputs 57,069 rows to the intermediate result settmp1. The join node is of the INNER (Hash) type. The join condition isc_custkey = o_custkey. The operation takes 4.75 ms (36.7%). -
The intermediate result set
tmp1is joined with thenationtable. The join outputs 57,069 rows to the intermediate result settmp2. The join node is of the INNER (Hash) type. The join condition isleft.c_nationkey = right.n_nationkey. The operation takes 5.63 ms (25.61%). -
The intermediate result set
tmp2is joined with thelineitemtable using an INNER (Hash) join on the conditionleft.o_orderkey = right.l_orderkey. The join outputs 114,705 rows as the final result and takes 5.29 ms (9.81% of the total time).
The cumulative number of rows output by the three joins is: 57,069 + 57,069 + 114,705 = 228,843.
This is a 20% reduction in the number of output rows compared with the 286,479 rows from before the adjustment. This comparison shows that different join orders can affect the size of intermediate result sets. If you find that the join order in AnalyticDB for MySQL is not optimal, you can manually adjust it to improve query performance.
-
-