

# Query plan in Amazon Redshift
<a name="query-plan"></a>

A *query plan* is a list of instructions that the execution engine must follow to execute a query on the data. You can create a query plan by running the [EXPLAIN](https://docs.aws.amazon.com/redshift/latest/dg/r_EXPLAIN.html) command followed by the actual query text, as the following example query shows:

```
EXPLAIN
select s.s_name, sum(li.l_quantity) as quantity
from tpch.lineitem li
join tpch.orders o
    on o.o_orderkey = li.l_orderkey
    and o.o_orderdate > '1992-05-01'
join tpch.supplier s
    on s.s_suppkey = li.l_suppkey
group by s.s_name
order by quantity desc
limit 10;
```

If you run the `EXPLAIN` command for the query example above, you get the following output:

```
XN Limit (cost=1004400430902.15..1004400430902.17 rows=10 width=39)
  -> XN Merge (cost=1004400430902.15..1004400430926.97 rows=9928 width=39)
        Merge Key: sum(li.l_quantity)
        -> XN Network (cost=1004400430902.15..1004400430926.97 rows=9928 width=39)
              Send to leader
                  -> XN Sort (cost=1004400430902.15..1004400430926.97 rows=9928 width=39)
                     Sort Key: sum(li.l_quantity)
                      -> XN HashAggregate (cost=4400430218.24..4400430243.06 rows=9928 width=39)
                           -> XN Hash Join DS_BCAST_INNER (cost=21489.58..4400401726.35 rows=5698378 width=39)
                                 Hash Cond: ("outer".l_suppkey = "inner".s_suppkey)
                                     -> XN Hash Join DS_DIST_NONE (cost=21364.58..273387.85 rows=5698378 width=14)
                                            Hash Cond: ("outer".l_orderkey = "inner".o_orderkey)
                                                   -> XN Seq Scan on lineitem li (cost=0.00..60012.15 rows=6001215 width=22)
                                                   -> XN Hash (cost=17803.81..17803.81 rows=1424306 width=8)
                                                           -> XN Seq Scan on orders o (cost=0.00..17803.81 rows=1424306 width=8)
                                                                    Filter: (o_orderdate > '1992-05-01'::date)
                                     -> XN Hash (cost=100.00..100.00 rows=10000 width=33)
                                             -> XN Seq Scan on supplier s (cost=0.00..100.00 rows=10000 width=33)
```

**Note**  
The example query plan output is a simplified, high-level view of query execution. The example plan doesn't illustrate the details of parallel query processing. For detailed information, run the query and then use the [SVL\_QUERY\_SUMMARY](https://docs.aws.amazon.com/redshift/latest/dg/r_SVL_QUERY_SUMMARY.html) or [SVL\_QUERY\_REPORT](https://docs.aws.amazon.com/redshift/latest/dg/r_SVL_QUERY_REPORT.html) views to get query summary information.

## Amazon Redshift query editor v2
<a name="query-editor-v2"></a>

You can also view the query plans in Amazon Redshift by using the **Explain** option in the query editor v2. For instructions, see [Working with query editor v2](https://docs.aws.amazon.com/redshift/latest/mgmt/query-editor-v2-using.html) in the Amazon Redshift documentation.

The query plan generated by the query editor v2 includes the following information:
+ What operations the execution engine performs, reading the results from bottom to top
+ What type of step each operation performs
+ Which tables and columns are used in each operation
+ How much data is processed in each operation, in terms of number of rows and data width in bytes
+ The relative cost of the operation (Cost is a measure that compares the relative execution times of the steps within a plan. Cost doesn't provide precise information about actual execution times or memory consumption, nor does it provide a meaningful comparison between execution plans. Cost does, however, give you an indication of which operations in a query are consuming the most resources.)

## EXPLAIN plans
<a name="explain-plans"></a>

You can use the [STL\_EXPLAIN](https://docs.aws.amazon.com/redshift/latest/dg/r_STL_EXPLAIN.html) system table to display the `EXPLAIN` plan for a query that has been submitted for execution. Overall, using `STL_EXPLAIN` can help improve the performance, efficiency, and cost-effectiveness of your Amazon Redshift queries.

The benefits of using `STL_EXPLAIN` are:
+ **Performance optimization** – `STL_EXPLAIN` can help identify areas of a query that can be optimized for better performance.
+ **Query planning** – `STL_EXPLAIN` can provide information on how Amazon Redshift is executing the query and can help identify potential bottlenecks in the query.
+ **Debugging **– `STL_EXPLAIN` can help diagnose issues with a query by showing the steps that Amazon Redshift takes to execute that query.
+ **Understanding Amazon Redshift behavior** – `STL_EXPLAIN` can provide insights into how Amazon Redshift processes queries. This can help improve your understanding of Amazon Redshift behavior.
+ **Cost optimization** – `STL_EXPLAIN` can provide information on the estimated cost of a query. This can help you identify areas where you can optimize costs.

The following query is an example that returns the plan nodes for a given query:

```
select nodeid as id,
       plannode,
       info
from stl_explain
where query=1042904 
order by nodeid;
```

The preceding query returns the following output.



![Output of a query that returns the query plan nodes.](http://docs.aws.amazon.com/prescriptive-guidance/latest/query-lifecycle-redshift/images/query-output.png)


The `EXPLAIN` plan returns useful metrics for each operation, including metrics on cost, rows, and width. For example, line 7 from the preceding query returns the following:

```
->  XN Hash Join DS_DIST_NONE  (cost=21364.58..273387.85 rows=5698378 width=14)
```

### Cost
<a name="query-plan-cost"></a>

Cost is a relative value that's useful for comparing operations within a plan. Cost consists of two decimal values separated by two periods. In this example, cost is equal to `21364.58..273387.85`. Consider the following:
+ The first value (in this case, `21364.58`) provides the relative cost of returning the first row for this operation.
+ The second value (in this case, `273387.85`) provides the relative cost of completing the operation. 

The costs in the query plan are cumulative and roll up from lower to higher rows. In the example output above, row 7 includes the cost of the other operations in the rows below itself (that is, rows 8–12 and beyond).

### Rows
<a name="query-plan-rows"></a>

Rows is the estimated number of rows to return. In this example, the scan is expected to return 5,698,378 rows. The rows estimate is based on the available statistics generated by the `ANALYZE` command. If `ANALYZE` has not been run recently, the estimate is less reliable.

### Width
<a name="query-plan-width"></a>

Width is the estimated width of the average row, in bytes. In this example, the average row is expected to be 14 bytes wide.