Amazon Redshift will no longer support the creation of new Python UDFs starting Patch 198. Existing Python UDFs will continue to function until June 30, 2026. For more information, see the [ blog post ](https://aws.amazon.com/blogs/big-data/amazon-redshift-python-user-defined-functions-will-reach-end-of-support-after-june-30-2026/). 

# Automatic table optimization
<a name="t_Creating_tables"></a>

Automatic table optimization is a self-tuning capability that automatically optimizes the design of tables by applying sort and distribution keys without the need for administrator intervention. By using automation to tune the design of tables, you can get started and get the fastest performance without investing time to manually tune and implement table optimizations. 

Automatic table optimization continuously observes how queries interact with tables. It uses advanced artificial intelligence methods to choose sort and distribution keys to optimize performance for the cluster's workload. If Amazon Redshift determines that applying a key improves cluster performance, tables are automatically altered within hours from the time the cluster was created, with minimal impact to queries. 

To take advantage of this automation, an Amazon Redshift administrator creates a new table, or alters an existing table to enable it to use automatic optimization. Existing tables with a distribution style or sort key of `AUTO` are already enabled for automation. When you run queries against those tables, Amazon Redshift determines if a sort key or distribution key will improve performance. If so, then Amazon Redshift automatically modifies the table without requiring administrator intervention. If a minimum number of queries are run, optimizations are applied within hours of the cluster being launched. 

 If Amazon Redshift determines that a distribution key improves the performance of queries, tables where distribution style is `AUTO` can have their distribution style changed to `KEY`.

**Topics**
+ [Enabling, disabling, and monitoring automatic table optimization](c_ato-enabling-disabling-monitoring.md)
+ [Managing workload exclusions from Autonomics](t_Manage_workload_exclusion.md)
+ [Column compression to reduce the size of stored data](t_Compressing_data_on_disk.md)
+ [Data distribution for query optimization](t_Distributing_data.md)
+ [Sort keys](t_Sorting_data.md)
+ [Table constraints](t_Defining_constraints.md)

# Enabling, disabling, and monitoring automatic table optimization
<a name="c_ato-enabling-disabling-monitoring"></a>

By default, tables created without explicitly defining sort keys or distributions keys are set to `AUTO`. At the time of table creation, you can also explicitly set a sort or a distribution key manually. If you set the sort or distribution key, then the table is not automatically managed. 

## Enabling automatic table optimization
<a name="ato-enabling"></a>

To enable an existing table to be automatically optimized, use the ALTER statement options to change the table to `AUTO`. You might choose to define automation for sort keys, but not for distribution keys (and vice versa). If you run an ALTER statement to convert a table to be an automated table, existing sort keys and distribution styles are preserved. 

```
ALTER TABLE table_name ALTER SORTKEY AUTO;
```

```
ALTER TABLE table_name ALTER DISTSTYLE AUTO;
```

For more information, see [ALTER TABLE](r_ALTER_TABLE.md).

Initially, a table has no distribution key or sort key. The distribution style is set to either `EVEN` or `ALL` depending on table size. As the table grows in size, Amazon Redshift applies the optimal distribution keys and sort keys. Optimizations are applied within hours after a minimum number of queries are run. When determining sort key optimizations, Amazon Redshift attempts to optimize the data blocks read from disk during a table scan. When determining distribution style optimizations, Amazon Redshift tries to optimize the number of bytes transferred between cluster nodes. 

## Removing automatic table optimization from a table
<a name="ato-disabling"></a>

You can remove a table from automatic optimization. Removing a table from automation involves selecting a sort key or distribution style. To change distribution style, specify a specific distribution style. 

```
ALTER TABLE table_name ALTER DISTSTYLE EVEN;
```

```
ALTER TABLE table_name ALTER DISTSTYLE ALL;
```

```
ALTER TABLE table_name ALTER DISTSTYLE KEY DISTKEY c1;
```

To change a sort key, you can define a sort key or choose none. 

```
ALTER TABLE table_name ALTER SORTKEY(c1, c2);
```

```
ALTER TABLE table_name ALTER SORTKEY NONE;
```

## Monitoring automatic table optimization
<a name="ato-monitoring-actions"></a>

The system view `SVV_ALTER_TABLE_RECOMMENDATIONS` records the current Amazon Redshift Advisor recommendations for tables. This view shows recommendations for all tables, those that are defined for automatic optimization and those that aren't. 

To view if a table is defined for automatic optimization, query the system view `SVV_TABLE_INFO`. Entries appear only for tables visible in the current session's database. Recommendations are inserted into the view twice per day starting within hours from the time the cluster was created. After a recommendation is available, it's started within an hour. After a recommendation has been applied (either by Amazon Redshift or by you), it no longer appears in the view. 

The system view `SVL_AUTO_WORKER_ACTION` shows an audit log of all actions taken by Amazon Redshift, and the previous state of the table.

The system view `SVV_TABLE_INFO` lists all of the tables in the system, along with a column to indicate whether the sort key and distribution style of the table is set to `AUTO`. 

For more information about these system views, see [System monitoring (provisioned only)](c_intro_system_views.md).

# Managing workload exclusions from Autonomics
<a name="t_Manage_workload_exclusion"></a>

 You can exclude specific provisioned endpoints or serverless workgroups from influencing autonomics decisions like distribution key and sort key through a denylist feature. This allows you to control which workloads Amazon Redshift considers when making optimization decisions for your Redshift Managed Storage (RMS) data.

 **Using the denylist** 

 You can manage the denylist through the Autonomics section in the Amazon Redshift console: 

1.  **Add or remove items** 

    Add specific provisioned endpoints or serverless workgroups to the denylist, and remove them when needed. 

1.  **View and search** 

   View all denylisted items and search for specific endpoints or workgroups in the denylist.

 The denylist is particularly useful when you run data marketplaces, share data with external users, or have multiple business units, where you want to prevent certain usage patterns from influencing optimization decisions suitable for your workload. For example, if workgroup A runs workload A, and workgroup B runs workload B on a shared table T, the sort key of T will be determined by both workloads A and B. If you want that only workload A should influence the sort key decision, then add workgroup B to the deny list of the endpoint or workgroup which owns table T. By default, Amazon Redshift Autonomics considers query patterns from the producer and all consumer clusters/workgroups unless specifically excluded through the denylist. 

**Note**  
 You can denylist resources across different AWS Regions within the same account. Cross-account denylisting is not yet supported. 

## Managing denylist resources in the Amazon Redshift console
<a name="manage-denylist-console"></a>

On the Amazon Redshift Serverless console, complete the following steps: 

1. Choose Clusters or Serverless workgroups.

1. Navigate to a specific cluster or workgroup details page.

1. Choose Autonomics in the tabs section.

1. Under the Autonomics tab, you can view and manage your denylist.

1. For managing cross-region denylist, choose the corresponding AWS Region.

## Adding denylist resources
<a name="add-denylist"></a>

1. Navigate to the Autonomics tab of the selected cluster or workgroup, select the AWS Region, and choose Add resources.

1.  Select one or more Provisioned clusters or Serverless workgroups to add to denylist and choose Add.

1. The table shows the list of denylist resources.

## Remove denylist resources
<a name="remove-denylist"></a>

1. Navigate to the Autonomics tab of the selected cluster or workgroup, select the AWS Region.

1. Choose the cluster or workgroup you want to delete from the list and select Remove.

1. A confirmation dialog box opens. Select Remove to confirm.

# Column compression to reduce the size of stored data
<a name="t_Compressing_data_on_disk"></a>

*Compression* is a column-level operation that reduces the size of data when it is stored. Compression conserves storage space and reduces the size of data that is read from storage, which reduces the amount of disk I/O and therefore improves query performance.

ENCODE AUTO is the default for tables. When a table is set to ENCODE AUTO, Amazon Redshift automatically manages compression encoding for all columns in the table. For more information, see [CREATE TABLE](r_CREATE_TABLE_NEW.md) and [ALTER TABLE](r_ALTER_TABLE.md).

However, if you specify compression encoding for any column in the table, the table is no longer set to ENCODE AUTO. Amazon Redshift no longer automatically manages compression encoding for all columns in the table. 

You can apply a compression type, or *encoding*, to the columns in a table manually when you create the table. Or you can use the COPY command to analyze and apply compression automatically. For more information, see [Let COPY choose compression encodings](c_best-practices-use-auto-compression.md). For details about applying automatic compression, see [Loading tables with automatic compression](c_Loading_tables_auto_compress.md).

**Note**  
We strongly recommend using the COPY command to apply automatic compression.

You might choose to apply compression encodings manually if the new table shares the same data characteristics as another table. Or you might do so if you discover in testing that the compression encodings applied during automatic compression are not the best fit for your data. If you choose to apply compression encodings manually, you can run the [ANALYZE COMPRESSION](r_ANALYZE_COMPRESSION.md) command against an already populated table and use the results to choose compression encodings.

To apply compression manually, you specify compression encodings for individual columns as part of the CREATE TABLE statement. The syntax is as follows.

```
CREATE TABLE table_name (column_name 
data_type ENCODE encoding-type)[, ...]
```

Here, *encoding-type* is taken from the keyword table in the following section.

For example, the following statement creates a two-column table, PRODUCT. When data is loaded into the table, the PRODUCT\$1ID column is not compressed, but the PRODUCT\$1NAME column is compressed, using the byte dictionary encoding (BYTEDICT).

```
create table product(
product_id int encode raw,
product_name char(20) encode bytedict);
```

You can specify the encoding for a column when it is added to a table using the ALTER TABLE command.

```
ALTER TABLE table-name ADD [ COLUMN ] column_name column_type ENCODE encoding-type
```

**Topics**
+ [Compression encodings](c_Compression_encodings.md)
+ [Testing compression encodings](t_Verifying_data_compression.md)

# Compression encodings
<a name="c_Compression_encodings"></a>

<a name="compression-encoding-list"></a>A *compression encoding* specifies the type of compression that is applied to a column of data values as rows are added to a table.

ENCODE AUTO is the default for tables. When a table is set to ENCODE AUTO, Amazon Redshift automatically manages compression encoding for all columns in the table. For more information, see [CREATE TABLE](r_CREATE_TABLE_NEW.md) and [ALTER TABLE](r_ALTER_TABLE.md).

However, if you specify compression encoding for any column in the table, the table is no longer set to ENCODE AUTO. Amazon Redshift no longer automatically manages compression encoding for all columns in the table.

When you use CREATE TABLE, ENCODE AUTO is disabled when you specify compression encoding for any column in the table. If ENCODE AUTO is disabled, Amazon Redshift automatically assigns compression encoding to columns for which you don't specify an ENCODE type as follows:
+ Columns that are defined as sort keys are assigned RAW compression.
+ Columns that are defined as BOOLEAN, REAL, or DOUBLE PRECISION data types are assigned RAW compression.
+ Columns that are defined as SMALLINT, INTEGER, BIGINT, DECIMAL, DATE, TIMESTAMP, or TIMESTAMPTZ data types are assigned AZ64 compression.
+ Columns that are defined as CHAR or VARCHAR data types are assigned LZO compression.

You can change a table's encoding after creating it by using ALTER TABLE. If you disable ENCODE AUTO using ALTER TABLE, Amazon Redshift no longer automatically manages compression encodings for your columns. All columns will keep the compression encoding types that they had when you disabled ENCODE AUTO until you change them or you enable ENCODE AUTO again.

Amazon Redshift supports the following compression encodings:

------
#### [ Raw ]

Raw encoding is the default encoding for columns that are designated as sort keys and columns that are defined as BOOLEAN, REAL, or DOUBLE PRECISION data types. With raw encoding, data is stored in raw, uncompressed form.

------
#### [ AZ64 ]

AZ64 is a proprietary compression encoding algorithm designed by Amazon to achieve a high compression ratio and improved query processing. At its core, the AZ64 algorithm compresses smaller groups of data values and uses single instruction, multiple data (SIMD) instructions for parallel processing. Use AZ64 to achieve significant storage savings and high performance for numeric, date, and time data types. 

You can use AZ64 as the compression encoding when defining columns using CREATE TABLE and ALTER TABLE statements with the following data types:
+ SMALLINT
+ INTEGER
+ BIGINT
+ DECIMAL
+ DATE
+ TIMESTAMP
+ TIMESTAMPTZ

------
#### [ Byte-dictionary ]

In byte dictionary encoding, a separate dictionary of unique values is created for each block of column values on disk. (An Amazon Redshift disk block occupies 1 MB.) The dictionary contains up to 256 one-byte values that are stored as indexes to the original data values. If more than 256 values are stored in a single block, the extra values are written into the block in raw, uncompressed form. The process repeats for each disk block.

This encoding is very effective on low cardinality string columns. This encoding is optimal when the data domain of a column is fewer than 256 unique values.

For columns with the string data type (CHAR and VARCHAR) encoded with BYTEDICT, Amazon Redshift performs vectorized scans and predicate evaluations that operate over compressed data directly. These scans use hardware-specific single instruction and multiple data (SIMD) instructions for parallel processing. This significantly speeds up the scanning of string columns. Byte-dictionary encoding is especially space-efficient if a CHAR/VARCHAR column holds long character strings.

Suppose that a table has a COUNTRY column with a CHAR(30) data type. As data is loaded, Amazon Redshift creates the dictionary and populates the COUNTRY column with the index value. The dictionary contains the indexed unique values, and the table itself contains only the one-byte subscripts of the corresponding values.

**Note**  
Trailing blanks are stored for fixed-length character columns. Therefore, in a CHAR(30) column, every compressed value saves 29 bytes of storage when you use the byte-dictionary encoding.

The following table represents the dictionary for the COUNTRY column.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html)

The following table represents the values in the COUNTRY column.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html)

The total compressed size in this example is calculated as follows: 6 different entries are stored in the dictionary (6 \$1 30 = 180), and the table contains 10 1-byte compressed values, for a total of 190 bytes.

------
#### [ Delta ]

Delta encodings are very useful for date time columns.

Delta encoding compresses data by recording the difference between values that follow each other in the column. This difference is recorded in a separate dictionary for each block of column values on disk. (An Amazon Redshift disk block occupies 1 MB.) For example, suppose that the column contains 10 integers in sequence from 1 to 10. The first are stored as a 4-byte integer (plus a 1-byte flag). The next nine are each stored as a byte with the value 1, indicating that it is one greater than the previous value.

Delta encoding comes in two variations: 
+ DELTA records the differences as 1-byte values (8-bit integers)
+ DELTA32K records differences as 2-byte values (16-bit integers)

If most of the values in the column could be compressed by using a single byte, the 1-byte variation is very effective. However, if the deltas are larger, this encoding, in the worst case, is somewhat less effective than storing the uncompressed data. Similar logic applies to the 16-bit version.

If the difference between two values exceeds the 1-byte range (DELTA) or 2-byte range (DELTA32K), the full original value is stored, with a leading 1-byte flag. The 1-byte range is from -127 to 127, and the 2-byte range is from -32K to 32K.

The following table shows how a delta encoding works for a numeric column.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html)

------
#### [ LZO ]

LZO encoding provides a very high compression ratio with good performance. LZO encoding works especially well for CHAR and VARCHAR columns that store very long character strings. They are especially good for free-form text, such as product descriptions, user comments, or JSON strings. 

------
#### [ Mostly ]

Mostly encodings are useful when the data type for a column is larger than most of the stored values require. By specifying a mostly encoding for this type of column, you can compress the majority of the values in the column to a smaller standard storage size. The remaining values that cannot be compressed are stored in their raw form. For example, you can compress a 16-bit column, such as an INT2 column, to 8-bit storage.

In general, the mostly encodings work with the following data types:
+ SMALLINT/INT2 (16-bit)
+ INTEGER/INT (32-bit)
+ BIGINT/INT8 (64-bit)
+ DECIMAL/NUMERIC (64-bit)

Choose the appropriate variation of the mostly encoding to suit the size of the data type for the column. For example, apply MOSTLY8 to a column that is defined as a 16-bit integer column. Applying MOSTLY16 to a column with a 16-bit data type or MOSTLY32 to a column with a 32-bit data type is disallowed.

Mostly encodings might be less effective than no compression when a relatively high number of the values in the column can't be compressed. Before applying one of these encodings to a column, perform a check. *Most* of the values that you are going to load now (and are likely to load in the future) should fit into the ranges shown in the following table.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html)

**Note**  
For decimal values, ignore the decimal point to determine whether the value fits into the range. For example, 1,234.56 is treated as 123,456 and can be compressed in a MOSTLY32 column.

For example, the VENUEID column in the VENUE table is defined as a raw integer column, which means that its values consume 4 bytes of storage. However, the current range of values in the column is **0** to **309**. Therefore, recreating and reloading this table with MOSTLY16 encoding for VENUEID would reduce the storage of every value in that column to 2 bytes.

If the VENUEID values referenced in another table were mostly in the range of 0 to 127, it might make sense to encode that foreign-key column as MOSTLY8. Before making the choice, run several queries against the referencing table data to find out whether the values mostly fall into the 8-bit, 16-bit, or 32-bit range.

The following table shows compressed sizes for specific numeric values when the MOSTLY8, MOSTLY16, and MOSTLY32 encodings are used:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html)

------
#### [ Run length ]

Run length encoding replaces a value that is repeated consecutively with a token that consists of the value and a count of the number of consecutive occurrences (the length of the run). A separate dictionary of unique values is created for each block of column values on disk. (An Amazon Redshift disk block occupies 1 MB.) This encoding is best suited to a table in which data values are often repeated consecutively, for example, when the table is sorted by those values.

For example, suppose that a column in a large dimension table has a predictably small domain, such as a COLOR column with fewer than 10 possible values. These values are likely to fall in long sequences throughout the table, even if the data is not sorted.

We don't recommend applying run length encoding on any column that is designated as a sort key. Range-restricted scans perform better when blocks contain similar numbers of rows. If sort key columns are compressed much more highly than other columns in the same query, range-restricted scans might perform poorly.

The following table uses the COLOR column example to show how the run length encoding works.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html)

------
#### [ Text255 and Text32k ]

Text255 and text32k encodings are useful for compressing VARCHAR columns in which the same words recur often. A separate dictionary of unique words is created for each block of column values on disk. (An Amazon Redshift disk block occupies 1 MB.) The dictionary contains the first 245 unique words in the column. Those words are replaced on disk by a one-byte index value representing one of the 245 values, and any words that are not represented in the dictionary are stored uncompressed. The process repeats for each 1-MB disk block. If the indexed words occur frequently in the column, the column yields a high compression ratio.

For the text32k encoding, the principle is the same, but the dictionary for each block does not capture a specific number of words. Instead, the dictionary indexes each unique word it finds until the combined entries reach a length of 32K, minus some overhead. The index values are stored in two bytes.

For example, consider the VENUENAME column in the VENUE table. Words such as **Arena**, **Center**, and **Theatre** recur in this column and are likely to be among the first 245 words encountered in each block if text255 compression is applied. If so, this column benefits from compression. This is because every time those words appear, they occupy only 1 byte of storage (instead of 5, 6, or 7 bytes, respectively).

------
#### [ ZSTD ]

Zstandard (ZSTD) encoding provides a high compression ratio with very good performance across diverse datasets. ZSTD works especially well with CHAR and VARCHAR columns that store a wide range of long and short strings, such as product descriptions, user comments, logs, and JSON strings. Where some algorithms, such as Delta encoding or Mostly encoding, can potentially use more storage space than no compression, ZSTD is unlikely to increase disk usage. 

ZSTD supports SMALLINT, INTEGER, BIGINT, DECIMAL, REAL, DOUBLE PRECISION, BOOLEAN, CHAR, VARCHAR, DATE, TIMESTAMP, and TIMESTAMPTZ data types.

------

The following table identifies the supported compression encodings and the data types that support the encoding.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html)

# Testing compression encodings
<a name="t_Verifying_data_compression"></a>

If you decide to manually specify column encodings, you might want to test different encodings with your data.

**Note**  
We recommend that you use the COPY command to load data whenever possible, and allow the COPY command to choose the optimal encodings based on your data. Or you can use the [ANALYZE COMPRESSION](r_ANALYZE_COMPRESSION.md) command to view the suggested encodings for existing data. For details about applying automatic compression, see [Loading tables with automatic compression](c_Loading_tables_auto_compress.md).

To perform a meaningful test of data compression, you must have a large number of rows. For this example, we create a table and insert rows by using a statement that selects from two tables; VENUE and LISTING. We leave out the WHERE clause that would normally join the two tables. The result is that *each* row in the VENUE table is joined to *all* of the rows in the LISTING table, for a total of over 32 million rows. This is known as a Cartesian join and normally is not recommended. However, for this purpose, it's a convenient method of creating many rows. If you have an existing table with data that you want to test, you can skip this step.

After we have a table with sample data, we create a table with seven columns. Each has a different compression encoding: raw, bytedict, lzo, run length, text255, text32k, and zstd. We populate each column with exactly the same data by running an INSERT command that selects the data from the first table.

To test compression encodings, do the following:

1.  (Optional) First, use a Cartesian join to create a table with a large number of rows. Skip this step if you want to test an existing table. 

   ```
   create table cartesian_venue(
   venueid smallint not null distkey sortkey,
   venuename varchar(100),
   venuecity varchar(30),
   venuestate char(2),
   venueseats integer);
   
   insert into cartesian_venue
   select venueid, venuename, venuecity, venuestate, venueseats
   from venue, listing;
   ```

1.  Next, create a table with the encodings that you want to compare.  

   ```
   create table encodingvenue (
   venueraw varchar(100) encode raw,
   venuebytedict varchar(100) encode bytedict,
   venuelzo varchar(100) encode lzo,
   venuerunlength varchar(100) encode runlength,
   venuetext255 varchar(100) encode text255,
   venuetext32k varchar(100) encode text32k,
   venuezstd varchar(100) encode zstd);
   ```

1.  Insert the same data into all of the columns using an INSERT statement with a SELECT clause. 

   ```
   insert into encodingvenue
   select venuename as venueraw, venuename as venuebytedict, venuename as venuelzo, venuename as venuerunlength, venuename as  venuetext32k, venuename as  venuetext255, venuename as venuezstd
   from cartesian_venue;
   ```

1.  Verify the number of rows in the new table. 

   ```
   select count(*) from encodingvenue
   
     count
   ----------
    38884394
   (1 row)
   ```

1.  Query the [STV\$1BLOCKLIST](r_STV_BLOCKLIST.md) system table to compare the number of 1 MB disk blocks used by each column.  

   The MAX aggregate function returns the highest block number for each column. The STV\$1BLOCKLIST table includes details for three system-generated columns. This example uses `col < 6` in the WHERE clause to exclude the system-generated columns. 

   ```
   select col, max(blocknum)
   from stv_blocklist b, stv_tbl_perm p
   where (b.tbl=p.id) and name ='encodingvenue'
   and col < 7
   group by name, col
   order by col;
   ```

   The query returns the following results. The columns are numbered beginning with zero. Depending on how your cluster is configured, your result might have different numbers, but the relative sizes should be similar. You can see that BYTEDICT encoding on the second column produced the best results for this dataset. This approach has a compression ratio of better than 20:1. LZO and ZSTD encoding also produced excellent results. Different datasets produce different results, of course. When a column contains longer text strings, LZO often produces the best compression results.

   ```
    col | max
   -----+-----
      0 | 203
      1 |  10
      2 |  22
      3 | 204
      4 |  56
      5 |  72
      6 |  20
   (7 rows)
   ```

If you have data in an existing table, you can use the [ANALYZE COMPRESSION](r_ANALYZE_COMPRESSION.md) command to view the suggested encodings for the table. For example, the following example shows the recommended encoding for a copy of the VENUE table, CARTESIAN\$1VENUE, that contains 38 million rows. Notice that ANALYZE COMPRESSION recommends LZO encoding for the VENUENAME column. ANALYZE COMPRESSION chooses optimal compression based on multiple factors, which include percent of reduction. In this specific case, BYTEDICT provides better compression, but LZO also produces greater than 90 percent compression. 

```
analyze compression cartesian_venue;

Table          | Column     | Encoding | Est_reduction_pct
---------------+------------+----------+------------------
reallybigvenue | venueid    | lzo      | 97.54            
reallybigvenue | venuename  | lzo      | 91.71            
reallybigvenue | venuecity  | lzo      | 96.01            
reallybigvenue | venuestate | lzo      | 97.68            
reallybigvenue | venueseats | lzo      | 98.21
```

## Example
<a name="Examples__compression_encodings_in_CREATE_TABLE_statements"></a>

The following example creates a CUSTOMER table that has columns with various data types. This CREATE TABLE statement shows one of many possible combinations of compression encodings for these columns. 

```
create table customer(
custkey int encode delta,
custname varchar(30) encode raw,
gender varchar(7) encode text255,
address varchar(200) encode text255,
city varchar(30) encode text255,
state char(2) encode raw,
zipcode char(5) encode bytedict,
start_date date encode delta32k);
```

The following table shows the column encodings that were chosen for the CUSTOMER table and gives an explanation for the choices:


| Column | Data type | Encoding | Explanation | 
| --- | --- | --- | --- | 
| CUSTKEY | int | delta | CUSTKEY consists of unique, consecutive integer values. Because the differences are one byte, DELTA is a good choice. | 
| CUSTNAME | varchar(30) | raw | CUSTNAME has a large domain with few repeated values. Any compression encoding would probably be ineffective. | 
| GENDER | varchar(7) | text255 | GENDER is very small domain with many repeated values. Text255 works well with VARCHAR columns in which the same words recur. | 
| ADDRESS | varchar(200) | text255 | ADDRESS is a large domain, but contains many repeated words, such as Street, Avenue, North, South, and so on. Text 255 and text 32k are useful for compressing VARCHAR columns in which the same words recur. The column length is short, so text255 is a good choice. | 
| CITY | varchar(30) | text255 | CITY is a large domain, with some repeated values. Certain city names are used much more commonly than others. Text255 is a good choice for the same reasons as ADDRESS. | 
| STATE | char(2) | raw | In the United States, STATE is a precise domain of 50 two-character values. Bytedict encoding would yield some compression, but because the column size is only two characters, compression might not be worth the overhead of uncompressing the data. | 
| ZIPCODE | char(5) | bytedict | ZIPCODE is a known domain of fewer than 50,000 unique values. Certain zip codes occur much more commonly than others. Bytedict encoding is very effective when a column contains a limited number of unique values.  | 
| START\$1DATE | date | delta32k | Delta encodings are very useful for date time columns, especially if the rows are loaded in date order. | 

# Data distribution for query optimization
<a name="t_Distributing_data"></a>

When you load data into a table, Amazon Redshift distributes the rows of the table to each of the compute nodes according to the table's distribution style. When you run a query, the query optimizer redistributes the rows to the compute nodes as needed to perform any joins and aggregations. The goal in choosing a table distribution style is to minimize the impact of the redistribution step by locating the data where it must be before the query is run.

**Note**  
This section will introduce you to the principles of data distribution in an Amazon Redshift database. We recommend that you create your tables with `DISTSTYLE AUTO`. If you do so, then Amazon Redshift uses automatic table optimization to choose the data distribution style. For more information, see [Automatic table optimization](t_Creating_tables.md). The rest of this section provides details about distribution styles. 

**Topics**
+ [Data distribution concepts](#t_data_distribution_concepts)
+ [Distribution styles](c_choosing_dist_sort.md)
+ [Viewing distribution styles](viewing-distribution-styles.md)
+ [Evaluating query patterns](t_evaluating_query_patterns.md)
+ [Designating distribution styles](t_designating_distribution_styles.md)
+ [Evaluating the query plan](c_data_redistribution.md)
+ [Query plan example](t_explain_plan_example.md)
+ [Distribution examples](c_Distribution_examples.md)

## Data distribution concepts
<a name="t_data_distribution_concepts"></a>

Some data distribution concepts for Amazon Redshift follow.

 **Nodes and slices** 

 An Amazon Redshift cluster is a set of nodes. Each node in the cluster has its own operating system, dedicated memory, and dedicated disk storage. One node is the *leader node*, which manages the distribution of data and query processing tasks to the compute nodes. The *compute nodes* provide resources to do those tasks. 

 The disk storage for a compute node is divided into a number of *slices*. The number of slices per node depends on the node size of the cluster. The nodes all participate in running parallel queries, working on data that is distributed as evenly as possible across the slices. For more information about the number of slices that each node size has, see [About clusters and nodes](https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-about-clusters-and-nodes) in the *Amazon Redshift Management Guide*.

 **Data redistribution** 

 When you load data into a table, Amazon Redshift distributes the rows of the table to each of the node slices according to the table's distribution style. As part of a query plan, the optimizer determines where blocks of data must be located to best run the query. The data is then physically moved, or redistributed, while the query runs. Redistribution might involve either sending specific rows to nodes for joining or broadcasting an entire table to all of the nodes. 

 Data redistribution can account for a substantial portion of the cost of a query plan, and the network traffic it generates can affect other database operations and slow overall system performance. To the extent that you anticipate where best to locate data initially, you can minimize the impact of data redistribution. 

 **Data distribution goals** 

 When you load data into a table, Amazon Redshift distributes the table's rows to the compute nodes and slices according to the distribution style that you chose when you created the table. Data distribution has two primary goals: 
+ To distribute the workload uniformly among the nodes in the cluster. Uneven distribution, or data distribution skew, forces some nodes to do more work than others, which impairs query performance.
+ To minimize data movement as a query runs. If the rows that participate in joins or aggregates are already collocated on the nodes with their joining rows in other tables, the optimizer doesn't need to redistribute as much data when queries run.

The distribution strategy that you choose for your database has important consequences for query performance, storage requirements, data loading, and maintenance. By choosing the best distribution style for each table, you can balance your data distribution and significantly improve overall system performance.

# Distribution styles
<a name="c_choosing_dist_sort"></a>

When you create a table, you can designate one of the following distribution styles: AUTO, EVEN, KEY, or ALL. 

If you don't specify a distribution style, Amazon Redshift uses AUTO distribution.

 **AUTO distribution** 

With AUTO distribution, Amazon Redshift assigns an optimal distribution style based on the size of the table data. For example, if AUTO distribution style is specified, Amazon Redshift initially assigns the ALL distribution style to a small table. When the table grows larger, Amazon Redshift might change the distribution style to KEY, choosing the primary key (or a column of the composite primary key) as the distribution key. If the table grows larger and none of the columns are suitable to be the distribution key, Amazon Redshift changes the distribution style to EVEN. The change in distribution style occurs in the background with minimal impact to user queries. 

To view actions that Amazon Redshift automatically performed to alter a table distribution key, see [SVL\$1AUTO\$1WORKER\$1ACTION](r_SVL_AUTO_WORKER_ACTION.md). To view current recommendations regarding altering a table distribution key, see [SVV\$1ALTER\$1TABLE\$1RECOMMENDATIONS](r_SVV_ALTER_TABLE_RECOMMENDATIONS.md). 

To view the distribution style applied to a table, query the PG\$1CLASS\$1INFO system catalog view. For more information, see [Viewing distribution styles](viewing-distribution-styles.md). If you don't specify a distribution style with the CREATE TABLE statement, Amazon Redshift applies AUTO distribution. 

 **EVEN distribution** 

 The leader node distributes the rows across the slices in a round-robin fashion, regardless of the values in any particular column. EVEN distribution is appropriate when a table doesn't participate in joins. It's also appropriate when there isn't a clear choice between KEY distribution and ALL distribution.

 **KEY distribution** 

 The rows are distributed according to the values in one column. The leader node places matching values on the same node slice. If you distribute a pair of tables on the joining keys, the leader node collocates the rows on the slices according to the values in the joining columns. This way, matching values from the common columns are physically stored together. 

 **ALL distribution** 

 A copy of the entire table is distributed to every node. Where EVEN distribution or KEY distribution place only a portion of a table's rows on each node, ALL distribution ensures that every row is collocated for every join that the table participates in. 

 ALL distribution multiplies the storage required by the number of nodes in the cluster, and so it takes much longer to load, update, or insert data into multiple tables. ALL distribution is appropriate only for relatively slow moving tables; that is, tables that are not updated frequently or extensively. Because the cost of redistributing small tables during a query is low, there isn't a significant benefit to define small dimension tables as DISTSTYLE ALL. 

**Note**  
 After you have specified a distribution style for a column, Amazon Redshift handles data distribution at the cluster level. Amazon Redshift does not require or support the concept of partitioning data within database objects. You don't need to create table spaces or define partitioning schemes for tables. 

In certain scenarios, you can change the distribution style of a table after it is created. For more information, see [ALTER TABLE](r_ALTER_TABLE.md). For scenarios when you can't change the distribution style of a table after it's created, you can recreate the table and populate the new table with a deep copy. For more information, see [Performing a deep copy](performing-a-deep-copy.md)

# Viewing distribution styles
<a name="viewing-distribution-styles"></a>

To view the distribution style of a table, query the PG\$1CLASS\$1INFO view or the SVV\$1TABLE\$1INFO view.

The RELEFFECTIVEDISTSTYLE column in PG\$1CLASS\$1INFO indicates the current distribution style for the table. If the table uses automatic distribution, RELEFFECTIVEDISTSTYLE is 10, 11, or 12, which indicates whether the effective distribution style is AUTO (ALL), AUTO (EVEN), or AUTO (KEY). If the table uses automatic distribution, the distribution style might initially show AUTO (ALL), then change to AUTO (EVEN) or AUTO (KEY) when the table grows. 

The following table gives the distribution style for each value in RELEFFECTIVEDISTSTYLE column: 


| RELEFFECTIVEDISTSTYLE | Current distribution style | 
| --- | --- | 
| 0 | EVEN | 
| 1 | KEY | 
| 8 | ALL | 
| 10 | AUTO (ALL) | 
| 11 | AUTO (EVEN) | 
| 12 | AUTO (KEY) | 

The DISTSTYLE column in SVV\$1TABLE\$1INFO indicates the current distribution style for the table. If the table uses automatic distribution, DISTSTYLE is AUTO (ALL), AUTO (EVEN), or AUTO (KEY).

The following example creates four tables using the three distribution styles and automatic distribution, then queries SVV\$1TABLE\$1INFO to view the distribution styles. 

```
create table public.dist_key (col1 int)
diststyle key distkey (col1);

insert into public.dist_key values (1);

create table public.dist_even (col1 int)
diststyle even;

insert into public.dist_even values (1);

create table public.dist_all (col1 int)
diststyle all;

insert into public.dist_all values (1);

create table public.dist_auto (col1 int);

insert into public.dist_auto values (1);

select "schema", "table", diststyle from SVV_TABLE_INFO
where "table" like 'dist%';

        schema   |    table        | diststyle
     ------------+-----------------+------------
      public     | dist_key        | KEY(col1)
      public     | dist_even       | EVEN
      public     | dist_all        | ALL
      public     | dist_auto       | AUTO(ALL)
```

# Evaluating query patterns
<a name="t_evaluating_query_patterns"></a>

 Choosing distribution styles is only one aspect of database design. Consider distribution styles within the context of the entire system, balancing distribution with other important factors such as cluster size, compression encoding methods, sort keys, and table constraints. 

 Test your system with data that is as close to real data as possible. 

To make good choices for distribution styles, you must understand the query patterns for your Amazon Redshift application. Identify the most costly queries in your system and base your initial database design on the demands of those queries. Factors that determine the total cost of a query include how long the query takes to run and how much computing resources it consumes. Other factors that determine query cost are how often it is run, and how disruptive it is to other queries and database operations. 

 Identify the tables that are used by the most costly queries, and evaluate their role in query runtime. Consider how the tables are joined and aggregated. 

 Use the guidelines in this section to choose a distribution style for each table. When you have done so, create the tables and load them with data that is as close as possible to real data. Then test the tables for the types of queries that you expect to use. You can evaluate the query explain plans to identify tuning opportunities. Compare load times, storage space, and query runtimes to balance your system's overall requirements. 

# Designating distribution styles
<a name="t_designating_distribution_styles"></a>

 The considerations and recommendations for designating distribution styles in this section use a star schema as an example. Your database design might be based on a star schema, some variant of a star schema, or an entirely different schema. Amazon Redshift is designed to work effectively with whatever schema design you choose. The principles in this section can be applied to any design schema. 

1.  **Specify the primary key and foreign keys for all your tables.** 

   Amazon Redshift does not enforce primary key and foreign key constraints, but the query optimizer uses them when it generates query plans. If you set primary keys and foreign keys, your application must maintain the validity of the keys. 

1.  **Distribute the fact table and its largest dimension table on their common columns.** 

   Choose the largest dimension based on the size of dataset that participates in the most common join, not only the size of the table. If a table is commonly filtered, using a WHERE clause, only a portion of its rows participate in the join. Such a table has less impact on redistribution than a smaller table that contributes more data. Designate both the dimension table's primary key and the fact table's corresponding foreign key as DISTKEY. If multiple tables use the same distribution key, they are also collocated with the fact table. Your fact table can have only one distribution key. Any tables that join on another key isn't collocated with the fact table. 

1.  **Designate distribution keys for the other dimension tables.** 

   Distribute the tables on their primary keys or their foreign keys, depending on how they most commonly join with other tables. 

1.  **Evaluate whether to change some of the dimension tables to use ALL distribution.** 

   If a dimension table cannot be collocated with the fact table or other important joining tables, you can improve query performance significantly by distributing the entire table to all of the nodes. Using ALL distribution multiplies storage space requirements and increases load times and maintenance operations, so you should weigh all factors before choosing ALL distribution. The following section explains how to identify candidates for ALL distribution by evaluating the EXPLAIN plan. 

1.  **Use AUTO distribution for the remaining tables.** 

   If a table is largely denormalized and does not participate in joins, or if you don't have a clear choice for another distribution style, use AUTO distribution. 

To let Amazon Redshift choose the appropriate distribution style, don't explicitly specify a distribution style.

# Evaluating the query plan
<a name="c_data_redistribution"></a>

You can use query plans to identify candidates for optimizing the distribution style. 

After making your initial design decisions, create your tables, load them with data, and test them. Use a test dataset that is as close as possible to the real data. Measure load times to use as a baseline for comparisons. 

Evaluate queries that are representative of the most costly queries you expect to run, specifically queries that use joins and aggregations. Compare runtimes for various design options. When you compare runtimes, don't count the first time the query is run, because the first runtime includes the compilation time. 

**DS\$1DIST\$1NONE**  
No redistribution is required, because corresponding slices are collocated on the compute nodes. You typically have only one DS\$1DIST\$1NONE step, the join between the fact table and one dimension table. 

**DS\$1DIST\$1ALL\$1NONE**  
No redistribution is required, because the inner join table used DISTSTYLE ALL. The entire table is located on every node. 

**DS\$1DIST\$1INNER**  
The inner table is redistributed. 

**DS\$1DIST\$1OUTER**  
The outer table is redistributed. 

**DS\$1BCAST\$1INNER**  
A copy of the entire inner table is broadcast to all the compute nodes. 

**DS\$1DIST\$1ALL\$1INNER**  
The entire inner table is redistributed to a single slice because the outer table uses DISTSTYLE ALL.

**DS\$1DIST\$1BOTH**  
Both tables are redistributed. 

**DS\$1DIST\$1ERR**  
When the table doesn't have a distribution style selected.

DS\$1DIST\$1NONE and DS\$1DIST\$1ALL\$1NONE are good. They indicate that no distribution was required for that step because all of the joins are collocated. 

DS\$1DIST\$1INNER means that the step probably has a relatively high cost because the inner table is being redistributed to the nodes. DS\$1DIST\$1INNER indicates that the outer table is already properly distributed on the join key. Set the inner table's distribution key to the join key to convert this to DS\$1DIST\$1NONE. In some cases, distributing the inner table on the join key isn't possible because the outer table isn't distributed on the join key. If this is the case, evaluate whether to use ALL distribution for the inner table. If the table isn't updated frequently or extensively, and it's large enough to carry a high redistribution cost, change the distribution style to ALL and test again. ALL distribution causes increased load times, so when you retest, include the load time in your evaluation factors. 

DS\$1DIST\$1ALL\$1INNER is not good. It means that the entire inner table is redistributed to a single slice because the outer table uses DISTSTYLE ALL, so that a copy of the entire outer table is located on each node. This results in inefficient serial runtime of the join on a single node, instead taking advantage of parallel runtime using all of the nodes. DISTSTYLE ALL is meant to be used only for the inner join table. Instead, specify a distribution key or use even distribution for the outer table.

DS\$1BCAST\$1INNER and DS\$1DIST\$1BOTH are not good. Usually these redistributions occur because the tables are not joined on their distribution keys. If the fact table does not already have a distribution key, specify the joining column as the distribution key for both tables. If the fact table already has a distribution key on another column, evaluate whether changing the distribution key to collocate this join improve overall performance. If changing the distribution key of the outer table isn't an optimal choice, you can achieve collocation by specifying DISTSTYLE ALL for the inner table. 

 The following example shows a portion of a query plan with DS\$1BCAST\$1INNER and DS\$1DIST\$1NONE labels.

```
->  XN Hash Join DS_BCAST_INNER  (cost=112.50..3272334142.59 rows=170771 width=84)
        Hash Cond: ("outer".venueid = "inner".venueid)
        ->  XN Hash Join DS_BCAST_INNER  (cost=109.98..3167290276.71 rows=172456 width=47)
              Hash Cond: ("outer".eventid = "inner".eventid)
              ->  XN Merge Join DS_DIST_NONE  (cost=0.00..6286.47 rows=172456 width=30)
                    Merge Cond: ("outer".listid = "inner".listid)
                    ->  XN Seq Scan on listing  (cost=0.00..1924.97 rows=192497 width=14)
                    ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=24)
```

After changing the dimension tables to use DISTSTYLE ALL, the query plan for the same query shows DS\$1DIST\$1ALL\$1NONE in place of DS\$1BCAST\$1INNER. Also, there is a dramatic change in the relative cost for the join steps. The total cost is `14142.59` compared to `3272334142.59` in the previous query.

```
->  XN Hash Join DS_DIST_ALL_NONE  (cost=112.50..14142.59 rows=170771 width=84)
        Hash Cond: ("outer".venueid = "inner".venueid)
        ->  XN Hash Join DS_DIST_ALL_NONE  (cost=109.98..10276.71 rows=172456 width=47)
              Hash Cond: ("outer".eventid = "inner".eventid)
              ->  XN Merge Join DS_DIST_NONE  (cost=0.00..6286.47 rows=172456 width=30)
                    Merge Cond: ("outer".listid = "inner".listid)
                    ->  XN Seq Scan on listing  (cost=0.00..1924.97 rows=192497 width=14)
                    ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=24)
```

# Query plan example
<a name="t_explain_plan_example"></a>

This example shows how to evaluate a query plan to find opportunities to optimize the distribution.

Run the following query with an EXPLAIN command to produce a query plan.

```
explain
select lastname, catname, venuename, venuecity, venuestate, eventname, 
month, sum(pricepaid) as buyercost, max(totalprice) as maxtotalprice
from category join event on category.catid = event.catid
join venue on venue.venueid = event.venueid
join sales on sales.eventid = event.eventid
join listing on sales.listid = listing.listid
join date on sales.dateid = date.dateid
join users on users.userid = sales.buyerid
group by lastname, catname, venuename, venuecity, venuestate, eventname, month
having sum(pricepaid)>9999
order by catname, buyercost desc;
```

In the TICKIT database, SALES is a fact table and LISTING is its largest dimension. In order to collocate the tables, SALES is distributed on the LISTID, which is the foreign key for LISTING, and LISTING is distributed on its primary key, LISTID. The following example shows the CREATE TABLE commands for SALES and LISTING.

```
create table sales(
	salesid integer not null,
	listid integer not null distkey,
	sellerid integer not null,
	buyerid integer not null,
	eventid integer not null encode mostly16,
	dateid smallint not null,
	qtysold smallint not null encode mostly8,
	pricepaid decimal(8,2) encode delta32k,
	commission decimal(8,2) encode delta32k,
	saletime timestamp,
	primary key(salesid),
	foreign key(listid) references listing(listid),
	foreign key(sellerid) references users(userid),
	foreign key(buyerid) references users(userid),
	foreign key(dateid) references date(dateid))
        sortkey(listid,sellerid);

create table listing(
	listid integer not null distkey sortkey,
	sellerid integer not null,
	eventid integer not null encode mostly16,
	dateid smallint not null,
	numtickets smallint not null encode mostly8,
	priceperticket decimal(8,2) encode bytedict,
	totalprice decimal(8,2) encode mostly32,
	listtime timestamp,
	primary key(listid),
	foreign key(sellerid) references users(userid),
	foreign key(eventid) references event(eventid),
	foreign key(dateid) references date(dateid));
```

In the following query plan, the Merge Join step for the join on SALES and LISTING shows DS\$1DIST\$1NONE, which indicates that no redistribution is required for the step. However, moving up the query plan, the other inner joins show DS\$1BCAST\$1INNER, which indicates that the inner table is broadcast as part of the query execution. Because only one pair of tables can be collocated using key distribution, five tables must be rebroadcast.

```
QUERY PLAN
XN Merge  (cost=1015345167117.54..1015345167544.46 rows=1000 width=103)
  Merge Key: category.catname, sum(sales.pricepaid)
  ->  XN Network  (cost=1015345167117.54..1015345167544.46 rows=170771 width=103)
        Send to leader
        ->  XN Sort  (cost=1015345167117.54..1015345167544.46 rows=170771 width=103)
              Sort Key: category.catname, sum(sales.pricepaid)
              ->  XN HashAggregate  (cost=15345150568.37..15345152276.08 rows=170771 width=103)
                    Filter: (sum(pricepaid) > 9999.00)
	                    ->  XN Hash Join DS_BCAST_INNER  (cost=742.08..15345146299.10 rows=170771 width=103)
	                          Hash Cond: ("outer".catid = "inner".catid)
	                          ->  XN Hash Join DS_BCAST_INNER  (cost=741.94..15342942456.61 rows=170771 width=97)
	                                Hash Cond: ("outer".dateid = "inner".dateid)
	                                ->  XN Hash Join DS_BCAST_INNER  (cost=737.38..15269938609.81 rows=170766 width=90)
	                                      Hash Cond: ("outer".buyerid = "inner".userid)
	                                      ->  XN Hash Join DS_BCAST_INNER  (cost=112.50..3272334142.59 rows=170771 width=84)
	                                            Hash Cond: ("outer".venueid = "inner".venueid)
	                                            ->  XN Hash Join DS_BCAST_INNER  (cost=109.98..3167290276.71 rows=172456 width=47)
	                                                  Hash Cond: ("outer".eventid = "inner".eventid)
	                                                  ->  XN Merge Join DS_DIST_NONE  (cost=0.00..6286.47 rows=172456 width=30)
	                                                        Merge Cond: ("outer".listid = "inner".listid)
	                                                        ->  XN Seq Scan on listing  (cost=0.00..1924.97 rows=192497 width=14)
	                                                        ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=24)
	                                                  ->  XN Hash  (cost=87.98..87.98 rows=8798 width=25)
	                                                        ->  XN Seq Scan on event  (cost=0.00..87.98 rows=8798 width=25)
	                                            ->  XN Hash  (cost=2.02..2.02 rows=202 width=41)
	                                                  ->  XN Seq Scan on venue  (cost=0.00..2.02 rows=202 width=41)
	                                      ->  XN Hash  (cost=499.90..499.90 rows=49990 width=14)
	                                            ->  XN Seq Scan on users  (cost=0.00..499.90 rows=49990 width=14)
	                                ->  XN Hash  (cost=3.65..3.65 rows=365 width=11)
	                                      ->  XN Seq Scan on date  (cost=0.00..3.65 rows=365 width=11)
	                          ->  XN Hash  (cost=0.11..0.11 rows=11 width=10)
	                                ->  XN Seq Scan on category  (cost=0.00..0.11 rows=11 width=10)
```

One solution is to alter the tables to have DISTSTYLE ALL.

```
ALTER TABLE users ALTER DISTSTYLE ALL;
ALTER TABLE venue ALTER DISTSTYLE ALL;
ALTER TABLE category ALTER DISTSTYLE ALL;
ALTER TABLE date ALTER DISTSTYLE ALL;
ALTER TABLE event ALTER DISTSTYLE ALL;
```

Run the same query with EXPLAIN again, and examine the new query plan. The joins now show DS\$1DIST\$1ALL\$1NONE, indicating that no redistribution is required because the data was distributed to every node using DISTSTYLE ALL.

```
QUERY PLAN
XN Merge  (cost=1000000047117.54..1000000047544.46 rows=1000 width=103)
  Merge Key: category.catname, sum(sales.pricepaid)
  ->  XN Network  (cost=1000000047117.54..1000000047544.46 rows=170771 width=103)
        Send to leader
        ->  XN Sort  (cost=1000000047117.54..1000000047544.46 rows=170771 width=103)
              Sort Key: category.catname, sum(sales.pricepaid)
              ->  XN HashAggregate  (cost=30568.37..32276.08 rows=170771 width=103)
                    Filter: (sum(pricepaid) > 9999.00)
                    ->  XN Hash Join DS_DIST_ALL_NONE  (cost=742.08..26299.10 rows=170771 width=103)
                          Hash Cond: ("outer".buyerid = "inner".userid)
                          ->  XN Hash Join DS_DIST_ALL_NONE  (cost=117.20..21831.99 rows=170766 width=97)
                                Hash Cond: ("outer".dateid = "inner".dateid)
                                ->  XN Hash Join DS_DIST_ALL_NONE  (cost=112.64..17985.08 rows=170771 width=90)
                                      Hash Cond: ("outer".catid = "inner".catid)
                                      ->  XN Hash Join DS_DIST_ALL_NONE  (cost=112.50..14142.59 rows=170771 width=84)
                                            Hash Cond: ("outer".venueid = "inner".venueid)
                                            ->  XN Hash Join DS_DIST_ALL_NONE  (cost=109.98..10276.71 rows=172456 width=47)
                                                  Hash Cond: ("outer".eventid = "inner".eventid)
                                                  ->  XN Merge Join DS_DIST_NONE  (cost=0.00..6286.47 rows=172456 width=30)
                                                        Merge Cond: ("outer".listid = "inner".listid)
                                                        ->  XN Seq Scan on listing  (cost=0.00..1924.97 rows=192497 width=14)
                                                        ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=24)
                                                  ->  XN Hash  (cost=87.98..87.98 rows=8798 width=25)
                                                        ->  XN Seq Scan on event  (cost=0.00..87.98 rows=8798 width=25)
                                            ->  XN Hash  (cost=2.02..2.02 rows=202 width=41)
                                                  ->  XN Seq Scan on venue  (cost=0.00..2.02 rows=202 width=41)
                                      ->  XN Hash  (cost=0.11..0.11 rows=11 width=10)
                                            ->  XN Seq Scan on category  (cost=0.00..0.11 rows=11 width=10)
                                ->  XN Hash  (cost=3.65..3.65 rows=365 width=11)
                                      ->  XN Seq Scan on date  (cost=0.00..3.65 rows=365 width=11)
                          ->  XN Hash  (cost=499.90..499.90 rows=49990 width=14)
                                ->  XN Seq Scan on users  (cost=0.00..499.90 rows=49990 width=14)
```

# Distribution examples
<a name="c_Distribution_examples"></a>

The following examples show how data is distributed according to the options that you define in the CREATE TABLE statement.

## DISTKEY examples
<a name="c_Distribution_examples-distkey-examples"></a>

Look at the schema of the USERS table in the TICKIT database. USERID is defined as the SORTKEY column and the DISTKEY column: 

```
select "column", type, encoding, distkey, sortkey 
from pg_table_def where tablename = 'users';
    
    column     |          type          | encoding | distkey | sortkey
---------------+------------------------+----------+---------+---------
 userid        | integer                | none     | t       |       1
 username      | character(8)           | none     | f       |       0
 firstname     | character varying(30)  | text32k  | f       |       0

...
```

USERID is a good choice for the distribution column on this table. If you query the SVV\$1DISKUSAGE system view, you can see that the table is very evenly distributed. Column numbers are zero-based, so USERID is column 0.

```
select slice, col, num_values as rows, minvalue, maxvalue
from svv_diskusage
where name='users' and col=0 and rows>0
order by slice, col;

slice| col | rows  | minvalue | maxvalue
-----+-----+-------+----------+----------
0    | 0   | 12496 | 4        | 49987
1    | 0   | 12498 | 1        | 49988
2    | 0   | 12497 | 2        | 49989
3    | 0   | 12499 | 3        | 49990
(4 rows)
```

The table contains 49,990 rows. The rows (num\$1values) column shows that each slice contains about the same number of rows. The minvalue and maxvalue columns show the range of values on each slice. Each slice includes nearly the entire range of values, so there's a good chance that every slice participates in running a query that filters for a range of user IDs.

This example demonstrates distribution on a small test system. The total number of slices is typically much higher.

If you commonly join or group using the STATE column, you might choose to distribute on the STATE column. The following example shows a case where you create a new table with the same data as the USERS table but set the DISTKEY to the STATE column. In this case, the distribution isn't as even. Slice 0 (13,587 rows) holds approximately 30 percent more rows than slice 3 (10,150 rows). In a much larger table, this amount of distribution skew can have an adverse impact on query processing.

```
create table userskey distkey(state) as select * from users;

select slice, col, num_values as rows, minvalue, maxvalue from svv_diskusage
where name = 'userskey' and col=0 and rows>0
order by slice, col;

slice | col | rows  | minvalue | maxvalue
------+-----+-------+----------+----------
    0 |   0 | 13587 |        5 |    49989
    1 |   0 | 11245 |        2 |    49990
    2 |   0 | 15008 |        1 |    49976
    3 |   0 | 10150 |        4 |    49986
(4 rows)
```

## DISTSTYLE EVEN example
<a name="c_Distribution_examples-diststyle-even-example"></a>

If you create a new table with the same data as the USERS table but set the DISTSTYLE to EVEN, rows are always evenly distributed across slices. 

```
create table userseven diststyle even as 
select * from users;

select slice, col, num_values as rows, minvalue, maxvalue from svv_diskusage
where name = 'userseven' and col=0 and rows>0
order by slice, col;

slice | col | rows  | minvalue | maxvalue
------+-----+-------+----------+----------
    0 |   0 | 12497 |        4 |    49990
    1 |   0 | 12498 |        8 |    49984
    2 |   0 | 12498 |        2 |    49988
    3 |   0 | 12497 |        1 |    49989  
(4 rows)
```

However, because distribution is not based on a specific column, query processing can be degraded, especially if the table is joined to other tables. The lack of distribution on a joining column often influences the type of join operation that can be performed efficiently. Joins, aggregations, and grouping operations are optimized when both tables are distributed and sorted on their respective joining columns.

## DISTSTYLE ALL example
<a name="c_Distribution_examples-diststyle-all-example"></a>

If you create a new table with the same data as the USERS table but set the DISTSTYLE to ALL, all the rows are distributed to the first slice of each node. 

```
select slice, col, num_values as rows, minvalue, maxvalue from svv_diskusage
where name = 'usersall' and col=0 and rows > 0
order by slice, col;

slice | col | rows  | minvalue | maxvalue
------+-----+-------+----------+----------
    0 |   0 | 49990 |        4 |    49990
    2 |   0 | 49990 |        2 |    49990

(4 rows)
```

# Sort keys
<a name="t_Sorting_data"></a>

**Note**  
We recommend that you create your tables with `SORTKEY AUTO`. If you do so, then Amazon Redshift uses automatic table optimization to choose the sort key. For more information, see [Automatic table optimization](t_Creating_tables.md). The rest of this section provides details about the sort order. 

When you create a table, you can alternatively define one or more of its columns as *sort keys*. When data is initially loaded into the empty table, the rows are stored on disk in sorted order. Information about sort key columns is passed to the query planner, and the planner uses this information to construct plans that exploit the way that the data is sorted. For more information, see [CREATE TABLE](r_CREATE_TABLE_NEW.md). For information on best practices when creating a sort key, see [Choose the best sort key](c_best-practices-sort-key.md).

Sorting enables efficient handling of range-restricted predicates. Amazon Redshift stores columnar data in 1 MB disk blocks. The min and max values for each block are stored as part of the metadata. If a query uses a range-restricted predicate, the query processor can use the min and max values to rapidly skip over large numbers of blocks during table scans. For example, suppose that a table stores five years of data sorted by date and a query specifies a date range of one month. In this case, you can remove up to 98 percent of the disk blocks from the scan. If the data is not sorted, more of the disk blocks (possibly all of them) have to be scanned. 

You can specify either a compound or interleaved sort key. A compound sort key is more efficient when query predicates use a *prefix*, which is a subset of the sort key columns in order. An interleaved sort key gives equal weight to each column in the sort key, so query predicates can use any subset of the columns that make up the sort key, in any order. 

To understand the impact of the chosen sort key on query performance, use the [EXPLAIN](r_EXPLAIN.md) command. For more information, see [Query planning and execution workflow](c-query-planning.md). 

To define a sort type, use either the INTERLEAVED or COMPOUND keyword with your CREATE TABLE or CREATE TABLE AS statement. The default is COMPOUND. COMPOUND is recommended when you update your tables regularly with INSERT, UPDATE, or DELETE operations. An INTERLEAVED sort key can use a maximum of eight columns. Depending on your data and cluster size, VACUUM REINDEX takes significantly longer than VACUUM FULL because it makes an additional pass to analyze the interleaved sort keys. The sort and merge operation can take longer for interleaved tables because the interleaved sort might have to rearrange more rows than a compound sort.

To view the sort keys for a table, query the [SVV\$1TABLE\$1INFO](r_SVV_TABLE_INFO.md) system view.

**Topics**
+ [Multidimensional data layout sorting](t_Sorting_mutidimensional-sort-key.md)
+ [Compound sort key](t_Sorting_data-compound.md)
+ [Interleaved sort key](t_Sorting_data-interleaved.md)

# Multidimensional data layout sorting
<a name="t_Sorting_mutidimensional-sort-key"></a>

A multidimensional data layout sort key is a type of AUTO sort key that is based on repetitive predicates found in a workload. If your workload has repetitive predicates, then Amazon Redshift can improve table scan performance by colocating data rows that satisfy the repetitive predicates. Instead of storing data of a table in strict column order, a multidimensional data layout sort key stores data by analyzing repetitive predicates that appear in a workload. More than one repetitive predicate can be found in a workload. Depending on your workload, this kind of sort key can improve performance of many predicates. Amazon Redshift automatically determines if this sort key method should be used for tables that are defined with an `AUTO` sort key.

For example, suppose you have a table that has data sorted in column order. Many data blocks might need to be examined to determine if they satisfy the predicates in your workload. But, if the data is stored on disk in a predicate order, then fewer blocks need to be scanned to satisfy the query. Using a multidimensional data layout sort key is beneficial in this case.

To view whether a query is using a multidimensional data layout key, see the `step_attribute` column of the [SYS\$1QUERY\$1DETAIL](SYS_QUERY_DETAIL.md) view. When the value is `multi-dimensional` then multidimensional data layout was used for the query.

To prevent Amazon Redshift from using a multidimensional data layout sort key, choose a different table sort key option other than `SORTKEY AUTO`. For more information on SORTKEY options, see [CREATE TABLE](r_CREATE_TABLE_NEW.md).

# Compound sort key
<a name="t_Sorting_data-compound"></a>

 A compound key is made up of all of the columns listed in the sort key definition, in the order they are listed. A compound sort key is most useful when a query's filter applies conditions, such as filters and joins, that use a prefix of the sort keys. The performance benefits of compound sorting decrease when queries depend only on secondary sort columns, without referencing the primary columns. COMPOUND is the default sort type.

Compound sort keys might speed up joins, GROUP BY and ORDER BY operations, and window functions that use PARTITION BY and ORDER BY. For example, a merge join, which is often faster than a hash join, is feasible when the data is distributed and presorted on the joining columns. Compound sort keys also help improve compression. 

As you add rows to a sorted table that already contains data, the unsorted region grows, which has a significant effect on performance. The effect is greater when the table uses interleaved sorting, especially when the sort columns include data that increases monotonically, such as date or timestamp columns. Run a VACUUM operation regularly, especially after large data loads, to re-sort and re-analyze the data. For more information, see [Reduce the size of the unsorted region](vacuum-managing-vacuum-times.md#r_vacuum_diskspacereqs). After vacuuming to resort the data, it's a good practice to run an ANALYZE command to update the statistical metadata for the query planner. For more information, see [Analyzing tables](t_Analyzing_tables.md).

# Interleaved sort key
<a name="t_Sorting_data-interleaved"></a>

An interleaved sort gives equal weight to each column, or subset of columns, in the sort key. If multiple queries use different columns for filters, then you can often improve performance for those queries by using an interleaved sort style. When a query uses restrictive predicates on secondary sort columns, interleaved sorting significantly improves query performance as compared to compound sorting. 

**Important**  
Don't use an interleaved sort key on columns with monotonically increasing attributes, such as identity columns, dates, or timestamps.

The performance improvements you gain by implementing an interleaved sort key should be weighed against increased load and vacuum times. 

Interleaved sorts are most effective with highly selective queries that filter on one or more of the sort key columns in the WHERE clause, for example `select c_name from customer where c_region = 'ASIA'`. The benefits of interleaved sorting increase with the number of sorted columns that are restricted. 

An interleaved sort is more effective with large tables. Sorting is applied on each slice. Thus, an interleaved sort is most effective when a table is large enough to require multiple 1 MB blocks per slice. Here, the query processor can skip a significant proportion of the blocks using restrictive predicates. To view the number of blocks a table uses, query the [STV\$1BLOCKLIST](r_STV_BLOCKLIST.md) system view.

 When sorting on a single column, an interleaved sort might give better performance than a compound sort if the column values have a long common prefix. For example, URLs commonly begin with "http://www". Compound sort keys use a limited number of characters from the prefix, which results in a lot of duplication of keys. Interleaved sorts use an internal compression scheme for zone map values that enables them to better discriminate among column values that have a long common prefix.

 When migrating Amazon Redshift provisioned clusters to Amazon Redshift Serverless, Redshift converts tables with both interleaved sort keys and DISTSTYLE KEY to compound sort keys. However, tables with only interleaved sort keys remain unchanged. For more information on distribution styles, see [Working with data distribution styles](https://docs.aws.amazon.com//redshift/latest/dg/t_Distributing_data.html). 
<a name="t_Sorting_data-interleaved-reindex"></a>
**VACUUM REINDEX**  
As you add rows to a sorted table that already contains data, performance might deteriorate over time. This deterioration occurs for both compound and interleaved sorts, but it has a greater effect on interleaved tables. A VACUUM restores the sort order, but the operation can take longer for interleaved tables because merging new interleaved data might involve modifying every data block.

When tables are initially loaded, Amazon Redshift analyzes the distribution of the values in the sort key columns and uses that information for optimal interleaving of the sort key columns. As a table grows, the distribution of the values in the sort key columns can change, or skew, especially with date or timestamp columns. If the skew becomes too large, performance might be affected. To re-analyze the sort keys and restore performance, run the VACUUM command with the REINDEX key word. Because it must take an extra analysis pass over the data, VACUUM REINDEX can take longer than a standard VACUUM for interleaved tables. To view information about key distribution skew and last reindex time, query the [SVV\$1INTERLEAVED\$1COLUMNS](r_SVV_INTERLEAVED_COLUMNS.md) system view.

For more information about how to determine how often to run VACUUM and when to run a VACUUM REINDEX, see [Decide whether to reindex](vacuum-managing-vacuum-times.md#r_vacuum-decide-whether-to-reindex). 

# Table constraints
<a name="t_Defining_constraints"></a>

Uniqueness, primary key, and foreign key constraints are informational only; *they are not enforced by Amazon Redshift* when you populate a table. For example, if you insert data into a table with dependencies, the insert can succeed even if it violates the constraint. Nonetheless, primary keys and foreign keys are used as planning hints and they should be declared if your ETL process or some other process in your application enforces their integrity.

For example, the query planner uses primary and foreign keys in certain statistical computations. It does this to infer uniqueness and referential relationships that affect subquery decorrelation techniques. By doing this, it can order large numbers of joins and remove redundant joins.

The planner leverages these key relationships, but it assumes that all keys in Amazon Redshift tables are valid as loaded. If your application allows invalid foreign keys or primary keys, some queries could return incorrect results. For example, a SELECT DISTINCT query might return duplicate rows if the primary key is not unique. Do not define key constraints for your tables if you doubt their validity. However, always declare primary and foreign keys and uniqueness constraints when you know that they are valid.

Amazon Redshift *does* enforce NOT NULL column constraints.

For more information about table constraints, see [CREATE TABLE](r_CREATE_TABLE_NEW.md). For information about how to drop a table with dependencies, see [DROP TABLE](r_DROP_TABLE.md).