Troubleshooting Amazon RDS zero-ETL integrations with Amazon Redshift - Amazon Relational Database Service

Troubleshooting Amazon RDS zero-ETL integrations with Amazon Redshift

This is prerelease documentation for Amazon RDS zero-ETL integrations with Amazon Redshift, which is in preview release. The documentation and the feature are both subject to change. We recommend that you use this feature only in test environments, and not in production environments. For preview terms and conditions, see Betas and Previews in AWS Service Terms.

You can check the state of a zero-ETL integration by querying the SVV_INTEGRATION system table in Amazon Redshift. If the state column has a value of ErrorState, it means something's wrong. For more information, see Monitoring integrations using system tables.

Use the following information to troubleshoot common issues with Amazon RDS zero-ETL integrations with Amazon Redshift.

I can't create a zero-ETL integration

If you can't create a zero-ETL integration, make sure that the following are correct for your source DB instance:

  • Your source database is running RDS for MySQL version 8.0.32 or higher. To validate the engine version, choose the Configuration tab for the database and check the Engine version.

  • You correctly configured DB parameters. If the required parameters are set incorrectly or not associated with the DB instance, creation fails. See Step 1: Create a custom DB parameter group.

In addition, make sure the following are correct for your target data warehouse:

My integration is stuck in a state of Syncing

Your integration might consistently show a status of Syncing if you change the value of one of the required DB parameters.

To fix this issue, check the values of the parameters in the parameter group associated with the source database, and make sure that they match the required values. For more information, see Step 1: Create a custom DB parameter group.

If you modify any parameters, make sure to reboot the database to apply the changes.

My tables aren't replicating to Amazon Redshift

Your data might not be replicating because one or more of your source tables doesn't have a primary key. The monitoring dashboard in Amazon Redshift displays the status of these tables as Failed, and the status of the overall zero-ETL integration changes to Needs attention.

To resolve this issue, you can identify an existing key in your table that can become a primary key, or you can add a synthetic primary key. For detailed solutions, see Handle tables without primary keys while creating Amazon Aurora MySQL or Amazon RDS for MySQL zero-ETL integrations with Amazon Redshift.

One or more of my Amazon Redshift tables requires a resync

Running certain commands on your source DB instance might require your tables to be resynchronized. In these cases, the SVV_INTEGRATION_TABLE_STATE system view shows a table_state of ResyncRequired, which means that the integration must completely reload data for that specific table from MySQL to Amazon Redshift.

When the table starts to resynchronize, it enters a state of Syncing. You don't need to take any manual action to resynchronize a table. While table data is resynchronizing, you can't access it in Amazon Redshift.

The following are some example operations that can put a table into a ResyncRequired state, and possible alternatives to consider.

Operation Example Alternative
Adding a column into a specific position
ALTER TABLE table_name ADD COLUMN column_name INTEGER NOT NULL first;
Amazon Redshift doesn't support adding columns into specific positions using first or after keywords. If the order of columns in the target table isn't critical, add the column to the end of the table using a simpler command:
ALTER TABLE table_name ADD COLUMN column_name column_type;
Adding a timestamp column with the default CURRENT_TIMESTAMP
ALTER TABLE table_name ADD COLUMN column_name TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP;
The CURRENT_TIMESTAMP value for existing table rows is calculated by RDS for MySQL and can't be simulated in Amazon Redshift without full table data resynchronization.

If possible, switch the default value to a literal constant like 2023-01-01 00:00:15 to avoid latency in table availability.

Performing multiple column operations within a single command
ALTER TABLE table_name ADD COLUMN column_1, RENAME COLUMN column_2 TO column_3;
Consider splitting the command into two separate operations, ADD and RENAME, which won't require resynchronization.