

# Amazon RDS zero-ETL integrations
<a name="zero-etl"></a>

An Amazon RDS zero-ETL integration with Amazon Redshift and Amazon SageMaker AI enables near real-time analytics and machine learning (ML) using data from RDS. It's a fully managed solution for making transactional data available in your analytics destination after it is written to an RDS database. *Extract, transform,* and *load* (ETL) is the process of combining data from multiple sources into a large, central data warehouse.

A zero-ETL integration makes the data in your RDS database available in Amazon Redshift or an Amazon SageMaker AI lakehouse in near real-time. Once that data is in the target data warehouse or data lake, you can power your analytics, ML, and AI workloads using the built-in capabilities, such as machine learning, materialized views, data sharing, federated access to multiple data stores and data lakes, and integrations with Amazon SageMaker AI, Quick, and other AWS services.

To create a zero-ETL integration, you specify an RDS database as the *source*, and a supported data warehouse or lakehouse as the *target*. The integration replicates data from the source database into the target data warehouse or lakehouse.

The following diagram illustrates this functionality for zero-ETL integration with Amazon Redshift:

![\[A zero-ETL integration\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/images/zero-etl-integrations.png)


The following diagram illustrates this functionality for zero-ETL integration with an Amazon SageMaker AI lakehouse:

![\[A zero-ETL integration with an Amazon SageMaker AI lakehouse\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/images/zero-etl-rds-lakehouse.png)


The integration monitors the health of the data pipeline and recovers from issues when possible. You can create integrations from multiple RDS databases into a single target data warehouse or lakehouse enabling you to derive insights across multiple applications.

**Topics**
+ [

## Benefits
](#zero-etl.benefits)
+ [

## Key concepts
](#zero-etl.concepts)
+ [

## Limitations
](#zero-etl.reqs-lims)
+ [

## Quotas
](#zero-etl.quotas)
+ [

## Supported Regions
](#zero-etl.regions)
+ [

# Getting started with Amazon RDS zero-ETL integrations
](zero-etl.setting-up.md)
+ [

# Creating Amazon RDS zero-ETL integrations with Amazon Redshift
](zero-etl.creating.md)
+ [

# Creating Amazon RDS zero-ETL integrations with an Amazon SageMaker lakehouse
](zero-etl.creating-smlh.md)
+ [

# Data filtering for Amazon RDS zero-ETL integrations
](zero-etl.filtering.md)
+ [

# Adding data to a source RDS database and querying it
](zero-etl.querying.md)
+ [

# Viewing and monitoring Amazon RDS zero-ETL integrations
](zero-etl.describingmonitoring.md)
+ [

# Modifying Amazon RDS zero-ETL integrations
](zero-etl.modifying.md)
+ [

# Deleting Amazon RDS zero-ETL integrations
](zero-etl.deleting.md)
+ [

# Troubleshooting Amazon RDS zero-ETL integrations
](zero-etl.troubleshooting.md)

## Benefits
<a name="zero-etl.benefits"></a>

RDS zero-ETL integrations have the following benefits:
+ Help you derive holistic insights from multiple data sources.
+ Eliminate the need to build and maintain complex data pipelines that perform extract, transform, and load (ETL) operations. Zero-ETL integrations remove the challenges that come with building and managing pipelines by provisioning and managing them for you.
+ Reduce operational burden and cost, and let you focus on improving your applications.
+ Let you leverage the target destination's analytics and ML capabilities to derive insights from transactional and other data, to respond effectively to critical, time-sensitive events.

## Key concepts
<a name="zero-etl.concepts"></a>

As you get started with zero-ETL integrations, consider the following concepts:

**Integration**  
A fully managed data pipeline that automatically replicates transactional data and schemas from an RDS database to a data warehouse or catalog.

**Source database**  
The RDS database where data is replicated from. You can specify a Single-AZ or Multi-AZ DB instance, or a Multi-AZ DB cluster (for RDS for MySQL only).

**Target**  
The data warehouse or lakehouse where the data is replicated to. There are two types of data warehouse: a [provisioned cluster](https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html) data warehouse and a [serverless](https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-workgroup-namespace.html) data warehouse. A provisioned cluster data warehouse is a collection of computing resources called nodes, which are organized into a group called a *cluster*. A serverless data warehouse is comprised of a workgroup that stores compute resources, and a namespace that houses the database objects and users. Both data warehouses run an analytics engine and contain one or more databases.  
A target lakehouse consists of catalogs, databases, tables, and views. For more information about lakehouse architecture, see [https://docs.aws.amazon.com/sagemaker-unified-studio/latest/userguide/lakehouse-components.html](https://docs.aws.amazon.com/sagemaker-unified-studio/latest/userguide/lakehouse-components.html) in the * Amazon SageMaker AI Unified Studio User Guide*.  
Multiple source databases can write to the same target.  
For more information, see [Data warehouse system architecture](https://docs.aws.amazon.com/redshift/latest/dg/c_high_level_system_architecture.html) in the *Amazon Redshift Developer Guide*.

## Limitations
<a name="zero-etl.reqs-lims"></a>

The following limitations apply to RDS zero-ETL integrations.

**Topics**
+ [

### General limitations
](#zero-etl.reqs-lims-general)
+ [

### RDS for MySQL limitations
](#zero-etl.reqs-lims-mysql)
+ [

### RDS for PostgreSQL limitations
](#zero-etl.reqs-lims-rds-postgres)
+ [

### RDS for Oracle limitations
](#zero-etl.reqs-lims-oracle)
+ [

### Amazon Redshift limitations
](#zero-etl.reqs-lims-redshift)
+ [

### Amazon SageMaker AI lakehouse limitations
](#zero-etl.reqs-lims-sagemaker-)

### General limitations
<a name="zero-etl.reqs-lims-general"></a>
+ The source database must be in the same Region as the target.
+ You can't rename a database if it has existing integrations.
+ You can't create multiple integrations between the same source and target databases.
+ You can't delete a database that has existing integrations. You must delete all associated integrations first.
+ If you stop the source database, the last few transactions might not be replicated to the target until you resume the database.
+ You can’t delete an integration if the source database is stopped.
+ If your database is the source of a blue/green deployment, the blue and green environments can't have existing zero-ETL integrations during switchover. You must delete the integration first and switch over, then recreate it.
+ You can't create an integration for a source database that has another integration being actively created.
+ When you initially create an integration, or when a table is being resynchronized, data seeding from the source to the target can take 20-25 minutes or more depending on the size of the source database. This delay can lead to increased replica lag.
+ Some data types aren't supported. For more information, see [Data type differences between RDS and Amazon Redshift databases](zero-etl.querying.md#zero-etl.data-type-mapping).
+ System tables, temporary tables, and views aren't replicated to target warehouses.
+ Performing DDL commands (for example `ALTER TABLE`) on a source table can trigger a table resynchronization, making the table unavailable for querying while it's resynchronizing. For more information, see [One or more of my Amazon Redshift tables requires a resync](zero-etl.troubleshooting.md#zero-etl.troubleshooting.resync).

### RDS for MySQL limitations
<a name="zero-etl.reqs-lims-mysql"></a>
+ Your source database must be running a supported version of RDS for MySQL. For a list of supported versions, see [Supported Regions and DB engines for Amazon RDS zero-ETL integrations](Concepts.RDS_Fea_Regions_DB-eng.Feature.ZeroETL.md).
+ Zero-ETL integrations are not supported on both the primary instance and a read replica instance in the same AWS Region.
+ Zero-ETL integrations rely on MySQL binary logging (binlog) to capture ongoing data changes. Don't use binlog-based data filtering, as it can cause data inconsistencies between the source and target databases.
+ Zero-ETL integrations are supported only for databases configured to use the InnoDB storage engine.
+ Foreign key references with predefined table updates aren't supported. Specifically, `ON DELETE` and `ON UPDATE` rules aren't supported with `CASCADE`, `SET NULL`, and `SET DEFAULT` actions. Attempting to create or update a table with such references to another table will put the table into a failed state.
+ You can't create an integration for a source database that uses magnetic storage.

### RDS for PostgreSQL limitations
<a name="zero-etl.reqs-lims-rds-postgres"></a>
+ The source database must be an RDS for PostgreSQL instance running version 15.7\$1, 16.3\$1, or 17.1\$1. Earlier versions are not supported.
+ RDS for PostgreSQL zero-ETL integrations don't support Multi-AZ DB clusters as source databases.
+ You can't create a zero-ETL integration from an RDS for PostgreSQL read replica instance.
+ PostgreSQL unlogged tables and materialized views are not replicated to Amazon Redshift.
+ Replication of certain PostgreSQL data types, such as [geometry data types](https://www.postgresql.org/docs/current/datatype-geometric.html) and data greater than 64KB, is not supported due to limitations in Amazon Redshift. For more information about data type differences between RDS for PostgreSQL and Amazon Redshift, see [RDS for PostgreSQL](zero-etl.querying.md#zero-etl.data-type-mapping-postgres) in the **Data type differences** section.
+ You can't perform a [major version upgrade](USER_UpgradeDBInstance.PostgreSQL.md) on the source RDS for PostgreSQL instance while it has an active zero-ETL integration. To upgrade the source instance, you must first delete all existing zero-ETL integrations. After the major version upgrade is complete, you can recreate the zero-ETL integrations.
+ If you perform [declarative partitioning](https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE) transactions on the source DB instance, all affected tables enter a failed state and are no longer accessible.

### RDS for Oracle limitations
<a name="zero-etl.reqs-lims-oracle"></a>
+ The source database must be an RDS for Oracle instance running version 19c Enterprise Edition or Standard Edition 2, July 2019 Release Update or later. Earlier versions are not supported.
+ You can't create a zero-ETL integration from an RDS for Oracle read replica instance.
+ You can't rename a [tenant database](Oracle.Concepts.CDBs.md) when there is a zero-ETL integration on that tenant database.
+ A tenant database can have only one zero-ETL integration.
+ RDS for Oracle and Amazon Redshift have some data type differences. For more information, see [RDS for Oracle](zero-etl.querying.md#zero-etl.data-type-mapping-oracle) in the **Data type differences** section.

### Amazon Redshift limitations
<a name="zero-etl.reqs-lims-redshift"></a>

For a list of Amazon Redshift limitations related to zero-ETL integrations, see [Considerations when using zero-ETL integrations with Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl.reqs-lims.html) in the *Amazon Redshift Management Guide*.

### Amazon SageMaker AI lakehouse limitations
<a name="zero-etl.reqs-lims-sagemaker-"></a>

Following is a limitation for Amazon SageMaker AI lakehouse zero-ETL integrations.
+ Catalog names are limited to 19 characters in length.

## Quotas
<a name="zero-etl.quotas"></a>

Your account has the following quotas related to RDS zero-ETL integrations. Each quota is per-Region unless otherwise specified.


| Name | Default | Description | 
| --- | --- | --- | 
| Integrations | 100 | The total number of integrations within an AWS account. | 
| Integrations per target | 50 | The number of integrations sending data to a single target data warehouse or lakehouse. | 
| Integrations per source instance | 5 | The number of integrations sending data from a single source DB instance. | 

In addition, the target warehouse places certain limits on the number of tables allowed in each DB instance or cluster node. For more information about Amazon Redshift quotas and limits, see [Quotas and limits in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/amazon-redshift-limits.html) in the *Amazon Redshift Management Guide*.

## Supported Regions
<a name="zero-etl.regions"></a>

RDS zero-ETL integrations are available in a subset of AWS Regions. For a list of supported Regions, see [Supported Regions and DB engines for Amazon RDS zero-ETL integrations](Concepts.RDS_Fea_Regions_DB-eng.Feature.ZeroETL.md).

# Getting started with Amazon RDS zero-ETL integrations
<a name="zero-etl.setting-up"></a>

Before you create a zero-ETL integration, configure your RDS database and your data warehouse with the required parameters and permissions. During setup, you'll complete the following steps:

1. [Create a custom DB parameter group](#zero-etl.parameters).

1. [Create a source database](#zero-etl.create-cluster).

1. [Create a target data warehouse for Amazon Redshift](#zero-etl-setting-up.data-warehouse) or [ Create a target Amazon SageMaker Lakehouse](#zero-etl-setting-up.sagemaker).

After you complete these tasks, continue to [Creating Amazon RDS zero-ETL integrations with Amazon Redshift](zero-etl.creating.md) or [Creating Amazon RDS zero-ETL integrations with an Amazon SageMaker lakehouse](zero-etl.creating-smlh.md).

**Tip**  
You can have RDS complete these setup steps for you while you're creating the integration, rather than performing them manually. To immediately start creating an integration, see [Creating Amazon RDS zero-ETL integrations with Amazon Redshift](zero-etl.creating.md).

For Step 3, you can choose to create either a target data warehouse (Step 3a) or a target lakehouse (Step 3b) depending on your needs:
+ Choose a data warehouse if you need traditional data warehousing capabilities with SQL-based analytics.
+ Choose an Amazon SageMaker Lakehouse if you need machine learning capabilities and want to use lakehouse features for data science and ML workflows.

## Step 1: Create a custom DB parameter group
<a name="zero-etl.parameters"></a>

Amazon RDS zero-ETL integrations require specific values for the DB parameters that control data replication. The specific parameters depend on your source DB engine. To configure these parameters, you must first create a custom DB parameter group, and then associate it with the source database. Configure the following parameter values depending on your source DB engine. For instructions to create a parameter group, see [DB parameter groups for Amazon RDS DB instances](USER_WorkingWithDBInstanceParamGroups.md). We recommend that you configure all parameter values within the same request to avoid dependency issues.

**RDS for MySQL**:
+ `binlog_format=ROW`
+ `binlog_row_image=full`

In addition, make sure that the `binlog_row_value_options` parameter is *not* set to `PARTIAL_JSON`. If the source database is a Multi-AZ DB cluster, make sure that the `binlog_transaction_compression` parameter is *not* set to `ON`.

Some of these parameters (such as `binlog_format`) are dynamic, meaning you can apply changes to the parameter without triggering a reboot. This means that some existing sessions might continue using the old value of the parameter. To prevent this from causing problems when creating a zero-ETL integration, enable [Performance Schema.](USER_PerfInsights.EnableMySQL.md) Performance Schema ensures that zero-ETL pre-checks run, which help detect missing parameters early in the process.

**RDS for PostgreSQL**:
+ `rds.logical_replication = 1`
+ `rds.replica_identity_full = 1`
+ `session_replication_role = origin`
+ `wal_sender_timeout ≥ 20000 or = 0`
+ `max_wal_senders ≥ 20`
+ `max_replication_slots ≥ 20`

For multiple PostgreSQL integrations, one logical replication slot will be used per integration. Review the `max_replication_slots` and `max_wal_senders` parameters based on your usage.

For efficient data synchronization in zero-ETL integrations, set `rds.replica_identity_full` in your source DB instance. This instructs the database to [log complete row data](https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY) in the write-ahead log (WAL) during `UPDATE` and `DELETE` operations, rather than just primary key information. Zero-ETL requires complete row data even when all replicated tables are required to have primary keys. To determine which data is visible during queries, Amazon Redshift uses a specialized anti-join strategy to compare your data against an internal delete tracking table. Logging full-row images helps Amazon Redshift perform these anti-joins efficiently. Without full row data, Amazon Redshift would need to perform additional lookups, which could slow performance during high-throughput operations in the columnar engine that Amazon Redshift uses.

**Important**  
Setting replica identity to log full rows [increases your WAL volume](https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-LEVEL), which can lead to higher write amplification and I/O usage, especially for wide tables or frequent updates. To prepare for these impacts, plan your storage capacity and I/O requirements, monitor your WAL growth, and track replication lag in write-heavy workloads.

**RDS for Oracle**:

No parameter changes are required for RDS for Oracle.

## Step 2: Select or create a source database
<a name="zero-etl.create-cluster"></a>

After you create a custom DB parameter group, choose or create an RDS DB instance . This database will be the source of data replication to the target data warehouse. For instructions to create a Single-AZ or Multi-AZ DB instance, see [Creating an Amazon RDS DB instance](USER_CreateDBInstance.md). For instructions to create a Multi-AZ DB cluster (RDS for MySQL only), see [Creating a Multi-AZ DB cluster for Amazon RDS](create-multi-az-db-cluster.md). 

The database must be running a supported DB engine version. For a list of supported versions, see [Supported Regions and DB engines for Amazon RDS zero-ETL integrations](Concepts.RDS_Fea_Regions_DB-eng.Feature.ZeroETL.md).

When you create the database, under **Additional configuration**, change the default **DB parameter group** to the custom parameter group that you created in the previous step.

**Note**  
If you associate the parameter group with the database *after* the database is already created, you must reboot the database to apply the changes before you can create a zero-ETL integration. For instructions, see [Rebooting a DB instance](USER_RebootInstance.md) or [Rebooting a Multi-AZ DB cluster and reader DB instances for Amazon RDS](multi-az-db-clusters-concepts-rebooting.md).

In addition, make sure that automated backups are enabled on the database. For more information, see [Enabling automated backups](USER_WorkingWithAutomatedBackups.Enabling.md).

## Step 3a: Create a target data warehouse
<a name="zero-etl-setting-up.data-warehouse"></a>

After you create your source database, you must create and configure a target data warehouse. The data warehouse must meet the following requirements:
+ Using an RA3 node type with at least two nodes, or Redshift Serverless.
+ Encrypted (if using a provisioned cluster). For more information, see [Amazon Redshift database encryption](https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-db-encryption.html).

For instructions to create a data warehouse, see [Creating a cluster](https://docs.aws.amazon.com/redshift/latest/mgmt/create-cluster) for provisioned clusters, or [Creating a workgroup with a namespace](https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-console-workgroups-create-workgroup-wizard.html) for Redshift Serverless.

### Enable case sensitivity on the data warehouse
<a name="zero-etl-setting-up.case-sensitivity"></a>

For the integration to be successful, the case sensitivity parameter ([https://docs.aws.amazon.com/redshift/latest/dg/r_enable_case_sensitive_identifier.html](https://docs.aws.amazon.com/redshift/latest/dg/r_enable_case_sensitive_identifier.html)) must be enabled for the data warehouse. By default, case sensitivity is disabled on all provisioned clusters and Redshift Serverless workgroups.

To enable case sensitivity, perform the following steps depending on your data warehouse type:
+ **Provisioned cluster** – To enable case sensitivity on a provisioned cluster, create a custom parameter group with the `enable_case_sensitive_identifier` parameter enabled. Then, associate the parameter group with the cluster. For instructions, see [Managing parameter groups using the console](https://docs.aws.amazon.com/redshift/latest/mgmt/managing-parameter-groups-console.html) or [Configuring parameter values using the AWS CLI](https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-parameter-groups.html#configure-parameters-using-the-clil). 
**Note**  
Remember to reboot the cluster after you associate the custom parameter group with it.
+ **Serverless workgroup** – To enable case sensitivity on a Redshift Serverless workgroup, you must use the AWS CLI. The Amazon Redshift console doesn't currently support modifying Redshift Serverless parameter values. Send the following [update-workgroup](https://docs.aws.amazon.com/cli/latest/reference/redshift-serverless/update-workgroup.html) request:

  ```
  aws redshift-serverless update-workgroup \
    --workgroup-name target-workgroup \
    --config-parameters parameterKey=enable_case_sensitive_identifier,parameterValue=true
  ```

  You don't need to reboot a workgroup after you modify its parameter values.

### Configure authorization for the data warehouse
<a name="zero-etl.setup-auth"></a>

After you create a data warehouse, you must configure the source RDS database as an authorized integration source. For instructions, see [Configure authorization for your Amazon Redshift data warehouse](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.setting-up.html#zero-etl-using.redshift-iam).

## Set up an integration using the AWS SDKs
<a name="zero-etl.setup-sdk"></a>

Rather than setting up each resource manually, you can run the following Python script to automatically set up the required resources for you. The code example uses the [AWS SDK for Python (Boto3)](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) to create a source RDS for MySQL DB instance and target data warehouse, each with the required parameter values. It then waits for the databases to be available before creating a zero-ETL integration between them. You can comment out different functions depending on which resources you need to set up. 

To install the required dependencies, run the following commands:

```
pip install boto3
pip install time
```

Within the script, optionally modify the names of the source, target, and parameter groups. The final function creates an integration named `my-integration` after the resources are set up.

### Python code example
<a name="zero-etl.setup-sdk-python"></a>

```
import boto3
import time

# Build the client using the default credential configuration.
# You can use the CLI and run 'aws configure' to set access key, secret
# key, and default Region.

rds = boto3.client('rds')
redshift = boto3.client('redshift')
sts = boto3.client('sts')

source_db_name = 'my-source-db' # A name for the source database
source_param_group_name = 'my-source-param-group' # A name for the source parameter group
target_cluster_name = 'my-target-cluster' # A name for the target cluster
target_param_group_name = 'my-target-param-group' # A name for the target parameter group

def create_source_db(*args):
    """Creates a source RDS for MySQL DB instance"""

    response = rds.create_db_parameter_group(
        DBParameterGroupName=source_param_group_name,
        DBParameterGroupFamily='mysql8.0',
        Description='RDS for MySQL zero-ETL integrations'
    )
    print('Created source parameter group: ' + response['DBParameterGroup']['DBParameterGroupName'])

    response = rds.modify_db_parameter_group(
        DBParameterGroupName=source_param_group_name,
        Parameters=[
            {
                'ParameterName': 'binlog_format',
                'ParameterValue': 'ROW',
                'ApplyMethod': 'pending-reboot'
            },
            {
                'ParameterName': 'binlog_row_image',
                'ParameterValue': 'full',
                'ApplyMethod': 'pending-reboot'
            }
        ]
    )
    print('Modified source parameter group: ' + response['DBParameterGroupName'])

    response = rds.create_db_instance(
        DBInstanceIdentifier=source_db_name,
        DBParameterGroupName=source_param_group_name,
        Engine='mysql',
        EngineVersion='8.0.32',
        DBName='mydb',
        DBInstanceClass='db.m5.large',
        AllocatedStorage=15,
        MasterUsername='username',
        MasterUserPassword='Password01**'
    )
    print('Creating source database: ' + response['DBInstance']['DBInstanceIdentifier'])
    source_arn = (response['DBInstance']['DBInstanceArn'])
    create_target_cluster(target_cluster_name, source_arn, target_param_group_name)
    return(response)

def create_target_cluster(target_cluster_name, source_arn, target_param_group_name):
    """Creates a target Redshift cluster"""

    response = redshift.create_cluster_parameter_group(
        ParameterGroupName=target_param_group_name,
        ParameterGroupFamily='redshift-1.0',
        Description='RDS for MySQL zero-ETL integrations'
    )
    print('Created target parameter group: ' + response['ClusterParameterGroup']['ParameterGroupName'])

    response = redshift.modify_cluster_parameter_group(
        ParameterGroupName=target_param_group_name,
        Parameters=[
            {
                'ParameterName': 'enable_case_sensitive_identifier',
                'ParameterValue': 'true'
            }
        ]
    )
    print('Modified target parameter group: ' + response['ParameterGroupName'])

    response = redshift.create_cluster(
        ClusterIdentifier=target_cluster_name,
        NodeType='ra3.4xlarge',
        NumberOfNodes=2,
        Encrypted=True,
        MasterUsername='username',
        MasterUserPassword='Password01**',
        ClusterParameterGroupName=target_param_group_name
    )
    print('Creating target cluster: ' + response['Cluster']['ClusterIdentifier'])
    
    # Retrieve the target cluster ARN
    response = redshift.describe_clusters(
        ClusterIdentifier=target_cluster_name
    )
    target_arn = response['Clusters'][0]['ClusterNamespaceArn']

    # Retrieve the current user's account ID
    response = sts.get_caller_identity()
    account_id = response['Account']

    # Create a resource policy granting access to source database and account ID
    response = redshift.put_resource_policy(
        ResourceArn=target_arn,
        Policy='''
        {
            \"Version\":\"2012-10-17\",		 	 	 
            \"Statement\":[
                {\"Effect\":\"Allow\",
                \"Principal\":{
                    \"Service\":\"redshift.amazonaws.com\"
                },
                \"Action\":[\"redshift:AuthorizeInboundIntegration\"],
                \"Condition\":{
                    \"StringEquals\":{
                        \"aws:SourceArn\":\"%s\"}
                    }
                },
                {\"Effect\":\"Allow\",
                \"Principal\":{
                    \"AWS\":\"arn:aws:iam::%s:root\"},
                \"Action\":\"redshift:CreateInboundIntegration\"}
            ]
        }
        ''' % (source_arn, account_id)
    )
    return(response)

def wait_for_db_availability(*args):
    """Waits for both databases to be available"""

    print('Waiting for source and target to be available...')

    response = rds.describe_db_instances(
        DBInstanceIdentifier=source_db_name
    )
    source_status = response['DBInstances'][0]['DBInstanceStatus']
    source_arn = response['DBInstances'][0]['DBInstanceArn']

    response = redshift.describe_clusters(
        ClusterIdentifier=target_cluster_name
    )
    target_status = response['Clusters'][0]['ClusterStatus']
    target_arn = response['Clusters'][0]['ClusterNamespaceArn']

    # Every 60 seconds, check whether the databases are available
    if source_status != 'available' or target_status != 'available':
        time.sleep(60)
        response = wait_for_db_availability(
            source_db_name, target_cluster_name)
    else:
        print('Databases available. Ready to create zero-ETL integration.')
        create_integration(source_arn, target_arn)
        return

def create_integration(source_arn, target_arn):
    """Creates a zero-ETL integration using the source and target databases"""

    response = rds.create_integration(
        SourceArn=source_arn,
        TargetArn=target_arn,
        IntegrationName='my-integration'
    )
    print('Creating integration: ' + response['IntegrationName'])
    
def main():
    """main function"""
    create_source_db(source_db_name, source_param_group_name)
    wait_for_db_availability(source_db_name, target_cluster_name)

if __name__ == "__main__":
    main()
```

## Step 3b: Create an AWS Glue catalog for Amazon SageMaker Lakehouse zero-ETL integration
<a name="zero-etl-setting-up.sagemaker"></a>

When creating a zero-ETL integration with an Amazon SageMaker Lakehouse, you must create an AWS Glue managed catalog in AWS Lake Formation. The target catalog must be an Amazon Redshift managed catalog. To create an Amazon Redshift managed catalog, first create the `AWSServiceRoleForRedshift` service-linked role. In the Lake Formation console, add the `AWSServiceRoleForRedshift` as a read-only administrator.

For more information about the previous tasks, see the following topics.
+ For information about creating an Amazon Redshift managed catalog, see [Creating an Amazon Redshift managed catalog in the AWS Glue Data Catalog](https://docs.aws.amazon.com/lake-formation/latest/dg/create-rms-catalog.html) in the *AWS Lake Formation Developer Guide*.
+ For information about the service-linked role for Amazon Redshift, see [Using service-linked roles for Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/using-service-linked-roles.html) in the *Amazon Redshift Management Guide*.
+ For information about read-only administrator permissions for Lake Formation, see [Lake Formation personas and IAM permissions reference](https://docs.aws.amazon.com/lake-formation/latest/dg/permissions-reference.html) in the *AWS Lake Formation Developer Guide*.

### Configure permissions for the target AWS Glue catalog
<a name="zero-etl-setting-up.sagemaker-permissions"></a>

Before creating a target catalog for zero-ETL integration, you must create the Lake Formation target creation role and the AWS Glue data transfer role. Use the Lake Formation target creation role to create the target catalog. When creating the target catalog, enter the Glue data transfer role in the **IAM role** field in the **Access from engines section**.

#### Lake Formation target creation role
<a name="zero-etl-setting-up.target-creation-role"></a>

The target creation role must be a Lake Formation administrator and requires the following permissions.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "lakeformation:RegisterResource",
            "Resource": "*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:PutEncryptionConfiguration",
                "iam:PassRole",
                "glue:CreateCatalog",
                "glue:GetCatalog",
                "s3:PutBucketTagging",
                "s3:PutLifecycleConfiguration",
                "s3:PutBucketPolicy",
                "s3:CreateBucket",
                "redshift-serverless:CreateNamespace",
                "s3:DeleteBucket",
                "s3:PutBucketVersioning",
                "redshift-serverless:CreateWorkgroup"
            ],
            "Resource": [
                "arn:aws:glue:*:111122223333:catalog",
                "arn:aws:glue:*:111122223333:catalog/*",
                "arn:aws:s3:::*",
                "arn:aws:redshift-serverless:*:111122223333:workgroup/*",
                "arn:aws:redshift-serverless:*:111122223333:namespace/*",
                "arn:aws:iam::111122223333:role/GlueDataCatalogDataTransferRole"
            ]
        }
    ]
}
```

------

The target creation role must have the following trust relationship.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "glue.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        },
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:iam::111122223333:user/Username"
          },
          "Action": "sts:AssumeRole"
        }
    ]
}
```

------

#### Glue data transfer role
<a name="zero-etl-setting-up.glue-data-transfer-role"></a>

The Glue data transfer role is required for MySQL catalog operations and must have the following permissions.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "DataTransferRolePolicy",
            "Effect": "Allow",
            "Action": [
                "kms:GenerateDataKey",
                "kms:Decrypt",
                "glue:GetCatalog",
                "glue:GetDatabase"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}
```

------

The Glue data transfer role must have the following trust relationship.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": [
                    "glue.amazonaws.com",
                    "redshift.amazonaws.com"
                ]
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
```

------

## Next steps
<a name="zero-etl.setup-next"></a>

With a source RDS database and either an Amazon Redshift target data warehouse or Amazon SageMaker Lakehouse, you can create a zero-ETL integration and replicate data. For instructions, see [Creating Amazon RDS zero-ETL integrations with Amazon Redshift](zero-etl.creating.md).

# Creating Amazon RDS zero-ETL integrations with Amazon Redshift
<a name="zero-etl.creating"></a>

When you create an Amazon RDS zero-ETL integration, you specify the source RDS database and the target Amazon Redshift data warehouse. You can also customize encryption settings and add tags. Amazon RDS creates an integration between the source database and its target. Once the integration is active, any data that you insert into the source database will be replicated into the configured Amazon Redshift target.

## Prerequisites
<a name="zero-etl.create-prereqs"></a>

Before you create a zero-ETL integration, you must create a source database and a target Amazon Redshift data warehouse. You also must allow replication into the data warehouse by adding the database as an authorized integration source.

For instructions to complete each of these steps, see [Getting started with Amazon RDS zero-ETL integrations](zero-etl.setting-up.md).

## Required permissions
<a name="zero-etl.create-permissions"></a>

Certain IAM permissions are required to create a zero-ETL integration. At minimum, you need permissions to perform the following actions:
+ Create zero-ETL integrations for the source RDS database.
+ View and delete all zero-ETL integrations.
+ Create inbound integrations into the target data warehouse.

The following sample policies demonstrate the [least privilege permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html#grant-least-privilege) required to create and manage integrations. You might not need these exact permissions if your user or role has broader permissions, such as an `AdministratorAccess` managed policy.

**Note**  
Redshift Amazon Resource Names (ARNs) have the following format. Note the use of a forward slash `(/`) rather than a colon (`:`) before the serverless namespace UUID.  
Provisioned cluster – `arn:aws:redshift:{region}:{account-id}:namespace:namespace-uuid`
Serverless – `arn:aws:redshift-serverless:{region}:{account-id}:namespace/namespace-uuid`

### Sample policy for Redshift target
<a name="zero-etl.create-sample-policy"></a>

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Sid": "CreateIntegration",
      "Effect": "Allow",
      "Action": [
        "rds:CreateIntegration"
      ],
      "Resource": [
        "arn:aws:rds:us-east-1:123456789012:db:source-db",
        "arn:aws:rds:us-east-1:123456789012:integration:*"
      ]
    },
    {
      "Sid": "DescribeIntegrationDetails",
      "Effect": "Allow",
      "Action": [
        "rds:DescribeIntegrations"
      ],
      "Resource": [
      "arn:aws:rds:us-east-1:123456789012:integration:*"
  ]
    },
    {
      "Sid": "ChangeIntegrationDetails",
      "Effect": "Allow",
      "Action": [
        "rds:DeleteIntegration",
        "rds:ModifyIntegration"
      ],
      "Resource": [
        "arn:aws:rds:us-east-1:123456789012:integration:*"
      ]
    },
    {
      "Sid": "AllowRedShiftIntegration",
      "Effect": "Allow",
      "Action": [
        "redshift:CreateInboundIntegration"
      ],
      "Resource": [
        "arn:aws:redshift:us-east-1:123456789012:namespace:namespace-uuid"
      ]
    }
  ]
}
```

------

### Choosing a target data warehouse in a different account
<a name="zero-etl.create-permissions-cross-account"></a>

If you plan to specify a target Amazon Redshift data warehouse that's in another AWS account, you must create a role that allows users in the current account to access resources in the target account. For more information, see [Providing access to an IAM user in another AWS account that you own](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_common-scenarios_aws-accounts.html).

The role must have the following permissions, which allow the user to view available Amazon Redshift provisioned clusters and Redshift Serverless namespaces in the target account.

#### Required permissions and trust policy
<a name="zero-etl.cross-account-sample-policy"></a>

------
#### [ JSON ]

****  

```
{
   "Version":"2012-10-17",		 	 	 
   "Statement":[
      {
         "Effect":"Allow",
         "Action":[
            "redshift:DescribeClusters",
            "redshift-serverless:ListNamespaces"
         ],
         "Resource":[
            "*"
         ]
      }
   ]
}
```

------

The role must have the following trust policy, which specifies the target account ID.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:root"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
```

------

For instructions to create the role, see [Creating a role using custom trust policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-custom.html).

## Creating zero-ETL integrations
<a name="zero-etl.create"></a>

You can create a zero-ETL integration using the AWS Management Console, the AWS CLI, or the RDS API.

**Important**  
zero-ETL integrations do not support refresh or resync operations. If you encounter issues with an integration after creation, you must delete the integration and create a new one.

By default, RDS for MySQL immediately purges binary log files. Because zero-ETL integrations rely on binary logs to replicate data from the source to the target, the retention period for the source database must be at least one hour. As soon as you create an integration, Amazon RDS checks the binary log file retention period for the selected source database. If the current value is 0 hours, Amazon RDS automatically changes it to 1 hour. Otherwise, the value remains the same.

### RDS console
<a name="zero-etl.create-console"></a>

**To create a zero-ETL integration**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the left navigation pane, choose **Zero-ETL integrations**.

1. Choose **Create zero-ETL integration**.

1. For **Integration identifier**, enter a name for the integration. The name can have up to 63 alphanumeric characters and can include hyphens.
**Important**  
Catalog names are limited to 19 characters in length. Ensure your integration identifier meets this requirement if it will be used as a catalog name.

1. Choose **Next**.

1. For **Source**, select the RDS database where the data will originate from.
**Note**  
RDS notifies you if the DB parameters aren't configured correctly. If you receive this message, you can either choose **Fix it for me**, or configure them manually. For instructions to fix them manually, see [Step 1: Create a custom DB parameter group](zero-etl.setting-up.md#zero-etl.parameters).  
Modifying DB parameters requires a reboot. Before you can create the integration, the reboot must be complete and the new parameter values must be successfully applied to the database.

1. Once your source database is successfully configured, choose **Next**.

1. For **Target**, do the following:

   1. (Optional) To use a different AWS account for the Amazon Redshift target, choose **Specify a different account**. Then, enter the ARN of an IAM role with permissions to display your data warehouses. For instructions to create the IAM role, see [Choosing a target data warehouse in a different account](#zero-etl.create-permissions-cross-account).

   1. For **Amazon Redshift data warehouse**, select the target for replicated data from the source database. You can choose a provisioned Amazon Redshift *cluster* or a Redshift Serverless *namespace* as the target.
**Note**  
RDS notifies you if the resource policy or case sensitivity settings for the specified data warehouse aren't configured correctly. If you receive this message, you can either choose **Fix it for me**, or configure them manually. For instructions to fix them manually, see [Turn on case sensitivity for your data warehouse](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.setting-up.html#zero-etl-setting-up.case-sensitivity) and [Configure authorization for your data warehouse](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.setting-up.html#zero-etl-using.redshift-iam) in the *Amazon Redshift Management Guide*.  
Modifying case sensitivity for a *provisioned* Redshift cluster requires a reboot. Before you can create the integration, the reboot must be complete and the new parameter value must be successfully applied to the cluster.  
If your selected source and target are in different AWS accounts, then Amazon RDS cannot fix these settings for you. You must navigate to the other account and fix them manually in Amazon Redshift.

1. Once your target data warehouse is configured correctly, choose **Next**.

1. (Optional) For **Tags**, add one or more tags to the integration. For more information, see [Tagging Amazon RDS resources](USER_Tagging.md).

1. For **Encryption**, specify how you want your integration to be encrypted. By default, RDS encrypts all integrations with an AWS owned key. To choose a customer managed key instead, enable **Customize encryption settings** and choose a KMS key to use for encryption. For more information, see [Encrypting Amazon RDS resources](Overview.Encryption.md).

   Optionally, add an encryption context. For more information, see [Encryption context](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context) in the *AWS Key Management Service Developer Guide*.
**Note**  
Amazon RDS adds the following encryption context pairs in addition to any that you add:  
`aws:redshift:integration:arn` - `IntegrationArn`
`aws:servicename:id` - `Redshift`
This reduces the overall number of pairs that you can add from 8 to 6, and contributes to the overall character limit of the grant constraint. For more information, see [Using grant constraints](https://docs.aws.amazon.com/kms/latest/developerguide/create-grant-overview.html#grant-constraints) in the *AWS Key Management Service Developer Guide*.

1. Choose **Next**.

1. Review your integration settings and choose **Create zero-ETL integration**.

   If creation fails, see [I can't create a zero-ETL integration](zero-etl.troubleshooting.md#zero-etl.troubleshooting.creation) for troubleshooting steps.

The integration has a status of `Creating` while it's being created, and the target Amazon Redshift data warehouse has a status of `Modifying`. During this time, you can't query the data warehouse or make any configuration changes on it.

When the integration is successfully created, the status of the integration and the target Amazon Redshift data warehouse both change to `Active`.

### AWS CLI
<a name="zero-etl.create-cli"></a>

To create a zero-ETL integration using the AWS CLI, use the [create-integration](https://docs.aws.amazon.com/cli/latest/reference/rds/create-integration.html) command with the following options:

**Note**  
Remember that catalog names are limited to 19 characters. Choose your integration name accordingly if it will be used as a catalog name.
+ `--integration-name` – Specify a name for the integration.
+ `--source-arn` – Specify the ARN of the RDS database that will be the source for the integration.
+ `--target-arn` – Specify the ARN of the Amazon Redshift data warehouse that will be the target for the integration.

**Example**  
For Linux, macOS, or Unix:  

```
aws rds create-integration \
    --integration-name my-integration \
    --source-arn arn:aws:rds:{region}:{account-id}:my-db \
    --target-arn arn:aws:redshift:{region}:{account-id}:namespace:namespace-uuid
```
For Windows:  

```
aws rds create-integration ^
    --integration-name my-integration ^
    --source-arn arn:aws:rds:{region}:{account-id}:my-db ^
    --target-arn arn:aws:redshift:{region}:{account-id}:namespace:namespace-uuid
```

### RDS API
<a name="zero-etl.create-api"></a>

To create a zero-ETL integration by using the Amazon RDS API, use the [https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_CreateIntegration.html](https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_CreateIntegration.html) operation with the following parameters:

**Note**  
Catalog names are limited to 19 characters. Ensure your IntegrationName parameter meets this requirement if it will be used as a catalog name.
+ `IntegrationName` – Specify a name for the integration.
+ `SourceArn` – Specify the ARN of the RDS database that will be the source for the integration.
+ `TargetArn` – Specify the ARN of the Amazon Redshift data warehouse that will be the target for the integration.

## Encrypting integrations with a customer managed key
<a name="zero-etl.create-encrypt"></a>

If you specify a custom KMS key rather than an AWS owned key when you create an integration, the key policy must provide the Amazon Redshift service principal access to the `CreateGrant` action. In addition, it must allow the current user to perform to the `DescribeKey` and `CreateGrant` actions.

The following sample policy demonstrates how to provide the required permissions in the key policy. It includes context keys to further reduce the scope of permissions.

### Sample key policy
<a name="zero-etl.kms-sample-policy"></a>

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Id": "Key policy",
    "Statement": [
        {
            "Sid": "Enables IAM user permissions",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:root"
            },
            "Action": "kms:*",
            "Resource": "*"
        },
        {
            "Sid": "Allows the Redshift service principal to add a grant to a KMS key",
            "Effect": "Allow",
            "Principal": {
                "Service": "redshift.amazonaws.com"
            },
            "Action": "kms:CreateGrant",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "kms:EncryptionContext:{context-key}": "{context-value}"
                },
                "ForAllValues:StringEquals": {
                    "kms:GrantOperations": [
                        "Decrypt",
                        "GenerateDataKey",
                        "CreateGrant"
                    ]
                }
            }
        },
        {
            "Sid": "Allows the current user or role to add a grant to a KMS key",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:role/{role-name}"
            },
            "Action": "kms:CreateGrant",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "kms:EncryptionContext:{context-key}": "{context-value}",
                    "kms:ViaService": "rds.us-east-1.amazonaws.com"
                },
                "ForAllValues:StringEquals": {
                    "kms:GrantOperations": [
                        "Decrypt",
                        "GenerateDataKey",
                        "CreateGrant"
                    ]
                }
            }
        },
        {
            "Sid": "Allows the current uer or role to retrieve information about a KMS key",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:role/{role-name}"
            },
            "Action": "kms:DescribeKey",
            "Resource": "*"
        }
    ]
}
```

------

For more information, see [Creating a key policy](https://docs.aws.amazon.com/kms/latest/developerguide/key-policy-overview.html) in the *AWS Key Management Service Developer Guide*.

## Next steps
<a name="zero-etl.create-next"></a>

After you successfully create a zero-ETL integration, you must create a destination database within your target Amazon Redshift cluster or workgroup. Then, you can start adding data to the source RDS database and querying it in Amazon Redshift. For instructions, see [Creating destination databases in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.creating-db.html).

# Creating Amazon RDS zero-ETL integrations with an Amazon SageMaker lakehouse
<a name="zero-etl.creating-smlh"></a>

When you create an Amazon RDS zero-ETL integration with an Amazon SageMaker lakehouse, you specify the source RDS database and the target AWS Glue managed catalog. You can also customize encryption settings and add tags. Amazon RDS creates an integration between the source database and its target. Once the integration is active, any data that you insert into the source database will be replicated into the configured target.

## Prerequisites
<a name="zero-etl.create-prereqs-smlh"></a>

Before you create a zero-ETL integration with an Amazon SageMaker lakehouse, you must create a source database and a target AWS Glue managed catalog. You also must allow replication into the catalog by adding the database as an authorized integration source.

For instructions to complete each of these steps, see [Getting started with Amazon RDS zero-ETL integrations](zero-etl.setting-up.md).

## Required permissions
<a name="zero-etl.create-permissions-smlh"></a>

Certain IAM permissions are required to create a zero-ETL integration with an Amazon SageMaker lakehouse. At minimum, you need permissions to perform the following actions:
+ Create zero-ETL integrations for the source RDS database.
+ View and delete all zero-ETL integrations.
+ Create inbound integrations into the target AWS Glue managed catalog.
+ Access Amazon S3 buckets used by the AWS Glue managed catalog.
+ Use AWS KMS keys for encryption if custom encryption is configured.
+ Register resources with Lake Formation.
+ Put resource policy on the AWS Glue managed catalog to authorize inbound integrations.

The following sample policy demonstrates the [least privilege permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html#grant-least-privilege) required to create and manage integrations with an Amazon SageMaker lakehouse. You might not need these exact permissions if your user or role has broader permissions, such as an `AdministratorAccess` managed policy.

Additionally, you must configure a resource policy on the target AWS Glue managed catalog to authorize inbound integrations. Use the following AWS CLI command to apply the resource policy.

### Sample AWS CLI command to authorize inbound integrations on the target catalog
<a name="zero-etl.create-sample-policy-smlh"></a>

```
aws glue put-resource-policy \
      --policy-in-json  '{
    "Version": "2012-10-17",		 	 	 
    "Statement": [{
        "Effect": "Allow",
        "Principal": {
            "Service": "glue.amazonaws.com"
        },
        "Action": [
            "glue:AuthorizeInboundIntegration"
        ],
        "Resource": ["arn:aws:glue:region:account_id:catalog/catalog_name"],
        "Condition": {
            "StringEquals": {
                "aws:SourceArn": "arn:aws:rds:region:account_id:db:source_name"
            }
        }
    },
    {
        "Effect": "Allow",
        "Principal": {
            "AWS": "account_id"
        },
        "Action": ["glue:CreateInboundIntegration"],
        "Resource": ["arn:aws:glue:region:account_id:catalog/catalog_name"]
    }
    ]
}' \
      --region region
```

**Note**  
Glue catalog Amazon Resource Names (ARNs) have the following format:  
Glue catalog – `arn:aws:glue:{region}:{account-id}:catalog/catalog-name`

### Choosing a target AWS Glue managed catalog in a different account
<a name="zero-etl.create-permissions-cross-account-smlh"></a>

If you plan to specify a target AWS Glue managed catalog that's in another AWS account, you must create a role that allows users in the current account to access resources in the target account. For more information, see [Providing access to an IAM user in another AWS account that you own](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_common-scenarios_aws-accounts.html).

The role must have the following permissions, which allow the user to view available AWS Glue catalogs in the target account.

#### Required permissions and trust policy
<a name="zero-etl.cross-account-sample-policy-smlh"></a>

------
#### [ JSON ]

****  

```
{
   "Version":"2012-10-17",		 	 	 
   "Statement":[
      {
         "Effect":"Allow",
         "Action":[
            "glue:GetCatalog"
         ],
         "Resource":[
            "*"
         ]
      }
   ]
}
```

------

The role must have the following trust policy, which specifies the target account ID.

------
#### [ JSON ]

****  

```
{
   "Version":"2012-10-17",		 	 	 
   "Statement":[
      {
         "Effect":"Allow",
         "Principal":{
            "AWS": "arn:aws:iam::111122223333:root"
         },
         "Action":"sts:AssumeRole"
      }
   ]
}
```

------

For instructions to create the role, see [Creating a role using custom trust policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-custom.html).

## Creating zero-ETL integrations with an Amazon SageMaker lakehouse
<a name="zero-etl.create-smlh"></a>

You can create a zero-ETL integration with an Amazon SageMaker lakehouse using the AWS Management Console, the AWS CLI, or the RDS API.

**Important**  
Zero-ETL integrations with an Amazon SageMaker lakehouse do not support refresh or resync operations. If you encounter issues with an integration after creation, you must delete the integration and create a new one.

By default, RDS for MySQL immediately purges binary log files. Because zero-ETL integrations rely on binary logs to replicate data from the source to the target, the retention period for the source database must be at least one hour. As soon as you create an integration, Amazon RDS checks the binary log file retention period for the selected source database. If the current value is 0 hours, Amazon RDS automatically changes it to 1 hour. Otherwise, the value remains the same.

### RDS console
<a name="zero-etl.create-console-smlh"></a>

**To create a zero-ETL integration with an Amazon SageMaker lakehouse**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the left navigation pane, choose **Zero-ETL integrations**.

1. Choose **Create zero-ETL integration**.

1. For **Integration identifier**, enter a name for the integration. The name can have up to 63 alphanumeric characters and can include hyphens.

1. Choose **Next**.

1. For **Source**, select the RDS database where the data will originate from.
**Note**  
RDS notifies you if the DB parameters aren't configured correctly. If you receive this message, you can either choose **Fix it for me**, or configure them manually. For instructions to fix them manually, see [Step 1: Create a custom DB parameter group](zero-etl.setting-up.md#zero-etl.parameters).  
Modifying DB parameters requires a reboot. Before you can create the integration, the reboot must be complete and the new parameter values must be successfully applied to the database.

1. Once your source database is successfully configured, choose **Next**.

1. For **Target**, do the following:

   1. (Optional) To use a different AWS account for the Amazon SageMaker lakehouse target, choose **Specify a different account**. Then, enter the ARN of an IAM role with permissions to display your AWS Glue catalogs. For instructions to create the IAM role, see [Choosing a target AWS Glue managed catalog in a different account](#zero-etl.create-permissions-cross-account-smlh).

   1. For **AWS Glue catalog**, select the target for replicated data from the source database. You can choose an existing AWS Glue managed catalog as the target.

   1. The target IAM role needs describe permissions on the target catalog and must have the following permissions:

------
#### [ JSON ]

****  

      ```
      {
          "Version":"2012-10-17",		 	 	 
          "Statement": [
              {
                  "Sid": "VisualEditor0",
                  "Effect": "Allow",
                  "Action": "glue:GetCatalog",
                  "Resource": [
                      "arn:aws:glue:us-east-1:111122223333:catalog/*",
                      "arn:aws:glue:us-east-1:111122223333:catalog"
                  ]
              }
          ]
      }
      ```

------

      The target IAM role must have the following trust relationship:

------
#### [ JSON ]

****  

      ```
      {
          "Version":"2012-10-17",		 	 	 
          "Statement": [
              {
                  "Effect": "Allow",
                  "Principal": {
                      "Service": "glue.amazonaws.com"
                  },
                  "Action": "sts:AssumeRole"
              }
          ]
      }
      ```

------

   1. You must grant the target IAM role describe permissions for the target AWS Glue managed catalog with the Lake Formation administrator role created in [Step 3b: Create an AWS Glue catalog for Amazon SageMaker Lakehouse zero-ETL integration](zero-etl.setting-up.md#zero-etl-setting-up.sagemaker).
**Note**  
RDS notifies you if the resource policy or configuration settings for the specified AWS Glue managed catalog aren't configured correctly. If you receive this message, you can either choose **Fix it for me**, or configure them manually.  
If your selected source and target are in different AWS accounts, then Amazon RDS cannot fix these settings for you. You must navigate to the other account and fix them manually in SageMaker Unified Studio.

1. Once your target AWS Glue managed catalog is configured correctly, choose **Next**.

1. (Optional) For **Tags**, add one or more tags to the integration. For more information, see [Tagging Amazon RDS resources](USER_Tagging.md).

1. For **Encryption**, specify how you want your integration to be encrypted. By default, RDS encrypts all integrations with an AWS owned key. To choose a customer managed key instead, enable **Customize encryption settings** and choose a KMS key to use for encryption. For more information, see [Encrypting Amazon RDS resources](Overview.Encryption.md).

   Optionally, add an encryption context. For more information, see [Encryption context](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#encrypt_context) in the *AWS Key Management Service Developer Guide*.
**Note**  
Amazon RDS adds the following encryption context pairs in addition to any that you add:  
`aws:glue:integration:arn` - `IntegrationArn`
`aws:servicename:id` - `glue`
This reduces the overall number of pairs that you can add from 8 to 6, and contributes to the overall character limit of the grant constraint. For more information, see [Using grant constraints](https://docs.aws.amazon.com/kms/latest/developerguide/create-grant-overview.html#grant-constraints) in the *AWS Key Management Service Developer Guide*.

1. Choose **Next**.

1. Review your integration settings and choose **Create zero-ETL integration**.

   If creation fails, see [Troubleshooting Amazon RDS zero-ETL integrations](zero-etl.troubleshooting.md) for troubleshooting steps.

The integration has a status of `Creating` while it's being created, and the target Amazon SageMaker lakehouse has a status of `Modifying`. During this time, you can't query the catalog or make any configuration changes on it.

When the integration is successfully created, the status of the integration and the target Amazon SageMaker lakehouse both change to `Active`.

### AWS CLI
<a name="zero-etl.create-cli-smlh"></a>

To prepare a target AWS Glue managed catalog for zero-ETL integration using the AWS CLI, you must first use the [create-integration-resource-property](https://docs.aws.amazon.com/cli/latest/reference/rds/create-integration.html) command with the following options:
+ `--resource-arn` – Specify the ARN of the AWS Glue managed catalog that will be the target for the integration.
+ `--target-processing-properties` – Specify the ARN of the IAM role to access the target AWS Glue managed catalog 

```
aws glue create-integration-resource-property --region us-east-1
 --resource-arn arn:aws:glue:region:account_id:catalog/catalog_name \
 --target-processing-properties '{"RoleArn" : "arn:aws:iam::account_id:role/TargetIamRole"}'
```

To create a zero-ETL integration with an Amazon SageMaker lakehouse using the AWS CLI, use the [create-integration](https://docs.aws.amazon.com/cli/latest/reference/rds/create-integration.html) command with the following options:
+ `--integration-name` – Specify a name for the integration.
+ `--source-arn` – Specify the ARN of the RDS database that will be the source for the integration.
+ `--target-arn` – Specify the ARN of the AWS Glue managed catalog that will be the target for the integration.

**Example**  
For Linux, macOS, or Unix:  

```
aws rds create-integration \
    --integration-name my-sagemaker-integration \
    --source-arn arn:aws:rds:{region}:{account-id}:cluster:my-db \
    --target-arn arn:aws:glue:{region}:{account-id}:catalog/catalog-name
```
For Windows:  

```
aws rds create-integration ^
    --integration-name my-sagemaker-integration ^
    --source-arn arn:aws:rds:{region}:{account-id}:cluster:my-db ^
    --target-arn arn:aws:glue:{region}:{account-id}:catalog/catalog-name
```

### RDS API
<a name="zero-etl.create-api-smlh"></a>

To create a zero-ETL integration with Amazon SageMaker by using the Amazon RDS API, use the [https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_CreateIntegration.html](https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_CreateIntegration.html) operation with the following parameters:

**Note**  
Catalog names are limited to 19 characters. Ensure your IntegrationName parameter meets this requirement if it will be used as a catalog name.
+ `IntegrationName` – Specify a name for the integration.
+ `SourceArn` – Specify the ARN of the RDS database that will be the source for the integration.
+ `TargetArn` – Specify the ARN of the AWS Glue managed catalog that will be the target for the integration.

## Encrypting integrations with a customer managed key
<a name="zero-etl.create-encrypt-smlh"></a>

If you specify a custom KMS key rather than an AWS owned key when you create an integration with Amazon SageMaker, the key policy must provide the SageMaker Unified Studio service principal access to the `CreateGrant` action. In addition, it must allow the current user to perform to the `DescribeKey` and `CreateGrant` actions.

The following sample policy demonstrates how to provide the required permissions in the key policy. It includes context keys to further reduce the scope of permissions.

### Sample key policy
<a name="zero-etl.kms-sample-policy"></a>

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Id": "Key policy",
    "Statement": [
        {
            "Sid": "EnablesIAMUserPermissions",
            "Effect": "Allow",
            "Principal": {
            "AWS": "arn:aws:iam::111122223333:root"
            },
            "Action": "kms:*",
            "Resource": "*"
        },
        {
            "Sid": "GlueServicePrincipalAddGrant",
            "Effect": "Allow",
            "Principal": {
                "Service": "glue.amazonaws.com"
            },
            "Action": "kms:CreateGrant",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "kms:EncryptionContext:{context-key}":"{context-value}"
                },
                "ForAllValues:StringEquals": {
                    "kms:GrantOperations": [
                        "Decrypt",
                        "GenerateDataKey",
                        "CreateGrant"
                    ]
                }
            }
        },
        {
            "Sid": "AllowsCurrentUserRoleAddGrantKMSKey",
            "Effect": "Allow",
            "Principal": {
            "AWS": "arn:aws:iam::111122223333:role/{role-name}"
            },
            "Action": "kms:CreateGrant",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "kms:EncryptionContext:{context-key}":"{context-value}",
                    "kms:ViaService": "rds.us-east-1.amazonaws.com"
                },
                "ForAllValues:StringEquals": {
                    "kms:GrantOperations": [
                        "Decrypt",
                        "GenerateDataKey",
                        "CreateGrant"
                    ]
                }
            }
        },
        {
            "Sid": "AllowsCurrentUserRoleRetrieveKMSKeyInformation",
            "Effect": "Allow",
            "Principal": {
            "AWS": "arn:aws:iam::111122223333:role/{role-name}"
            },
            "Action": "kms:DescribeKey",
            "Resource": "*"
        }
    ]
}
```

------

For more information, see [Creating a key policy](https://docs.aws.amazon.com/kms/latest/developerguide/key-policy-overview.html) in the *AWS Key Management Service Developer Guide*.

## Next steps
<a name="zero-etl.create-next-smlh"></a>

After you successfully create a zero-ETL integration with Amazon SageMaker, you can start adding data to the source RDS database and querying it in your Amazon SageMaker lakehouse. The data will be automatically replicated and made available for analytics and machine learning workloads.

# Data filtering for Amazon RDS zero-ETL integrations
<a name="zero-etl.filtering"></a>

Amazon RDS zero-ETL integrations support data filtering, which lets you control which data is replicated from your source Amazon RDS database to your target data warehouse. Instead of replicating the entire database, you can apply one or more filters to selectively include or exclude specific tables. This helps you optimize storage and query performance by ensuring that only relevant data is transferred. Currently, filtering is limited to the database and table levels. Column- and row-level filtering are not supported.

Data filtering can be useful when you want to:
+ Join certain tables from two or more different source databases, and you don't need complete data from either database.
+ Save costs by performing analytics using only a subset of tables rather than an entire fleet of databases.
+ Filter out sensitive information—such as phone numbers, addresses, or credit card details—from certain tables.

You can add data filters to a zero-ETL integration using the AWS Management Console, the AWS Command Line Interface (AWS CLI), or the Amazon RDS API.

If the integration has a provisioned cluster as its target, the cluster must be on [patch 180](https://docs.aws.amazon.com/redshift/latest/mgmt/cluster-versions.html#cluster-version-180) or higher to use data filtering.

**Topics**
+ [

## Format of a data filter
](#zero-etl.filtering-format)
+ [

## Filter logic
](#zero-etl.filtering-evaluate)
+ [

## Filter precedence
](#zero-etl.filtering-precedence)
+ [

## RDS for MySQL examples
](#zero-etl.filtering-examples-mysql)
+ [

## RDS for PostgreSQL examples
](#zero-etl.filtering-examples-postgres)
+ [

## RDS for Oracle examples
](#zero-etl.filtering-examples-oracle)
+ [

## Adding data filters to an integration
](#zero-etl.add-filter)
+ [

## Removing data filters from an integration
](#zero-etl.remove-filter)

## Format of a data filter
<a name="zero-etl.filtering-format"></a>

You can define multiple filters for a single integration. Each filter either includes or excludes any existing and future database tables that match one of the patterns in the filter expression. Amazon RDS zero-ETL integrations use [Maxwell filter syntax](https://maxwells-daemon.io/filtering/) for data filtering.

Each filter has the following elements:


| Element | Description | 
| --- | --- | 
| Filter type |  An `Include` filter type *includes* all tables that match one of the patterns in the filter expression. An `Exclude` filter type *excludes* all tables that match one of the patterns.  | 
| Filter expression |  A comma-separated list of patterns. Expressions must use [Maxwell filter syntax](https://maxwells-daemon.io/filtering/).  | 
| Pattern |  A filter pattern in the format `database.table` for RDS for MySQL, or `database.schema.table` for RDS for PostgreSQL. You can specify literal names, or define regular expressions.  For RDS for MySQL, regular expressions are supported in both the database and table name. For RDS for PostgreSQL, regular expressions are supported only in the schema and table name, not in the database name.  You can't include column-level filters or denylists. A single integration can have a maximum of 99 total patterns. In the console, you can enter patterns within a single filter expression, or spread them out among multiple expressions. A single pattern can't exceed 256 characters in length.  | 

**Important**  
If you select an RDS for PostgreSQL source database, you must specify at least one data filter pattern. At minimum, the pattern must include a single database (`database-name.*.*`) for replication to the target data warehouse.

The following image shows the structure of RDS for MySQL data filters in the console:

![\[Data filters for a zero-ETL integration\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/images/zero-etl-filter.png)


**Important**  
Do not include personally identifying, confidential, or sensitive information in your filter patterns.

### Data filters in the AWS CLI
<a name="zero-etl.filtering-cli"></a>

When using the AWS CLI to add a data filter, the syntax differs slightly from the console. You must assign a filter type (`Include` or `Exclude`) to each pattern individually, so you can't group multiple patterns under one filter type.

For example, in the console you can group the following comma-separated patterns under a single `Include` statement:

**RDS for MySQL**

```
mydb.mytable, mydb./table_\d+/
```

**RDS for PostgreSQL**

```
mydb.myschema.mytable, mydb.myschema./table_\d+/
```

However, when using the AWS CLI, the same data filter must be in the following format:

**RDS for MySQL**

```
'include: mydb.mytable, include: mydb./table_\d+/'
```

**RDS for PostgreSQL**

```
'include: mydb.myschema.mytable, include: mydb.myschema./table_\d+/'
```

## Filter logic
<a name="zero-etl.filtering-evaluate"></a>

If you don't specify any data filters in your integration, Amazon RDS assumes a default filter of `include:*.*`, which replicates all tables to the target data warehouse. However, if you add at least one filter, the default logic switches to `exclude:*.*`, which excludes all tables by default. This lets you explicitly define which databases and tables to include in replication.

For example, if you define the following filter:

```
'include: db.table1, include: db.table2'
```

Amazon RDS evaluates the filter as follows:

```
'exclude:*.*, include: db.table1, include: db.table2'
```

Therefore, Amazon RDS only replicates `table1` and `table2` from the database named `db` to the target data warehouse.

## Filter precedence
<a name="zero-etl.filtering-precedence"></a>

Amazon RDS evaluates data filters in the order you specify. In the AWS Management Console, it processes filter expressions from left to right and top to bottom. A second filter or an individual pattern that follows the first can override it.

For example, if the first filter is `Include books.stephenking`, it includes only the `stephenking` table from the `books` database. However, if you add a second filter, `Exclude books.*`, it overrides the first filter. This prevents any tables from the `books` index from being replicated to the target data warehouse.

When you specify at least one filter, the logic starts by assuming `exclude:*.*` by default, which automatically *excludes* all tables from replication. As a best practice, define filters from broadest to most specific. Start with one or more `Include` statements to specify the data to replicate, then add `Exclude` filters to selectively remove certain tables.

The same principle applies to filters that you define using the AWS CLI. Amazon RDS evaluates these filter patterns in the order that you specify them, so a pattern might override one that you specify before it.

## RDS for MySQL examples
<a name="zero-etl.filtering-examples-mysql"></a>

The following examples demonstrate how data filtering works for RDS for MySQL examples zero-ETL integrations:
+  Include all databases and all tables:

  ```
  'include: *.*'
  ```
+  Include all tables within the `books` database:

  ```
  'include: books.*'
  ```
+ Exclude any tables named `mystery`:

  ```
  'include: *.*, exclude: *.mystery'
  ```
+ Include two specific tables within the `books` database:

  ```
  'include: books.stephen_king, include: books.carolyn_keene'
  ```
+ Include all tables in the `books` database, except for those containing the substring `mystery`:

  ```
  'include: books.*, exclude: books./.*mystery.*/'
  ```
+ Include all tables in the `books` database, except those starting with `mystery`:

  ```
  'include: books.*, exclude: books./mystery.*/'
  ```
+ Include all tables in the `books` database, except those ending with `mystery`:

  ```
  'include: books.*, exclude: books./.*mystery/'
  ```
+ Include all tables in the `books` database that start with `table_`, except for the one named `table_stephen_king`. For example, `table_movies` or `table_books` would be replicated, but not `table_stephen_king`.

  ```
  'include: books./table_.*/, exclude: books.table_stephen_king'
  ```

## RDS for PostgreSQL examples
<a name="zero-etl.filtering-examples-postgres"></a>

The following examples demonstrate how data filtering works for RDS for PostgreSQL zero-ETL integrations:
+ Include all tables within the `books` database:

  ```
  'include: books.*.*'
  ```
+ Exclude any tables named `mystery` in the `books` database:

  ```
  'include: books.*.*, exclude: books.*.mystery'
  ```
+ Include one table within the `books` database in the `mystery` schema, and one table within `employee` database in the `finance` schema:

  ```
  'include: books.mystery.stephen_king, include: employee.finance.benefits'
  ```
+ Include all tables in the `books` database and `science_fiction` schema, except for those containing the substring `king`:

  ```
  'include: books.science_fiction.*, exclude: books.*./.*king.*/
  ```
+ Include all tables in the `books` database, except those with a schema name starting with `sci`:

  ```
  'include: books.*.*, exclude: books./sci.*/.*'
  ```
+ Include all tables in the `books` database, except those in the `mystery` schema ending with `king`:

  ```
  'include: books.*.*, exclude: books.mystery./.*king/'
  ```
+ Include all tables in the `books` database that start with `table_`, except for the one named `table_stephen_king`. For example, `table_movies` in the `fiction` schema and `table_books` in the `mystery` schema are replicated, but not `table_stephen_king` in either schema:

  ```
  'include: books.*./table_.*/, exclude: books.*.table_stephen_king'
  ```

## RDS for Oracle examples
<a name="zero-etl.filtering-examples-oracle"></a>

The following examples demonstrate how data filtering works for RDS for Oracle zero-ETL integrations:
+ Include all tables within the books database:

  ```
  'include: books.*.*'
  ```
+ Exclude any tables named mystery in the books database:

  ```
  'include: books.*.*, exclude: books.*.mystery'
  ```
+ Include one table within the books database in the mystery schema, and one table within employee database in the finance schema:

  ```
  'include: books.mystery.stephen_king, include: employee.finance.benefits'
  ```
+ Include all tables in the mystery schema within the books database:

  ```
  'include: books.mystery.*'
  ```

### Case sensitivity considerations
<a name="zero-etl.filtering-examples-oracle-case-sensitivity"></a>

Oracle Database and Amazon Redshift handle object name casing differently, which affects both data filter configuration and target queries. Note the following:
+ Oracle Database stores database, schema, and object names in uppercase unless explicitly quoted in the `CREATE` statement. For example, if you create `mytable` (no quotes), the Oracle data dictionary stores the table name as `MYTABLE`. If you quote the object name, the data dictionary preserves the case.
+ Zero-ETL data filters are case sensitive and must match the exact case of object names as they appear in the Oracle data dictionary.
+ Amazon Redshift queries default to lowercase object names unless explicitly quoted. For example, a query of `MYTABLE` (no quotes) searches for `mytable`.

Be mindful of the case differences when you create the Amazon Redshift filter and query the data.

#### Creating an uppercase integration
<a name="zero-etl.filtering-examples-oracle-uppercase"></a>

When you create a table without specifying the name in double quotes, the Oracle database stores the name in uppercase in the data dictionary. For example, you can create `MYTABLE` using any of the following SQL statements.

```
CREATE TABLE REINVENT.MYTABLE (id NUMBER PRIMARY KEY, description VARCHAR2(100));
CREATE TABLE reinvent.mytable (id NUMBER PRIMARY KEY, description VARCHAR2(100));
CREATE TABLE REinvent.MyTable (id NUMBER PRIMARY KEY, description VARCHAR2(100));
CREATE TABLE reINVENT.MYtabLE (id NUMBER PRIMARY KEY, description VARCHAR2(100));
```

Because you didn't quote the table name in the preceding statements, the Oracle database stores the object name in uppercase as `MYTABLE`.

To replicate this table to Amazon Redshift, you must specify the uppercase name in your data filter of your `create-integration` command. The Zero-ETL filter name and Oracle data dictionary name must match.

```
aws rds create-integration \
  --integration-name upperIntegration \
  --data-filter "include: ORCL.REINVENT.MYTABLE" \
...
```

By default, Amazon Redshift stores data in lowercase. To query `MYTABLE` in the replicated database in Amazon Redshift, you must quote the uppercase name `MYTABLE` so that it matches the case in the Oracle data dictionary.

```
SELECT * FROM targetdb1."REINVENT"."MYTABLE";
```

The following queries don't use the quoting mechanism. They all return an error because they search for an Amazon Redshift table named `mytable`, which uses the default lowercase name, but the table is named `MYTABLE` in the Oracle data dictionary.

```
SELECT * FROM targetdb1."REINVENT".MYTABLE;
SELECT * FROM targetdb1."REINVENT".MyTable;
SELECT * FROM targetdb1."REINVENT".mytable;
```

The following queries uses the quoting mechanism to specify a mixed case name. The queries all return an error because they search for an Amazon Redshift table that isn't named `MYTABLE`.

```
SELECT * FROM targetdb1."REINVENT"."MYtablE";
SELECT * FROM targetdb1."REINVENT"."MyTable";
SELECT * FROM targetdb1."REINVENT"."mytable";
```

#### Creating a lowercase integration
<a name="zero-etl.filtering-examples-oracle-lowercase"></a>

In the following alternative example, you use double quotes to store the table name in lowercase in the Oracle data dictionary. You create `mytable` as follows.

```
CREATE TABLE REINVENT."mytable" (id NUMBER PRIMARY KEY, description VARCHAR2(100));
```

The Oracle database stores the table name as `mytable` in lowercase. To replicate this table to Amazon Redshift, you must specify the lowercase name `mytable` in your Zero-ETL data filter.

```
aws rds create-integration \
  --integration-name lowerIntegration \
  --data-filter "include: ORCL.REINVENT.mytable" \
...
```

When you query this table in the replicated database in Amazon Redshift, you can specify the lowercase name `mytable`. The query succeeds because it searches for a table named `mytable`, which is the table name in the Oracle data dictionary.

```
SELECT * FROM targetdb1."REINVENT".mytable;
```

Because Amazon Redshift defaults to lowercase object names, the following queries also succeed in finding `mytable`.

```
SELECT * FROM targetdb1."REINVENT".MYtablE;
SELECT * FROM targetdb1."REINVENT".MYTABLE;
SELECT * FROM targetdb1."REINVENT".MyTable;
```

The following queries use the quoting mechanism for the object name. They all return an error because they search for an Amazon Redshift table whose name is different from `mytable`.

```
SELECT * FROM targetdb1."REINVENT"."MYTABLE";
SELECT * FROM targetdb1."REINVENT"."MyTable";
SELECT * FROM targetdb1."REINVENT"."MYtablE";
```

#### Create a table with a mixed-case integration
<a name="zero-etl.filtering-examples-oracle-mixed-case"></a>

In the following example, you use double quotes to store the table name in lowercase in the Oracle data dictionary. You create `MyTable` as follows.

```
CREATE TABLE REINVENT."MyTable" (id NUMBER PRIMARY KEY, description VARCHAR2(100));
```

The Oracle database stores this table name as `MyTable` with mixed case. To replicate this table to Amazon Redshift, you must specify the mixed case name in the data filter.

```
aws rds create-integration \
  --integration-name mixedIntegration \
  --data-filter "include: ORCL.REINVENT.MyTable" \
...
```

When you query this table in the replicated database in Amazon Redshift, you must specify the mixed case name `MyTable` by quoting the object name.

```
SELECT * FROM targetdb1."REINVENT"."MyTable";
```

Because Amazon Redshift defaults to lowercase object names, the following queries don't find the object because they are searching for the lowercase name `mytable`.

```
SELECT * FROM targetdb1."REINVENT".MYtablE;
SELECT * FROM targetdb1."REINVENT".MYTABLE;
SELECT * FROM targetdb1."REINVENT".mytable;
```

**Note**  
You can't use regular expressions in the filter value for database name, schema, or table name in RDS for Oracle integrations.

## Adding data filters to an integration
<a name="zero-etl.add-filter"></a>

You can configure data filtering using the AWS Management Console, the AWS CLI, or the Amazon RDS API. 

**Important**  
If you add a filter after you create an integration, Amazon RDS treats it as if it always existed. It removes any data in the target data warehouse that doesn’t match the new filtering criteria and resynchronizes all affected tables.

### RDS console
<a name="add-filter-console"></a>

**To add data filters to a zero-ETL integration**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Zero-ETL integrations**. Select the integration that you want to add data filters to, and then choose **Modify**.

1. Under **Source**, add one or more `Include` and `Exclude` statements.

   The following image shows an example of data filters for a MySQL integration:  
![\[Data filters for a zero-ETL integration in the RDS console\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/images/zero-etl-filter-data.png)

1. When you're satisfied with the changes, choose **Continue** and **Save changes**.

### AWS CLI
<a name="add-filter-cli"></a>

To add data filters to a zero-ETL integration using the AWS CLI, call the [modify-integration](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/rds/modify-integration.html) command. In addition to the integration identifier, specify the `--data-filter` parameter with a comma-separated list of `Include` and `Exclude` Maxwell filters.

**Example**  
The following example adds filter patterns to `my-integration`.  
For Linux, macOS, or Unix:  

```
aws rds modify-integration \
    --integration-identifier my-integration \
    --data-filter 'include: foodb.*, exclude: foodb.tbl, exclude: foodb./table_\d+/'
```
For Windows:  

```
aws rds modify-integration ^
    --integration-identifier my-integration ^
    --data-filter 'include: foodb.*, exclude: foodb.tbl, exclude: foodb./table_\d+/'
```

### RDS API
<a name="add-filter-api"></a>

To modify a zero-ETL integration using the RDS API, call the [ModifyIntegration](https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_ModifyIntegration.html) operation. Specify the integration identifier and provide a comma-separated list of filter patterns.

## Removing data filters from an integration
<a name="zero-etl.remove-filter"></a>

When you remove a data filter from an integration, Amazon RDS reevaluates the remaining filters as if the removed filter never existed. It then replicates any previously excluded data that now meets the criteria into the target data warehouse. This triggers a resynchronization of all affected tables.

# Adding data to a source RDS database and querying it
<a name="zero-etl.querying"></a>

To finish creating a zero-ETL integration that replicates data from Amazon RDS into Amazon Redshift, you must create a database in the target destination.

For connections with Amazon Redshift, connect to your Amazon Redshift cluster or workgroup and create a database with a reference to your integration identifier. Then, you can add data to your source RDS database and see it replicated in Amazon Redshift or Amazon SageMaker.

**Topics**
+ [

## Creating a target database
](#zero-etl.create-db)
+ [

## Adding data to the source database
](#zero-etl.add-data-rds)
+ [

## Querying your Amazon RDS data in Amazon Redshift
](#zero-etl.query-data-redshift)
+ [

## Data type differences between RDS and Amazon Redshift databases
](#zero-etl.data-type-mapping)
+ [

## DDL operations for RDS for PostgreSQL
](#zero-etl.ddl-postgres)

## Creating a target database
<a name="zero-etl.create-db"></a>

Before you can start replicating data into Amazon Redshift, after you create an integration, you must create a database in your target data warehouse. This database must include a reference to the integration identifier. You can use the Amazon Redshift console or the Query editor v2 to create the database.

For instructions to create a destination database, see [Create a destination database in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.creating-db.html#zero-etl-using.create-db).

## Adding data to the source database
<a name="zero-etl.add-data-rds"></a>

After you configure your integration, you can populate the source RDS database with data that you want to replicate into your data warehouse.

**Note**  
There are differences between data types in Amazon RDS and the target analytics warehouse. For a table of data type mappings, see [Data type differences between RDS and Amazon Redshift databases](#zero-etl.data-type-mapping).

First, connect to the source database using the MySQL client of your choice. For instructions, see [Connecting to your MySQL DB instance](USER_ConnectToInstance.md).

Then, create a table and insert a row of sample data.

**Important**  
Make sure that the table has a primary key. Otherwise, it can't be replicated to the target data warehouse.

**RDS for MySQL**

The following example uses the [MySQL Workbench utility](https://dev.mysql.com/downloads/workbench/).

```
CREATE DATABASE my_db;

USE my_db;

CREATE TABLE books_table (ID int NOT NULL, Title VARCHAR(50) NOT NULL, Author VARCHAR(50) NOT NULL,
Copyright INT NOT NULL, Genre VARCHAR(50) NOT NULL, PRIMARY KEY (ID));

INSERT INTO books_table VALUES (1, 'The Shining', 'Stephen King', 1977, 'Supernatural fiction');
```



**RDS for PostgreSQL**

The following example uses the `[psql](https://www.postgresql.org/docs/current/app-psql.html)` PostgreSQL interactive terminal. When connecting to the database, include the database name that you want to replicate.

```
psql -h mydatabase.123456789012.us-east-2.rds.amazonaws.com -p 5432 -U username -d named_db;

named_db=> CREATE TABLE books_table (ID int NOT NULL, Title VARCHAR(50) NOT NULL, Author VARCHAR(50) NOT NULL,
Copyright INT NOT NULL, Genre VARCHAR(50) NOT NULL, PRIMARY KEY (ID));

named_db=> INSERT INTO books_table VALUES (1, 'The Shining', 'Stephen King', 1977, 'Supernatural fiction');
```

**RDS for Oracle**

The following example uses SQL\$1Plus to connect to your RDS for Oracle database.

```
sqlplus 'user_name@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=dns_name)(PORT=port))(CONNECT_DATA=(SID=database_name)))'

SQL> CREATE TABLE books_table (ID int NOT NULL, Title VARCHAR(50) NOT NULL, Author VARCHAR(50) NOT NULL,
Copyright INT NOT NULL, Genre VARCHAR(50) NOT NULL, PRIMARY KEY (ID));

SQL> INSERT INTO books_table VALUES (1, 'The Shining', 'Stephen King', 1977, 'Supernatural fiction');
```

## Querying your Amazon RDS data in Amazon Redshift
<a name="zero-etl.query-data-redshift"></a>

After you add data to the RDS database, it's replicated into the destination database and is ready to be queried.

**To query the replicated data**

1. Navigate to the Amazon Redshift console and choose **Query editor v2** from the left navigation pane.

1. Connect to your cluster or workgroup and choose your destination database (which you created from the integration) from the dropdown menu (**destination\$1database** in this example). For instructions to create a destination database, see [Create a destination database in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.creating-db.html#zero-etl-using.create-db).

1. Use a SELECT statement to query your data. In this example, you can run the following command to select all data from the table that you created in the source RDS database:

   ```
   SELECT * from my_db."books_table";
   ```  
![\[Run a SELECT statement within the query editor. The result is a single row of sample data that was added to the Amazon RDS database.\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/images/zero-etl-redshift-editor.png)
   + `my_db` is the RDS database schema name. 
   + `books_table` is the RDS table name.

You can also query the data using the a command line client. For example:

```
destination_database=# select * from my_db."books_table";

 ID |       Title |        Author |   Copyright |                  Genre |  txn_seq |  txn_id
----+–------------+---------------+-------------+------------------------+----------+--------+
  1 | The Shining |  Stephen King |        1977 |   Supernatural fiction |        2 |   12192
```

**Note**  
For case-sensitivity, use double quotes (" ") for schema, table, and column names. For more information, see [enable\$1case\$1sensitive\$1identifier](https://docs.aws.amazon.com/redshift/latest/dg/r_enable_case_sensitive_identifier.html).

## Data type differences between RDS and Amazon Redshift databases
<a name="zero-etl.data-type-mapping"></a>

The following tables show the mappings of RDS for MySQL, RDS for PostgreSQL, and RDS for Oracle data types to corresponding destination data types. *Amazon RDS currently supports only these data types for zero-ETL integrations.*

If a table in your source database includes an unsupported data type, the table goes out of sync and isn't consumable by the destination target. Streaming from the source to the target continues, but the table with the unsupported data type isn't available. To fix the table and make it available in the target destination, you must manually revert the breaking change and then refresh the integration by running `[ALTER DATABASE...INTEGRATION REFRESH](https://docs.aws.amazon.com/redshift/latest/dg/r_ALTER_DATABASE.html)`.

**Note**  
You can't refresh zero-ETL integrations with an Amazon SageMaker lakehouse. Instead, delete and try to create the integration again.

**Topics**
+ [

### RDS for MySQL
](#zero-etl.data-type-mapping-mysql)
+ [

### RDS for PostgreSQL
](#zero-etl.data-type-mapping-postgres)
+ [

### RDS for Oracle
](#zero-etl.data-type-mapping-oracle)

### RDS for MySQL
<a name="zero-etl.data-type-mapping-mysql"></a>

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/zero-etl.querying.html)

### RDS for PostgreSQL
<a name="zero-etl.data-type-mapping-postgres"></a>

Zero-ETL integrations for RDS for PostgreSQL don't support custom data types or data types created by extensions.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/zero-etl.querying.html)

### RDS for Oracle
<a name="zero-etl.data-type-mapping-oracle"></a>

**Unsupported data types**

The following RDS for Oracle data types are not supported by Amazon Redshift:
+ `ANYDATA`
+ `BFILE`
+ `REF`
+ `ROWID`
+ `UROWID`
+ `VARRAY`
+ `SDO_GEOMETRY`
+ User-defined data types

**Data type differences**

The following table shows the data type differences that affect a zero-ETL integration when RDS for Oracle is the source and Amazon Redshift is the target.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/zero-etl.querying.html)

## DDL operations for RDS for PostgreSQL
<a name="zero-etl.ddl-postgres"></a>

Amazon Redshift is derived from PostgreSQL, so it shares several features with RDS for PostgreSQL due to their common PostgreSQL architecture. Zero-ETL integrations leverage these similarities to streamline data replication from RDS for PostgreSQL to Amazon Redshift, mapping databases by name and utilizing the shared database, schema, and table structure.

Consider the following points when managing RDS for PostgreSQL zero-ETL integrations:
+ Isolation is managed at the database level.
+ Replication occurs at the database level. 
+ RDS for PostgreSQL databases are mapped to Amazon Redshift databases by name, with data flowing to the corresponding renamed Redshift database if the original is renamed.

Despite their similarities, Amazon Redshift and RDS for PostgreSQL have important differences. The following sections outline Amazon Redshift system responses for common DDL operations.

**Topics**
+ [

### Database operations
](#zero-etl.ddl-postgres-database)
+ [

### Schema operations
](#zero-etl.ddl-postgres-schema)
+ [

### Table operations
](#zero-etl.ddl-postgres-table)

### Database operations
<a name="zero-etl.ddl-postgres-database"></a>

The following table shows the system responses for database DDL operations.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/zero-etl.querying.html)

### Schema operations
<a name="zero-etl.ddl-postgres-schema"></a>

The following table shows the system responses for schema DDL operations.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/zero-etl.querying.html)

### Table operations
<a name="zero-etl.ddl-postgres-table"></a>

The following table shows the system responses for table DDL operations.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/zero-etl.querying.html)

# Viewing and monitoring Amazon RDS zero-ETL integrations
<a name="zero-etl.describingmonitoring"></a>

You can view the details of an Amazon RDS zero-ETL integration to see its configuration information and current status. You can also monitor the status of your integration by querying specific system views in Amazon Redshift. In addition, Amazon Redshift publishes certain integration-related metrics to Amazon CloudWatch, which you can view within the Amazon Redshift console.

**Topics**
+ [

## Viewing integrations
](#zero-etl.describing)
+ [

## Monitoring integrations using system tables for Amazon Redshift
](#zero-etl.monitoring)
+ [

## Monitoring integrations with Amazon EventBridge for Amazon Redshift
](#zero-etl.eventbridge)

## Viewing integrations
<a name="zero-etl.describing"></a>

You can view Amazon RDS zero-ETL integrations using the AWS Management Console, the AWS CLI, or the RDS API.

### Console
<a name="zero-etl.describing-console"></a>

**To view the details of a zero-ETL integration**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. From the left navigation pane, choose **Zero-ETL integrations**. 

1. Select an integration to view more details about it, such as its source database and target data warehouse.  
![\[Details about a zero-ETL integration\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/images/zero-etl-integration-view.png)

An integration can have the following statuses:
+ `Creating` – The integration is being created.
+ `Active` – The integration is sending transactional data to the target data warehouse.
+ `Syncing` – The integration has encountered a recoverable error and is reseeding data. Affected tables aren't available for querying until they finish resyncing.
+ `Needs attention` – The integration encountered an event or error that requires manual intervention to resolve it. To fix the issue, follow the instructions in the error message on the integration details page.
+ `Failed` – The integration encountered an unrecoverable event or error that can't be fixed. You must delete and recreate the integration.
+ `Deleting` – The integration is being deleted.

### AWS CLI
<a name="zero-etl.describing-cli"></a>

To view all zero-ETL integrations in the current account using the AWS CLI, use the [describe-integrations](https://docs.aws.amazon.com/cli/latest/reference/rds/describe-integrations.html) command and specify the `--integration-identifier` option.

**Example**  
For Linux, macOS, or Unix:  

```
aws rds describe-integrations \
    --integration-identifier ee605691-6c47-48e8-8622-83f99b1af374
```
For Windows:  

```
aws rds describe-integrations ^
    --integration-identifier ee605691-6c47-48e8-8622-83f99b1af374
```

### RDS API
<a name="zero-etl.describing-api"></a>

To view zero-ETL integration using the Amazon RDS API, use the [https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_DescribeIntegrations.html](https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_DescribeIntegrations.html) operation with the `IntegrationIdentifier` parameter.

## Monitoring integrations using system tables for Amazon Redshift
<a name="zero-etl.monitoring"></a>

Amazon Redshift has system tables and views that contain information about how the system is functioning. You can query these system tables and views the same way that you would query any other database table. For more information about system tables and views in Amazon Redshift, see [System tables and views reference](https://docs.aws.amazon.com//redshift/latest/dg/cm_chap_system-tables.html) in the *Amazon Redshift Database Developer Guide*.

You can query the following system views and tables to get information about your zero-ETL integrations:
+  [SVV\$1INTEGRATION](https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_INTEGRATION.html) – Provides configuration details for your integrations.
+ [SVV\$1INTEGRATION\$1TABLE\$1STATE](https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_INTEGRATION_TABLE_STATE.html) – Describes the state of each table within an integration.
+ [SYS\$1INTEGRATION\$1TABLE\$1STATE\$1CHANGE](https://docs.aws.amazon.com/redshift/latest/dg/r_SYS_INTEGRATION_TABLE_STATE_CHANGE.html) – Displays table state change logs for an integration.
+ [SYS\$1INTEGRATION\$1ACTIVITY](https://docs.aws.amazon.com/redshift/latest/dg/r_SYS_INTEGRATION_ACTIVITY.html) – Provides information about completed integration runs.

All integration-related Amazon CloudWatch metrics originate from Amazon Redshift. For more information, see [Metrics for zero-ETL integrations](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.monitoring.html) in the *Amazon Redshift Management Guide*. Currently, Amazon RDS doesn't publish any integration metrics to CloudWatch.

## Monitoring integrations with Amazon EventBridge for Amazon Redshift
<a name="zero-etl.eventbridge"></a>

Amazon Redshift send integration-related events to Amazon EventBridge. For a list of events and their corresponding event IDs, see [Zero-ETL integration event notifications with Amazon EventBridge](https://docs.aws.amazon.com/redshift/latest/mgmt/integration-event-notifications) in the *Amazon Redshift Management Guide*.

# Modifying Amazon RDS zero-ETL integrations
<a name="zero-etl.modifying"></a>

You can modify only the name, description, and data filtering options for a zero-ETL integration in a supported data warehouse. You can't modify the AWS KMS key used to encrypt the integration, or the source or target databases.

If you add a data filter to an existing integration, Amazon RDS reevaluates the filter as if it always existed. It removes any data that is currently in the target data warehouse that doesn't match the new filtering criteria. If you *remove* a data filter from an integration, it replicates any data that previously didn't match the filtering criteria (but now does) into the target data warehouse. For more information, see [Data filtering for Amazon RDS zero-ETL integrations](zero-etl.filtering.md).

You can modify a zero-ETL integration using the AWS Management Console, the AWS CLI, or the Amazon RDS API.

## RDS console
<a name="modify-integration-console"></a>

**To modify a zero-ETL integration**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Zero-ETL integrations**, and then choose the integration that you want to modify. 

1. Choose **Modify** and make modifications to any available settings.

1. When all the changes are as you want them, choose **Modify**.

## AWS CLI
<a name="modify-integration-cli"></a>

To modify a zero-ETL integration using the AWS CLI, call the [modify-integration](https://docs.aws.amazon.com/cli/latest/reference/rds/modify-integration.html) command. Along with the `--integration-identifier`, specify any of the following options:
+ `--integration-name` – Specify a new name for the integration.
+ `--description` – Specify a new description for the integration.
+ `--data-filter` – Specify data filtering options for the integration. For more information, see [Data filtering for Amazon RDS zero-ETL integrations](zero-etl.filtering.md).

**Example**  
The following request modifies an existing integration.  
For Linux, macOS, or Unix:  

```
aws rds modify-integration \
    --integration-identifier ee605691-6c47-48e8-8622-83f99b1af374 \
    --integration-name my-renamed-integration
```
For Windows:  

```
aws rds modify-integration ^
    --integration-identifier ee605691-6c47-48e8-8622-83f99b1af374 ^
    --integration-name my-renamed-integration
```

## RDS API
<a name="modify-integration-api"></a>

To modify a zero-ETL integration using the RDS API, call the [ModifyIntegration](https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_ModifyIntegration.html) operation. Specify the integration identifier, and the parameters that you want to modify.

# Deleting Amazon RDS zero-ETL integrations
<a name="zero-etl.deleting"></a>

When you delete a zero-ETL integration, Amazon RDS removes it from the source database. Your transactional data isn't deleted from Amazon RDS or the analytics destination, but Amazon RDS doesn't send new data to Amazon Redshift or Amazon SageMaker.

You can only delete an integration when it has a status of `Active`, `Failed`, `Syncing`, or `Needs attention`.

You can delete zero-ETL integrations using the AWS Management Console, the AWS CLI, or the RDS API.

## Console
<a name="zero-etl.deleting-console"></a>

**To delete a zero-ETL integration**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. From the left navigation pane, choose **Zero-ETL integrations**. 

1. Select the zero-ETL integration that you want to delete. 

1. Choose **Actions**, **Delete**, and confirm deletion.

## AWS CLI
<a name="zero-etl.deleting-cli"></a>

To delete a zero-ETL integration, use the [delete-integration](https://docs.aws.amazon.com/cli/latest/reference/rds/delete-integration.html) command and specify the `--integration-identifier` option.

**Example**  
For Linux, macOS, or Unix:  

```
aws rds delete-integration \
    --integration-identifier ee605691-6c47-48e8-8622-83f99b1af374
```
For Windows:  

```
aws rds delete-integration ^
    --integration-identifier ee605691-6c47-48e8-8622-83f99b1af374
```

## RDS API
<a name="zero-etl.deleting-api"></a>

To delete a zero-ETL integration using the Amazon RDS API, use the [https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_DeleteIntegration.html](https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_DeleteIntegration.html) operation with the `IntegrationIdentifier` parameter.

# Troubleshooting Amazon RDS zero-ETL integrations
<a name="zero-etl.troubleshooting"></a>

You can check the state of a zero-ETL integration by querying the [SVV\$1INTEGRATION](https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_INTEGRATION.html) system table in the analytics destination. If the `state` column has a value of `ErrorState`, it means something's wrong. For more information, see [Monitoring integrations using system tables for Amazon Redshift](zero-etl.describingmonitoring.md#zero-etl.monitoring).

Use the following information to troubleshoot common issues with Amazon RDS zero-ETL integrations.

**Important**  
Resync and refresh operations are not available for zero-ETL integrations with an Amazon SageMaker AI lakehouse. If there are issues with an integration, you must delete the integration and create a new integration. You can't refresh or resync an existing integration.

**Topics**
+ [

## I can't create a zero-ETL integration
](#zero-etl.troubleshooting.creation)
+ [

## My integration is stuck in a state of `Syncing`
](#zero-etl.troubleshooting.syncing)
+ [

## My tables aren't replicating to Amazon Redshift
](#zero-etl.troubleshooting.primarykey)
+ [

## One or more of my Amazon Redshift tables requires a resync
](#zero-etl.troubleshooting.resync)
+ [

## Integration failed issues for Amazon SageMaker AI lakehouse zero-ETL integrations
](#zero-etl.troubleshooting.integration-issues)

## I can't create a zero-ETL integration
<a name="zero-etl.troubleshooting.creation"></a>

If you can't create a zero-ETL integration, make sure that the following are correct for your source database:
+ Your source database must be running a supported DB engine version. For a list of supported versions, see [Supported Regions and DB engines for Amazon RDS zero-ETL integrations](Concepts.RDS_Fea_Regions_DB-eng.Feature.ZeroETL.md).
+  You correctly configured DB parameters. If the required parameters are set incorrectly or not associated with the database, creation fails. See [Step 1: Create a custom DB parameter group](zero-etl.setting-up.md#zero-etl.parameters).

In addition, make sure the following are correct for your target data warehouse:
+ Case sensitivity is enabled. See [Turn on case sensitivity for your data warehouse](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.setting-up.html#zero-etl-setting-up.case-sensitivity).
+ You added the correct authorized principal and integration source. See [Configure authorization for your Amazon Redshift data warehouse](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.setting-up.html#zero-etl-using.redshift-iam).
+ The data warehouse is encrypted (if it's a provisioned cluster). See [Amazon Redshift database encryption](https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-db-encryption.html).

## My integration is stuck in a state of `Syncing`
<a name="zero-etl.troubleshooting.syncing"></a>

Your integration might consistently show a status of `Syncing` if you change the value of one of the required DB parameters.

To fix this issue, check the values of the parameters in the parameter group associated with the source database, and make sure that they match the required values. For more information, see [Step 1: Create a custom DB parameter group](zero-etl.setting-up.md#zero-etl.parameters).

If you modify any parameters, make sure to reboot the database to apply the changes.

## My tables aren't replicating to Amazon Redshift
<a name="zero-etl.troubleshooting.primarykey"></a>

If you don't see one or more tables reflected in Amazon Redshift, you can run the following command to resynchronize them:

```
ALTER DATABASE dbname INTEGRATION REFRESH TABLES table1, table2;
```

For more information, see [ALTER DATABASE](https://docs.aws.amazon.com/redshift/latest/dg/r_ALTER_DATABASE.html) in the Amazon Redshift SQL reference.

Your data might not be replicating because one or more of your source tables doesn't have a primary key. The monitoring dashboard in Amazon Redshift displays the status of these tables as `Failed`, and the status of the overall zero-ETL integration changes to `Needs attention`. To resolve this issue, you can identify an existing key in your table that can become a primary key, or you can add a synthetic primary key. For detailed solutions, see [Handle tables without primary keys while creating Amazon Aurora MySQL or Amazon RDS for MySQL zero-ETL integrations with Amazon Redshift](https://aws.amazon.com/blogs/database/handle-tables-without-primary-keys-while-creating-amazon-aurora-mysql-or-amazon-rds-for-mysql-zero-etl-integrations-with-amazon-redshift/). 

## One or more of my Amazon Redshift tables requires a resync
<a name="zero-etl.troubleshooting.resync"></a>

Running certain commands on your source database might require your tables to be resynchronized. In these cases, the [SVV\$1INTEGRATION\$1TABLE\$1STATE](https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_INTEGRATION_TABLE_STATE.html) system view shows a `table_state` of `ResyncRequired`, which means that the integration must completely reload data for that specific table from MySQL to Amazon Redshift.

When the table starts to resynchronize, it enters a state of `Syncing`. You don't need to take any manual action to resynchronize a table. While table data is resynchronizing, you can't access it in Amazon Redshift.

The following are some example operations that can put a table into a `ResyncRequired` state, and possible alternatives to consider.


| Operation | Example | Alternative | 
| --- | --- | --- | 
| Adding a column into a specific position  |  <pre>ALTER TABLE table_name<br />  ADD COLUMN column_name INTEGER<br />  NOT NULL first;</pre>  | Amazon Redshift doesn't support adding columns into specific positions using first or after keywords. If the order of columns in the target table isn't critical, add the column to the end of the table using a simpler command:<pre>ALTER TABLE table_name<br />  ADD COLUMN column_name column_type;</pre> | 
| Adding a timestamp column with the default CURRENT\$1TIMESTAMP |  <pre>ALTER TABLE table_name<br />  ADD COLUMN column_name TIMESTAMP<br />  NOT NULL DEFAULT CURRENT_TIMESTAMP;</pre>  | The CURRENT\$1TIMESTAMP value for existing table rows is calculated by RDS for MySQL and can't be simulated in Amazon Redshift without full table data resynchronization. If possible, switch the default value to a literal constant like `2023-01-01 00:00:15` to avoid latency in table availability. | 
| Performing multiple column operations within a single command |  <pre>ALTER TABLE table_name<br />  ADD COLUMN column_1,<br />  RENAME COLUMN column_2 TO column_3;</pre>  | Consider splitting the command into two separate operations, ADD and RENAME, which won't require resynchronization. | 

## Integration failed issues for Amazon SageMaker AI lakehouse zero-ETL integrations
<a name="zero-etl.troubleshooting.integration-issues"></a>

If you encounter issues with an existing zero-ETL integration with an Amazon SageMaker AI lakehouse, the only resolution is to delete the integration and create a new one. Unlike other AWS services, zero-ETL integrations do not support refresh or resync operations.

To resolve integration issues:

1. Delete the problematic zero-ETL integration using the console, CLI, or API.

1. Verify that the source database and target data warehouse configurations are correct.

1. Create a new zero-ETL integration with the same or updated configuration.

This process will result in a complete re-initialization of the data pipeline, which may take time depending on the size of your source database.