# Working with AWS DMS endpoints
<a name="CHAP_Endpoints"></a>

An endpoint provides connection, data store type, and location information about your data store. AWS Database Migration Service uses this information to connect to a data store and migrate data from a source endpoint to a target endpoint. You can specify additional connection attributes for an endpoint by using endpoint settings. These settings can control logging, file size, and other parameters; for more information about endpoint settings, see the documentation section for your data store. 

Following, you can find out more details about endpoints.

**Topics**
+ [

# Creating source and target endpoints
](CHAP_Endpoints.Creating.md)
+ [

# Sources for data migration
](CHAP_Source.md)
+ [

# Targets for data migration
](CHAP_Target.md)
+ [

# Configuring VPC endpoints for AWS DMS
](CHAP_VPC_Endpoints.md)
+ [

# DDL statements supported by AWS DMS
](CHAP_Introduction.SupportedDDL.md)
+ [

# Advanced endpoint configuration
](CHAP_Advanced.Endpoints.md)

# Creating source and target endpoints
<a name="CHAP_Endpoints.Creating"></a>

You can create source and target endpoints when you create your replication instance or you can create endpoints after your replication instance is created. The source and target data stores can be on an Amazon Elastic Compute Cloud (Amazon EC2) instance, an Amazon Relational Database Service (Amazon RDS) DB instance, or an on-premises database. (Note that one of your endpoints must be on an AWS service. You can't use AWS DMS to migrate from an on-premises database to another on-premises database.)

The following procedure assumes that you have chosen the AWS DMS console wizard. Note that you can also do this step by selecting **Endpoints** from the AWS DMS console's navigation pane and then selecting **Create endpoint**. When using the console wizard, you create both the source and target endpoints on the same page. When not using the console wizard, you create each endpoint separately.

**To specify source or target database endpoints using the AWS console**

1. On the **Connect source and target database endpoints** page, specify your connection information for the source or target database. The following table describes the settings.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Endpoints.Creating.html)

   The following table lists the unsupported characters in endpoint passwords and secret manager secrets for the listed database engines. If you want to use commas (,) in your endpoint passwords, use the Secrets Manager support provided in AWS DMS to authenticate access to your AWS DMS instances. For more information, see [Using secrets to access AWS Database Migration Service endpoints](security_iam_secretsmanager.md).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Endpoints.Creating.html)

1. Choose **Endpoint settings** and **AWS KMS key ** if you need them. You can test the endpoint connection by choosing **Run test**. The following table describes the settings.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Endpoints.Creating.html)

# Using IAM authentication for Amazon RDS endpoint in AWS DMS
<a name="CHAP_Endpoints.Creating.IAMRDS"></a>

AWS Identity and Access Management (IAM) database authentication provides enhanced security for your Amazon RDS databases by managing database access through AWS IAM credentials. Instead of using traditional database passwords, IAM authentication generates short-lived authentication tokens, valid for 15 minutes, using AWS credentials. This approach significantly improves security by eliminating the need to store database passwords in application code, reducing the risk of credential exposure, and providing centralized access management through IAM. It also simplifies access management by leveraging existing AWS IAM roles and policies, enabling you to control database access using the same IAM framework you use for other AWS services.

AWS DMS now supports IAM authentication for replication instances running DMS version 3.6.1 or later when connecting to MySQL, PostgreSQL, Aurora PostgreSQL, Aurora MySQL, or MariaDB endpoints on Amazon RDS. When creating a new endpoint for these engines, you can select IAM authentication and specify an IAM role instead of providing database credentials. This integration enhances security by eliminating the need to manage and store database passwords for your migration tasks.

## Configuring IAM authentication for Amazon RDS endpoint in AWS DMS
<a name="CHAP_Endpoints.Creating.IAMRDS.config"></a>

When creating an endpoint you can configure IAM authentication for your Amazon RDS database. To configure IAM authentication, do the following:

### DMS Console
<a name="CHAP_Endpoints.Creating.IAMRDS.console"></a>

1. Ensure the Amazon RDS and the database user has IAM authentication enabled. For more information, see [Enabling and disabling IAM database authentication](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.Enabling.html) in the *Amazon Relational Database Service user guide*. 

1. Navigate to the IAM Console, create an IAM role with the below policies:

   Policy

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Sid": "VisualEditor0",
               "Effect": "Allow",
               "Action": [
                   "rds-db:connect"
               ],
               "Resource": "*"
           }
       ]
   }
   ```

------

   Trust policy:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Sid": "",
               "Effect": "Allow",
               "Principal": {
                   "Service": [
                       "dms.amazonaws.com"
                   ]
               },
               "Action": "sts:AssumeRole"
           }
       ]
   }
   ```

------

1. During the endpoint configuration in the [https://console.aws.amazon.com/dms/v2](https://console.aws.amazon.com/dms/v2), navigate to the **Access to endpoint database** section and select **IAM authentication**.

1. In the **IAM role for RDS database authentication** dropdown menu, select the IAM role with appropriate permissions to access the database.

    For more information, see [Creating source and target endpoints](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Endpoints.Creating.html).

### AWS CLI
<a name="CHAP_Endpoints.Creating.IAMRDS.awscli"></a>

1. Ensure the Amazon RDS and the database user has IAM authentication enabled. For more information, see [Enabling and disabling IAM database authentication](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.Enabling.html) in the *Amazon Relational Database Service user guide*. 

1. Navigate to the AWS CLI, create an IAM role, and allow DMS to assume the role:

   Policy:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Sid": "VisualEditor0",
               "Effect": "Allow",
               "Action": [
                   "rds-db:connect"
               ],
               "Resource": "*"
           }
       ]
   }
   ```

------

   Trust policy:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Sid": "",
               "Effect": "Allow",
               "Principal": {
                   "Service": [
                       "dms.amazonaws.com"
                   ]
               },
               "Action": "sts:AssumeRole"
           }
       ]
   }
   ```

------

1. Run the following command to import the certificate and download the PEM file. For more information, see [Download certificate bundles for Amazon RDS](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html#UsingWithRDS.SSL.CertificatesDownload) in the *Amazon Relational Database Service user guide*. 

   ```
   aws dms import-certificate --certificate-identifier rdsglobal --certificate-pem file://~/global-bundle.pem
   ```

1. Run the following commands to create an IAM endpoint:
   + For PostgreSQL/Aurora PostgreSQL endpoints (When `sslmode` is set to `required`, `--certificate-arn` flag is not required): 

     ```
     aws dms create-endpoint --endpoint-identifier <endpoint-name> --endpoint-type <source/target> --engine-name <postgres/aurora-postgres> --username <db username with iam auth privileges> --server-name <db server name> --port <port number> --ssl-mode <required/verify-ca/verify-full> --postgre-sql-settings "{\"ServiceAccessRoleArn\": \"role arn created from step 2 providing permissions for iam authentication\", \"AuthenticationMethod\": \"iam\", \"DatabaseName\": \"database name\"}" --certificate-arn <if sslmode is verify-ca/verify full use cert arn generated in step 3, otherwise this parameter is not required>
     ```
   + For MySQL, MariaDB, or Aurora MySQL endpoints: 

     ```
     aws dms create-endpoint --endpoint-identifier <endpoint-name> --endpoint-type <source/target> --engine-name <mysql/mariadb/aurora> --username <db username with iam auth privileges> --server-name <db server name> --port <port number> --ssl-mode <verify-ca/verify-full> --my-sql-settings "{\"ServiceAccessRoleArn\": \"role arn created from step 2 providing permissions for iam authentication\", \"AuthenticationMethod\": \"iam\", \"DatabaseName\": \"database name\"}" --certificate-arn <cert arn from previously imported cert in step 3>
     ```

1. Run a test connection against your desired replication instance to create the instance endpoint association and verify everything is set up correctly: 

   ```
   aws dms test-connection --replication-instance-arn <replication instance arn> --endpoint-arn <endpoint arn from previously created endpoint in step 4>
   ```
**Note**  
When using IAM authentication, the replication instance provided in test-connection must be on AWS DMS version 3.6.1 or later.

## Limitations
<a name="CHAP_Endpoints.Creating.IAMRDS.Limitations"></a>

AWS DMS has following limitations when using IAM authentication with Amazon RDS endpoint:
+ Currently Amazon RDS PostgreSQL and Amazon Aurora PostgreSQL instances do not support CDC connections with IAM authentication. For more information, see [Limitations for IAM database authentication](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.html#UsingWithRDS.IAMDBAuth.Limitations) in the *Amazon Relational Database Service User Guide*.

# Sources for data migration
<a name="CHAP_Source"></a>

AWS Database Migration Service (AWS DMS) can use many of the most popular data engines as a source for data replication. The database source can be a self-managed engine running on an Amazon EC2 instance or an on-premises database. Or it can be a data source on an AWS service such as Amazon RDS or Amazon S3.

For a comprehensive list of valid sources, see [Sources for AWS DMS](CHAP_Introduction.Sources.md#CHAP_Introduction.Sources.title).

**Topics**
+ [

# Using an Oracle database as a source for AWS DMS
](CHAP_Source.Oracle.md)
+ [

# Using a Microsoft SQL Server database as a source for AWS DMS
](CHAP_Source.SQLServer.md)
+ [

# Using Microsoft Azure SQL database as a source for AWS DMS
](CHAP_Source.AzureSQL.md)
+ [

# Using Microsoft Azure SQL Managed Instance as a source for AWS DMS
](CHAP_Source.AzureMgd.md)
+ [

# Using Microsoft Azure Database for PostgreSQL flexible server as a source for AWS DMS
](CHAP_Source.AzureDBPostgreSQL.md)
+ [

# Using Microsoft Azure Database for MySQL flexible server as a source for AWS DMS
](CHAP_Source.AzureDBMySQL.md)
+ [

# Using OCI MySQL Heatwave as a source for AWS DMS
](CHAP_Source.heatwave.md)
+ [

# Using Google Cloud for MySQL as a source for AWS DMS
](CHAP_Source.GC.md)
+ [

# Using Google Cloud for PostgreSQL as a source for AWS DMS
](CHAP_Source.GCPostgres.md)
+ [

# Using a PostgreSQL database as an AWS DMS source
](CHAP_Source.PostgreSQL.md)
+ [

# Using a MySQL-compatible database as a source for AWS DMS
](CHAP_Source.MySQL.md)
+ [

# Using an SAP ASE database as a source for AWS DMS
](CHAP_Source.SAP.md)
+ [

# Using MongoDB as a source for AWS DMS
](CHAP_Source.MongoDB.md)
+ [

# Using Amazon DocumentDB (with MongoDB compatibility) as a source for AWS DMS
](CHAP_Source.DocumentDB.md)
+ [

# Using Amazon S3 as a source for AWS DMS
](CHAP_Source.S3.md)
+ [

# Using IBM Db2 for Linux, Unix, Windows, and Amazon RDS database (Db2 LUW) as a source for AWS DMS
](CHAP_Source.DB2.md)
+ [

# Using IBM Db2 for z/OS databases as a source for AWS DMS
](CHAP_Source.DB2zOS.md)

# Using an Oracle database as a source for AWS DMS
<a name="CHAP_Source.Oracle"></a>

You can migrate data from one or many Oracle databases using AWS DMS. With an Oracle database as a source, you can migrate data to any of the targets supported by AWS DMS.

AWS DMS supports the following Oracle database editions:
+ Oracle Enterprise Edition
+ Oracle Standard Edition
+ Oracle Express Edition
+ Oracle Personal Edition

For information about versions of Oracle databases that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md).

You can use Secure Sockets Layer (SSL) to encrypt connections between your Oracle endpoint and your replication instance. For more information on using SSL with an Oracle endpoint, see [SSL support for an Oracle endpoint](#CHAP_Security.SSL.Oracle).

AWS DMS supports the use of Oracle transparent data encryption (TDE) to encrypt data at rest in the source database. For more information on using Oracle TDE with an Oracle source endpoint, see [Supported encryption methods for using Oracle as a source for AWS DMS](#CHAP_Source.Oracle.Encryption).

AWS supports the use of TLS version 1.2 and later with Oracle endpoints (and all other endpoint types), and recommends using TLS version 1.3 or later.

Follow these steps to configure an Oracle database as an AWS DMS source endpoint:

1. Create an Oracle user with the appropriate permissions for AWS DMS to access your Oracle source database.

1. Create an Oracle source endpoint that conforms with your chosen Oracle database configuration. To create a full-load-only task, no further configuration is needed.

1. To create a task that handles change data capture (a CDC-only or full-load and CDC task), choose Oracle LogMiner or AWS DMS Binary Reader to capture data changes. Choosing LogMiner or Binary Reader determines some of the later permissions and configuration options. For a comparison of LogMiner and Binary Reader, see the following section.

**Note**  
For more information on full-load tasks, CDC-only tasks, and full-load and CDC tasks, see [Creating a task](CHAP_Tasks.Creating.md)

For additional details on working with Oracle source databases and AWS DMS, see the following sections. 

**Topics**
+ [

## Using Oracle LogMiner or AWS DMS Binary Reader for CDC
](#CHAP_Source.Oracle.CDC)
+ [

## Workflows for configuring a self-managed or AWS-managed Oracle source database for AWS DMS


## Configuring an Oracle source database
](#CHAP_Source.Oracle.Workflows)
+ [

## Working with a self-managed Oracle database as a source for AWS DMS
](#CHAP_Source.Oracle.Self-Managed)
+ [

## Working with an AWS-managed Oracle database as a source for AWS DMS
](#CHAP_Source.Oracle.Amazon-Managed)
+ [

## Limitations on using Oracle as a source for AWS DMS
](#CHAP_Source.Oracle.Limitations)
+ [

## SSL support for an Oracle endpoint
](#CHAP_Security.SSL.Oracle)
+ [

## Supported encryption methods for using Oracle as a source for AWS DMS
](#CHAP_Source.Oracle.Encryption)
+ [

## Supported compression methods for using Oracle as a source for AWS DMS
](#CHAP_Source.Oracle.Compression)
+ [

## Replicating nested tables using Oracle as a source for AWS DMS
](#CHAP_Source.Oracle.NestedTables)
+ [

## Storing REDO on Oracle ASM when using Oracle as a source for AWS DMS
](#CHAP_Source.Oracle.REDOonASM)
+ [

## Endpoint settings when using Oracle as a source for AWS DMS
](#CHAP_Source.Oracle.ConnectionAttrib)
+ [

## Source data types for Oracle
](#CHAP_Source.Oracle.DataTypes)

## Using Oracle LogMiner or AWS DMS Binary Reader for CDC
<a name="CHAP_Source.Oracle.CDC"></a>

In AWS DMS, there are two methods for reading the redo logs when doing change data capture (CDC) for Oracle as a source: Oracle LogMiner and AWS DMS Binary Reader. LogMiner is an Oracle API to read the online redo logs and archived redo log files. Binary Reader is an AWS DMS method that reads and parses the raw redo log files directly. These methods have the following features.


| Feature | LogMiner | Binary Reader | 
| --- | --- | --- | 
| Easy to configure | Yes | No | 
| Lower impact on source system I/O and CPU | No | Yes | 
| Better CDC performance | No | Yes | 
| Supports Oracle table clusters | Yes | No | 
| Supports all types of Oracle Hybrid Columnar Compression (HCC) | Yes |  Partially Binary Reader does not support QUERY LOW for tasks with CDC. All other HCC types are fully supported.  | 
| LOB column support in Oracle 12c only | No (LOB Support is not available with LogMiner in Oracle 12c.) | Yes | 
| Supports UPDATE statements that affect only LOB columns | No | Yes | 
| Supports Oracle transparent data encryption (TDE) |  Partially When using Oracle LogMiner, AWS DMS does not support TDE encryption on column level for Amazon RDS for Oracle.  |  Partially Binary Reader supports TDE only for self-managed Oracle databases.  | 
| Supports all Oracle compression methods | Yes | No | 
| Supports XA transactions | No | Yes | 
| RAC |  Yes Not recommended, due to performance reasons, and some internal DMS limitations.  |  Yes Highly recommended  | 

**Note**  
By default, AWS DMS uses Oracle LogMiner for (CDC).   
AWS DMS supports transparent data encryption (TDE) methods when working with an Oracle source database. If the TDE credentials you specify are incorrect, the AWS DMS migration task does not fail, which can impact ongoing replication of encrypted tables. For more information about specifying TDE credentials, see [Supported encryption methods for using Oracle as a source for AWS DMS](#CHAP_Source.Oracle.Encryption).

The main advantages of using LogMiner with AWS DMS include the following:
+ LogMiner supports most Oracle options, such as encryption options and compression options. Binary Reader does not support all Oracle options, particularly compression and most options for encryption.
+ LogMiner offers a simpler configuration, especially compared to Binary Reader direct-access setup or when the redo logs are managed using Oracle Automatic Storage Management (ASM).
+ LogMiner supports table clusters for use by AWS DMS. Binary Reader does not.

The main advantages of using Binary Reader with AWS DMS include the following:
+ For migrations with a high volume of changes, LogMiner might have some I/O or CPU impact on the computer hosting the Oracle source database. Binary Reader has less chance of having I/O or CPU impact because logs are mined directly rather than making multiple database queries.
+ For migrations with a high volume of changes, CDC performance is usually much better when using Binary Reader compared with using Oracle LogMiner.
+ Binary Reader supports CDC for LOBs in Oracle version 12c. LogMiner does not.

In general, use Oracle LogMiner for migrating your Oracle database unless you have one of the following situations:
+ You need to run several migration tasks on the source Oracle database.
+ The volume of changes or the redo log volume on the source Oracle database is high, or you have changes and are also using Oracle ASM.

**Note**  
If you change between using Oracle LogMiner and AWS DMS Binary Reader, make sure to restart the CDC task. 

### Configuration for CDC on an Oracle source database
<a name="CHAP_Source.Oracle.CDC.Configuration"></a>

For an Oracle source endpoint to connect to the database for a change data capture (CDC) task, you might need to specify extra connection attributes. This can be true for either a full-load and CDC task or for a CDC-only task. The extra connection attributes that you specify depend on the method you use to access the redo logs: Oracle LogMiner or AWS DMS Binary Reader. 

You specify extra connection attributes when you create a source endpoint. If you have multiple connection attribute settings, separate them from each other by semicolons with no additional white space (for example, `oneSetting;thenAnother`).

AWS DMS uses LogMiner by default. You don't have to specify additional extra connection attributes to use it. 

To use Binary Reader to access the redo logs, add the following extra connection attributes.

```
useLogMinerReader=N;useBfile=Y;
```

Use the following format for the extra connection attributes to access a server that uses ASM with Binary Reader.

```
useLogMinerReader=N;useBfile=Y;asm_user=asm_username;asm_server=RAC_server_ip_address:port_number/+ASM;
```

Set the source endpoint `Password` request parameter to both the Oracle user password and the ASM password, separated by a comma as follows.

```
oracle_user_password,asm_user_password
```

Where the Oracle source uses ASM, you can work with high-performance options in Binary Reader for transaction processing at scale. These options include extra connection attributes to specify the number of parallel threads (`parallelASMReadThreads`) and the number of read-ahead buffers (`readAheadBlocks`). Setting these attributes together can significantly improve the performance of the CDC task. The following settings provide good results for most ASM configurations.

```
useLogMinerReader=N;useBfile=Y;asm_user=asm_username;asm_server=RAC_server_ip_address:port_number/+ASM;
    parallelASMReadThreads=6;readAheadBlocks=150000;
```

For more information on values that extra connection attributes support, see [Endpoint settings when using Oracle as a source for AWS DMS](#CHAP_Source.Oracle.ConnectionAttrib).

In addition, the performance of a CDC task with an Oracle source that uses ASM depends on other settings that you choose. These settings include your AWS DMS extra connection attributes and the SQL settings to configure the Oracle source. For more information on extra connection attributes for an Oracle source using ASM, see [Endpoint settings when using Oracle as a source for AWS DMS](#CHAP_Source.Oracle.ConnectionAttrib)

You also need to choose an appropriate CDC start point. Typically when you do this, you want to identify the point of transaction processing that captures the earliest open transaction to begin CDC from. Otherwise, the CDC task can miss earlier open transactions. For an Oracle source database, you can choose a CDC native start point based on the Oracle system change number (SCN) to identify this earliest open transaction. For more information, see [Performing replication starting from a CDC start point](CHAP_Task.CDC.md#CHAP_Task.CDC.StartPoint).

For more information on configuring CDC for a self-managed Oracle database as a source, see [Account privileges required when using Oracle LogMiner to access the redo logs](#CHAP_Source.Oracle.Self-Managed.LogMinerPrivileges), [Account privileges required when using AWS DMS Binary Reader to access the redo logs](#CHAP_Source.Oracle.Self-Managed.BinaryReaderPrivileges), and [Additional account privileges required when using Binary Reader with Oracle ASM](#CHAP_Source.Oracle.Self-Managed.ASMBinaryPrivileges).

For more information on configuring CDC for an AWS-managed Oracle database as a source, see [Configuring a CDC task to use Binary Reader with an RDS for Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.CDC) and [Using an Amazon RDS Oracle Standby (read replica) as a source with Binary Reader for CDC in AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.StandBy).

## Workflows for configuring a self-managed or AWS-managed Oracle source database for AWS DMS


## Configuring an Oracle source database
<a name="CHAP_Source.Oracle.Workflows"></a>

To configure a self-managed source database instance, use the following workflow steps , depending on how you perform CDC. 


| For this workflow step | If you perform CDC using LogMiner, do this | If you perform CDC using Binary Reader, do this | 
| --- | --- | --- | 
| Grant Oracle account privileges. | See [User account privileges required on a self-managed Oracle source for AWS DMS](#CHAP_Source.Oracle.Self-Managed.Privileges). | See [User account privileges required on a self-managed Oracle source for AWS DMS](#CHAP_Source.Oracle.Self-Managed.Privileges). | 
| Prepare the source database for replication using CDC. | See [Preparing an Oracle self-managed source database for CDC using AWS DMS](#CHAP_Source.Oracle.Self-Managed.Configuration). | See [Preparing an Oracle self-managed source database for CDC using AWS DMS](#CHAP_Source.Oracle.Self-Managed.Configuration). | 
| Grant additional Oracle user privileges required for CDC. | See [Account privileges required when using Oracle LogMiner to access the redo logs](#CHAP_Source.Oracle.Self-Managed.LogMinerPrivileges). | See [Account privileges required when using AWS DMS Binary Reader to access the redo logs](#CHAP_Source.Oracle.Self-Managed.BinaryReaderPrivileges). | 
| For an Oracle instance with ASM, grant additional user account privileges required to access ASM for CDC. | No additional action. AWS DMS supports Oracle ASM without additional account privileges. | See [Additional account privileges required when using Binary Reader with Oracle ASM](#CHAP_Source.Oracle.Self-Managed.ASMBinaryPrivileges). | 
| If you haven't already done so, configure the task to use LogMiner or Binary Reader for CDC. | See [Using Oracle LogMiner or AWS DMS Binary Reader for CDC](#CHAP_Source.Oracle.CDC). | See [Using Oracle LogMiner or AWS DMS Binary Reader for CDC](#CHAP_Source.Oracle.CDC). | 
| Configure Oracle Standby as a source for CDC. | AWS DMS does not support Oracle Standby as a source. | See [Using a self-managed Oracle Standby as a source with Binary Reader for CDC in AWS DMS](#CHAP_Source.Oracle.Self-Managed.BinaryStandby). | 

Use the following workflow steps to configure an AWS-managed Oracle source database instance.


| For this workflow step | If you perform CDC using LogMiner, do this | If you perform CDC using Binary Reader, do this | 
| --- | --- | --- | 
| Grant Oracle account privileges. | For more information, see [User account privileges required on an AWS-managed Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.Privileges). | For more information, see [User account privileges required on an AWS-managed Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.Privileges). | 
| Prepare the source database for replication using CDC. | For more information, see [Configuring an AWS-managed Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.Configuration). | For more information, see [Configuring an AWS-managed Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.Configuration). | 
| Grant additional Oracle user privileges required for CDC. | No additional account privileges are required. | For more information, see [Configuring a CDC task to use Binary Reader with an RDS for Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.CDC). | 
| If you haven't already done so, configure the task to use LogMiner or Binary Reader for CDC. | For more information, see [Using Oracle LogMiner or AWS DMS Binary Reader for CDC](#CHAP_Source.Oracle.CDC). | For more information, see [Using Oracle LogMiner or AWS DMS Binary Reader for CDC](#CHAP_Source.Oracle.CDC). | 
| Configure Oracle Standby as a source for CDC. | AWS DMS does not support Oracle Standby as a source. | For more information, see [Using an Amazon RDS Oracle Standby (read replica) as a source with Binary Reader for CDC in AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.StandBy). | 

## Working with a self-managed Oracle database as a source for AWS DMS
<a name="CHAP_Source.Oracle.Self-Managed"></a>

A *self-managed database* is a database that you configure and control, either a local on-premises database instance or a database on Amazon EC2. Following, you can find out about the privileges and configurations you need when using a self-managed Oracle database with AWS DMS.

### User account privileges required on a self-managed Oracle source for AWS DMS
<a name="CHAP_Source.Oracle.Self-Managed.Privileges"></a>

To use an Oracle database as a source in AWS DMS, grant the following privileges to the Oracle user specified in the Oracle endpoint connection settings.

**Note**  
When granting privileges, use the actual name of objects, not the synonym for each object. For example, use `V_$OBJECT` including the underscore, not `V$OBJECT` without the underscore.

```
GRANT CREATE SESSION TO dms_user;
GRANT SELECT ANY TRANSACTION TO dms_user;
GRANT SELECT ON V_$ARCHIVED_LOG TO dms_user;
GRANT SELECT ON V_$LOG TO dms_user;
GRANT SELECT ON V_$LOGFILE TO dms_user;
GRANT SELECT ON V_$LOGMNR_LOGS TO dms_user;
GRANT SELECT ON V_$LOGMNR_CONTENTS TO dms_user;
GRANT SELECT ON V_$DATABASE TO dms_user;
GRANT SELECT ON V_$THREAD TO dms_user;
GRANT SELECT ON V_$PARAMETER TO dms_user;
GRANT SELECT ON V_$NLS_PARAMETERS TO dms_user;
GRANT SELECT ON V_$TIMEZONE_NAMES TO dms_user;
GRANT SELECT ON V_$TRANSACTION TO dms_user;
GRANT SELECT ON V_$CONTAINERS TO dms_user;                   
GRANT SELECT ON ALL_INDEXES TO dms_user;
GRANT SELECT ON ALL_OBJECTS TO dms_user;
GRANT SELECT ON ALL_TABLES TO dms_user;
GRANT SELECT ON ALL_USERS TO dms_user;
GRANT SELECT ON ALL_CATALOG TO dms_user;
GRANT SELECT ON ALL_CONSTRAINTS TO dms_user;
GRANT SELECT ON ALL_CONS_COLUMNS TO dms_user;
GRANT SELECT ON ALL_TAB_COLS TO dms_user;
GRANT SELECT ON ALL_IND_COLUMNS TO dms_user;
GRANT SELECT ON ALL_ENCRYPTED_COLUMNS TO dms_user;
GRANT SELECT ON ALL_LOG_GROUPS TO dms_user;
GRANT SELECT ON ALL_TAB_PARTITIONS TO dms_user;
GRANT SELECT ON SYS.DBA_REGISTRY TO dms_user;
GRANT SELECT ON SYS.OBJ$ TO dms_user;
GRANT SELECT ON DBA_TABLESPACES TO dms_user;
GRANT SELECT ON DBA_OBJECTS TO dms_user; -– Required if the Oracle version is earlier than 11.2.0.3.
GRANT SELECT ON SYS.ENC$ TO dms_user; -– Required if transparent data encryption (TDE) is enabled. For more information on using Oracle TDE with AWS DMS, see Supported encryption methods for using Oracle as a source for AWS DMS.
GRANT SELECT ON GV_$TRANSACTION TO dms_user; -– Required if the source database is Oracle RAC in AWS DMS versions 3.4.6 and higher.
GRANT SELECT ON V_$DATAGUARD_STATS TO dms_user; -- Required if the source database is Oracle Data Guard and Oracle Standby is used in the latest release of DMS version 3.4.6, version 3.4.7, and higher.
GRANT SELECT ON V_$DATABASE_INCARNATION TO dms_user;
```

Grant the additional following privilege for each replicated table when you are using a specific table list.

```
GRANT SELECT on any-replicated-table to dms_user;
```

Grant the additional following privilege to use validation feature.

```
GRANT EXECUTE ON SYS.DBMS_CRYPTO TO dms_user;
```

Grant the additional following privilege if you use binary reader instead of LogMiner.

```
GRANT SELECT ON SYS.DBA_DIRECTORIES TO dms_user;
```

Grant the additional following privilege to expose views.

```
GRANT SELECT on ALL_VIEWS to dms_user;
```

To expose views, you must also add the `exposeViews=true` extra connection attribute to your source endpoint.

Grant the additional following privilege when using serverless replications.

```
GRANT SELECT on dba_segments to dms_user;
GRANT SELECT on v_$tablespace to dms_user;
GRANT SELECT on dba_tab_subpartitions to dms_user;
GRANT SELECT on dba_extents to dms_user;
```

For information about serverless replications, see [Working with AWS DMS Serverless](CHAP_Serverless.md).

Grant the additional following privileges when using Oracle-specific premigration assessments.

```
GRANT SELECT on gv_$parameter  to dms_user;
GRANT SELECT on v_$instance to dms_user;
GRANT SELECT on v_$version to dms_user;
GRANT SELECT on gv_$ASM_DISKGROUP to dms_user;
GRANT SELECT on gv_$database to dms_user;
GRANT SELECT on dba_db_links to dms_user;
GRANT SELECT on gv_$log_History to dms_user;
GRANT SELECT on gv_$log to dms_user;
GRANT SELECT ON DBA_TYPES TO dms_user;
GRANT SELECT ON DBA_USERS to dms_user;
GRANT SELECT ON DBA_DIRECTORIES to dms_user;
GRANT EXECUTE ON SYS.DBMS_XMLGEN TO dms_user;
```

For information about Oracle-specific premigration assessments, see [Oracle assessments](CHAP_Tasks.AssessmentReport.Oracle.md).

#### Prerequisites for handling open transactions for Oracle Standby
<a name="CHAP_Source.Oracle.Self-Managed.Privileges.Standby"></a>

When using AWS DMS versions 3.4.6 and higher, perform the following steps to handle open transactions for Oracle Standby. 

1. Create a database link named, `AWSDMS_DBLINK` on the primary database. `DMS_USER` will use the database link to connect to the primary database. Note that the database link is executed from the standby instance to query the open transactions running on the primary database. See the following example. 

   ```
   CREATE PUBLIC DATABASE LINK AWSDMS_DBLINK 
      CONNECT TO DMS_USER IDENTIFIED BY DMS_USER_PASSWORD
      USING '(DESCRIPTION=
               (ADDRESS=(PROTOCOL=TCP)(HOST=PRIMARY_HOST_NAME_OR_IP)(PORT=PORT))
               (CONNECT_DATA=(SERVICE_NAME=SID))
             )';
   ```

1. Verify the connection to the database link using `DMS_USER` is established, as shown in the following example.

   ```
   select 1 from dual@AWSDMS_DBLINK
   ```

### Preparing an Oracle self-managed source database for CDC using AWS DMS
<a name="CHAP_Source.Oracle.Self-Managed.Configuration"></a>

Prepare your self-managed Oracle database as a source to run a CDC task by doing the following: 
+ [Verifying that AWS DMS supports the source database version](#CHAP_Source.Oracle.Self-Managed.Configuration.DbVersion).
+ [Making sure that ARCHIVELOG mode is on](#CHAP_Source.Oracle.Self-Managed.Configuration.ArchiveLogMode).
+ [Setting up supplemental logging](#CHAP_Source.Oracle.Self-Managed.Configuration.SupplementalLogging).

#### Verifying that AWS DMS supports the source database version
<a name="CHAP_Source.Oracle.Self-Managed.Configuration.DbVersion"></a>

Run a query like the following to verify that the current version of the Oracle source database is supported by AWS DMS.

```
SELECT name, value, description FROM v$parameter WHERE name = 'compatible';
```

Here, `name`, `value`, and `description` are columns somewhere in the database that are being queried based on the value of `name`. If this query runs without error, AWS DMS supports the current version of the database and you can continue with the migration. If the query raises an error, AWS DMS does not support the current version of the database. To proceed with migration, first convert the Oracle database to an version supported by AWS DMS.

#### Making sure that ARCHIVELOG mode is on
<a name="CHAP_Source.Oracle.Self-Managed.Configuration.ArchiveLogMode"></a>

You can run Oracle in two different modes: the `ARCHIVELOG` mode and the `NOARCHIVELOG` mode. To run a CDC task, run the database in `ARCHIVELOG` mode. To know if the database is in `ARCHIVELOG` mode, execute the following query.

```
SQL> SELECT log_mode FROM v$database;
```

If `NOARCHIVELOG` mode is returned, set the database to `ARCHIVELOG` per Oracle instructions. 

#### Setting up supplemental logging
<a name="CHAP_Source.Oracle.Self-Managed.Configuration.SupplementalLogging"></a>

To capture ongoing changes, AWS DMS requires that you enable minimal supplemental logging on your Oracle source database. In addition, you need to enable supplemental logging on each replicated table in the database.

By default, AWS DMS adds `PRIMARY KEY` supplemental logging on all replicated tables. To allow AWS DMS to add `PRIMARY KEY` supplemental logging, grant the following privilege for each replicated table.

```
ALTER on any-replicated-table;
```

You can disable the default `PRIMARY KEY` supplemental logging added by AWS DMS using the extra connection attribute `addSupplementalLogging`. For more information, see [Endpoint settings when using Oracle as a source for AWS DMS](#CHAP_Source.Oracle.ConnectionAttrib).

Make sure to turn on supplemental logging if your replication task updates a table using a `WHERE` clause that does not reference a primary key column.

**To manually set up supplemental logging**

1. Run the following query to verify if supplemental logging is already enabled for the database.

   ```
   SELECT supplemental_log_data_min FROM v$database;
   ```

   If the result returned is `YES` or `IMPLICIT`, supplemental logging is enabled for the database.

   If not, enable supplemental logging for the database by running the following command.

   ```
   ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
   ```

1. Make sure that the required supplemental logging is added for each replicated table.

   Consider the following:
   + If `ALL COLUMNS` supplemental logging is added to the table, you don't need to add more logging.
   + If a primary key exists, add supplemental logging for the primary key. You can do this either by using the format to add supplemental logging on the primary key itself, or by adding supplemental logging on the primary key columns on the database.

     ```
     ALTER TABLE Tablename ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
     ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
     ```
   + If no primary key exists and the table has a single unique index, add all of the unique index's columns to the supplemental log.

     ```
     ALTER TABLE TableName ADD SUPPLEMENTAL LOG GROUP LogGroupName (UniqueIndexColumn1[, UniqueIndexColumn2] ...) ALWAYS;
     ```

     Using `SUPPLEMENTAL LOG DATA (UNIQUE INDEX) COLUMNS` does not add the unique index columns to the log.
   + If no primary key exists and the table has multiple unique indexes, AWS DMS selects the first unique index in an alphabetically ordered ascending list. You need to add supplemental logging on the selected index’s columns as in the previous item.

     Using `SUPPLEMENTAL LOG DATA (UNIQUE INDEX) COLUMNS` does not add the unique index columns to the log.
   + If no primary key exists and there is no unique index, add supplemental logging on all columns.

     ```
     ALTER TABLE TableName ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
     ```

     In some cases, the target table primary key or unique index is different than the source table primary key or unique index. In such cases, add supplemental logging manually on the source table columns that make up the target table primary key or unique index.

     Also, if you change the target table primary key, add supplemental logging on the target unique index's columns instead of the columns of the source primary key or unique index.

If a filter or transformation is defined for a table, you might need to enable additional logging.

Consider the following:
+ If `ALL COLUMNS` supplemental logging is added to the table, you don't need to add more logging.
+ If the table has a unique index or a primary key, add supplemental logging on each column that is involved in a filter or transformation. However, do so only if those columns are different from the primary key or unique index columns.
+ If a transformation includes only one column, don't add this column to a supplemental logging group. For example, for a transformation `A+B`, add supplemental logging on both columns `A` and `B`. However, for a transformation `substring(A,10)` don't add supplemental logging on column `A`.
+ To set up supplemental logging on primary key or unique index columns and other columns that are filtered or transformed, you can set up `USER_LOG_GROUP` supplemental logging. Add this logging on both the primary key or unique index columns and any other specific columns that are filtered or transformed.

  For example, to replicate a table named `TEST.LOGGING` with primary key `ID` and a filter by the column `NAME`, you can run a command similar to the following to create the log group supplemental logging.

  ```
  ALTER TABLE TEST.LOGGING ADD SUPPLEMENTAL LOG GROUP TEST_LOG_GROUP (ID, NAME) ALWAYS;
  ```

### Account privileges required when using Oracle LogMiner to access the redo logs
<a name="CHAP_Source.Oracle.Self-Managed.LogMinerPrivileges"></a>

To access the redo logs using the Oracle LogMiner, grant the following privileges to the Oracle user specified in the Oracle endpoint connection settings.

```
GRANT EXECUTE on DBMS_LOGMNR to dms_user;
GRANT SELECT on V_$LOGMNR_LOGS to dms_user;
GRANT SELECT on V_$LOGMNR_CONTENTS to dms_user;
GRANT LOGMINING to dms_user; -– Required only if the Oracle version is 12c or higher.
```

### Account privileges required when using AWS DMS Binary Reader to access the redo logs
<a name="CHAP_Source.Oracle.Self-Managed.BinaryReaderPrivileges"></a>

To access the redo logs using the AWS DMS Binary Reader, grant the following privileges to the Oracle user specified in the Oracle endpoint connection settings.

```
GRANT SELECT on v_$transportable_platform to dms_user;   -– Grant this privilege if the redo logs are stored in Oracle Automatic Storage Management (ASM) and AWS DMS accesses them from ASM.
GRANT CREATE ANY DIRECTORY to dms_user;                  -– Grant this privilege to allow AWS DMS to use Oracle BFILE read file access in certain cases. This access is required when the replication instance does not have file-level access to the redo logs and the redo logs are on non-ASM storage.
GRANT EXECUTE on DBMS_FILE_TRANSFER to dms_user;         -– Grant this privilege to copy the redo log files to a temporary folder using the CopyToTempFolder method.
GRANT EXECUTE on DBMS_FILE_GROUP to dms_user;
```

Binary Reader works with Oracle file features that include Oracle directories. Each Oracle directory object includes the name of the folder containing the redo log files to process. These Oracle directories are not represented at the file system level. Instead, they are logical directories that are created at the Oracle database level. You can view them in the Oracle `ALL_DIRECTORIES` view.

If you want AWS DMS to create these Oracle directories, grant the `CREATE ANY DIRECTORY` privilege specified preceding. AWS DMS creates the directory names with the `DMS_` prefix. If you don't grant the `CREATE ANY DIRECTORY` privilege, create the corresponding directories manually. In some cases when you create the Oracle directories manually, the Oracle user specified in the Oracle source endpoint isn't the user that created these directories. In these cases, also grant the `READ on DIRECTORY` privilege.

**Note**  
AWS DMS CDC does not support Active Dataguard Standby that is not configured to use automatic redo transport service.

In some cases, you might use Oracle Managed Files (OMF) for storing the logs. Or your source endpoint is in ADG and thus you can't grant the CREATE ANY DIRECTORY privilege. In these cases, manually create the directories with all the possible log locations before starting the AWS DMS replication task. If AWS DMS does not find a precreated directory that it expects, the task stops. Also, AWS DMS does not delete the entries it has created in the `ALL_DIRECTORIES` view, so manually delete them.

### Additional account privileges required when using Binary Reader with Oracle ASM
<a name="CHAP_Source.Oracle.Self-Managed.ASMBinaryPrivileges"></a>

To access the redo logs in Automatic Storage Management (ASM) using Binary Reader, grant the following privileges to the Oracle user specified in the Oracle endpoint connection settings.

```
SELECT ON v_$transportable_platform
SYSASM -– To access the ASM account with Oracle 11g Release 2 (version 11.2.0.2) and higher, grant the Oracle endpoint user the SYSASM privilege. For older supported Oracle versions, it's typically sufficient to grant the Oracle endpoint user the SYSDBA privilege.
```

You can validate ASM account access by opening a command prompt and invoking one of the following statements, depending on your Oracle version as specified preceding.

If you need the `SYSDBA` privilege, use the following.

```
sqlplus asmuser/asmpassword@+asmserver as sysdba
```

If you need the `SYSASM` privilege, use the following. 

```
sqlplus asmuser/asmpassword@+asmserver as sysasm
```

### Using a self-managed Oracle Standby as a source with Binary Reader for CDC in AWS DMS
<a name="CHAP_Source.Oracle.Self-Managed.BinaryStandby"></a>

To configure an Oracle Standby instance as a source when using Binary Reader for CDC, start with the following prerequisites:
+ AWS DMS currently supports only Oracle Active Data Guard Standby.
+ Make sure that the Oracle Data Guard configuration uses:
  + Redo transport services for automated transfers of redo data.
  + Apply services to automatically apply redo to the standby database.

To confirm those requirements are met, execute the following query.

```
SQL> select open_mode, database_role from v$database;
```

From the output of that query, confirm that the standby database is opened in READ ONLY mode and redo is being applied automatically. For example:

```
OPEN_MODE             DATABASE_ROLE
--------------------  ----------------
READ ONLY WITH APPLY  PHYSICAL STANDBY
```

**To configure an Oracle Standby instance as a source when using Binary Reader for CDC**

1. Grant additional privileges required to access standby log files.

   ```
   GRANT SELECT ON v_$standby_log TO dms_user;
   ```

1. Create a source endpoint for the Oracle Standby by using the AWS Management Console or AWS CLI. When creating the endpoint, specify the following extra connection attributes.

   ```
   useLogminerReader=N;useBfile=Y;
   ```
**Note**  
In AWS DMS, you can use extra connection attributes to specify if you want to migrate from the archive logs instead of the redo logs. For more information, see [Endpoint settings when using Oracle as a source for AWS DMS](#CHAP_Source.Oracle.ConnectionAttrib).

1. Configure archived log destination.

   DMS binary reader for Oracle source without ASM uses Oracle Directories to access archived redo logs. If your database is configured to use Fast Recovery Area (FRA) as an archive log destination, the location of archive redo files isn't constant. Each day that archived redo logs are generated results in a new directory being created in the FRA, using the directory name format YYYY\$1MM\$1DD. For example: 

   ```
   DB_RECOVERY_FILE_DEST/SID/archivelog/YYYY_MM_DD
   ```

   When DMS needs access to archived redo files in the newly created FRA directory and the primary read-write database is being used as a source, DMS creates a new or replaces an existing Oracle directory, as follows. 

   ```
   CREATE OR REPLACE DIRECTORY dmsrep_taskid AS ‘DB_RECOVERY_FILE_DEST/SID/archivelog/YYYY_MM_DD’;
   ```

   When the standby database is being used as a source, DMS is unable to create or replace the Oracle directory because the database is in read-only mode. But, you can choose to perform one of these additional steps: 

   1. Modify `log_archive_dest_id_1` to use an actual path instead of FRA in such a configuration that Oracle won't create daily subdirectories:

      ```
      ALTER SYSTEM SET log_archive_dest_1=’LOCATION=full directory path’
      ```

      Then, create an Oracle directory object to be used by DMS:

      ```
      CREATE OR REPLACE DIRECTORY dms_archived_logs AS ‘full directory path’;
      ```

   1. Create an additional archive log destination and an Oracle directory object pointing to that destination. For example:

      ```
      ALTER SYSTEM SET log_archive_dest_3=’LOCATION=full directory path’; 
      CREATE DIRECTORY dms_archived_log AS ‘full directory path’;
      ```

      Then add an extra connection attribute to the task source endpoint:

      ```
      archivedLogDestId=3
      ```

   1. Manually pre-create Oracle directory objects to be used by DMS.

      ```
      CREATE DIRECTORY dms_archived_log_20210301 AS ‘DB_RECOVERY_FILE_DEST/SID/archivelog/2021_03_01’;
      CREATE DIRECTORY dms_archived_log_20210302 AS ‘DB_RECOVERY_FILE_DEST>/SID>/archivelog/2021_03_02’; 
      ...
      ```

   1. Create an Oracle scheduler job that runs daily and creates the required directory.

1. Configure online log destination. 

   Create Oracle directory that points to OS directory with standby redo logs:

   ```
   CREATE OR REPLACE DIRECTORY STANDBY_REDO_DIR AS '<full directory path>';
   GRANT READ ON DIRECTORY STANDBY_REDO_DIR TO <dms_user>;
   ```

### Using a user-managed database on Oracle Cloud Infrastructure (OCI) as a source for CDC in AWS DMS
<a name="CHAP_Source.Oracle.Self-Managed.OCI"></a>

A user-managed database is a database that you configure and control, such as an Oracle database created on a virtual machine (VM), bare metal, or Exadata server. Or, databases that you configure and control that run on dedicated infrastructure, like Oracle Cloud Infrastructure (OCI). The following information describes the privileges and configurations you need when using an Oracle user-managed database on OCI as a source for change data capture (CDC) in AWS DMS.

**To configure an OCI hosted user-managed Oracle database as a source for change data capture**

1. Grant required user account privileges for a user-managed Oracle source database on OCI. For more information, see [Account privileges for a self-managed Oracle source endpoint](#CHAP_Source.Oracle.Self-Managed.Privileges).

1. Grant account privileges required when using Binary Reader to access the redo logs. For more information, see [Account privileges required when using Binary Reader](#CHAP_Source.Oracle.Self-Managed.BinaryReaderPrivileges).

1. Add account privileges that are required when using Binary Reader with Oracle Automatic Storage Management (ASM). For more information, see [Additional account privileges required when using Binary Reader with Oracle ASM](#CHAP_Source.Oracle.Self-Managed.ASMBinaryPrivileges).

1. Set-up supplemental logging. For more information, see [Setting up supplemental logging](#CHAP_Source.Oracle.Self-Managed.Configuration.SupplementalLogging).

1. Set-up TDE encryption. For more information, see [Encryption methods when using an Oracle database as a source endpoint](#CHAP_Source.Oracle.Encryption).

The following limitations apply when replicating data from an Oracle source database on Oracle Cloud Infrastructure (OCI).

**Limitations**
+ DMS does not support using Oracle LogMiner to access the redo logs.
+ DMS does not support Autonomous DB.

## Working with an AWS-managed Oracle database as a source for AWS DMS
<a name="CHAP_Source.Oracle.Amazon-Managed"></a>

An AWS-managed database is a database that is on an Amazon service such as Amazon RDS, Amazon Aurora, or Amazon S3. Following, you can find the privileges and configurations that you need to set up when using an AWS-managed Oracle database with AWS DMS.

### User account privileges required on an AWS-managed Oracle source for AWS DMS
<a name="CHAP_Source.Oracle.Amazon-Managed.Privileges"></a>

Grant the following privileges to the Oracle user account specified in the Oracle source endpoint definition.

**Important**  
For all parameter values such as `dms_user` and `any-replicated-table`, Oracle assumes the value is all uppercase unless you specify the value with a case-sensitive identifier. For example, suppose that you create a `dms_user` value without using quotation marks, as in `CREATE USER myuser` or `CREATE USER MYUSER`. In this case, Oracle identifies and stores the value as all uppercase (`MYUSER`). If you use quotation marks, as in `CREATE USER "MyUser"` or `CREATE USER 'MyUser'`, Oracle identifies and stores the case-sensitive value that you specify (`MyUser`).

```
GRANT CREATE SESSION to dms_user;
GRANT SELECT ANY TRANSACTION to dms_user;
GRANT SELECT on DBA_TABLESPACES to dms_user;
GRANT SELECT ON any-replicated-table to dms_user;
GRANT EXECUTE on rdsadmin.rdsadmin_util to dms_user;
 -- For Oracle 12c or higher:
GRANT LOGMINING to dms_user; – Required only if the Oracle version is 12c or higher.
```

In addition, grant `SELECT` and `EXECUTE` permissions on `SYS` objects using the Amazon RDS procedure `rdsadmin.rdsadmin_util.grant_sys_object` as shown. For more information, see [Granting SELECT or EXECUTE privileges to SYS objects](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.Oracle.CommonDBATasks.html#Appendix.Oracle.CommonDBATasks.TransferPrivileges).

```
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_VIEWS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_TAB_PARTITIONS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_INDEXES', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_OBJECTS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_TABLES', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_USERS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_CATALOG', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_CONSTRAINTS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_CONS_COLUMNS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_TAB_COLS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_IND_COLUMNS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_LOG_GROUPS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$ARCHIVED_LOG', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$LOG', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$LOGFILE', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$DATABASE', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$THREAD', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$PARAMETER', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$NLS_PARAMETERS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$TIMEZONE_NAMES', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$TRANSACTION', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$CONTAINERS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('DBA_REGISTRY', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('OBJ$', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('ALL_ENCRYPTED_COLUMNS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$LOGMNR_LOGS', 'dms_user', 'SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$LOGMNR_CONTENTS','dms_user','SELECT');
exec rdsadmin.rdsadmin_util.grant_sys_object('DBMS_LOGMNR', 'dms_user', 'EXECUTE');

-- (as of Oracle versions 12.1 and higher)
exec rdsadmin.rdsadmin_util.grant_sys_object('REGISTRY$SQLPATCH', 'dms_user', 'SELECT');

-- (for Amazon RDS Active Dataguard Standby (ADG))
exec rdsadmin.rdsadmin_util.grant_sys_object('V_$STANDBY_LOG', 'dms_user', 'SELECT'); 

-- (for transparent data encryption (TDE))

exec rdsadmin.rdsadmin_util.grant_sys_object('ENC$', 'dms_user', 'SELECT'); 
               
-- (for validation with LOB columns)
exec rdsadmin.rdsadmin_util.grant_sys_object('DBMS_CRYPTO', 'dms_user', 'EXECUTE');
                    
-- (for binary reader)
exec rdsadmin.rdsadmin_util.grant_sys_object('DBA_DIRECTORIES','dms_user','SELECT'); 
                    
-- Required when the source database is Oracle Data guard, and Oracle Standby is used in the latest release of DMS version 3.4.6, version 3.4.7, and higher.

exec rdsadmin.rdsadmin_util.grant_sys_object('V_$DATAGUARD_STATS', 'dms_user', 'SELECT');
```

For more information on using Amazon RDS Active Dataguard Standby (ADG) with AWS DMS see [Using an Amazon RDS Oracle Standby (read replica) as a source with Binary Reader for CDC in AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.StandBy).

For more information on using Oracle TDE with AWS DMS, see [Supported encryption methods for using Oracle as a source for AWS DMS](#CHAP_Source.Oracle.Encryption).

#### Prerequisites for handling open transactions for Oracle Standby
<a name="CHAP_Source.Oracle.Amazon-Managed.Privileges.Standby"></a>

When using AWS DMS versions 3.4.6 and higher, perform the following steps to handle open transactions for Oracle Standby. 

1. Create a database link named, `AWSDMS_DBLINK` on the primary database. `DMS_USER` will use the database link to connect to the primary database. Note that the database link is executed from the standby instance to query the open transactions running on the primary database. See the following example. 

   ```
   CREATE PUBLIC DATABASE LINK AWSDMS_DBLINK 
      CONNECT TO DMS_USER IDENTIFIED BY DMS_USER_PASSWORD
      USING '(DESCRIPTION=
               (ADDRESS=(PROTOCOL=TCP)(HOST=PRIMARY_HOST_NAME_OR_IP)(PORT=PORT))
               (CONNECT_DATA=(SERVICE_NAME=SID))
             )';
   ```

1. Verify the connection to the database link using `DMS_USER` is established, as shown in the following example.

   ```
   select 1 from dual@AWSDMS_DBLINK
   ```

### Configuring an AWS-managed Oracle source for AWS DMS
<a name="CHAP_Source.Oracle.Amazon-Managed.Configuration"></a>

Before using an AWS-managed Oracle database as a source for AWS DMS, perform the following tasks for the Oracle database:
+ Enable automatic backups. For more information about enabling automatic backups, see [Enabling automated backups](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithAutomatedBackups.html#USER_WorkingWithAutomatedBackups.Enabling) in the *Amazon RDS User Guide*.
+ Set up supplemental logging.
+ Set up archiving. Archiving the redo logs for your Amazon RDS for Oracle DB instance allows AWS DMS to retrieve the log information using Oracle LogMiner or Binary Reader. 

**To set up archiving**

1. Run the `rdsadmin.rdsadmin_util.set_configuration` command to set up archiving.

   For example, to retain the archived redo logs for 24 hours, run the following command.

   ```
   exec rdsadmin.rdsadmin_util.set_configuration('archivelog retention hours',24);
   commit;
   ```
**Note**  
The commit is required for a change to take effect.

1. Make sure that your storage has enough space for the archived redo logs during the specified retention period. For example, if your retention period is 24 hours, calculate the total size of your accumulated archived redo logs over a typical hour of transaction processing and multiply that total by 24. Compare this calculated 24-hour total with your available storage space and decide if you have enough storage space to handle a full 24 hours transaction processing.

**To set up supplemental logging**

1. Run the following command to enable supplemental logging at the database level.

   ```
   exec rdsadmin.rdsadmin_util.alter_supplemental_logging('ADD');
   ```

1. Run the following command to enable primary key supplemental logging.

   ```
   exec rdsadmin.rdsadmin_util.alter_supplemental_logging('ADD','PRIMARY KEY');
   ```

1. (Optional) Enable key-level supplemental logging at the table level.

   Your source database incurs a small bit of overhead when key-level supplemental logging is enabled. Therefore, if you are migrating only a subset of your tables, you might want to enable key-level supplemental logging at the table level. To enable key-level supplemental logging at the table level, run the following command.

   ```
   alter table table_name add supplemental log data (PRIMARY KEY) columns;
   ```

### Configuring a CDC task to use Binary Reader with an RDS for Oracle source for AWS DMS
<a name="CHAP_Source.Oracle.Amazon-Managed.CDC"></a>

You can configure AWS DMS to access the source Amazon RDS for Oracle instance redo logs using Binary Reader for CDC. 

**Note**  
To use Oracle LogMiner, the minimum required user account privileges are sufficient. For more information, see [User account privileges required on an AWS-managed Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.Privileges).

To use AWS DMS Binary Reader, specify additional settings and extra connection attributes for the Oracle source endpoint, depending on your AWS DMS version.

Binary Reader support is available in the following versions of Amazon RDS for Oracle:
+ Oracle 11.2 – Versions 11.2.0.4V11 and higher
+ Oracle 12.1 – Versions 12.1.0.2.V7 and higher
+ Oracle 12.2 – All versions
+ Oracle 18.0 – All versions
+ Oracle 19.0 – All versions

**To configure CDC using Binary Reader**

1. Log in to your Amazon RDS for Oracle source database as the master user and run the following stored procedures to create the server-level directories.

   ```
   exec rdsadmin.rdsadmin_master_util.create_archivelog_dir;
   exec rdsadmin.rdsadmin_master_util.create_onlinelog_dir;
   ```

1. Grant the following privileges to the Oracle user account that is used to access the Oracle source endpoint.

   ```
   GRANT READ ON DIRECTORY ONLINELOG_DIR TO dms_user;
   GRANT READ ON DIRECTORY ARCHIVELOG_DIR TO dms_user;
   ```

1. Set the following extra connection attributes on the Amazon RDS Oracle source endpoint:
   + For RDS Oracle versions 11.2 and 12.1, set the following.

     ```
     useLogminerReader=N;useBfile=Y;accessAlternateDirectly=false;useAlternateFolderForOnline=true;
     oraclePathPrefix=/rdsdbdata/db/{$DATABASE_NAME}_A/;usePathPrefix=/rdsdbdata/log/;replacePathPrefix=true;
     ```
   + For RDS Oracle versions 12.2, 18.0, and 19.0, set the following.

     ```
     useLogminerReader=N;useBfile=Y;
     ```

**Note**  
Make sure there's no white space following the semicolon separator (;) for multiple attribute settings, for example `oneSetting;thenAnother`.

For more information configuring a CDC task, see [Configuration for CDC on an Oracle source database](#CHAP_Source.Oracle.CDC.Configuration).

### Using an Amazon RDS Oracle Standby (read replica) as a source with Binary Reader for CDC in AWS DMS
<a name="CHAP_Source.Oracle.Amazon-Managed.StandBy"></a>

Verify the following prerequisites for using Amazon RDS for Oracle Standby as a source when using Binary Reader for CDC in AWS DMS:
+ Use the Oracle master user to set up Binary Reader.
+ Make sure that AWS DMS currently supports using only Oracle Active Data Guard Standby.

After you do so, use the following procedure to use RDS for Oracle Standby as a source when using Binary Reader for CDC.

**To configure an RDS for Oracle Standby as a source when using Binary Reader for CDC**

1. Sign in to RDS for Oracle primary instance as the master user.

1. Run the following stored procedures as documented in the Amazon RDS User Guide to create the server level directories.

   ```
   exec rdsadmin.rdsadmin_master_util.create_archivelog_dir;
   exec rdsadmin.rdsadmin_master_util.create_onlinelog_dir;
   ```

1. Identify the directories created in step 2.

   ```
   SELECT directory_name, directory_path FROM all_directories
   WHERE directory_name LIKE ( 'ARCHIVELOG_DIR_%' )
           OR directory_name LIKE ( 'ONLINELOG_DIR_%' )
   ```

   For example, the preceding code displays a list of directories like the following.  
![\[Table showing directory names and their corresponding paths for archive and online logs.\]](http://docs.aws.amazon.com/dms/latest/userguide/images/datarep-rds-server-level-directories.png)

1. Grant the `Read` privilege on the preceding directories to the Oracle user account that is used to access the Oracle Standby.

   ```
   GRANT READ ON DIRECTORY ARCHIVELOG_DIR_A TO dms_user;
   GRANT READ ON DIRECTORY ARCHIVELOG_DIR_B TO dms_user;
   GRANT READ ON DIRECTORY ONLINELOG_DIR_A TO dms_user;
   GRANT READ ON DIRECTORY ONLINELOG_DIR_B TO dms_user;
   ```

1. Perform an archive log switch on the primary instance. Doing this makes sure that the changes to `ALL_DIRECTORIES` are also ported to the Oracle Standby.

1. Run an `ALL_DIRECTORIES` query on the Oracle Standby to confirm that the changes were applied.

1. Create a source endpoint for the Oracle Standby by using the AWS DMS Management Console or AWS Command Line Interface (AWS CLI). While creating the endpoint, specify the following extra connection attributes.

   ```
   useLogminerReader=N;useBfile=Y;archivedLogDestId=1;additionalArchivedLogDestId=2
   ```

1. After creating the endpoint, use **Test endpoint connection** on the **Create endpoint** page of the console or the AWS CLI `test-connection` command to verify that connectivity is established.

## Limitations on using Oracle as a source for AWS DMS
<a name="CHAP_Source.Oracle.Limitations"></a>

The following limitations apply when using an Oracle database as a source for AWS DMS:
+ AWS DMS supports Oracle Extended data types in AWS DMS version 3.5.0 and higher.
+ AWS DMS does not support Oracle identifiers longer than 30 bytes during CDC replication. This limitation applies regardless of whether the Oracle source version supports extended identifiers.

  During full load, AWS DMS migrates objects with long identifiers successfully. However, during CDC, events for objects with identifiers exceeding 30 bytes are silently skipped without stopping the task. This can result in missing data on the target.

  This limitation applies to all identifier types in the migration scope, such as schemas, tables, views, columns, constraints, and primary keys.
**Note**  
The 30-byte limit is measured in bytes, not characters. Identifiers that use multibyte character sets, such as UTF-8 encoded CJK characters, can exceed 30 bytes with fewer than 30 characters. For example, a 15-character Japanese identifier encoded in UTF-8 can require up to 45 bytes.

  To avoid data loss during CDC replication, verify that all identifiers in the migration scope are within the 30-byte limit. To check the byte length of an object name, use the Oracle built-in function `LENGTHB('object-name')`. Alternatively, run a premigration assessment to validate all objects in the migration scope. For more information, see [Validate the length of the object name included in the task scope](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Tasks.AssessmentReport.Oracle.html#CHAP_Tasks.AssessmentReport.Oracle.30ByteLimit).
+ AWS DMS does not support function-based indexes.
+ If you manage supplemental logging and carry out transformations on any of the columns, make sure that supplemental logging is activated for all fields and columns. For more information on setting up supplemental logging, see the following topics:
  + For a self-managed Oracle source database, see [Setting up supplemental logging](#CHAP_Source.Oracle.Self-Managed.Configuration.SupplementalLogging).
  + For an AWS-managed Oracle source database, see [Configuring an AWS-managed Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.Configuration).
+ AWS DMS does not support the multi-tenant container root database (CDB\$1ROOT). It does support a PDB using the Binary Reader.
+ AWS DMS does not support deferred constraints.
+ AWS DMS version 3.5.3 and above fully supports secure LOBs.
+ AWS DMS supports the `rename table table-name to new-table-name` syntax for all supported Oracle versions 11 and higher. This syntax isn't supported for any Oracle version 10 source databases.
+ AWS DMS does not replicate results of the DDL statement `ALTER TABLE ADD column data_type DEFAULT default_value`. Instead of replicating `default_value` to the target, it sets the new column to `NULL`.
+ When using AWS DMS version 3.4.7 or higher, to replicate changes that result from partition or subpartition operations, do the following before starting a DMS task.
  + Manually create the partitioned table structure (DDL); 
  + Make sure the DDL is the same on both Oracle source and Oracle target; 
  + Set the extra connection attribute `enableHomogenousPartitionOps=true`.

  For more information about `enableHomogenousPartitionOps`, see [Endpoint settings when using Oracle as a source for AWS DMS](#CHAP_Source.Oracle.ConnectionAttrib). Also, note that on FULL\$1CDC tasks, DMS does not replicate data changes captured as part of the cached changes. In that use case, recreate the table structure on the Oracle target and reload the tables in question.

  Prior to AWS DMS version 3.4.7:

  DMS does not replicate data changes that result from partition or subpartition operations (`ADD`, `DROP`, `EXCHANGE`, and `TRUNCATE`). Such updates might cause the following errors during replication:
  + For `ADD` operations, updates and deletes on the added data might raise a "0 rows affected" warning.
  + For `DROP` and `TRUNCATE` operations, new inserts might raise "duplicates" errors.
  + `EXCHANGE` operations might raise both a "0 rows affected" warning and "duplicates" errors.

  To replicate changes that result from partition or subpartition operations, reload the tables in question. After adding a new empty partition, operations on the newly added partition are replicated to the target as normal.
+ AWS DMS versions prior to 3.4 don't support data changes on the target that result from running the `CREATE TABLE AS` statement on the source. However, the new table is created on the target.
+ AWS DMS does not capture changes made by the Oracle `DBMS_REDEFINITION` package, for example the table metadata and the `OBJECT_ID` field.
+ When Limited-size LOB mode is enabled, empty BLOB/CLOB columns on the Oracle source are replicated as NULL values. When Full LOB mode is enabled, they are replicated as an empty string (' ').
+ When capturing changes with Oracle 11 LogMiner, an update on a CLOB column with a string length greater than 1982 is lost, and the target is not updated.
+ During change data capture (CDC), AWS DMS does not support batch updates to numeric columns defined as a primary key.
+ AWS DMS does not support certain `UPDATE` commands. The following example is an unsupported `UPDATE` command.

  ```
  UPDATE TEST_TABLE SET KEY=KEY+1;
  ```

  Here, `TEST_TABLE` is the table name and `KEY` is a numeric column defined as a primary key.
+ AWS DMS does not support full LOB mode for loading LONG and LONG RAW columns. Instead, you can use limited LOB mode for migrating these datatypes to an Oracle target. In limited LOB mode, AWS DMS truncates any data to 64 KB that you set to LONG or LONG RAW columns longer than 64 KB.
+ AWS DMS does not support full LOB mode for loading XMLTYPE columns. Instead, you can use limited LOB mode for migrating XMLTYPE columns to an Oracle target. In limited LOB mode, DMS truncates any data larger than the user defined 'Maximum LOB size' variable. The maximum recommended value for 'Maximum LOB size' is 100MB.
+ AWS DMS does not replicate tables whose names contain apostrophes.
+ AWS DMS supports CDC from materialized views. But DMS does not support CDC from any other views.
+ AWS DMS does not support CDC for index-organized tables with an overflow segment.
+ AWS DMS does not support the `Drop Partition` operation for tables partitioned by reference with `enableHomogenousPartitionOps` set to `true`.
+ When you use Oracle LogMiner to access the redo logs, AWS DMS has the following limitations:
  + For Oracle 12 only, AWS DMS does not replicate any changes to LOB columns.
  + AWS DMS does not support XA transactions in replication while using Oracle LogMiner.
  + Oracle LogMiner does not support connections to a pluggable database (PDB). To connect to a PDB, access the redo logs using Binary Reader.
  + SHRINK SPACE operations aren’t supported.
+ When you use Binary Reader, AWS DMS has these limitations:
  + It does not support table clusters.
  + It supports only table-level `SHRINK SPACE` operations. This level includes the full table, partitions, and sub-partitions.
  + It does not support changes to index-organized tables with key compression.
  + It does not support implementing online redo logs on raw devices.
  + Binary Reader supports TDE only for self-managed Oracle databases since RDS for Oracle does not support wallet password retrieval for TDE encryption keys.
+ AWS DMS does not support connections to an Amazon RDS Oracle source using an Oracle Automatic Storage Management (ASM) proxy.
+ AWS DMS does not support virtual columns. 
+ AWS DMS does not support the `ROWID` data type or materialized views based on a ROWID column.

  AWS DMS has partial support for Oracle Materialized Views. For full-loads, DMS can do a full-load copy of an Oracle Materialized View. DMS copies the Materialized View as a base table to the target system and ignores any ROWID columns in the Materialized View. For ongoing replication (CDC), DMS tries to replicate changes to the Materialized View data but the results might not be ideal. Specifically, if the Materialized View is completely refreshed, DMS replicates individual deletes for all the rows, followed by individual inserts for all the rows. That is a very resource intensive exercise and might perform poorly for materialized views with large numbers of rows. For ongoing replication where the materialized views do a fast refresh, DMS tries to process and replicate the fast refresh data changes. In either case, DMS skips any ROWID columns in the materialized view.
+ AWS DMS does not load or capture global temporary tables.
+ For S3 targets using replication, enable supplemental logging on every column so source row updates can capture every column value. An example follows: `alter table yourtablename add supplemental log data (all) columns;`.
+ An update for a row with a composite unique key that contains `null` can't be replicated at the target.
+ AWS DMS does not support use of multiple Oracle TDE encryption keys on the same source endpoint. Each endpoint can have only one attribute for TDE encryption Key Name "`securityDbEncryptionName`", and one TDE password for this key.
+ When replicating from Amazon RDS for Oracle, TDE is supported only with encrypted tablespace and using Oracle LogMiner.
+ AWS DMS does not support multiple table rename operations in quick succession.
+ When using Oracle 19.0 as source, AWS DMS does not support the following features:
  + Data-guard DML redirect
  + Partitioned hybrid tables
  + Schema-only Oracle accounts
+ AWS DMS does not support migration of tables or views of type `BIN$` or `DR$`.
+ Beginning with Oracle 18.x, AWS DMS does not support change data capture (CDC) from Oracle Express Edition (Oracle Database XE).
+ When migrating data from a CHAR column, DMS truncates any trailing spaces. 
+ AWS DMS does not support replication from application containers.
+ AWS DMS does not support performing Oracle Flashback Database and restore points, as these operations affect the consistency of Oracle Redo Log files.
+ Prior to AWS DMS version 3.5.3, Direct-load `INSERT` procedure with the parallel execution option is not supported in the following cases:
  + Uncompressed tables with more than 255 columns
  + Row size exceeds 8K
  + Exadata HCC tables
  + Database running on Big Endian platform
+ A source table with neither primary nor unique key requires ALL COLUMN supplemental logging to be enabled. It creates more redo log activities and may increase DMS CDC latency.
+ AWS DMS does not migrate data from invisible columns in your source database. To include these columns in your migration scope, use the `ALTER TABLE` statement to make these columns visible.
+ For all Oracle versions, AWS DMS does not replicate the result of `UPDATE` operations on `XMLTYPE` and LOB columns.
+ AWS DMS does not support replication from tables with temporal validity constraints.
+ If the Oracle source becomes unavailable during a full load task, AWS DMS might mark the task as completed after multiple reconnection attempts, even though the data migration remains incomplete. In this scenario, the target tables contain only the records migrated before the connection loss, potentially creating data inconsistencies between the source and target systems. To ensure data completeness, you must either restart the full load task entirely or reload the specific tables affected by the connection interruption.

## SSL support for an Oracle endpoint
<a name="CHAP_Security.SSL.Oracle"></a>

AWS DMS Oracle endpoints support SSL V3 for the `none` and `verify-ca` SSL modes. To use SSL with an Oracle endpoint, upload the Oracle wallet for the endpoint instead of .pem certificate files. 

**Topics**
+ [

### Using an existing certificate for Oracle SSL
](#CHAP_Security.SSL.Oracle.Existing)
+ [

### Using a self-signed certificate for Oracle SSL
](#CHAP_Security.SSL.Oracle.SelfSigned)

### Using an existing certificate for Oracle SSL
<a name="CHAP_Security.SSL.Oracle.Existing"></a>

To use an existing Oracle client installation to create the Oracle wallet file from the CA certificate file, do the following steps.

**To use an existing oracle client installation for Oracle SSL with AWS DMS**

1. Set the `ORACLE_HOME` system variable to the location of your `dbhome_1` directory by running the following command.

   ```
   prompt>export ORACLE_HOME=/home/user/app/user/product/12.1.0/dbhome_1                        
   ```

1. Append `$ORACLE_HOME/lib` to the `LD_LIBRARY_PATH` system variable.

   ```
   prompt>export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME/lib                        
   ```

1. Create a directory for the Oracle wallet at `$ORACLE_HOME/ssl_wallet`.

   ```
   prompt>mkdir $ORACLE_HOME/ssl_wallet
   ```

1. Put the CA certificate `.pem` file in the `ssl_wallet` directory. If you use Amazon RDS, you can download the `rds-ca-2015-root.pem` root CA certificate file hosted by Amazon RDS. For more information about downloading this file, see [Using SSL/TLS to encrypt a connection to a DB instance](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html) in the *Amazon RDS User Guide*.

1. If your CA certificate contains more than one PEM file (like Amazon RDS global or regional bundle), you must split it into separate files and add them into the Oracle wallet using following bash script. This script requires two parameter inputs: the path to the CA certificate and the path to the folder of the previously created Oracle wallet.

   ```
   #!/usr/bin/env bash
   
   certnum=$(grep -c BEGIN <(cat $1))
   
   cnt=0
   temp_cert=""
   while read line
   do
   if [ -n "$temp_cert" -a "$line" == "-----BEGIN CERTIFICATE-----" ]
   then
   cnt=$(expr $cnt + 1)
   printf "\rImporting certificate # $cnt of $certnum"
   orapki wallet add -wallet "$2" -trusted_cert -cert <(echo -n "${temp_cert}") -auto_login_only 1>/dev/null 2>/dev/null
   temp_cert=""
   fi
   temp_cert+="$line"$'\n'
   done < <(cat $1)
   
   cnt=$(expr $cnt + 1)
   printf "\rImporting certificate # $cnt of $certnum"
   orapki wallet add -wallet "$2" -trusted_cert -cert <(echo -n "${temp_cert}") -auto_login_only 1>/dev/null 2>/dev/null
   echo ""
   ```

When you have completed the steps previous, you can import the wallet file with the `ImportCertificate` API call by specifying the certificate-wallet parameter. You can then use the imported wallet certificate when you select `verify-ca` as the SSL mode when creating or modifying your Oracle endpoint.

**Note**  
 Oracle wallets are binary files. AWS DMS accepts these files as-is. 

### Using a self-signed certificate for Oracle SSL
<a name="CHAP_Security.SSL.Oracle.SelfSigned"></a>

To use a self-signed certificate for Oracle SSL, do the steps following, assuming an Oracle wallet password of `oracle123`.

**To use a self-signed certificate for Oracle SSL with AWS DMS**

1. Create a directory you will use to work with the self-signed certificate.

   ```
   mkdir -p /u01/app/oracle/self_signed_cert
   ```

1. Change into the directory you created in the previous step.

   ```
   cd /u01/app/oracle/self_signed_cert
   ```

1. Create a root key.

   ```
   openssl genrsa -out self-rootCA.key 2048
   ```

1. Self-sign a root certificate using the root key you created in the previous step.

   ```
   openssl req -x509 -new -nodes -key self-rootCA.key 
           -sha256 -days 3650 -out self-rootCA.pem
   ```

   Use input parameters like the following.
   + `Country Name (2 letter code) [XX]`, for example: `AU`
   + `State or Province Name (full name) []`, for example: `NSW`
   + `Locality Name (e.g., city) [Default City]`, for example: `Sydney`
   + `Organization Name (e.g., company) [Default Company Ltd]`, for example: `AmazonWebService`
   + `Organizational Unit Name (e.g., section) []`, for example: `DBeng`
   + `Common Name (e.g., your name or your server's hostname) []`, for example: `aws`
   + `Email Address []`, for example: abcd.efgh@amazonwebservice.com

1. Create an Oracle wallet directory for the Oracle database.

   ```
   mkdir -p /u01/app/oracle/wallet
   ```

1. Create a new Oracle wallet.

   ```
   orapki wallet create -wallet "/u01/app/oracle/wallet" -pwd oracle123 -auto_login_local
   ```

1. Add the root certificate to the Oracle wallet.

   ```
   orapki wallet add -wallet "/u01/app/oracle/wallet" -pwd oracle123 -trusted_cert 
   -cert /u01/app/oracle/self_signed_cert/self-rootCA.pem
   ```

1. List the contents of the Oracle wallet. The list should include the root certificate. 

   ```
   orapki wallet display -wallet /u01/app/oracle/wallet -pwd oracle123
   ```

   For example, this might display similar to the following.

   ```
   Requested Certificates:
   User Certificates:
   Trusted Certificates:
   Subject:        CN=aws,OU=DBeng,O= AmazonWebService,L=Sydney,ST=NSW,C=AU
   ```

1. Generate the Certificate Signing Request (CSR) using the ORAPKI utility.

   ```
   orapki wallet add -wallet "/u01/app/oracle/wallet" -pwd oracle123 
   -dn "CN=aws" -keysize 2048 -sign_alg sha256
   ```

1. Run the following command.

   ```
   openssl pkcs12 -in /u01/app/oracle/wallet/ewallet.p12 -nodes -out /u01/app/oracle/wallet/nonoracle_wallet.pem
   ```

   This has output like the following.

   ```
   Enter Import Password:
   MAC verified OK
   Warning unsupported bag type: secretBag
   ```

1. Put 'dms' as the common name.

   ```
   openssl req -new -key /u01/app/oracle/wallet/nonoracle_wallet.pem -out certdms.csr
   ```

   Use input parameters like the following.
   + `Country Name (2 letter code) [XX]`, for example: `AU`
   + `State or Province Name (full name) []`, for example: `NSW`
   + `Locality Name (e.g., city) [Default City]`, for example: `Sydney`
   + `Organization Name (e.g., company) [Default Company Ltd]`, for example: `AmazonWebService`
   + `Organizational Unit Name (e.g., section) []`, for example: `aws`
   + `Common Name (e.g., your name or your server's hostname) []`, for example: `aws`
   + `Email Address []`, for example: abcd.efgh@amazonwebservice.com

   Make sure this is not same as step 4. You can do this, for example, by changing Organizational Unit Name to a different name as shown.

   Enter the additional attributes following to be sent with your certificate request.
   + `A challenge password []`, for example: `oracle123`
   + `An optional company name []`, for example: `aws`

1. Get the certificate signature.

   ```
   openssl req -noout -text -in certdms.csr | grep -i signature
   ```

   The signature key for this post is `sha256WithRSAEncryption` .

1. Run the command following to generate the certificate (`.crt`) file.

   ```
   openssl x509 -req -in certdms.csr -CA self-rootCA.pem -CAkey self-rootCA.key 
   -CAcreateserial -out certdms.crt -days 365 -sha256
   ```

   This displays output like the following.

   ```
   Signature ok
   subject=/C=AU/ST=NSW/L=Sydney/O=awsweb/OU=DBeng/CN=aws
   Getting CA Private Key
   ```

1. Add the certificate to the wallet.

   ```
   orapki wallet add -wallet /u01/app/oracle/wallet -pwd oracle123 -user_cert -cert certdms.crt
   ```

1. View the wallet. It should have two entries. See the code following.

   ```
   orapki wallet display -wallet /u01/app/oracle/wallet -pwd oracle123
   ```

1. Configure the `sqlnet.ora` file (`$ORACLE_HOME/network/admin/sqlnet.ora`).

   ```
   WALLET_LOCATION =
      (SOURCE =
        (METHOD = FILE)
        (METHOD_DATA =
          (DIRECTORY = /u01/app/oracle/wallet/)
        )
      ) 
   
   SQLNET.AUTHENTICATION_SERVICES = (NONE)
   SSL_VERSION = 1.0
   SSL_CLIENT_AUTHENTICATION = FALSE
   SSL_CIPHER_SUITES = (SSL_RSA_WITH_AES_256_CBC_SHA)
   ```

1. Stop the Oracle listener.

   ```
   lsnrctl stop
   ```

1. Add entries for SSL in the `listener.ora` file (`$ORACLE_HOME/network/admin/listener.ora`).

   ```
   SSL_CLIENT_AUTHENTICATION = FALSE
   WALLET_LOCATION =
     (SOURCE =
       (METHOD = FILE)
       (METHOD_DATA =
         (DIRECTORY = /u01/app/oracle/wallet/)
       )
     )
   
   SID_LIST_LISTENER =
    (SID_LIST =
     (SID_DESC =
      (GLOBAL_DBNAME = SID)
      (ORACLE_HOME = ORACLE_HOME)
      (SID_NAME = SID)
     )
    )
   
   LISTENER =
     (DESCRIPTION_LIST =
       (DESCRIPTION =
         (ADDRESS = (PROTOCOL = TCP)(HOST = localhost.localdomain)(PORT = 1521))
         (ADDRESS = (PROTOCOL = TCPS)(HOST = localhost.localdomain)(PORT = 1522))
         (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
       )
     )
   ```

1. Configure the `tnsnames.ora` file (`$ORACLE_HOME/network/admin/tnsnames.ora`).

   ```
   <SID>=
   (DESCRIPTION=
           (ADDRESS_LIST = 
                   (ADDRESS=(PROTOCOL = TCP)(HOST = localhost.localdomain)(PORT = 1521))
           )
           (CONNECT_DATA =
                   (SERVER = DEDICATED)
                   (SERVICE_NAME = <SID>)
           )
   )
   
   <SID>_ssl=
   (DESCRIPTION=
           (ADDRESS_LIST = 
                   (ADDRESS=(PROTOCOL = TCPS)(HOST = localhost.localdomain)(PORT = 1522))
           )
           (CONNECT_DATA =
                   (SERVER = DEDICATED)
                   (SERVICE_NAME = <SID>)
           )
   )
   ```

1. Restart the Oracle listener.

   ```
   lsnrctl start
   ```

1. Show the Oracle listener status.

   ```
   lsnrctl status
   ```

1. Test the SSL connection to the database from localhost using sqlplus and the SSL tnsnames entry.

   ```
   sqlplus -L ORACLE_USER@SID_ssl
   ```

1. Verify that you successfully connected using SSL.

   ```
   SELECT SYS_CONTEXT('USERENV', 'network_protocol') FROM DUAL;
   
   SYS_CONTEXT('USERENV','NETWORK_PROTOCOL')
   --------------------------------------------------------------------------------
   tcps
   ```

1. Change directory to the directory with the self-signed certificate.

   ```
   cd /u01/app/oracle/self_signed_cert
   ```

1. Create a new client Oracle wallet for AWS DMS to use.

   ```
   orapki wallet create -wallet ./ -auto_login_only
   ```

1. Add the self-signed root certificate to the Oracle wallet.

   ```
   orapki wallet add -wallet ./ -trusted_cert -cert self-rootCA.pem -auto_login_only
   ```

1. List the contents of the Oracle wallet for AWS DMS to use. The list should include the self-signed root certificate.

   ```
   orapki wallet display -wallet ./
   ```

   This has output like the following.

   ```
   Trusted Certificates:
   Subject:        CN=aws,OU=DBeng,O=AmazonWebService,L=Sydney,ST=NSW,C=AU
   ```

1. Upload the Oracle wallet that you just created to AWS DMS.

## Supported encryption methods for using Oracle as a source for AWS DMS
<a name="CHAP_Source.Oracle.Encryption"></a>

In the following table, you can find the transparent data encryption (TDE) methods that AWS DMS supports when working with an Oracle source database. 


| Redo logs access method | TDE tablespace | TDE column | 
| --- | --- | --- | 
| Oracle LogMiner | Yes | Yes | 
| Binary Reader | Yes | Yes | 

AWS DMS supports Oracle TDE when using Binary Reader, on both the column level and the tablespace level. To use TDE encryption with AWS DMS, first identify the Oracle wallet location where the TDE encryption key and TDE password are stored. Then identify the correct TDE encryption key and password for your Oracle source endpoint.

**To identify and specify encryption key and password for TDE encryption**

1. Run the following query to find the Oracle encryption wallet on the Oracle database host.

   ```
   SQL> SELECT WRL_PARAMETER FROM V$ENCRYPTION_WALLET;
   
   WRL_PARAMETER
   --------------------------------------------------------------------------------
   /u01/oracle/product/12.2.0/dbhome_1/data/wallet/
   ```

   Here, `/u01/oracle/product/12.2.0/dbhome_1/data/wallet/` is the wallet location.

1. Get the master key ID for either Non-CDB or CDB source as follows:

   1. For non-CDB source run the following query to retrieve Master encryption key ID:

      ```
      SQL>  select rownum, key_id, activation_time from v$encryption_keys;
      
      ROWNUM KEY_ID                                                 ACTIVATION_TIME
      ------ ------------------------------------------------------ ---------------
           1 AeKask0XZU+NvysflCYBEVwAAAAAAAAAAAAAAAAAAAAAAAAAAAAA   04-SEP-24 10.20.56.605200 PM +00:00
           2 AV7WU9uhoU8rv8daE/HNnSwAAAAAAAAAAAAAAAAAAAAAAAAAAAAA   10-AUG-21 07.52.03.966362 PM +00:00
           3 AckpoJ/f+k8xvzJ+gSuoVH4AAAAAAAAAAAAAAAAAAAAAAAAAAAAA   14-SEP-20 09.26.29.048870 PM +00:00
      ```

      Activation time is useful if you plan to start CDC from some point in the past. For example, using the above results, you can start CDC from some point between 10-AUG-21 07.52.03 PM and 14-SEP-20 09.26.29 PM using the Master Key ID in ROWNUM 2. When the task reaches the redo generated on or after 14-SEP-20 09.26.29 PM it fails, you must modify the source endpoint, provide the Master key ID in ROWNUM 3, and then resume the task.

   1. For CDB source DMS requires CDB\$1ROOT Master encryption key. Connect to CDB\$1ROOT and execute the following query:

      ```
      SQL> select rownum, key_id, activation_time from v$encryption_keys where con_id = 1;
      
      ROWNUM KEY_ID                                               ACTIVATION_TIME
      ------ ---------------------------------------------------- -----------------------------------
           1 Aa2E/Vwb5U+zv5hCncS5ErMAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 29-AUG-24 12.51.19.699060 AM +00:00
      ```

1. From the command line, list the encryption wallet entries on the source Oracle database host.

   ```
   $ mkstore -wrl /u01/oracle/product/12.2.0/dbhome_1/data/wallet/ -list
   Oracle Secret Store entries:
   ORACLE.SECURITY.DB.ENCRYPTION.AWGDC9glSk8Xv+3bVveiVSgAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
   ORACLE.SECURITY.DB.ENCRYPTION.AY1mRA8OXU9Qvzo3idU4OH4AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
   ORACLE.SECURITY.DB.ENCRYPTION.MASTERKEY
   ORACLE.SECURITY.ID.ENCRYPTION.
   ORACLE.SECURITY.KB.ENCRYPTION.
   ORACLE.SECURITY.KM.ENCRYPTION.AY1mRA8OXU9Qvzo3idU4OH4AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
   ```

   Find the entry containing the master key ID that you found in step 2 (`AWGDC9glSk8Xv+3bVveiVSg`). This entry is the TDE encryption key name.

1. View the details of the entry that you found in the previous step.

   ```
   $ mkstore -wrl /u01/oracle/product/12.2.0/dbhome_1/data/wallet/ -viewEntry ORACLE.SECURITY.DB.ENCRYPTION.AWGDC9glSk8Xv+3bVveiVSgAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
   Oracle Secret Store Tool : Version 12.2.0.1.0
   Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.
   Enter wallet password:
   ORACLE.SECURITY.DB.ENCRYPTION.AWGDC9glSk8Xv+3bVveiVSgAAAAAAAAAAAAAAAAAAAAAAAAAAAAA = AEMAASAASGYs0phWHfNt9J5mEMkkegGFiD4LLfQszDojgDzbfoYDEACv0x3pJC+UGD/PdtE2jLIcBQcAeHgJChQGLA==
   ```

   Enter the wallet password to see the result.

   Here, the value to the right of `'='` is the TDE password.

1. Specify the TDE encryption key name for the Oracle source endpoint by setting the `securityDbEncryptionName` extra connection attribute.

   ```
   securityDbEncryptionName=ORACLE.SECURITY.DB.ENCRYPTION.AWGDC9glSk8Xv+3bVveiVSgAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
   ```

1. Provide the associated TDE password for this key on the console as part of the Oracle source's **Password** value. Use the following order to format the comma-separated password values, ended by the TDE password value.

   ```
   Oracle_db_password,ASM_Password,AEMAASAASGYs0phWHfNt9J5mEMkkegGFiD4LLfQszDojgDzbfoYDEACv0x3pJC+UGD/PdtE2jLIcBQcAeHgJChQGLA==
   ```

   Specify the password values in this order regardless of your Oracle database configuration. For example, if you're using TDE but your Oracle database isn't using ASM, specify password values in the following comma-separated order.

   ```
   Oracle_db_password,,AEMAASAASGYs0phWHfNt9J5mEMkkegGFiD4LLfQszDojgDzbfoYDEACv0x3pJC+UGD/PdtE2jLIcBQcAeHgJChQGLA==
   ```

If the TDE credentials you specify are incorrect, the AWS DMS migration task does not fail. However, the task also does not read or apply ongoing replication changes to the target database. After starting the task, monitor **Table statistics** on the console migration task page to make sure changes are replicated.

If a DBA changes the TDE credential values for the Oracle database while the task is running, the task fails. The error message contains the new TDE encryption key name. To specify new values and restart the task, use the preceding procedure.

**Important**  
You can’t manipulate a TDE wallet created in an Oracle Automatic Storage Management (ASM) location because OS level commands like `cp`, `mv`, `orapki`, and `mkstore` corrupt the wallet files stored in an ASM location. This restriction is specific to TDE wallet files stored in an ASM location only, but not for TDE wallet files stored in a local OS directory.  
To manipulate a TDE wallet stored in ASM with OS level commands, create a local keystore and merge the ASM keystore into the local keystore as follows:   
Create a local keystore.  

   ```
   ADMINISTER KEY MANAGEMENT create keystore file system wallet location identified by wallet password;
   ```
Merge the ASM keystore into the local keystore.  

   ```
   ADMINISTER KEY MANAGEMENT merge keystore ASM wallet location identified by wallet password into existing keystore file system wallet location identified by wallet password with backup;
   ```
Then, to list the encryption wallet entries and TDE password, run steps 3 and 4 against the local keystore.

## Supported compression methods for using Oracle as a source for AWS DMS
<a name="CHAP_Source.Oracle.Compression"></a>

In the following table, you can find which compression methods AWS DMS supports when working with an Oracle source database. As the table shows, compression support depends both on your Oracle database version and whether DMS is configured to use Oracle LogMiner to access the redo logs.


| Version | Basic | OLTP |  HCC (from Oracle 11g R2 or newer)  | Others | 
| --- | --- | --- | --- | --- | 
| Oracle 10 | No | N/A | N/A | No | 
| Oracle 11 or newer – Oracle LogMiner | Yes | Yes | Yes  | Yes – Any compression method supported by Oracle LogMiner. | 
| Oracle 11 or newer – Binary Reader | Yes | Yes | Yes – For more information, see the following note . | Yes | 

**Note**  
When the Oracle source endpoint is configured to use Binary Reader, the Query Low level of the HCC compression method is supported for full-load tasks only.

## Replicating nested tables using Oracle as a source for AWS DMS
<a name="CHAP_Source.Oracle.NestedTables"></a>

AWS DMS supports the replication of Oracle tables containing columns that are nested tables or defined types. To enable this functionality, add the following extra connection attribute setting to the Oracle source endpoint.

```
allowSelectNestedTables=true;
```

AWS DMS creates the target tables from Oracle nested tables as regular parent and child tables on the target without a unique constraint. To access the correct data on the target, join the parent and child tables. To do this, first manually create a nonunique index on the `NESTED_TABLE_ID` column in the target child table. You can then use the `NESTED_TABLE_ID` column in the join `ON` clause together with the parent column that corresponds to the child table name. In addition, creating such an index improves performance when the target child table data is updated or deleted by AWS DMS. For an example, see [Example join for parent and child tables on the target](#CHAP_Source.Oracle.NestedTables.JoinExample).

We recommend that you configure the task to stop after a full load completes. Then, create these nonunique indexes for all the replicated child tables on the target and resume the task.

If a captured nested table is added to an existing parent table (captured or not captured), AWS DMS handles it correctly. However, the nonunique index for the corresponding target table isn't created. In this case, if the target child table becomes extremely large, performance might be affected. In such a case, we recommend that you stop the task, create the index, then resume the task.

After the nested tables are replicated to the target, have the DBA run a join on the parent and corresponding child tables to flatten the data.

### Prerequisites for replicating Oracle nested tables as a source
<a name="CHAP_Source.Oracle.NestedTables.Prerequisites"></a>

Ensure that you replicate parent tables for all the replicated nested tables. Include both the parent tables (the tables containing the nested table column) and the child (that is, nested) tables in the AWS DMS table mappings.

### Supported Oracle nested table types as a source
<a name="CHAP_Source.Oracle.NestedTables.Types"></a>

AWS DMS supports the following Oracle nested table types as a source:
+ Data type
+ User defined object

### Limitations of AWS DMS support for Oracle nested tables as a source
<a name="CHAP_Source.Oracle.NestedTables.Limitations"></a>

AWS DMS has the following limitations in its support of Oracle nested tables as a source:
+ AWS DMS supports only one level of table nesting.
+ AWS DMS table mapping does not check that both the parent and child table or tables are selected for replication. That is, it's possible to select a parent table without a child or a child table without a parent.

### How AWS DMS replicates Oracle nested tables as a source
<a name="CHAP_Source.Oracle.NestedTables.HowReplicated"></a>

AWS DMS replicates parent and nested tables to the target as follows:
+ AWS DMS creates the parent table identical to the source. It then defines the nested column in the parent as `RAW(16)` and includes a reference to the parent's nested tables in its `NESTED_TABLE_ID` column.
+ AWS DMS creates the child table identical to the nested source, but with an additional column named `NESTED_TABLE_ID`. This column has the same type and value as the corresponding parent nested column and has the same meaning.

### Example join for parent and child tables on the target
<a name="CHAP_Source.Oracle.NestedTables.JoinExample"></a>

To flatten the parent table, run a join between the parent and child tables, as shown in the following example:

1. Create the `Type` table.

   ```
   CREATE OR REPLACE TYPE NESTED_TEST_T AS TABLE OF VARCHAR(50);
   ```

1. Create the parent table with a column of type `NESTED_TEST_T` as defined preceding.

   ```
   CREATE TABLE NESTED_PARENT_TEST (ID NUMBER(10,0) PRIMARY KEY, NAME NESTED_TEST_T) NESTED TABLE NAME STORE AS NAME_KEY;
   ```

1. Flatten the table `NESTED_PARENT_TEST` using a join with the `NAME_KEY` child table where `CHILD.NESTED_TABLE_ID` matches `PARENT.NAME`.

   ```
   SELECT … FROM NESTED_PARENT_TEST PARENT, NAME_KEY CHILD WHERE CHILD.NESTED_
   TABLE_ID = PARENT.NAME;
   ```

## Storing REDO on Oracle ASM when using Oracle as a source for AWS DMS
<a name="CHAP_Source.Oracle.REDOonASM"></a>

For Oracle sources with high REDO generation, storing REDO on Oracle ASM can benefit performance, especially in a RAC configuration since you can configure DMS to distribute ASM REDO reads across all ASM nodes.

To utilize this configuration, use the `asmServer` connection attribute. For example, the following connection string distributes DMS REDO reads across 3 ASM nodes:

```
asmServer=(DESCRIPTION=(CONNECT_TIMEOUT=8)(ENABLE=BROKEN)(LOAD_BALANCE=ON)(FAILOVER=ON)
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=tcp)(HOST=asm_node1_ip_address)(PORT=asm_node1_port_number))
(ADDRESS=(PROTOCOL=tcp)(HOST=asm_node2_ip_address)(PORT=asm_node2_port_number))
(ADDRESS=(PROTOCOL=tcp)(HOST=asm_node3_ip_address)(PORT=asm_node3_port_number)))
(CONNECT_DATA=(SERVICE_NAME=+ASM)))
```

When using NFS to store Oracle REDO, it’s important to make sure that applicable DNFS (direct NFS) client patches are applied, specifically any patch addressing Oracle bug 25224242. For additional information, review the following Oracle Publication regarding Direct NFS client related patches, [Recommended Patches for Direct NFS Client](https://support.oracle.com/knowledge/Oracle Cloud/1495104_1.html). 

Additionally, to improve NFS read performance, we recommended you increase the value of `rsize` and `wsize` in `fstab` for the the NFS volume, as shown in the following example.

```
NAS_name_here:/ora_DATA1_archive /u09/oradata/DATA1 nfs rw,bg,hard,nointr,tcp,nfsvers=3,_netdev,
timeo=600,rsize=262144,wsize=262144
```

Also, adjust the `tcp-max-xfer-size` value as follows:

```
vserver nfs modify -vserver vserver -tcp-max-xfer-size 262144
```

## Endpoint settings when using Oracle as a source for AWS DMS
<a name="CHAP_Source.Oracle.ConnectionAttrib"></a>

You can use endpoint settings to configure your Oracle source database similar to using extra connection attributes. You specify the settings when you create the source endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--oracle-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with Oracle as a source.


| Name | Description | 
| --- | --- | 
| AccessAlternateDirectly |  Set this attribute to false in order to use the Binary Reader to capture change data for an Amazon RDS for Oracle as the source. This tells the DMS instance to not access redo logs through any specified path prefix replacement using direct file access. For more information, see [Configuring a CDC task to use Binary Reader with an RDS for Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.CDC). Default value: true  Valid values: true/false Example: `--oracle-settings '{"AccessAlternateDirectly": false}'`  | 
|  `AdditionalArchivedLogDestId`  |  Set this attribute with `ArchivedLogDestId` in a primary-Standby setup. This attribute is useful in a switchover when Oracle Data Guard database is used as a source. In this case, AWS DMS needs to know which destination to get archive redo logs from to read changes. This is because the previous primary is now a Standby instance after switchover. Although AWS DMS supports the use of the Oracle `RESETLOGS` option to open the database, never use `RESETLOGS` unless necessary. For additional information about `RESETLOGS`, see [RMAN Data Repair Concepts](https://docs.oracle.com/en/database/oracle/oracle-database/19/bradv/rman-data-repair-concepts.html#GUID-1805CCF7-4AF2-482D-B65A-998192F89C2B) in the *Oracle® Database Backup and Recovery User's Guide*. Valid values : Archive destination Ids Example: `--oracle-settings '{"AdditionalArchivedLogDestId": 2}'`  | 
|  `AddSupplementalLogging`  |  Set this attribute to set up table-level supplemental logging for the Oracle database. This attribute enables one of the following on all tables selected for a migration task, depending on table metadata: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.Oracle.html) Default value: false  Valid values: true/false  Example: `--oracle-settings '{"AddSupplementalLogging": false}'`  If you use this option, you still need to enable database-level supplemental logging as discussed previously.    | 
|  `AllowSelectNestedTables`  |  Set this attribute to true to enable replication of Oracle tables containing columns that are nested tables or defined types. For more information, see [Replicating nested tables using Oracle as a source for AWS DMS](#CHAP_Source.Oracle.NestedTables). Default value: false  Valid values: true/false Example: `--oracle-settings '{"AllowSelectNestedTables": true}'`  | 
|  `ArchivedLogDestId`  |  Specifies the ID of the destination for the archived redo logs. This value should be the same as a number in the dest\$1id column of the v\$1archived\$1log view. If you work with an additional redo log destination, we recommend that you use the `AdditionalArchivedLogDestId` attribute to specify the additional destination ID. Doing this improves performance by ensuring that the correct logs are accessed from the outset.  Default value: 1 Valid values: Number  Example: `--oracle-settings '{"ArchivedLogDestId": 1}'`  | 
|  `ArchivedLogsOnly`  |  When this field is set to Y, AWS DMS only accesses the archived redo logs. If the archived redo logs are stored on Oracle ASM only, the AWS DMS user account needs to be granted ASM privileges.  Default value: N  Valid values: Y/N  Example: `--oracle-settings '{"ArchivedLogsOnly": Y}'`  | 
|  `asmUsePLSQLArray` (ECA Only)  |  Use this extra connection attribute (ECA) when capturing source changes with BinaryReader. This setting allows DMS to buffer 50 reads at ASM level per single read thread while controlling the number of threads using the `parallelASMReadThreads` attribute. When you set this attribute, AWS DMS binary reader uses an anonymous PL/SQL block to capture redo data and send it back to the replication instance as a large buffer. This reduces the number of round trips to the source. This can significantly improve source capture performance, but it does result in higher PGA memory consumption on the ASM Instance. Stability issues might arise if the memory target is not sufficient enough. You can use the following formula to estimate the total ASM instance PGA memory usage by a single DMS task: `number_of_redo_threads * parallelASMReadThreads * 7 MB` Default value: false Valid values: true/false ECA Example: `asmUsePLSQLArray=true;`  | 
|  `ConvertTimestampWithZoneToUTC`  |  Set this attribute to `true` to convert the timestamp value of 'TIMESTAMP WITH TIME ZONE' and 'TIMESTAMP WITH LOCAL TIME ZONE' columns to UTC. By default the value of this attribute is 'false' and the data will be replicated using the source database timezone. Default value: false Valid values: true/false Example: `--oracle-settings '{"ConvertTimestampWithZoneToUTC": true}'`  | 
|  `EnableHomogenousPartitionOps`  |  Set this attribute to `true` to enable replication of Oracle Partition and subPartition DDL Operations for Oracle *Homogenous* migration. Note that this feature and enhancement was introduced in AWS DMS version 3.4.7. Default value: false Valid values: true/false Example: `--oracle-settings '{"EnableHomogenousPartitionOps": true}'`  | 
|  `EnableHomogenousTablespace`  |  Set this attribute to enable homogenous tablespace replication and create existing tables or indexes under the same tablespace on the target. Default value: false Valid values: true/false Example: `--oracle-settings '{"EnableHomogenousTablespace": true}'`  | 
|  `EscapeCharacter`  |  Set this attribute to an escape character. This escape character allows you to make a single wildcard character behave as a normal character in table mapping expressions. For more information, see [Wildcards in table mapping](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Wildcards.md). Default value: Null  Valid values: Any character other than a wildcard character Example: `--oracle-settings '{"EscapeCharacter": "#"}'` You can only use `escapeCharacter` for table names. It does not escape characters from schema names or column names.  | 
|  `ExposeViews`  |  Use this attribute to pull data once from a view; you can't use it for ongoing replication. When you extract data from a view, the view is shown as a table on the target schema. Default value: false Valid values: true/false Example: `--oracle-settings '{"ExposeViews": true}'`  | 
|  `ExtraArchivedLogDestIds`  |  Specifies the IDs of one more destinations for one or more archived redo logs. These IDs are the values of the dest\$1id column in the v\$1archived\$1log view. Use this setting with the ArchivedLogDestId extra connection attribute in a primary-to-single setup or a primary-to-multiple-standby setup. This setting is useful in a switchover when you use an Oracle Data Guard database as a source. In this case, AWS DMS needs information about what destination to get archive redo logs from to read changes. AWS DMS needs this because after the switchover the previous primary is a standby instance. Valid values: Archive destination Ids Example: `--oracle-settings '{"ExtraArchivedLogDestIds": 1}'`  | 
|  `FailTasksOnLobTruncation`  |  When set to `true`, this attribute causes a task to fail if the actual size of an LOB column is greater than the specified `LobMaxSize`. If a task is set to limited LOB mode and this option is set to `true`, the task fails instead of truncating the LOB data. Default value: false  Valid values: Boolean  Example: `--oracle-settings '{"FailTasksOnLobTruncation": true}'`  | 
|  `filterTransactionsOfUser` (ECA Only)  |  Use this extra connection attribute (ECA) to allows DMS to ignore transactions from a specified user when replicating data from Oracle when using LogMiner. You can pass comma separated user name values, but they must be in all CAPITAL letters. ECA Example: `filterTransactionsOfUser=USERNAME;`  | 
|  `NumberDataTypeScale`  |  Specifies the number scale. You can select a scale up to 38, or you can select -1 for FLOAT, or -2 for VARCHAR. By default, the NUMBER data type is converted to precision 38, scale 10. Default value: 10  Valid values: -2 to 38 (-2 for VARCHAR, -1 for FLOAT) Example: `--oracle-settings '{"NumberDataTypeScale": 12}'`  Select a precision-scale combination, -1 (FLOAT) or -2 (VARCHAR). DMS supports any precision-scale combination supported by Oracle. If precision is 39 or above, select -2 (VARCHAR). The NumberDataTypeScale setting for the Oracle database is used for the NUMBER data type only (without the explicit precision and scale definition). You must note that loss of precision can happen when this setting is incorrectly configured.   | 
|  `OpenTransactionWindow`  |   Provides the timeframe in minutes to check for any open transactions for CDC only task. When you set `OpenTransactionWindow` to 1 or higher, DMS uses `SCN_TO_TIMESTAMP` to convert SCN values to timestamp values. Because of Oracle Database limitations, if you specify an SCN that is too old as the CDC start point, SCN\$1TO\$1TIMESTAMP will fail with an `ORA-08181` error, and you can't start CDC-only tasks. Default value: 0  Valid values: An integer from 0 to 240 Example: `openTransactionWindow=15;`  | 
| OraclePathPrefix | Set this string attribute to the required value in order to use the Binary Reader to capture change data for an Amazon RDS for Oracle as the source. This value specifies the default Oracle root used to access the redo logs. For more information, see [Configuring a CDC task to use Binary Reader with an RDS for Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.CDC).Default value: none Valid value: /rdsdbdata/db/ORCL\$1A/ Example: `--oracle-settings '{"OraclePathPrefix": "/rdsdbdata/db/ORCL_A/"}'`  | 
| ParallelASMReadThreads |  Set this attribute to change the number of threads that DMS configures to perform change data capture (CDC) using Oracle Automatic Storage Management (ASM). You can specify an integer value between 2 (the default) and 8 (the maximum). Use this attribute together with the `ReadAheadBlocks` attribute. For more information, see [Configuring a CDC task to use Binary Reader with an RDS for Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.CDC). Default value: 2  Valid values: An integer from 2 to 8 Example: `--oracle-settings '{"ParallelASMReadThreads": 6;}'`  | 
| ReadAheadBlocks |  Set this attribute to change the number of read-ahead blocks that DMS configures to perform CDC using Oracle Automatic Storage Management (ASM) and non-ASM NAS storage. You can specify an integer value between 1000 (the default) and 2,000,000 (the maximum). Use this attribute together with the `ParallelASMReadThreads` attribute. For more information, see [Configuring a CDC task to use Binary Reader with an RDS for Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.CDC). Default value: 1000  Valid values: An integer from 1000 to 2,000,000 Example: `--oracle-settings '{"ReadAheadBlocks": 150000}'`  | 
|  `ReadTableSpaceName`  |  When set to `true`, this attribute supports tablespace replication. Default value: false  Valid values: Boolean  Example: `--oracle-settings '{"ReadTableSpaceName": true}'`  | 
| ReplacePathPrefix | Set this attribute to true in order to use the Binary Reader to capture change data for an Amazon RDS for Oracle as the source. This setting tells DMS instance to replace the default Oracle root with the specified UsePathPrefix setting to access the redo logs. For more information, see [Configuring a CDC task to use Binary Reader with an RDS for Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.CDC).Default value: false Valid values: true/false Example: `--oracle-settings '{"ReplacePathPrefix": true}'`  | 
|  `RetryInterval`  |  Specifies the number of seconds that the system waits before resending a query.  Default value: 5  Valid values: Numbers starting from 1  Example: `--oracle-settings '{"RetryInterval": 6}'`  | 
|  `SecurityDbEncryptionName`  |  Specifies the name of a key used for the transparent data encryption (TDE) of the columns and tablespace in the Oracle source database. For more information on setting this attribute and its associated password on the Oracle source endpoint, see [Supported encryption methods for using Oracle as a source for AWS DMS](#CHAP_Source.Oracle.Encryption). Default value: ""  Valid values: String  Example: `--oracle-settings '{"SecurityDbEncryptionName": "ORACLE.SECURITY.DB.ENCRYPTION.Adg8m2dhkU/0v/m5QUaaNJEAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"}'`  | 
|  `SpatialSdo2GeoJsonFunctionName`  |  For Oracle version 12.1 or earlier sources migrating to PostgreSQL targets, use this attribute to convert SDO\$1GEOMETRY to GEOJSON format. By default, AWS DMS calls the `SDO2GEOJSON` custom function which must be present and accessible to the AWS DMS user. Or you can create your own custom function that mimics the operation of `SDOGEOJSON` and set `SpatialSdo2GeoJsonFunctionName` to call it instead.  Default value: SDO2GEOJSON Valid values: String  Example: `--oracle-settings '{"SpatialSdo2GeoJsonFunctionName": "myCustomSDO2GEOJSONFunction"}'`  | 
|  `StandbyDelayTime`  |  Use this attribute to specify a time in minutes for the delay in standby sync. If the source is an Active Data Guard standby database, use this attribute to specify the time lag between primary and standby databases. In AWS DMS, you can create an Oracle CDC task that uses an Active Data Guard standby instance as a source for replicating ongoing changes. Doing this eliminates the need to connect to an active database that might be in production. Default value:0  Valid values: Number  Example: `--oracle-settings '{"StandbyDelayTime": 1}'` **Note: **When using DMS 3.4.6, 3.4.7 and higher, use of this connection setting is optional. In the latest version of DMS 3.4.6 and version 3.4.7, `dms_user` should have `select` permission on `V_$DATAGUARD_STATS`, allowing DMS to calculate standby delay time.  | 
| UseAlternateFolderForOnline | Set this attribute to true in order to use the Binary Reader to capture change data for an Amazon RDS for Oracle as the source. This tells the DMS instance to use any specified prefix replacement to access all online redo logs. For more information, see [Configuring a CDC task to use Binary Reader with an RDS for Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.CDC).Default value: false Valid values: true/false Example: `--oracle-settings '{"UseAlternateFolderForOnline": true}'`  | 
| UseBfile |  Set this attribute to Y in order to capture change data using the Binary Reader utility. Set `UseLogminerReader` to N to set this attribute to Y. To use the Binary Reader with an Amazon RDS for Oracle as the source, you set additional attributes. For more information on this setting and using Oracle Automatic Storage Management (ASM), see [Using Oracle LogMiner or AWS DMS Binary Reader for CDC](#CHAP_Source.Oracle.CDC). Note: When setting this value as an Extra Connection Attribute (ECA), valid values are 'Y' and 'N'. When setting this value as an endpoint setting, valid values are `true` and `false`. Default value: N  Valid values: Y/N (when setting this value as an ECA); true/false (when setting this value as as an endpoint setting). Example: `--oracle-settings '{"UseBfile": Y}'`  | 
|  `UseLogminerReader`  |  Set this attribute to Y to capture change data using the LogMiner utility (the default). Set this option to N if you want AWS DMS to access the redo logs as a binary file. When you set this option to N, also add the setting useBfile=Y. For more information on this setting and using Oracle Automatic Storage Management (ASM), see [Using Oracle LogMiner or AWS DMS Binary Reader for CDC](#CHAP_Source.Oracle.CDC). Note: When setting this value as an Extra Connection Attribute (ECA), valid values are 'Y' and 'N'. When setting this value as an endpoint setting, valid values are `true` and `false`. Default value: Y  Valid values: Y/N (when setting this value as an ECA); true/false (when setting this value as as an endpoint setting). Example: `--oracle-settings '{"UseLogminerReader": Y}'`  | 
| UsePathPrefix | Set this string attribute to the required value in order to use the Binary Reader to capture change data for an Amazon RDS for Oracle as the source. This value specifies the path prefix used to replace the default Oracle root to access the redo logs. For more information, see [Configuring a CDC task to use Binary Reader with an RDS for Oracle source for AWS DMS](#CHAP_Source.Oracle.Amazon-Managed.CDC).Default value: none Valid value: /rdsdbdata/log/ Example: `--oracle-settings '{"UsePathPrefix": "/rdsdbdata/log/"}'`  | 

## Source data types for Oracle
<a name="CHAP_Source.Oracle.DataTypes"></a>

The Oracle endpoint for AWS DMS supports most Oracle data types. The following table shows the Oracle source data types that are supported when using AWS DMS and the default mapping to AWS DMS data types.

**Note**  
With the exception of the LONG and LONG RAW data types, when replicating from an Oracle source to an Oracle target (a *homogeneous replication*), all of the source and target data types will be identical. But the LONG data type will be mapped to CLOB and the LONG RAW data type will be mapped to BLOB. 

For information on how to view the data type that is mapped in the target, see the section for the target endpoint you are using.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  Oracle data type  |  AWS DMS data type  | 
| --- | --- | 
|  BINARY\$1FLOAT  |  REAL4  | 
|  BINARY\$1DOUBLE  |  REAL8  | 
|  BINARY  |  BYTES  | 
|  FLOAT (P)  |  If precision is less than or equal to 24, use REAL4. If precision is greater than 24, use REAL8.  | 
|  NUMBER (P,S)  |  When scale is greater than 0, use NUMERIC. When scale is 0: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.Oracle.html) When scale is less than 0, use REAL8. | 
|  DATE  |  DATETIME  | 
|  INTERVAL\$1YEAR TO MONTH  |  STRING (with interval year\$1to\$1month indication)  | 
|  INTERVAL\$1DAY TO SECOND  |  STRING (with interval day\$1to\$1second indication)  | 
|  TIMESTAMP  |  DATETIME  | 
|  TIMESTAMP WITH TIME ZONE  |  STRING (with timestamp\$1with\$1timezone indication)  | 
|  TIMESTAMP WITH LOCAL TIME ZONE  |  STRING (with timestamp\$1with\$1local\$1 timezone indication)  | 
|  CHAR  |  STRING  | 
|  VARCHAR2  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.Oracle.html)  | 
|  NCHAR  |  WSTRING  | 
|  NVARCHAR2  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.Oracle.html)  | 
|  RAW  |  BYTES  | 
|  REAL  |  REAL8  | 
|  BLOB  |  BLOB To use this data type with AWS DMS, you must enable the use of BLOB data types for a specific task. AWS DMS supports BLOB data types only in tables that include a primary key.  | 
|  CLOB  |  CLOB To use this data type with AWS DMS, you must enable the use of CLOB data types for a specific task. During CDC, AWS DMS supports CLOB data types only in tables that include a primary key.  | 
|  NCLOB  |  NCLOB To use this data type with AWS DMS, you must enable the use of NCLOB data types for a specific task. During CDC, AWS DMS supports NCLOB data types only in tables that include a primary key.  | 
|  LONG  |  CLOB The LONG data type isn't supported in batch-optimized apply mode (TurboStream CDC mode). To use this data type with AWS DMS, enable the use of LOBs for a specific task. During CDC or full load, AWS DMS supports LOB data types only in tables that have a primary key. Also, AWS DMS does not support full LOB mode for loading LONG columns. Instead, you can use limited LOB mode for migrating LONG columns to an Oracle target. In limited LOB mode, AWS DMS truncates any data to 64 KB that you set to LONG columns longer than 64 KB. For more information about LOB support in AWS DMS, see [Setting LOB support for source databases in an AWS DMS task](CHAP_Tasks.LOBSupport.md)  | 
|  LONG RAW  |  BLOB The LONG RAW data type isn't supported in batch-optimized apply mode (TurboStream CDC mode). To use this data type with AWS DMS, enable the use of LOBs for a specific task. During CDC or full load, AWS DMS supports LOB data types only in tables that have a primary key. Also, AWS DMS does not support full LOB mode for loading LONG RAW columns. Instead, you can use limited LOB mode for migrating LONG RAW columns to an Oracle target. In limited LOB mode, AWS DMS truncates any data to 64 KB that you set to LONG RAW columns longer than 64 KB. For more information about LOB support in AWS DMS, see [Setting LOB support for source databases in an AWS DMS task](CHAP_Tasks.LOBSupport.md)  | 
|  XMLTYPE  |  CLOB  | 
| SDO\$1GEOMETRY | BLOB (when an Oracle to Oracle migration)CLOB (when an Oracle to PostgreSQL migration) | 

Oracle tables used as a source with columns of the following data types are not supported and can't be replicated. Replicating columns with these data types result in a null column.
+ BFILE
+ ROWID
+ REF
+ UROWID
+ User-defined data types
+ ANYDATA
+ VARRAY

**Note**  
Virtual columns are not supported.

### Migrating Oracle spatial data types
<a name="CHAP_Source.Oracle.DataTypes.Spatial"></a>

*Spatial data* identifies the geometry information for an object or location in space. In an Oracle database, the geometric description of a spatial object is stored in an object of type SDO\$1GEOMETRY. Within this object, the geometric description is stored in a single row in a single column of a user-defined table. 

AWS DMS supports migrating the Oracle type SDO\$1GEOMETRY from an Oracle source to either an Oracle or PostgreSQL target.

When you migrate Oracle spatial data types using AWS DMS, be aware of these considerations:
+ When migrating to an Oracle target, make sure to manually transfer USER\$1SDO\$1GEOM\$1METADATA entries that include type information. 
+ When migrating from an Oracle source endpoint to a PostgreSQL target endpoint, AWS DMS creates target columns. These columns have default geometry and geography type information with a 2D dimension and a spatial reference identifier (SRID) equal to zero (0). An example is `GEOMETRY, 2, 0`.
+ For Oracle version 12.1 or earlier sources migrating to PostgreSQL targets, convert `SDO_GEOMETRY` objects to `GEOJSON` format by using the `SDO2GEOJSON` function, or the `spatialSdo2GeoJsonFunctionName` extra connection attribute. For more information, see [Endpoint settings when using Oracle as a source for AWS DMS](#CHAP_Source.Oracle.ConnectionAttrib).
+ AWS DMS supports Oracle Spatial Column migrations for Full LOB mode only. AWS DMS does not support Limited LOB or Inline LOB modes. For more information about LOB mode, see [Setting LOB support for source databases in an AWS DMS task](CHAP_Tasks.LOBSupport.md).
+ Because AWS DMS only supports Full LOB mode for migrating Oracle Spatial Columns, the columns' table needs a primary key and a unique key. If the table does not have a primary key and a unique key, the table is skipped from migration.

# Using a Microsoft SQL Server database as a source for AWS DMS
<a name="CHAP_Source.SQLServer"></a>

Migrate data from one or many Microsoft SQL Server databases using AWS DMS. With a SQL Server database as a source, you can migrate data to another SQL Server database, or to one of the other AWS DMS supported databases. 

For information about versions of SQL Server that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md).

The source SQL Server database can be installed on any computer in your network. A SQL Server account with appropriate access privileges to the source database for the type of task you chose is required for use with AWS DMS. For more information, see [Permissions for SQL Server tasks](#CHAP_Source.SQLServer.Permissions).

AWS DMS supports migrating data from named instances of SQL Server. You can use the following notation in the server name when you create the source endpoint.

```
IPAddress\InstanceName
```

For example, the following is a correct source endpoint server name. Here, the first part of the name is the IP address of the server, and the second part is the SQL Server instance name (in this example, SQLTest).

```
10.0.0.25\SQLTest
```

Also, obtain the port number that your named instance of SQL Server listens on, and use it to configure your AWS DMS source endpoint. 

**Note**  
Port 1433 is the default for Microsoft SQL Server. But dynamic ports that change each time SQL Server is started, and specific static port numbers used to connect to SQL Server through a firewall are also often used. So, you want to know the actual port number of your named instance of SQL Server when you create the AWS DMS source endpoint.

You can use SSL to encrypt connections between your SQL Server endpoint and the replication instance. For more information on using SSL with a SQL Server endpoint, see [Using SSL with AWS Database Migration Service](CHAP_Security.SSL.md).

You can use CDC for ongoing migration from a SQL Server database. For information about configuring your source SQL server database for CDC, see [Capturing data changes for ongoing replication from SQL Server](CHAP_Source.SQLServer.CDC.md).

For additional details on working with SQL Server source databases and AWS DMS, see the following.

**Topics**
+ [

## Limitations on using SQL Server as a source for AWS DMS
](#CHAP_Source.SQLServer.Limitations)
+ [

## Permissions for SQL Server tasks
](#CHAP_Source.SQLServer.Permissions)
+ [

## Prerequisites for using ongoing replication (CDC) from a SQL Server source
](#CHAP_Source.SQLServer.Prerequisites)
+ [

## Supported compression methods for SQL Server
](#CHAP_Source.SQLServer.Compression)
+ [

## Working with self-managed SQL Server AlwaysOn availability groups
](#CHAP_Source.SQLServer.AlwaysOn)
+ [

## Endpoint settings when using SQL Server as a source for AWS DMS
](#CHAP_Source.SQLServer.ConnectionAttrib)
+ [

## Source data types for SQL Server
](#CHAP_Source.SQLServer.DataTypes)
+ [

# Capturing data changes for ongoing replication from SQL Server
](CHAP_Source.SQLServer.CDC.md)

## Limitations on using SQL Server as a source for AWS DMS
<a name="CHAP_Source.SQLServer.Limitations"></a>

The following limitations apply when using a SQL Server database as a source for AWS DMS:
+ The identity property for a column isn't migrated to a target database column.
+ The SQL Server endpoint doesn't support the use of tables with sparse columns.
+ Windows Authentication isn't supported.
+ Changes to computed fields in a SQL Server aren't replicated.
+ Temporal tables aren't supported.
+ SQL Server partition switching isn't supported.
+ When using the WRITETEXT and UPDATETEXT utilities, AWS DMS doesn't capture events applied on the source database.
+ The following data manipulation language (DML) pattern isn't supported. 

  ```
  SELECT * INTO new_table FROM existing_table
  ```
+ When using SQL Server as a source, column-level encryption isn't supported.
+ AWS DMS doesn't support server level audits on SQL Server 2008 or SQL Server 2008 R2 as sources. This is because of a known issue with SQL Server 2008 and 2008 R2. For example, running the following command causes AWS DMS to fail.

  ```
  USE [master]
  GO 
  ALTER SERVER AUDIT [my_audit_test-20140710] WITH (STATE=on)
  GO
  ```
+ Geometry and Geography columns are not supported in full lob mode when using SQL Server as a source. Instead, use limited lob mode or set the `InlineLobMaxSize` task setting to use inline lob mode.
+ When using a Microsoft SQL Server source database in a replication task, the SQL Server Replication Publisher definitions aren't removed if you remove the task. A Microsoft SQL Server system administrator must delete those definitions from Microsoft SQL Server.
+ Migrating data from schema-bound and non-schema-bound views is supported for full-load only tasks. 
+ Renaming tables using sp\$1rename isn't supported (for example, `sp_rename 'Sales.SalesRegion', 'SalesReg;)`
+ Renaming columns using sp\$1rename isn't supported (for example, `sp_rename 'Sales.Sales.Region', 'RegID', 'COLUMN';`)
+ AWS DMS doesn't support change processing to set and unset column default values (using the `ALTER COLUMN SET DEFAULT` clause with `ALTER TABLE` statements).
+ AWS DMS doesn't support change processing to set column nullability (using the `ALTER COLUMN [SET|DROP] NOT NULL` clause with `ALTER TABLE` statements).
+ With SQL Server 2012 and SQL Server 2014, when using DMS replication with Availability Groups, the distribution database can't be placed in an availability group. SQL 2016 supports placing the distribution database into an availability group, except for distribution databases used in merge, bidirectional, or peer-to-peer replication topologies.
+ For partitioned tables, AWS DMS doesn't support different data compression settings for each partition.
+ When inserting a value into SQL Server spatial data types (GEOGRAPHY and GEOMETRY), you can either ignore the spatial reference system identifier (SRID) property or specify a different number. When replicating tables with spatial data types, AWS DMS replaces the SRID with the default SRID (0 for GEOMETRY and 4326 for GEOGRAPHY).
+ If your database isn't configured for MS-REPLICATION or MS-CDC, you can still capture tables that do not have a Primary Key, but only INSERT/DELETE DML events are captured. UPDATE and TRUNCATE TABLE events are ignored.
+ Columnstore indexes aren't supported.
+ Memory-optimized tables (using In-Memory OLTP) aren't supported.
+ When replicating a table with a primary key that consists of multiple columns, updating the primary key columns during full load isn't supported.
+ Delayed durability isn't supported.
+ The `readBackupOnly=true` endpoint setting (extra connection attribute) doesn't work on RDS for SQL Server source instances because of the way RDS performs backups.
+ `EXCLUSIVE_AUTOMATIC_TRUNCATION` doesn’t work on Amazon RDS SQL Server source instances because RDS users don't have access to run the SQL Server stored procedure, `sp_repldone`.
+ AWS DMS doesn't capture truncate commands.
+ AWS DMS doesn't support replication from databases with accelerated database recovery (ADR) turned on.
+ AWS DMS doesn't support capturing data definition language (DDL) and data manipulation language (DML) statements within a single transaction.
+ AWS DMS doesn't support the replication of data-tier application packages (DACPAC).
+ UPDATE statements that involve primary keys or unique indexes and update multiple data rows, can cause conflicts when you apply changes to the target database. This might happen, for example, when the target database applies updates as INSERT and DELETE statements instead of a single UPDATE statement. With the batch optimized apply mode, the table might be ignored. With the transactional apply mode, the UPDATE operation might result in constraint violations. To avoid this issue, reload the relevant table. Alternatively, locate the problematic records in the Apply Exceptions control table (`dmslogs.awsdms_apply_exceptions`) and edit them manually in the target database. For more information, see [Change processing tuning settings](CHAP_Tasks.CustomizingTasks.TaskSettings.ChangeProcessingTuning.md).
+ AWS DMS doesn't support the replication of tables and schemas, where the name includes a special character from the following set.

  `\\ -- \n \" \b \r ' \t ;` 
+ Data masking isn't supported. AWS DMS migrates masked data without masking.
+ AWS DMS replicates up to 32,767 tables with primary keys and up to 1,000 columns for each table. This is because AWS DMS creates a SQL Server replication article for each replicated table, and SQL Server replication articles have these limitations.
+ When using Change Data Capture (CDC), you must define all columns that make up a unique index as `NOT NULL`. If this requirement is not met, SQL Server system error 22838 will result. 
+ You may lose events if SQL Server archives from the active transaction log to the backup log, or truncates them from the active transaction log.

The following limitations apply when accessing the backup transaction logs:
+ Encrypted backups aren't supported.
+ Backups stored at a URL or on Windows Azure aren't supported.
+ AWS DMS doe snot support direct processing of transaction log backups at the file level from alternative shared folders.
+ For Cloud SQL Server sources other than Amazon RDS for Microsoft SQL Server, AWS DMS supports ongoing replication (CDC) with the active transaction log only. You can't use the backup log with CDC. You may lose events if SQL server archives them from the active transaction log to the backup log, or truncates them from the active transaction log before DMS can read it. 
+ For Amazon RDS for Microsoft SQL Server sources, AWS DMS 3.5.2 and below supports ongoing replication (CDC) with the active transaction log only, because DMS can’t access the backup log with CDC. You may lose events if RDS for SQL Server archives them from the active transaction log to the backup log, or truncate them from the active transaction log before DMS can read it. This limitation does not apply to AWS DMS version 3.5.3 and above.
+ AWS DMS does not support CDC for Amazon RDS Proxy for SQL Server as a source.
+ If the SQL Server source becomes unavailable during a full load task, AWS DMS might mark the task as completed after multiple reconnection attempts, even though the data migration remains incomplete. In this scenario, the target tables contain only the records migrated before the connection loss, potentially creating data inconsistencies between the source and target systems. To ensure data completeness, you must either restart the full load task entirely or reload the specific tables affected by the connection interruption.

## Permissions for SQL Server tasks
<a name="CHAP_Source.SQLServer.Permissions"></a>

**Topics**
+ [

### Permissions for full load only tasks
](#CHAP_Source.SQLServer.Permissions.FullLoad)
+ [

### Permissions for tasks with ongoing replication
](#CHAP_Source.SQLServer.Permissions.Ongoing)

### Permissions for full load only tasks
<a name="CHAP_Source.SQLServer.Permissions.FullLoad"></a>

The following permissions are required to perform full load only tasks. Note that AWS DMS does not create the `dms_user` login. For information about creating a login for SQL Server, see [Create a database user](https://learn.microsoft.com/en-us/sql/relational-databases/security/authentication-access/create-a-database-user?view=sql-server-ver16) topic in *Microsoft's documentation*.

```
USE db_name;
                
                CREATE USER dms_user FOR LOGIN dms_user; 
                ALTER ROLE [db_datareader] ADD MEMBER dms_user; 
                GRANT VIEW DATABASE STATE to dms_user;
                GRANT VIEW DEFINITION to dms_user;
                
                USE master;
                
                GRANT VIEW SERVER STATE TO dms_user;
```

### Permissions for tasks with ongoing replication
<a name="CHAP_Source.SQLServer.Permissions.Ongoing"></a>

Self-managed SQL Server instances can be configured for ongoing replication using DMS with or without using the `sysadmin` role. For SQL Server instances, where you can't grant the `sysadmin` role, ensure that the DMS user has the privileges described as follows.

**Set up permissions for ongoing replication from a self-managed SQL Server database**

1. Create a new SQL Server account with password authentication using SQL Server Management Studio (SSMS) or as described previously in [Permissions for full load only tasks](#CHAP_Source.SQLServer.Permissions.FullLoad), for example, `self_managed_user`.

1. Run the following `GRANT` commands: 

   ```
   GRANT VIEW SERVER STATE TO self_managed_user;
   
   USE msdb;
       GRANT SELECT ON msdb.dbo.backupset TO self_managed_user;
       GRANT SELECT ON msdb.dbo.backupmediafamily TO self_managed_user;
       GRANT SELECT ON msdb.dbo.backupfile TO self_managed_user;
       
   USE db_name;
       CREATE USER self_managed_user FOR LOGIN self_managed_user;
       ALTER ROLE [db_owner] ADD MEMBER self_managed_user;
       GRANT VIEW DEFINITION to self_managed_user;
   ```

1. In addition to the preceding permissions, the user needs one of the following:
   + The user must be a member of the `sysadmin` fixed server role
   + Configurations and permissions as described in [Setting up ongoing replication on a SQL Server in an availability group environment: Without sysadmin role](CHAP_Source.SQLServer.CDC.md#CHAP_SupportScripts.SQLServer.ag) or [Setting up ongoing replication on a standalone SQL Server: Without sysadmin role](CHAP_Source.SQLServer.CDC.md#CHAP_SupportScripts.SQLServer.standalone), depending on your source configuration.

#### Set up permissions for ongoing replication from a cloud SQL Server database
<a name="CHAP_Source.SQLServer.Permissions.Cloud"></a>

A cloud-hosted SQL server instance is an instance running on Amazon RDS for Microsoft SQL Server, an Azure SQL Managed Instance, or any other managed cloud SQL Server instance supported by DMS.

Create a new SQL Server account with password authentication using SQL Server Management Studio (SSMS) or as described previously in [Permissions for full load only tasks](#CHAP_Source.SQLServer.Permissions.FullLoad), for example, `rds_user`.

Run the following grant commands.

```
GRANT VIEW SERVER STATE TO rds_user;
```

For Amazon RDS for Microsoft SQL Server sources, DMS version 3.5.3 and above support reading from transaction log backups. To ensure that DMS is able to access the log backups, in addition to the above, either grant `master` user privileges, or the following privileges on an RDS SQL Server source:

```
USE msdb;
    GRANT EXEC ON msdb.dbo.rds_dms_tlog_download TO rds_user;
    GRANT EXEC ON msdb.dbo.rds_dms_tlog_read TO rds_user;
    GRANT EXEC ON msdb.dbo.rds_dms_tlog_list_current_lsn TO rds_user;
    GRANT EXEC ON msdb.dbo.rds_task_status TO rds_user;
    
USE db_name;
    CREATE USER rds_user FOR LOGIN rds_user;
    ALTER ROLE [db_owner] ADD MEMBER rds_user;
    GRANT VIEW DEFINITION to rds_user;
```

For Amazon Azure SQL Managed Instances grant the following privileges:

```
GRANT SELECT ON msdb.dbo.backupset TO rds_user;
GRANT SELECT ON msdb.dbo.backupmediafamily TO rds_user;
GRANT SELECT ON msdb.dbo.backupfile TO rds_user;
```

## Prerequisites for using ongoing replication (CDC) from a SQL Server source
<a name="CHAP_Source.SQLServer.Prerequisites"></a>

You can use ongoing replication (change data capture, or CDC) for a self-managed SQL Server database on-premises or on Amazon EC2, or a cloud database such as Amazon RDS or a Microsoft Azure SQL managed instance.

The following requirements apply specifically when using ongoing replication with a SQL Server database as a source for AWS DMS:
+ SQL Server must be configured for full backups, and you must perform a backup before beginning to replicate data.
+ The recovery model must be set to **Bulk logged** or **Full**.
+ SQL Server backup to multiple disks isn't supported. If the backup is defined to write the database backup to multiple files over different disks, AWS DMS can't read the data and the AWS DMS task fails.
+ For self-managed SQL Server sources, SQL Server Replication Publisher definitions for the source used in a DMS CDC task aren't removed when you remove the task. A SQL Server system administrator must delete these definitions from SQL Server for self-managed sources.
+ During CDC, AWS DMS needs to look up SQL Server transaction log backups to read changes. AWS DMS doesn't support SQL Server transaction log backups created using third-party backup software that* aren't *in native format. To support transaction log backups that *are* in native format and created using third-party backup software, add the `use3rdPartyBackupDevice=Y` connection attribute to the source endpoint.
+ For self-managed SQL Server sources, be aware that SQL Server doesn't capture changes on newly created tables until they've been published. When tables are added to a SQL Server source, AWS DMS manages creating the publication. However, this process might take several minutes. Operations made to newly created tables during this delay aren't captured or replicated to the target. 
+ AWS DMS change data capture requires full transaction logging to be turned on in SQL Server. To turn on full transaction logging in SQL Server, either enable MS-REPLICATION or CHANGE DATA CAPTURE (CDC).
+ SQL Server *tlog* entries won't be marked for re-use until the MS CDC capture job processes those changes.
+ CDC operations aren't supported on memory-optimized tables. This limitation applies to SQL Server 2014 (when the feature was first introduced) and higher.
+ AWS DMS change data capture requires a distribution database by default on Amazon EC2 or On-Prem SQL server as source. So, ensure that you have activated the distributor while configuring MS replication for tables with primary keys.

## Supported compression methods for SQL Server
<a name="CHAP_Source.SQLServer.Compression"></a>

Note the following about support for SQL Server compression methods in AWS DMS:
+ AWS DMS supports Row/Page compression in SQL Server version 2008 and later.
+ AWS DMS doesn't support the Vardecimal storage format.
+ AWS DMS doesn't support sparse columns and columnar structure compression.

## Working with self-managed SQL Server AlwaysOn availability groups
<a name="CHAP_Source.SQLServer.AlwaysOn"></a>

SQL Server Always On availability groups provide high availability and disaster recovery as an enterprise-level alternative to database mirroring. 

In AWS DMS, you can migrate changes from a single primary or secondary availability group replica.

### Working with the primary availability group replica
<a name="CHAP_Source.SQLServer.AlwaysOn.Primary"></a>

 
**To use the primary availability group as a source in AWS DMS, do the following:**

1. Turn on the distribution option for all SQL Server instances in your availability replicas. For more information, see [Setting up ongoing replication on a self-managed SQL Server](CHAP_Source.SQLServer.CDC.md#CHAP_Source.SQLServer.CDC.MSCDC).

1. In the AWS DMS console, open the SQL Server source database settings. For **Server Name**, specify the Domain Name Service (DNS) name or IP address that was configured for your availability group listener. 

When you start an AWS DMS task for the first time, it might take longer than usual to start. This slowness occurs because the creation of the table articles is being duplicated by the availability group server. 

### Working with a secondary availability group replica
<a name="CHAP_Source.SQLServer.AlwaysOn.Secondary"></a>

**To use a secondary availability group as a source in AWS DMS, do the following:**

1. Use the same credentials for connecting to individual replicas as those used by the AWS DMS source endpoint user.

1. Ensure that your AWS DMS replication instance can resolve DNS names for all existing replicas, and connect to them. You can use the following SQL query to get DNS names for all of your replicas.

   ```
   select ar.replica_server_name, ar.endpoint_url from sys.availability_replicas ar
   JOIN sys.availability_databases_cluster adc
   ON adc.group_id = ar.group_id AND adc.database_name = '<source_database_name>';
   ```

1. When you create the source endpoint, specify the DNS name of the availability group listener for the endpoint's **Server name** or for the endpoint secret's **Server address**. For more information about availability group listeners, see [What is an availability group listener?](https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/availability-group-listener-overview?view=sql-server-ver15) in the SQL Server documentation.

   You can use either a public DNS server or an on-premises DNS server to resolve the availability group listener, the primary replica, and the secondary replicas. To use an on-premises DNS server, configure the Amazon Route 53 Resolver. For more information, see [Using your own on-premises name server](CHAP_BestPractices.md#CHAP_BestPractices.Rte53DNSResolver).

1. Add the following extra connection attributes to your source endpoint.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.SQLServer.html)

1. Enable the distribution option on all replicas in your availability group. Add all nodes to the distributors list. For more information, see [To set up distribution](CHAP_Source.SQLServer.CDC.md#CHAP_Source.SQLServer.CDC.MSCDC.Setup).

1. Run the following query on the primary read-write replica to enable publication of your database. You run this query only once for your database. 

   ```
   sp_replicationdboption @dbname = N'<source DB name>', @optname = N'publish', @value = N'true';
   ```


#### Limitations
<a name="CHAP_Source.SQLServer.AlwaysOn.Secondary.limitations"></a>

Following are limitations for working with a secondary availability group replica:
+ AWS DMS doesn't support Safeguard when using a read-only availability group replica as a source. For more information, see [Endpoint settings when using SQL Server as a source for AWS DMS](#CHAP_Source.SQLServer.ConnectionAttrib).
+ AWS DMS doesn't support the `setUpMsCdcForTables` extra connection attribute when using a read-only availability group replica as a source. For more information, see [Endpoint settings when using SQL Server as a source for AWS DMS](#CHAP_Source.SQLServer.ConnectionAttrib).
+ AWS DMS can use a self-managed secondary availability group replica as a source database for ongoing replication (change data capture, or CDC) starting from version 3.4.7. Cloud SQL Server Multi-AZ read replicas are not supported. If you use previous versions of AWS DMS, make sure that you use the primary availability group replica as a source database for CDC.

#### Failover to other nodes
<a name="CHAP_Source.SQLServer.AlwaysOn.Secondary.failover"></a>

If you set the `ApplicationIntent` extra connection attribute for your endpoint to `ReadOnly`, your AWS DMS task connects to the read-only node with the highest read-only routing priority. It then fails over to other read-only nodes in your availability group when the highest priority read-only node is unavailable. If you don't set `ApplicationIntent`, your AWS DMS task only connects to the primary (read/write) node in your availability group.

## Endpoint settings when using SQL Server as a source for AWS DMS
<a name="CHAP_Source.SQLServer.ConnectionAttrib"></a>

You can use endpoint settings to configure your SQL Server source database similar to using extra connection attributes. You specify the settings when you create the source endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--microsoft-sql-server-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with SQL Server as a source.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.SQLServer.html)

## Source data types for SQL Server
<a name="CHAP_Source.SQLServer.DataTypes"></a>

Data migration that uses SQL Server as a source for AWS DMS supports most SQL Server data types. The following table shows the SQL Server source data types that are supported when using AWS DMS and the default mapping from AWS DMS data types.

For information on how to view the data type that is mapped in the target, see the section for the target endpoint you are using.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  SQL Server data types  |  AWS DMS data types  | 
| --- | --- | 
|  BIGINT  |  INT8  | 
|  BIT  |  BOOLEAN  | 
|  DECIMAL  |  NUMERIC  | 
|  INT  |  INT4  | 
|  MONEY  |  NUMERIC  | 
|  NUMERIC (p,s)  |  NUMERIC   | 
|  SMALLINT  |  INT2  | 
|  SMALLMONEY  |  NUMERIC  | 
|  TINYINT  |  UINT1  | 
|  REAL  |  REAL4  | 
|  FLOAT  |  REAL8  | 
|  DATETIME  |  DATETIME  | 
|  DATETIME2 (SQL Server 2008 and higher)  |  DATETIME  | 
|  SMALLDATETIME  |  DATETIME  | 
|  DATE  |  DATE  | 
|  TIME  |  TIME  | 
|  DATETIMEOFFSET  |  WSTRING  | 
|  CHAR  |  STRING  | 
|  VARCHAR  |  STRING  | 
|  VARCHAR (max)  |  CLOB TEXT To use this data type with AWS DMS, you must enable the use of CLOB data types for a specific task. For SQL Server tables, AWS DMS updates LOB columns in the target even for UPDATE statements that don't change the value of the LOB column in SQL Server. During CDC, AWS DMS supports CLOB data types only in tables that include a primary key.  | 
|  NCHAR  |  WSTRING  | 
|  NVARCHAR (length)  |  WSTRING  | 
|  NVARCHAR (max)  |  NCLOB NTEXT To use this data type with AWS DMS, you must enable the use of SupportLobs for a specific task. For more information about enabling Lob support, see [Setting LOB support for source databases in an AWS DMS task](CHAP_Tasks.LOBSupport.md).  For SQL Server tables, AWS DMS updates LOB columns in the target even for UPDATE statements that don't change the value of the LOB column in SQL Server. During CDC, AWS DMS supports CLOB data types only in tables that include a primary key.  | 
|  BINARY  |  BYTES  | 
|  VARBINARY  |  BYTES  | 
|  VARBINARY (max)  |  BLOB IMAGE For SQL Server tables, AWS DMS updates LOB columns in the target even for UPDATE statements that don't change the value of the LOB column in SQL Server. To use this data type with AWS DMS, you must enable the use of BLOB data types for a specific task. AWS DMS supports BLOB data types only in tables that include a primary key.  | 
|  TIMESTAMP  |  BYTES  | 
|  UNIQUEIDENTIFIER  |  STRING  | 
|  HIERARCHYID   |  Use HIERARCHYID when replicating to a SQL Server target endpoint. Use WSTRING (250) when replicating to all other target endpoints.  | 
|  XML  |  NCLOB For SQL Server tables, AWS DMS updates LOB columns in the target even for UPDATE statements that don't change the value of the LOB column in SQL Server. To use this data type with AWS DMS, you must enable the use of NCLOB data types for a specific task. During CDC, AWS DMS supports NCLOB data types only in tables that include a primary key.  | 
|  GEOMETRY  |  Use GEOMETRY when replicating to target endpoints that support this data type. Use CLOB when replicating to target endpoints that don't support this data type.  | 
|  GEOGRAPHY  |  Use GEOGRAPHY when replicating to target endpoints that support this data type. Use CLOB when replicating to target endpoints that don't support this data type.  | 

AWS DMS doesn't support tables that include fields with the following data types. 
+ CURSOR
+ SQL\$1VARIANT
+ TABLE

**Note**  
User-defined data types are supported according to their base type. For example, a user-defined data type based on DATETIME is handled as a DATETIME data type.

# Capturing data changes for ongoing replication from SQL Server
<a name="CHAP_Source.SQLServer.CDC"></a>

This topic describes how to set up CDC replication on a SQL Server source.

**Topics**
+ [

## Capturing data changes for self-managed SQL Server on-premises or on Amazon EC2
](#CHAP_Source.SQLServer.CDC.Selfmanaged)
+ [

## Setting up ongoing replication on a cloud SQL Server DB instance
](#CHAP_Source.SQLServer.Configuration)

## Capturing data changes for self-managed SQL Server on-premises or on Amazon EC2
<a name="CHAP_Source.SQLServer.CDC.Selfmanaged"></a>

To capture changes from a source Microsoft SQL Server database, make sure that the database is configured for full backups. Configure the database either in full recovery mode or bulk-logged mode.

For a self-managed SQL Server source, AWS DMS uses the following:

**MS-Replication**  
To capture changes for tables with primary keys. You can configure this automatically by giving sysadmin privileges to the AWS DMS endpoint user on the source SQL Server instance. Or you can follow the steps in this section to prepare the source and use a user that doesn't have sysadmin privileges for the AWS DMS endpoint.

**MS-CDC**  
To capture changes for tables without primary keys. Enable MS-CDC at the database level and for all of the tables individually.

When setting up a SQL Server database for ongoing replication (CDC), you can do one of the following:
+ Set up ongoing replication using the sysadmin role.
+ Set up ongoing replication to not use the sysadmin role.

**Note**  
You can use the following script to find all table without a primary or a unique key:  

```
USE [DBname]
SELECT SCHEMA_NAME(schema_id) AS schema_name, name AS table_name
FROM sys.tables
WHERE OBJECTPROPERTY(object_id, 'TableHasPrimaryKey') = 0
        AND  OBJECTPROPERTY(object_id, 'TableHasUniqueCnst') = 0
ORDER BY schema_name, table_name;
```

### Setting up ongoing replication on a self-managed SQL Server
<a name="CHAP_Source.SQLServer.CDC.MSCDC"></a>

This section contains information about setting up ongoing replication on a self-managed SQL server with or without using the sysadmin role.

**Topics**
+ [

#### Setting up ongoing replication on a self-managed SQL Server: Using sysadmin role
](#CHAP_Source.SQLServer.CDC.MSCDC.Sysadmin)
+ [

#### Setting up ongoing replication on a standalone SQL Server: Without sysadmin role
](#CHAP_SupportScripts.SQLServer.standalone)
+ [

#### Setting up ongoing replication on a SQL Server in an availability group environment: Without sysadmin role
](#CHAP_SupportScripts.SQLServer.ag)

#### Setting up ongoing replication on a self-managed SQL Server: Using sysadmin role
<a name="CHAP_Source.SQLServer.CDC.MSCDC.Sysadmin"></a>

AWS DMS ongoing replication for SQL Server uses native SQL Server replication for tables with primary keys, and change data capture (CDC) for tables without primary keys.

Before setting up ongoing replication, see [Prerequisites for using ongoing replication (CDC) from a SQL Server source](CHAP_Source.SQLServer.md#CHAP_Source.SQLServer.Prerequisites). 

For tables with primary keys, AWS DMS can generally configure the required artifacts on the source. However, for SQL Server source instances that are self-managed, make sure to first configure the SQL Server distribution manually. After you do so, AWS DMS source users with sysadmin permission can automatically create the publication for tables with primary keys.

To check if distribution has already been configured, run the following command.

```
sp_get_distributor
```

If the result is `NULL` for column distribution, distribution isn't configured. You can use the following procedure to set up distribution.<a name="CHAP_Source.SQLServer.CDC.MSCDC.Setup"></a>

**To set up distribution**

1. Connect to your SQL Server source database using the SQL Server Management Studio (SSMS) tool.

1. Open the context (right-click) menu for the **Replication** folder, and choose **Configure Distribution**. The Configure Distribution wizard appears. 

1. Follow the wizard to enter the default values and create the distribution.<a name="CHAP_Source.SQLServer.CDC.MSCDC.Setup.CDC"></a>

**To set up CDC**

AWS DMS version 3.4.7 and greater can set up MS CDC for your database and all of your tables automatically if you aren't using a read-only replica. To use this feature, set the `SetUpMsCdcForTables` ECA to true. For information about ECAs, see [Endpoint settings](CHAP_Source.SQLServer.md#CHAP_Source.SQLServer.ConnectionAttrib).

For versions of AWS DMS earlier than 3.4.7, or for a read-only replica as a source, perform the following steps:

1. For tables without primary keys, set up MS-CDC for the database. To do so, use an account that has the sysadmin role assigned to it, and run the following command.

   ```
   use [DBname]
   EXEC sys.sp_cdc_enable_db
   ```

1. Next, set up MS-CDC for each of the source tables. For each table with unique keys but no primary key, run the following query to set up MS-CDC.

   ```
   exec sys.sp_cdc_enable_table
   @source_schema = N'schema_name',
   @source_name = N'table_name',
   @index_name = N'unique_index_name',
   @role_name = NULL,
   @supports_net_changes = 1
   GO
   ```

1. For each table with no primary key or no unique keys, run the following query to set up MS-CDC.

   ```
   exec sys.sp_cdc_enable_table
   @source_schema = N'schema_name',
   @source_name = N'table_name',
   @role_name = NULL
   GO
   ```

For more information on setting up MS-CDC for specific tables, see the [SQL Server documentation](https://msdn.microsoft.com/en-us/library/cc627369.aspx). 

#### Setting up ongoing replication on a standalone SQL Server: Without sysadmin role
<a name="CHAP_SupportScripts.SQLServer.standalone"></a>

This section describes how to set up ongoing replication for a standalone SQL Server database source that doesn't require the user account to have sysadmin privileges.

**Note**  
After running the steps in this section, the non-sysadmin DMS user will have permissions to do the following:  
Read changes from the online transactions log file
Disk access to read changes from transactional log backup files
Add or alter the publication which DMS uses
Add articles to the publication

1. Set up Microsoft SQL Server for Replication as described in [Capturing data changes for ongoing replication from SQL Server](#CHAP_Source.SQLServer.CDC).

1. Enable MS-REPLICATION on the source database. This can either be done manually or by running the task once as a sysadmin user.

1. Create the `awsdms` schema on the source database using the following script:

   ```
   use master
   go
   create schema awsdms
   go
   
   
   -- Create the table valued function [awsdms].[split_partition_list] on the Master database, as follows:
   USE [master]
   GO
   
   set ansi_nulls on
   go
   
   set quoted_identifier on
   go
   
   if (object_id('[awsdms].[split_partition_list]','TF')) is not null
   
   drop function [awsdms].[split_partition_list];
   
   go
   
   create function [awsdms].[split_partition_list]
   
   (
   
   @plist varchar(8000), --A delimited list of partitions
   
   @dlm nvarchar(1) --Delimiting character
   
   )
   
   returns @partitionsTable table --Table holding the BIGINT values of the string fragments
   
   (
   
   pid bigint primary key
   
   )   
   
   as
   
   begin
   
   declare @partition_id bigint;
   
   declare @dlm_pos integer;
   
   declare @dlm_len integer;
   
   set @dlm_len = len(@dlm);
   
   while (charindex(@dlm,@plist)>0)
   
   begin
   
   set @dlm_pos = charindex(@dlm,@plist);
   
   set @partition_id = cast( ltrim(rtrim(substring(@plist,1,@dlm_pos-1))) as bigint);
   
   insert into @partitionsTable (pid) values (@partition_id)
   
   set @plist = substring(@plist,@dlm_pos+@dlm_len,len(@plist));
   
   end
   
   set @partition_id = cast (ltrim(rtrim(@plist)) as bigint);
   
   insert into @partitionsTable (pid) values ( @partition_id );
   
   return
   
   end
   
   GO
   ```

1. Create the `[awsdms].[rtm_dump_dblog]` procedure on the Master database using the following script:

   ```
   use [MASTER]
   
   go
   
   if (object_id('[awsdms].[rtm_dump_dblog]','P')) is not null drop procedure [awsdms].[rtm_dump_dblog];
   go
   
   
   set ansi_nulls on
   go
   
   set quoted_identifier on
   GO
   
   
   CREATE procedure [awsdms].[rtm_dump_dblog]
   
   (
   
   @start_lsn varchar(32),
   
   @seqno integer,
   
   @filename varchar(260),
   
   @partition_list varchar(8000), -- A comma delimited list: P1,P2,... Pn
   
   @programmed_filtering integer,
   
   @minPartition bigint,
   
   @maxPartition bigint
   
   )
   
   as begin
   
   declare @start_lsn_cmp varchar(32); -- Stands against the GT comparator
   
   SET NOCOUNT ON -- – Disable "rows affected display"
   
   set @start_lsn_cmp = @start_lsn;
   
   if (@start_lsn_cmp) is null
   
   set @start_lsn_cmp = '00000000:00000000:0000';
   
   if (@partition_list is null)
   
   begin
   
   RAISERROR ('Null partition list waspassed',16,1);
   
   return
   
   end
   
   if (@start_lsn) is not null
   
   set @start_lsn = '0x'+@start_lsn;
   
   if (@programmed_filtering=0)
   
   
   SELECT
   
   [Current LSN],
   
   [operation],
   
   [Context],
   
   [Transaction ID],
   
   [Transaction Name],
   
   [Begin Time],
   
   [End Time],
   
   [Flag Bits],
   
   [PartitionID],
   
   [Page ID],
   
   [Slot ID],
   
   [RowLog Contents 0],
   
   [Log Record],
   
   [RowLog Contents 1]
   
   FROM
   
   fn_dump_dblog (
   
   @start_lsn, NULL, N'DISK', @seqno, @filename,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default)
   
   where [Current LSN] collate SQL_Latin1_General_CP1_CI_AS > @start_lsn_cmp collate SQL_Latin1_General_CP1_CI_AS
   
   and
   
   (
   
   ( [operation] in ('LOP_BEGIN_XACT','LOP_COMMIT_XACT','LOP_ABORT_XACT') )
   
   or
   
   ( [operation] in ('LOP_INSERT_ROWS','LOP_DELETE_ROWS','LOP_MODIFY_ROW')
   
   and
   
   ( ( [context] in ('LCX_HEAP','LCX_CLUSTERED','LCX_MARK_AS_GHOST') ) or ([context] = 'LCX_TEXT_MIX' and (datalength([RowLog Contents 0]) in (0,1))))
   
   and [PartitionID] in ( select * from master.awsdms.split_partition_list (@partition_list,','))
   
   )
   
   or
   
   ([operation] = 'LOP_HOBT_DDL')
   
   )
   
   
   else
   
   
   SELECT
   
   [Current LSN],
   
   [operation],
   
   [Context],
   
   [Transaction ID],
   
   [Transaction Name],
   
   [Begin Time],
   
   [End Time],
   
   [Flag Bits],
   
   [PartitionID],
   
   [Page ID],
   
   [Slot ID],
   
   [RowLog Contents 0],
   
   [Log Record],
   
   [RowLog Contents 1] -- After Image
   
   FROM
   
   fn_dump_dblog (
   
   @start_lsn, NULL, N'DISK', @seqno, @filename,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default,
   
   default, default, default, default, default, default, default)
   
   where [Current LSN] collate SQL_Latin1_General_CP1_CI_AS > @start_lsn_cmp collate SQL_Latin1_General_CP1_CI_AS
   
   and
   
   (
   
   ( [operation] in ('LOP_BEGIN_XACT','LOP_COMMIT_XACT','LOP_ABORT_XACT') )
   
   or
   
   ( [operation] in ('LOP_INSERT_ROWS','LOP_DELETE_ROWS','LOP_MODIFY_ROW')
   
   and
   
   ( ( [context] in ('LCX_HEAP','LCX_CLUSTERED','LCX_MARK_AS_GHOST') ) or ([context] = 'LCX_TEXT_MIX' and (datalength([RowLog Contents 0]) in (0,1))))
   
   and ([PartitionID] is not null) and ([PartitionID] >= @minPartition and [PartitionID]<=@maxPartition)
   
   )
   
   or
   
   ([operation] = 'LOP_HOBT_DDL')
   
   )
   
   
   SET NOCOUNT OFF -- Re-enable "rows affected display"
   
   end
   
   GO
   ```

1. Create the certificate on the Master database using the following script:

   ```
   Use [master]
   Go
   
   CREATE CERTIFICATE [awsdms_rtm_dump_dblog_cert] ENCRYPTION BY PASSWORD = N'@5trongpassword'
   
   WITH SUBJECT = N'Certificate for FN_DUMP_DBLOG Permissions';
   ```

1. Create the login from the certificate using the following script: 

   ```
   Use [master]
   Go
   
   CREATE LOGIN awsdms_rtm_dump_dblog_login FROM CERTIFICATE [awsdms_rtm_dump_dblog_cert];
   ```

1. Add the login to the sysadmin server role using the following script:

   ```
   ALTER SERVER ROLE [sysadmin] ADD MEMBER [awsdms_rtm_dump_dblog_login];
   ```

1. Add the signature to [master].[awsdms].[rtm\$1dump\$1dblog] using the certificate, using the following script: 

   ```
   Use [master]
   GO
   ADD SIGNATURE
   TO [master].[awsdms].[rtm_dump_dblog] BY CERTIFICATE [awsdms_rtm_dump_dblog_cert] WITH PASSWORD = '@5trongpassword';
   ```
**Note**  
If you recreate the stored procedure, you need to add the signature again.

1. Create the [awsdms].[rtm\$1position\$11st\$1timestamp] on the Master database using the following script:

   ```
   use [master]
       if object_id('[awsdms].[rtm_position_1st_timestamp]','P') is not null
       DROP PROCEDURE [awsdms].[rtm_position_1st_timestamp];
       go
       create procedure [awsdms].[rtm_position_1st_timestamp]
       (
       @dbname                sysname,      -- Database name
       @seqno                 integer,      -- Backup set sequence/position number within file
       @filename              varchar(260), -- The backup filename
       @1stTimeStamp          varchar(40)   -- The timestamp to position by
       ) 
       as begin
   
       SET NOCOUNT ON       -- Disable "rows affected display"
   
       declare @firstMatching table
       (
       cLsn varchar(32),
       bTim datetime
       )
   
       declare @sql nvarchar(4000)
       declare @nl                       char(2)
       declare @tb                       char(2)
       declare @fnameVar                 nvarchar(254) = 'NULL'
   
       set @nl  = char(10); -- New line
       set @tb  = char(9)   -- Tab separator
   
       if (@filename is not null)
       set @fnameVar = ''''+@filename +''''
   
       set @sql='use ['+@dbname+'];'+@nl+
       'select top 1 [Current LSN],[Begin Time]'+@nl+
       'FROM fn_dump_dblog (NULL, NULL, NULL, '+ cast(@seqno as varchar(10))+','+ @fnameVar+','+@nl+
       @tb+'default, default, default, default, default, default, default,'+@nl+
       @tb+'default, default, default, default, default, default, default,'+@nl+
       @tb+'default, default, default, default, default, default, default,'+@nl+
       @tb+'default, default, default, default, default, default, default,'+@nl+
       @tb+'default, default, default, default, default, default, default,'+@nl+
       @tb+'default, default, default, default, default, default, default,'+@nl+
       @tb+'default, default, default, default, default, default, default,'+@nl+
       @tb+'default, default, default, default, default, default, default,'+@nl+
       @tb+'default, default, default, default, default, default, default)'+@nl+
       'where operation=''LOP_BEGIN_XACT''' +@nl+
       'and [Begin Time]>= cast('+''''+@1stTimeStamp+''''+' as datetime)'+@nl
   
       --print @sql
       delete from  @firstMatching 
       insert into @firstMatching  exec sp_executesql @sql    -- Get them all
   
       select top 1 cLsn as [matching LSN],convert(varchar,bTim,121) as [matching Timestamp] from @firstMatching;
   
       SET NOCOUNT OFF      -- Re-enable "rows affected display"
   
       end
       GO
   ```

1. Create the certificate on the Master database using the following script:

   ```
   Use [master]
   Go
   CREATE CERTIFICATE [awsdms_rtm_position_1st_timestamp_cert]
   ENCRYPTION BY PASSWORD = '@5trongpassword'
   WITH SUBJECT = N'Certificate for FN_POSITION_1st_TIMESTAMP Permissions';
   ```

1. Create the login from the certificate using the following script:

   ```
   Use [master]
   Go
   CREATE LOGIN awsdms_rtm_position_1st_timestamp_login FROM CERTIFICATE [awsdms_rtm_position_1st_timestamp_cert];
   ```

1. Add the login to the sysadmin role using the following script:

   ```
   ALTER SERVER ROLE [sysadmin] ADD MEMBER [awsdms_rtm_position_1st_timestamp_login];
   ```

1. Add the signature to [master].[awsdms].[rtm\$1position\$11st\$1timestamp] using the certificate, using the following script:

   ```
   Use [master]
       GO
       ADD SIGNATURE
       TO [master].[awsdms].[rtm_position_1st_timestamp]
       BY CERTIFICATE [awsdms_rtm_position_1st_timestamp_cert]
       WITH PASSWORD = '@5trongpassword';
   ```

1. Grant the DMS user execute access to the new stored procedure using the following script:

   ```
   use master
   go
   GRANT execute on [awsdms].[rtm_position_1st_timestamp] to dms_user;
   ```

1. Create a user with the following permissions and roles in each of the following databases:
**Note**  
You should create the dmsnosysadmin user account with the same SID on each replica. The following SQL query can help verify the dmsnosysadmin account SID value on each replica. For more information about creating a user, see [ CREATE USER (Transact-SQL) ](https://learn.microsoft.com/en-us/sql/t-sql/statements/create-user-transact-sql) in the [Microsoft SQL server documentation](https://learn.microsoft.com/en-us/sql/). For more information about creating SQL user accounts for Azure SQL database, see [ Active geo-replication ](https://learn.microsoft.com/en-us/azure/azure-sql/database/active-geo-replication-overview).

   ```
   use master
   go
   grant select on sys.fn_dblog to [DMS_user]
   grant view any definition to [DMS_user]
   grant view server state to [DMS_user]--(should be granted to the login).
   grant execute on sp_repldone to [DMS_user]
   grant execute on sp_replincrementlsn to [DMS_user]
   grant execute on sp_addpublication to [DMS_user]
   grant execute on sp_addarticle to [DMS_user]
   grant execute on sp_articlefilter to [DMS_user]
   grant select on [awsdms].[split_partition_list] to [DMS_user]
   grant execute on [awsdms].[rtm_dump_dblog] to [DMS_user]
   ```

   ```
   use msdb
   go
   grant select on msdb.dbo.backupset to self_managed_user
   grant select on msdb.dbo.backupmediafamily to self_managed_user
   grant select on msdb.dbo.backupfile to self_managed_user
   ```

   Run the following script on the source database:

   ```
   use Source_DB
       Go
       EXEC sp_addrolemember N'db_owner', N'DMS_user'
   ```

1. Lastly, add an Extra Connection Attribute (ECA) to the source SQL Server endpoint:

   ```
   enableNonSysadminWrapper=true;
   ```

#### Setting up ongoing replication on a SQL Server in an availability group environment: Without sysadmin role
<a name="CHAP_SupportScripts.SQLServer.ag"></a>

This section describes how to set up ongoing replication for a SQL Server database source in an availability group environment that doesn't require the user account to have sysadmin privileges.

**Note**  
After running the steps in this section, the non-sysadmin DMS user will have permissions to do the following:  
Read changes from the online transactions log file
Disk access to read changes from transactional log backup files
Add or alter the publication which DMS uses
Add articles to the publication

**To set up ongoing replication without using the sysadmin user in an Availability Group environment**

1. Set up Microsoft SQL Server for Replication as described in [Capturing data changes for ongoing replication from SQL Server](#CHAP_Source.SQLServer.CDC).

1. Enable MS-REPLICATION on the source database. This can either be done manually or by running the task once using a sysadmin user.
**Note**  
You should either configure the MS-REPLICATION distributor as local or in a way that allows access to non-sysadmin users via the associated linked server.

1. If the **Exclusively use sp\$1repldone within a single task** endpoint option is enabled, stop the MS-REPLICATION Log Reader job.

1. Perform the following steps on each replica:

   1. Create the `[awsdms]`[awsdms] schema in the master database:

      ```
      CREATE SCHEMA [awsdms]
      ```

   1. Create the `[awsdms].[split_partition_list]` table valued function on the Master database:

      ```
      USE [master]
      GO
      
      SET ansi_nulls on
      GO
        
      SET quoted_identifier on
      GO
      
      IF (object_id('[awsdms].[split_partition_list]','TF')) is not null
        DROP FUNCTION [awsdms].[split_partition_list];
      GO
      
      CREATE FUNCTION [awsdms].[split_partition_list] 
      ( 
        @plist varchar(8000),    --A delimited list of partitions    
        @dlm nvarchar(1)    --Delimiting character
      ) 
      RETURNS @partitionsTable table --Table holding the BIGINT values of the string fragments
      (
        pid bigint primary key
      ) 
      AS 
      BEGIN
        DECLARE @partition_id bigint;
        DECLARE @dlm_pos integer;
        DECLARE @dlm_len integer;  
        SET @dlm_len = len(@dlm);
        WHILE (charindex(@dlm,@plist)>0)
        BEGIN 
          SET @dlm_pos = charindex(@dlm,@plist);
          SET @partition_id = cast( ltrim(rtrim(substring(@plist,1,@dlm_pos-1))) as bigint);
          INSERT into @partitionsTable (pid) values (@partition_id)
          SET @plist = substring(@plist,@dlm_pos+@dlm_len,len(@plist));
        END 
        SET @partition_id = cast (ltrim(rtrim(@plist)) as bigint);
        INSERT into @partitionsTable (pid) values (  @partition_id  );
        RETURN
      END
      GO
      ```

   1. Create the `[awsdms].[rtm_dump_dblog]` procedure on the Master database:

      ```
      USE [MASTER] 
      GO
      
      IF (object_id('[awsdms].[rtm_dump_dblog]','P')) is not null
        DROP PROCEDURE [awsdms].[rtm_dump_dblog]; 
      GO
      
      SET ansi_nulls on
      GO 
      
      SET quoted_identifier on 
      GO
                                          
      CREATE PROCEDURE [awsdms].[rtm_dump_dblog]
      (
        @start_lsn            varchar(32),
        @seqno                integer,
        @filename             varchar(260),
        @partition_list       varchar(8000), -- A comma delimited list: P1,P2,... Pn
        @programmed_filtering integer,
        @minPartition         bigint,
        @maxPartition         bigint
      ) 
      AS 
      BEGIN
      
        DECLARE @start_lsn_cmp varchar(32); -- Stands against the GT comparator
      
        SET NOCOUNT ON  -- Disable "rows affected display"
      
        SET @start_lsn_cmp = @start_lsn;
        IF (@start_lsn_cmp) is null
          SET @start_lsn_cmp = '00000000:00000000:0000';
      
        IF (@partition_list is null)
          BEGIN
            RAISERROR ('Null partition list was passed',16,1);
            return
            --set @partition_list = '0,';    -- A dummy which is never matched
          END
      
        IF (@start_lsn) is not null
          SET @start_lsn = '0x'+@start_lsn;
      
        IF (@programmed_filtering=0)
          SELECT
            [Current LSN],
            [operation],
            [Context],
            [Transaction ID],
            [Transaction Name],
            [Begin Time],
            [End Time],
            [Flag Bits],
            [PartitionID],
            [Page ID],
            [Slot ID],
            [RowLog Contents 0],
            [Log Record],
            [RowLog Contents 1] -- After Image
          FROM
            fn_dump_dblog (
              @start_lsn, NULL, N'DISK', @seqno, @filename,
              default, default, default, default, default, default, default,
              default, default, default, default, default, default, default,
              default, default, default, default, default, default, default,
              default, default, default, default, default, default, default,
              default, default, default, default, default, default, default,
              default, default, default, default, default, default, default,
              default, default, default, default, default, default, default,
              default, default, default, default, default, default, default,
              default, default, default, default, default, default, default)
          WHERE 
            [Current LSN] collate SQL_Latin1_General_CP1_CI_AS > @start_lsn_cmp collate SQL_Latin1_General_CP1_CI_AS -- This aims for implementing FN_DBLOG based on GT comparator.
            AND
            (
              (  [operation] in ('LOP_BEGIN_XACT','LOP_COMMIT_XACT','LOP_ABORT_XACT') )
              OR
              (  [operation] in ('LOP_INSERT_ROWS','LOP_DELETE_ROWS','LOP_MODIFY_ROW')
                AND
                ( ( [context]   in ('LCX_HEAP','LCX_CLUSTERED','LCX_MARK_AS_GHOST') ) or ([context] = 'LCX_TEXT_MIX') )
                AND       
                [PartitionID] in ( select * from master.awsdms.split_partition_list (@partition_list,','))
              )
            OR
            ([operation] = 'LOP_HOBT_DDL')
          )
          ELSE
            SELECT
              [Current LSN],
              [operation],
              [Context],
              [Transaction ID],
              [Transaction Name],
              [Begin Time],
              [End Time],
              [Flag Bits],
              [PartitionID],
              [Page ID],
              [Slot ID],
              [RowLog Contents 0],
              [Log Record],
              [RowLog Contents 1] -- After Image
            FROM
              fn_dump_dblog (
                @start_lsn, NULL, N'DISK', @seqno, @filename,
                default, default, default, default, default, default, default,
                default, default, default, default, default, default, default,
                default, default, default, default, default, default, default,
                default, default, default, default, default, default, default,
                default, default, default, default, default, default, default,
                default, default, default, default, default, default, default,
                default, default, default, default, default, default, default,
                default, default, default, default, default, default, default,
                default, default, default, default, default, default, default)
            WHERE [Current LSN] collate SQL_Latin1_General_CP1_CI_AS > @start_lsn_cmp collate SQL_Latin1_General_CP1_CI_AS -- This aims for implementing FN_DBLOG based on GT comparator.
            AND
            (
              (  [operation] in ('LOP_BEGIN_XACT','LOP_COMMIT_XACT','LOP_ABORT_XACT') )
              OR
              (  [operation] in ('LOP_INSERT_ROWS','LOP_DELETE_ROWS','LOP_MODIFY_ROW')
                AND
                ( ( [context]   in ('LCX_HEAP','LCX_CLUSTERED','LCX_MARK_AS_GHOST') ) or ([context] = 'LCX_TEXT_MIX') )
                AND ([PartitionID] is not null) and ([PartitionID] >= @minPartition and [PartitionID]<=@maxPartition)
              )
              OR
              ([operation] = 'LOP_HOBT_DDL')
            )
            SET NOCOUNT OFF -- Re-enable "rows affected display"
      END
      GO
      ```

   1. Create a certificate on the Master Database:

      ```
      USE [master]
      GO
      CREATE CERTIFICATE [awsdms_rtm_dump_dblog_cert]
        ENCRYPTION BY PASSWORD = N'@hardpassword1'
        WITH SUBJECT = N'Certificate for FN_DUMP_DBLOG Permissions'
      ```

   1. Create a login from the certificate:

      ```
      USE [master]
      GO
      CREATE LOGIN awsdms_rtm_dump_dblog_login FROM CERTIFICATE
        [awsdms_rtm_dump_dblog_cert];
      ```

   1. Add the login to the sysadmin server role:

      ```
      ALTER SERVER ROLE [sysadmin] ADD MEMBER [awsdms_rtm_dump_dblog_login];
      ```

   1. Add the signature to the [master].[awsdms].[rtm\$1dump\$1dblog] procedure using the certificate:

      ```
      USE [master]
      GO
      
      ADD SIGNATURE
        TO [master].[awsdms].[rtm_dump_dblog]
        BY CERTIFICATE [awsdms_rtm_dump_dblog_cert]
        WITH PASSWORD = '@hardpassword1';
      ```
**Note**  
If you recreate the stored procedure, you need to add the signature again.

   1. Create the `[awsdms].[rtm_position_1st_timestamp]` procedure on the Master database:

      ```
      USE [master]
      IF object_id('[awsdms].[rtm_position_1st_timestamp]','P') is not null
        DROP PROCEDURE [awsdms].[rtm_position_1st_timestamp];
      GO
      CREATE PROCEDURE [awsdms].[rtm_position_1st_timestamp]
      (
        @dbname                sysname,      -- Database name
        @seqno                 integer,      -- Backup set sequence/position number within file
        @filename              varchar(260), -- The backup filename
        @1stTimeStamp          varchar(40)   -- The timestamp to position by
      ) 
      AS 
      BEGIN
        SET NOCOUNT ON       -- Disable "rows affected display"
      
        DECLARE @firstMatching table
        (
          cLsn varchar(32),
          bTim datetime
        )
        DECLARE @sql nvarchar(4000)
        DECLARE @nl                       char(2)
        DECLARE @tb                       char(2)
        DECLARE @fnameVar                 sysname = 'NULL'
      
        SET @nl  = char(10); -- New line
        SET @tb  = char(9)   -- Tab separator
      
        IF (@filename is not null)
          SET @fnameVar = ''''+@filename +''''
        SET @filename = ''''+@filename +''''
        SET @sql='use ['+@dbname+'];'+@nl+
          'SELECT TOP 1 [Current LSN],[Begin Time]'+@nl+
          'FROM fn_dump_dblog (NULL, NULL, NULL, '+ cast(@seqno as varchar(10))+','+ @filename +','+@nl+
          @tb+'default, default, default, default, default, default, default,'+@nl+
          @tb+'default, default, default, default, default, default, default,'+@nl+
          @tb+'default, default, default, default, default, default, default,'+@nl+
          @tb+'default, default, default, default, default, default, default,'+@nl+
          @tb+'default, default, default, default, default, default, default,'+@nl+
          @tb+'default, default, default, default, default, default, default,'+@nl+
          @tb+'default, default, default, default, default, default, default,'+@nl+
          @tb+'default, default, default, default, default, default, default,'+@nl+
          @tb+'default, default, default, default, default, default, default)'+@nl+
          'WHERE operation=''LOP_BEGIN_XACT''' +@nl+
          'AND [Begin Time]>= cast('+''''+@1stTimeStamp+''''+' as datetime)'+@nl
      
          --print @sql
          DELETE FROM @firstMatching 
          INSERT INTO @firstMatching  exec sp_executesql @sql    -- Get them all
          SELECT TOP 1 cLsn as [matching LSN],convert(varchar,bTim,121) AS[matching Timestamp] FROM @firstMatching;
      
          SET NOCOUNT OFF      -- Re-enable "rows affected display"
      
      END
      GO
      ```

   1. Create a certificate on the Master database:

      ```
      USE [master]
      GO
      CREATE CERTIFICATE [awsdms_rtm_position_1st_timestamp_cert]
        ENCRYPTION BY PASSWORD = N'@hardpassword1'
        WITH SUBJECT = N'Certificate for FN_POSITION_1st_TIMESTAMP Permissions';
      ```

   1. Create a login from the certificate:

      ```
      USE [master]
      GO
      CREATE LOGIN awsdms_rtm_position_1st_timestamp_login FROM CERTIFICATE
        [awsdms_rtm_position_1st_timestamp_cert];
      ```

   1. Add the login to the sysadmin server role:

      ```
      ALTER SERVER ROLE [sysadmin] ADD MEMBER [awsdms_rtm_position_1st_timestamp_login];
      ```

   1. Add the signature to the `[master].[awsdms].[rtm_position_1st_timestamp]` procedure using the certificate:

      ```
      USE [master]
      GO
      ADD SIGNATURE
        TO [master].[awsdms].[rtm_position_1st_timestamp]
        BY CERTIFICATE [awsdms_rtm_position_1st_timestamp_cert]
        WITH PASSWORD = '@hardpassword1';
      ```
**Note**  
If you recreate the stored procedure, you need to add the signature again.

   1. Create a user with the following permissions/roles in each of the following databases:
**Note**  
You should create the dmsnosysadmin user account with the same SID on each replica. The following SQL query can help verify the dmsnosysadmin account SID value on each replica. For more information about creating a user, see [ CREATE USER (Transact-SQL) ](https://learn.microsoft.com/en-us/sql/t-sql/statements/create-user-transact-sql) in the [Microsoft SQL server documentation](https://learn.microsoft.com/en-us/sql/). For more information about creating SQL user accounts for Azure SQL database, see [ Active geo-replication ](https://learn.microsoft.com/en-us/azure/azure-sql/database/active-geo-replication-overview).

      ```
      SELECT @@servername servername, name, sid, create_date, modify_date
        FROM sys.server_principals
        WHERE name = 'dmsnosysadmin';
      ```

   1. Grant permissions on the master database on each replica:

      ```
      USE master
      GO 
      
      GRANT select on sys.fn_dblog to dmsnosysadmin;
      GRANT view any definition to dmsnosysadmin;
      GRANT view server state to dmsnosysadmin -- (should be granted to the login).
      GRANT execute on sp_repldone to dmsnosysadmin;
      GRANT execute on sp_replincrementlsn to dmsnosysadmin;
      GRANT execute on sp_addpublication to dmsnosysadmin;
      GRANT execute on sp_addarticle to dmsnosysadmin;
      GRANT execute on sp_articlefilter to dmsnosysadmin;
      GRANT select on [awsdms].[split_partition_list] to dmsnosysadmin;
      GRANT execute on [awsdms].[rtm_dump_dblog] to dmsnosysadmin;
      GRANT execute on [awsdms].[rtm_position_1st_timestamp] to dmsnosysadmin;
      ```

   1. Grant permissions on the msdb database on each replica:

      ```
      USE msdb
      GO
      GRANT select on msdb.dbo.backupset TO self_managed_user
      GRANT select on msdb.dbo.backupmediafamily TO self_managed_user
      GRANT select on msdb.dbo.backupfile TO self_managed_user
      ```

   1. Add the `db_owner` role to `dmsnosysadmin` on the source database. Because the database is synchronized, you can add the role on the primary replica only.

      ```
      use <source DB>
      GO 
      EXEC sp_addrolemember N'db_owner', N'dmsnosysadmin'
      ```

## Setting up ongoing replication on a cloud SQL Server DB instance
<a name="CHAP_Source.SQLServer.Configuration"></a>

This section describes how to set up CDC on a cloud-hosted SQL Server database instance. A cloud-hosted SQL server instance is an instance running on Amazon RDS for SQL Server, an Azure SQL Manged Instance, or any other managed cloud SQL Server instance. For information about limitations for ongoing replication for each database type, see [Limitations on using SQL Server as a source for AWS DMS](CHAP_Source.SQLServer.md#CHAP_Source.SQLServer.Limitations). 

Before setting up ongoing replication, see [Prerequisites for using ongoing replication (CDC) from a SQL Server source](CHAP_Source.SQLServer.md#CHAP_Source.SQLServer.Prerequisites). 

Unlike self-managed Microsoft SQL Server sources, Amazon RDS for SQL Server doesn't support MS-Replication. Therefore, AWS DMS needs to use MS-CDC for tables with or without primary keys.

Amazon RDS doesn't grant sysadmin privileges for setting replication artifacts that AWS DMS uses for ongoing changes in a source SQL Server instance. Make sure to turn on MS-CDC for the Amazon RDS instance (using master user privileges) as in the following procedure.

**To turn on MS-CDC for a cloud SQL Server DB instance**

1. Run one of the following queries at the database level.

   For an RDS for SQL Server DB instance, use this query.

   ```
   exec msdb.dbo.rds_cdc_enable_db 'DB_name'
   ```

   For an Azure SQL managed DB instance, use this query.

   ```
   USE DB_name 
   GO 
   EXEC sys.sp_cdc_enable_db 
   GO
   ```

1. For each table with a primary key, run the following query to turn on MS-CDC.

   ```
   exec sys.sp_cdc_enable_table
   @source_schema = N'schema_name',
   @source_name = N'table_name',
   @role_name = NULL,
   @supports_net_changes = 1
   GO
   ```

   For each table with unique keys but no primary key, run the following query to turn on MS-CDC.

   ```
   exec sys.sp_cdc_enable_table
   @source_schema = N'schema_name',
   @source_name = N'table_name',
   @index_name = N'unique_index_name',
   @role_name = NULL,
   @supports_net_changes = 1
   GO
   ```

    For each table with no primary key nor unique keys, run the following query to turn on MS-CDC.

   ```
   exec sys.sp_cdc_enable_table
   @source_schema = N'schema_name',
   @source_name = N'table_name',
   @role_name = NULL
   GO
   ```

1. Set the retention period:
   + For RDS for SQL Server instances that are replicating using DMS version 3.5.3 and above, make sure that the retention period is set to the default value of 5 seconds. If you’re upgrading or moving from DMS 3.5.2 and below to DMS 3.5.3 and above, change the polling interval value after the tasks are running on the new or upgraded instance. The following script sets the retention period to 5 seconds:

     ```
     use dbname
     EXEC sys.sp_cdc_change_job @job_type = 'capture' ,@pollinginterval = 5
     exec sp_cdc_stop_job 'capture'
     exec sp_cdc_start_job 'capture'
     ```
   + The parameter `@pollinginterval` is measured in seconds with a recommended value set to 86399. This means that the transaction log retains changes for 86,399 seconds (one day) when `@pollinginterval = 86399`. The procedure `exec sp_cdc_start_job 'capture'` initiates the settings.
**Note**  
With some versions of SQL Server, if the value of `pollinginterval` is set to more than 3599 seconds, the value resets to the default five seconds. When this happens, T-Log entries are purged before AWS DMS can read them. To determine which SQL Server versions are affected by this known issue, see [this Microsoft KB article](https://support.microsoft.com/en-us/topic/kb4459220-fix-incorrect-results-occur-when-you-convert-pollinginterval-parameter-from-seconds-to-hours-in-sys-sp-cdc-scan-in-sql-server-dac8aefe-b60b-7745-f987-582dda2cfa78).

     If you are using Amazon RDS with Multi-AZ, make sure that you also set your secondary to have the right values in case of failover.

     ```
     exec rdsadmin..rds_set_configuration 'cdc_capture_pollinginterval' , <5 or preferred value>
     ```

**To maintain the retention period when an AWS DMS replication task is stopped for more than one hour**
**Note**  
The following steps aren’t needed for a RDS for SQL Server source replicating using DMS 3.5.3 and above.

1. Stop the job truncating the transaction logs by using the following command. 

   ```
   exec sp_cdc_stop_job 'capture'
   ```

1. Find your task on the AWS DMS console and resume the task.

1. Choose the **Monitoring** tab, and check the `CDCLatencySource` metric. 

1. After the `CDCLatencySource` metric equals 0 (zero) and stays there, restart the job truncating the transaction logs using the following command.

   ```
   exec sp_cdc_start_job 'capture'
   ```

Remember to start the job that truncates SQL Server transaction logs. Otherwise, storage on your SQL Server instance might fill up.

### Recommended settings when using RDS for SQL Server as a source for AWS DMS
<a name="CHAP_Source.SQLServer.Configuration.Settings"></a>

#### For AWS DMS 3.5.3 and above
<a name="CHAP_Source.SQLServer.Configuration.Settings.353"></a>

**Note**  
The initial release of the RDS for SQL Server log backup feature is enabled by default for endpoints that you created or modified after the release of DMS version 3.5.3. To use this feature for existing endpoints, modify the endpoint without making any changes.

AWS DMS version 3.5.3 introduces support for reading from log backups. DMS primarily relies on reading from the active transaction logs to replicate events. If a transaction is backed up before DMS can read it from the active log, the task accesses the RDS backups on-demand and reads from the subsequent backup logs until it catches up to the active transaction log. To ensure that DMS has access to log backups, set the RDS automated backup retention period to at least one day. For information about setting the automated backup retention period, see [ Backup retention period](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ManagingAutomatedBackups.html#USER_WorkingWithAutomatedBackups.BackupRetention) in the *Amazon RDS User Guide*.

A DMS task accessing log backups utilizes storage on the RDS instance. Note that the task only accesses the log backups needed for replication. Amazon RDS removes these downloaded backups in a couple of hours. This removal doesn't affect the Amazon RDS backups retained in Amazon S3, or Amazon RDS `RESTORE DATABASE` functionality. It is advisable to allocate additional storage on your RDS for SQL Server source if you intend to replicate using DMS. One way to estimate the amount of storage needed is to identify the backup from which DMS will start or resume replicating from, and add up the file sizes of all the subsequent backups using the RDS `tlog backup` metadata function. For more information about the `tlog backup` function, see [ Listing available transaction log backups](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER.SQLServer.AddlFeat.TransactionLogAccess.html#USER.SQLServer.AddlFeat.TransactionLogAccess.Listing) in the *Amazon RDS User Guide*. 

Alternately, you may choose to enable storage autoscaling and/ or trigger storage scaling based on the CloudWatch `FreeStorageSpace` metric for your Amazon RDS instance.

We strongly recommend that you don’t start or resume from a point too far back in the transaction log backups, as it might lead to storage on your SQL Server instance filling up. In such cases, it is advisable to start a full load. Replicating from the transaction log backup is slower than reading from the active transaction logs. For more information, see [Transaction log backup processing for RDS for SQL Server](CHAP_Troubleshooting_Latency_Source_SQLServer.md#CHAP_Troubleshooting_Latency_Source_SQLServer_backup).

Note that accessing the log backups requires additional privileges. For more information, see as detailed in [Set up permissions for ongoing replication from a cloud SQL Server database](CHAP_Source.SQLServer.md#CHAP_Source.SQLServer.Permissions.Cloud) Make sure that you grant these privileges before the task starts replicating.

#### For AWS DMS 3.5.2 and below
<a name="CHAP_Source.SQLServer.Configuration.Settings.352"></a>

When you work with Amazon RDS for SQL Server as a source, the MS-CDC capture job relies on the parameters `maxscans` and `maxtrans`. These parameters govern the maximum number of scans that the MS-CDC capture does on the transaction log and the number of transactions that are processed for each scan.

For databases, where a number of transactions is greater than `maxtrans*maxscans`, increasing the `polling_interval` value can cause an accumulation of active transaction log records. In turn, this accumulation can lead to an increase in the size of the transaction log.

Note that AWS DMS does not rely on the MS-CDC capture job. The MS-CDC capture job marks the transaction log entries as having been processed. This allows the transaction log backup job to remove the entries from the transaction log.

We recommend that you monitor the size of the transaction log and the success of the MS-CDC jobs. If the MS-CDC jobs fail, the transaction log could grow excessively and cause AWS DMS replication failures. You can monitor MS-CDC capture job errors using the `sys.dm_cdc_errors` dynamic management view in the source database. You can monitor the transaction log size using the `DBCC SQLPERF(LOGSPACE)` management command.

**To address the transaction log increase that is caused by MS-CDC**

1. Check the `Log Space Used %` for the database AWS DMS is replicating from and validate that it increases continuously.

   ```
   DBCC SQLPERF(LOGSPACE)
   ```

1. Identify what is blocking the transaction log backup process.

   ```
   Select log_reuse_wait, log_reuse_wait_desc, name from sys.databases where name = db_name();
   ```

   If the `log_reuse_wait_desc` value equals `REPLICATION`, the log backup retention is caused by the latency in MS-CDC.

1. Increase the number of events processed by the capture job by increasing the `maxtrans` and `maxscans` parameter values.

   ```
   EXEC sys.sp_cdc_change_job @job_type = 'capture' ,@maxtrans = 5000, @maxscans = 20 
   exec sp_cdc_stop_job 'capture'
   exec sp_cdc_start_job 'capture'
   ```

To address this issue, set the values of `maxscans` and `maxtrans` so that `maxtrans*maxscans` is equal to the average number of events generated for tables that AWS DMS replicates from the source database for each day.

If you set these parameters higher than the recommended value, the capture jobs process all events in the transaction logs. If you set these parameters below the recommended value, MS-CDC latency increases and your transaction log grows.

Identifying appropriate values for `maxscans` and `maxtrans` can be difficult because changes in workload produce varying number of events. In this case, we recommend that you set up monitoring on MS-CDC latency. For more information, see [ Monitor the process](https://docs.microsoft.com/en-us/sql/relational-databases/track-changes/administer-and-monitor-change-data-capture-sql-server?view=sql-server-ver15#Monitor) in SQL Server documentation. Then configure `maxtrans` and `maxscans` dynamically based on the monitoring results.

If the AWS DMS task is unable to find the log sequence numbers (LSNs) needed to resume or continue the task, the task may fail and require a complete reload.

**Note**  
When using AWS DMS to replicate data from an RDS for SQL Server source, you may encounter errors when trying to resume replication after a stop-start event of the Amazon RDS instance. This is due to the SQL Server Agent process restarting the capture job process when it restarts after the stop-start event. This bypasses the MS-CDC polling interval.  
Because of this, on databases with transaction volumes lower than the MS-CDC capture job processing, this can cause data to be processed or marked as replicated and backed up before AWS DMS can resume from where it stopped, resulting in the following error:  

```
[SOURCE_CAPTURE ]E: Failed to access LSN '0000dbd9:0006f9ad:0003' in the backup log sets since BACKUP/LOG-s are not available. [1020465] (sqlserver_endpoint_capture.c:764)
```
To mitigate this issue, set the `maxtrans` and `maxscans` values as recommended prior.

# Using Microsoft Azure SQL database as a source for AWS DMS
<a name="CHAP_Source.AzureSQL"></a>

With AWS DMS, you can use Microsoft Azure SQL Database as a source in much the same way as you do SQL Server. AWS DMS supports, as a source, the same list of database versions that are supported for SQL Server running on-premises or on an Amazon EC2 instance. 

For more information, see [Using a Microsoft SQL Server database as a source for AWS DMS](CHAP_Source.SQLServer.md).

**Note**  
AWS DMS doesn't support change data capture operations (CDC) with Azure SQL Database.

# Using Microsoft Azure SQL Managed Instance as a source for AWS DMS
<a name="CHAP_Source.AzureMgd"></a>

With AWS DMS, you can use Microsoft Azure SQL Managed Instance as a source in much the same way as you do SQL Server. AWS DMS supports, as a source, the same list of database versions that are supported for SQL Server running on-premises or on an Amazon EC2 instance. 

For more information, see [Using a Microsoft SQL Server database as a source for AWS DMS](CHAP_Source.SQLServer.md).

# Using Microsoft Azure Database for PostgreSQL flexible server as a source for AWS DMS
<a name="CHAP_Source.AzureDBPostgreSQL"></a>

With AWS DMS, you can use Microsoft Azure Database for PostgreSQL flexible server as a source in much the same way as you do PostgreSQL.

For information about versions of Microsoft Azure Database for PostgreSQL flexible server that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md).

## Setting up Microsoft Azure for PostgreSQL flexible server for logical replication and decoding
<a name="CHAP_Source.AzureDBPostgreSQL.setup"></a>

You can use logical replication and decoding features in Microsoft Azure Database for PostgreSQL flexible server during database migration.

For logical decoding, DMS uses either the `test_decoding` or `pglogical` plugin. If the `pglogical` plugin is available on a source PostgreSQL database, DMS creates a replication slot using `pglogical`, otherwise the `test_decoding` plugin is used. 

To configure your Microsoft Azure for PostgreSQL flexible server as a source endpoint for DMS, perform the following steps: 

1. Open the Server Parameters page on the portal.

1. Set the `wal_level` server parameter to `LOGICAL`.

1. If you want to use the `pglogical` extension, set the `shared_preload_libraries` and `azure.extensions` parameters to `pglogical`.

1. Set the `max_replication_slots` parameter to the maximum number of DMS tasks that you plan to run concurrently. In Microsoft Azure, the default value for this parameter is 10. This parameter's maximum value depends on the available memory of your PostgreSQL instance, allowing for between 2 and 8 replication slots per GB of memory.

1. Set the `max_wal_senders` parameter to a value greater than 1. The `max_wal_senders` parameter sets the number of concurrent tasks that can run. The default value is 10.

1. Set the `max_worker_processes` parameter value to at least 16. Otherwise, you may see errors such as the following:

   ```
   WARNING: out of background worker slots.
   ```

1. Save the changes. Restart the server to apply the changes.

1. Confirm that your PostgreSQL instance allows network traffic from your connecting resource.

1. Grant an existing user replication permissions, or create a new user with replication permissions, using the following commands. 
   + Grant an existing user replication permissions using the following command:

     ```
     ALTER USER <existing_user> WITH REPLICATION;
     ```
   + Create a new user with replication permissions using the following command: 

     ```
     CREATE USER aws_dms_user PASSWORD 'aws_dms_user_password';
     GRANT azure_pg_admin to aws_dms_user;
     ALTER ROLE aws_dms_user REPLICATION LOGIN;
     ```

For more information about logical replication with PostgreSQL, see the following topics:
+ [Enabling change data capture (CDC) using logical replication](CHAP_Source.PostgreSQL.md#CHAP_Source.PostgreSQL.Security)
+ [Using native CDC start points to set up a CDC load of a PostgreSQL source](CHAP_Source.PostgreSQL.md#CHAP_Source.PostgreSQL.v10)
+ [ Logical replication and logical decoding in Azure Database for PostgreSQL - Flexible Server](https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-logical) in the [Azure Database for PostgreSQL documentation](https://learn.microsoft.com/en-us/azure/postgresql/).

# Using Microsoft Azure Database for MySQL flexible server as a source for AWS DMS
<a name="CHAP_Source.AzureDBMySQL"></a>

With AWS DMS, you can use Microsoft Azure Database for MySQL flexible server as a source in much the same way as you do MySQL.

For information about versions of Microsoft Azure Database for MySQL flexible server that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md). 

For more information about using a customer-managed MySQL-compatible database with AWS DMS, see [Using a self-managed MySQL-compatible database as a source for AWS DMS](CHAP_Source.MySQL.md#CHAP_Source.MySQL.CustomerManaged).

## Limitations when using Azure MySQL as a source for AWS Database Migration Service
<a name="CHAP_Source.AzureDBMySQL.limitations"></a>
+ The defaut value for the Azure MySQL flexible server system variable `sql_generate_invisible_primary_key` is `ON`, and the server automatically adds a generated invisible primary key (GIPK) to any table that is created without an explicit primary key. AWS DMS doesn’t support ongoing replication for MySQL tables with GIPK constraints.

# Using OCI MySQL Heatwave as a source for AWS DMS
<a name="CHAP_Source.heatwave"></a>

With AWS DMS, you can use OCI MySQL Heatwave as a source in much the same way as you do MySQL. Using OCI MySQL Heatwave as a source requires a few additional configuration changes.

For information about versions of OCI MySQL Heatwave that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md).

## Setting up OCI MySQL Heatwave for logical replication
<a name="CHAP_Source.heatwave.setup"></a>

To configure your OCI MySQL Heatwave instance as a source endpoint for DMS, do the following:

1. Sign in to OCI Console, and open the main hamburger menu (≡) in the top left corner.

1. Choose **Databases**, **DB Systems**.

1. Open the **Configurations** menu.

1. Choose **Create configuration**.

1. Enter a configuration name, such as **dms\$1configuration**.

1. Choose the shape of your current OCI MySQL Heatwave instance. You can find the shape on the instance's **DB system configuration** properties tab under the **DB system configuration:Shape** section.

1. In the **User variables** section, choose the `binlog_row_value_options` system variable. Its default value is `PARTIAL_JSON`. Clear the value.

1. Choose the **Create** button.

1. Open your OCI MySQLHeatwave instance, and choose the **Edit** button.

1. In the **Configuration** section, choose the **Change configuration** button, and choose the shape configuration that you created in step 4.

1. Once the changes take effect, your instance is ready for logical replication.

# Using Google Cloud for MySQL as a source for AWS DMS
<a name="CHAP_Source.GC"></a>

With AWS DMS, you can use Google Cloud for MySQL as a source in much the same way as you do MySQL. 

For information about versions of GCP MySQL that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md). 

For more information, see [Using a MySQL-compatible database as a source for AWS DMS](CHAP_Source.MySQL.md).

**Note**  
Support for GCP MySQL 8.0 as a source is available in AWS DMS version 3.4.6.  
AWS DMS doesn't support the SSL mode `verify-full` for GCP for MySQL instances.  
The GCP MySQL security setting `Allow only SSL connections` isn't supported, because it requires both server and client certificate verification. AWS DMS only supports server certificate verification.  
AWS DMS supports the default GCP CloudSQL for MySQL value of `CRC32` for the `binlog_checksum` database flag.

# Using Google Cloud for PostgreSQL as a source for AWS DMS
<a name="CHAP_Source.GCPostgres"></a>

With AWS DMS, you can use Google Cloud for PostgreSQL as a source in much the same way as you do self-managed PostgreSQL databases.

For information about versions of GCP PostgreSQL that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md). 

For more information, see [Using a PostgreSQL database as an AWS DMS source](CHAP_Source.PostgreSQL.md).

## Set up Google Cloud for PostgreSQL for logical replication and decoding
<a name="CHAP_Source.GCPostgres.setup"></a>

You can use logical replication and decoding features in Google Cloud SQL for PostgreSQL during database migration.

For logical decoding, DMS uses one of the following plugins:
+ `test_decoding`
+ `pglogical`

If the `pglogical` plugin is available on a source PostgreSQL database, DMS creates a replication slot using `pglogical`, otherwise the `test_decoding` plugin is used. 

Note the following about using logical decoding with AWS DMS:

1. With Google Cloud SQL for PostgreSQL, enable logical decoding by setting the `cloudsql.logical_decoding` flag to `on`.

1. To enable `pglogical`, set the `cloudsql.enable_pglogical` flag to `on`, and restart the database.

1. To use logical decoding features, you create a PostgreSQL user with the `REPLICATION` attribute. When you are using the `pglogical` extension, the user must have the `cloudsqlsuperuser` role. To create a user with the `cloudsqlsuperuser` role, do the following:

   ```
   CREATE USER new_aws_dms_user WITH REPLICATION
   IN ROLE cloudsqlsuperuser LOGIN PASSWORD 'new_aws_dms_user_password';
   ```

   To set this attribue on an existing user, do the following:

   ```
   ALTER USER existing_user WITH REPLICATION;
   ```

1. Set the `max_replication_slots` parameter to the maximum number of DMS tasks that you plan to run concurrently. In Google Cloud SQL, the default value for this parameter is 10. This parameter's maximum value depends on the available memory of your PostgreSQL instance, allowing for between 2 and 8 replication slots per GB of memory.

For more information about logical replication with PostgreSQL, see the following topics:
+ [Enabling change data capture (CDC) using logical replication](CHAP_Source.PostgreSQL.md#CHAP_Source.PostgreSQL.Security)
+ [Using native CDC start points to set up a CDC load of a PostgreSQL source](CHAP_Source.PostgreSQL.md#CHAP_Source.PostgreSQL.v10)
+ [ Set up logical replication and decoding](https://cloud.google.com/sql/docs/postgres/replication/configure-logical-replication) in the [Cloud SQL for PostgreSQL documentation](https://cloud.google.com/sql/docs/postgres).

# Using a PostgreSQL database as an AWS DMS source
<a name="CHAP_Source.PostgreSQL"></a>

You can migrate data from one or many PostgreSQL databases using AWS DMS. With a PostgreSQL database as a source, you can migrate data to either another PostgreSQL database or one of the other supported databases. 

For information about versions of PostgreSQL that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md). 

AWS DMS supports PostgreSQL for these types of databases: 
+ On-premises databases
+ Databases on an Amazon EC2 instance
+ Databases on an Amazon RDS DB instance
+ Databases on an DB instance based on Amazon Aurora PostgreSQL-Compatible Edition
+ Databases on an DB instance based on Amazon Aurora PostgreSQL-Compatible Serverless Edition

**Note**  
DMS supports Amazon Aurora PostgreSQL—Serverless V1 as a source for Full load only. But you can use Amazon Aurora PostgreSQL—Serverless V2 as a source for Full load, Full load \$1 CDC, and CDC only tasks.

You can use Secure Socket Layers (SSL) to encrypt connections between your PostgreSQL endpoint and the replication instance. For more information on using SSL with a PostgreSQL endpoint, see [Using SSL with AWS Database Migration Service](CHAP_Security.SSL.md).

As an additional security requirement when using PostgreSQL as a source, the user account specified must be a registered user in the PostgreSQL database.

To configure a PostgreSQL database as an AWS DMS source endpoint, do the following:
+ Create a PostgreSQL user with appropriate permissions to provide AWS DMS access to your PostgreSQL source database.
**Note**  
If your PostgreSQL source database is self-managed, see [Working with self-managed PostgreSQL databases as a source in AWS DMS](#CHAP_Source.PostgreSQL.Prerequisites) for more information.
If your PostgreSQL source database is managed by Amazon RDS, see [Working with AWS-managed PostgreSQL databases as a DMS source](#CHAP_Source.PostgreSQL.RDSPostgreSQL) for more information.
+ Create a PostgreSQL source endpoint that conforms with your chosen PostgreSQL database configuration.
+ Create a task or set of tasks to migrate your tables.

  To create a full-load-only task, no further endpoint configuration is needed.

  Before you create a task for change data capture (a CDC-only or full-load and CDC task), see [Enabling CDC using a self-managed PostgreSQL database as a AWS DMS source](#CHAP_Source.PostgreSQL.Prerequisites.CDC) or [Enabling CDC with an AWS-managed PostgreSQL DB instance with AWS DMS](#CHAP_Source.PostgreSQL.RDSPostgreSQL.CDC).

**Topics**
+ [

## Working with self-managed PostgreSQL databases as a source in AWS DMS
](#CHAP_Source.PostgreSQL.Prerequisites)
+ [

## Working with AWS-managed PostgreSQL databases as a DMS source
](#CHAP_Source.PostgreSQL.RDSPostgreSQL)
+ [

## Enabling change data capture (CDC) using logical replication
](#CHAP_Source.PostgreSQL.Security)
+ [

## Using native CDC start points to set up a CDC load of a PostgreSQL source
](#CHAP_Source.PostgreSQL.v10)
+ [

## Migrating from PostgreSQL to PostgreSQL using AWS DMS
](#CHAP_Source.PostgreSQL.Homogeneous)
+ [

## Migrating from Babelfish for Amazon Aurora PostgreSQL using AWS DMS
](#CHAP_Source.PostgreSQL.Babelfish)
+ [

## Removing AWS DMS artifacts from a PostgreSQL source database
](#CHAP_Source.PostgreSQL.CleanUp)
+ [

## Additional configuration settings when using a PostgreSQL database as a DMS source
](#CHAP_Source.PostgreSQL.Advanced)
+ [

## Read replica as a source for PostgreSQL
](#CHAP_Source.PostgreSQL.ReadReplica)
+ [

## Using the `MapBooleanAsBoolean` PostgreSQL endpoint setting
](#CHAP_Source.PostgreSQL.ConnectionAttrib.Endpointsetting)
+ [

## Endpoint settings and Extra Connection Attributes (ECAs) when using PostgreSQL as a DMS source
](#CHAP_Source.PostgreSQL.ConnectionAttrib)
+ [

## Limitations on using a PostgreSQL database as a DMS source
](#CHAP_Source.PostgreSQL.Limitations)
+ [

## Source data types for PostgreSQL
](#CHAP_Source-PostgreSQL-DataTypes)

## Working with self-managed PostgreSQL databases as a source in AWS DMS
<a name="CHAP_Source.PostgreSQL.Prerequisites"></a>

With a self-managed PostgreSQL database as a source, you can migrate data to either another PostgreSQL database, or one of the other target databases supported by AWS DMS. The database source can be an on-premises database or a self-managed engine running on an Amazon EC2 instance. You can use a DB instance for both full-load tasks and change data capture (CDC) tasks.

### Prerequisites to using a self-managed PostgreSQL database as an AWS DMS source
<a name="CHAP_Source.PostgreSQL.Prerequisites.SelfManaged"></a>

Before migrating data from a self-managed PostgreSQL source database, do the following: 
+ Make sure that you use a PostgreSQL database that is version 9.4.x or higher.
+ For full-load plus CDC tasks or CDC-only tasks, grant superuser permissions for the user account specified for the PostgreSQL source database. The user account needs superuser permissions to access replication-specific functions in the source. DMS user account needs SELECT permissions on all columns to migrate tables successfully. In the case of missing permissions on columns, DMS creates target table using regular DMS data type mappings which leads to metadata differences and task failures.
+ Add the IP address of the AWS DMS replication server to the `pg_hba.conf` configuration file and enable replication and socket connections. An example follows.

  ```
              # Replication Instance
              host all all 12.3.4.56/00 md5
              # Allow replication connections from localhost, by a user with the
              # replication privilege.
              host replication dms 12.3.4.56/00 md5
  ```

  PostgreSQL's `pg_hba.conf` configuration file controls client authentication. (HBA stands for host-based authentication.) The file is traditionally stored in the database cluster's data directory. 
+ If you're configuring a database as a source for logical replication using AWS DMS see [Enabling CDC using a self-managed PostgreSQL database as a AWS DMS source](#CHAP_Source.PostgreSQL.Prerequisites.CDC)

**Note**  
Some AWS DMS transactions are idle for some time before the DMS engine uses them again. By using the parameter `idle_in_transaction_session_timeout` in PostgreSQL versions 9.6 and higher, you can cause idle transactions to time out and fail. Don't end idle transactions when you use AWS DMS. 

### Enabling CDC using a self-managed PostgreSQL database as a AWS DMS source
<a name="CHAP_Source.PostgreSQL.Prerequisites.CDC"></a>

AWS DMS supports change data capture (CDC) using logical replication. To enable logical replication of a self-managed PostgreSQL source database, set the following parameters and values in the `postgresql.conf` configuration file:
+ Set `wal_level = logical`.
+ Set `max_replication_slots` to a value greater than 1.

  Set the `max_replication_slots` value according to the number of tasks that you want to run. For example, to run five tasks you set a minimum of five slots. Slots open automatically as soon as a task starts and remain open even when the task is no longer running. Make sure to manually delete open slots. Note that DMS automatically drops replication slots when the task is deleted, if DMS created the slot.
+ Set `max_wal_senders` to a value greater than 1.

  The `max_wal_senders` parameter sets the number of concurrent tasks that can run.
+ The `wal_sender_timeout` parameter ends replication connections that are inactive longer than the specified number of milliseconds. The default for an on-premises PostgreSQL database is 60000 milliseconds (60 seconds). Setting the value to 0 (zero) disables the timeout mechanism, and is a valid setting for DMS.

  When setting `wal_sender_timeout` to a non-zero value, a DMS task with CDC requires a minimum of 10000 milliseconds (10 seconds), and fails if the value is less than 10000. Keep the value less than 5 minutes to avoid causing a delay during a Multi-AZ failover of a DMS replication instance.

Some parameters are static, and you can only set them at server start. Any changes to their entries in the configuration file (for a self-managed database) or DB parameter group (for an RDS for PostgreSQL database) are ignored until the server is restarted. For more information, see the [PostgreSQL documentation](https://www.postgresql.org/docs/current/intro-whatis.html).

For more information about enabling CDC, see [Enabling change data capture (CDC) using logical replication](#CHAP_Source.PostgreSQL.Security).

## Working with AWS-managed PostgreSQL databases as a DMS source
<a name="CHAP_Source.PostgreSQL.RDSPostgreSQL"></a>

You can use an AWS-managed PostgreSQL DB instance as a source for AWS DMS. You can perform both full-load tasks and change data capture (CDC) tasks using an AWS-managed PostgreSQL source. 

### Prerequisites for using an AWS-managed PostgreSQL database as a DMS source
<a name="CHAP_Source.PostgreSQL.RDSPostgreSQL.Prerequisites"></a>

Before migrating data from an AWS-managed PostgreSQL source database, do the following:
+ We recommend that you use an AWS user account with the minimum required permissions for the PostgreSQL DB instance as the user account for the PostgreSQL source endpoint for AWS DMS. Using the master account is not recommended. The account must have the `rds_superuser` role and the `rds_replication` role. The `rds_replication` role grants permissions to manage logical slots and to stream data using logical slots.

  Make sure to create several objects from the master user account for the account that you use. For information about creating these, see [Migrating an Amazon RDS for PostgreSQL database without using the master user account](#CHAP_Source.PostgreSQL.RDSPostgreSQL.NonMasterUser).
+ If your source database is in a virtual private cloud (VPC), choose the VPC security group that provides access to the DB instance where the database resides. This is needed for the DMS replication instance to connect successfully to the source DB instance. When the database and DMS replication instance are in same VPC, add the appropriate security group to its own inbound rules.

**Note**  
Some AWS DMS transactions are idle for some time before the DMS engine uses them again. By using the parameter `idle_in_transaction_session_timeout` in PostgreSQL versions 9.6 and higher, you can cause idle transactions to time out and fail. Don't end idle transactions when you use AWS DMS.

### Enabling CDC with an AWS-managed PostgreSQL DB instance with AWS DMS
<a name="CHAP_Source.PostgreSQL.RDSPostgreSQL.CDC"></a>

AWS DMS supports CDC on Amazon RDS PostgreSQL databases when the DB instance is configured to use logical replication. The following table summarizes the logical replication compatibility of each AWS-managed PostgreSQL version. 


|  PostgreSQL version  |  AWS DMS full load support   |  AWS DMS CDC support  | 
| --- | --- | --- | 
|  Aurora PostgreSQL version 2.1 with PostgreSQL 10.5 compatibility (or lower)  |  Yes  |  No  | 
|  Aurora PostgreSQL version 2.2 with PostgreSQL 10.6 compatibility (or higher)   |  Yes  |  Yes  | 
|  RDS for PostgreSQL with PostgreSQL 10.21 compatibility (or higher)  |  Yes  |  Yes  | 

**To enable logical replication for an RDS PostgreSQL DB instance**

1. Use the AWS master user account for the PostgreSQL DB instance as the user account for the PostgreSQL source endpoint. The master user account has the required roles that allow it to set up CDC. 

   If you use an account other than the master user account, make sure to create several objects from the master account for the account that you use. For more information, see [Migrating an Amazon RDS for PostgreSQL database without using the master user account](#CHAP_Source.PostgreSQL.RDSPostgreSQL.NonMasterUser).

1. Set the `rds.logical_replication` parameter in your DB CLUSTER parameter group to 1. This static parameter requires a reboot of the DB instance to take effect. As part of applying this parameter, AWS DMS sets the `wal_level`, `max_wal_senders`, `max_replication_slots`, and `max_connections` parameters. These parameter changes can increase write ahead log (WAL) generation, so only set `rds.logical_replication` when you use logical replication slots.

1. The `wal_sender_timeout` parameter ends replication connections that are inactive longer than the specified number of milliseconds. The default for an AWS-managed PostgreSQL database is 30000 milliseconds (30 seconds). Setting the value to 0 (zero) disables the timeout mechanism, and is a valid setting for DMS.

   When setting `wal_sender_timeout` to a non-zero value, a DMS task with CDC requires a minimum of 10000 milliseconds (10 seconds), and fails if the value is between 0 and 10000. Keep the value less than 5 minutes to avoid causing a delay during a Multi-AZ failover of a DMS replication instance.

1.  Ensure the value of the `max_worker_processes` parameter in your DB Cluster Parameter Group is equal to, or higher than the total combined values of `max_logical_replication_workers`, `autovacuum_max_workers`, and `max_parallel_workers`. A high number of background worker processes might impact application workloads on small instances. So, monitor performance of your database if you set `max_worker_processes` higher than the default value.

1.  When using Aurora PostgreSQL as a source with CDC, set `synchronous_commit` to `ON`.

**To use PostgreSQL MultiAZ DB Cluster Read Replica for CDC (ongoing replication)**

1. Set the `rds.logical_replication` and `sync_replication_slots` parameters in your DB CLUSTER parameter group to 1. This static parameters require a reboot of the DB instance to take effect.

1. Run the following command to create the `awsdms_ddl_audit` table on Writer and to replace the `objects_schema` with the name of the schema to use:

   ```
   CREATE TABLE objects_schema.awsdms_ddl_audit
   (
     c_key    bigserial primary key,
     c_time   timestamp,    -- Informational
     c_user   varchar(64),  -- Informational: current_user
     c_txn    varchar(16),  -- Informational: current transaction
     c_tag    varchar(24),  -- Either 'CREATE TABLE' or 'ALTER TABLE' or 'DROP TABLE'
     c_oid    integer,      -- For future use - TG_OBJECTID
     c_name   varchar(64),  -- For future use - TG_OBJECTNAME
     c_schema varchar(64),  -- For future use - TG_SCHEMANAME. For now - holds current_schema
     c_ddlqry  text         -- The DDL query associated with the current DDL event
   );
   ```

1. Run the following command to create the `awsdms_intercept_ddl` function and to replace the `objects_schema` with the name of the schema to use:

   ```
   CREATE OR REPLACE FUNCTION objects_schema.awsdms_intercept_ddl()
     RETURNS event_trigger
   LANGUAGE plpgsql
   SECURITY DEFINER
     AS $$
     declare _qry text;
   BEGIN
     if (tg_tag='CREATE TABLE' or tg_tag='ALTER TABLE' or tg_tag='DROP TABLE' or tg_tag = 'CREATE TABLE AS') then
            SELECT current_query() into _qry;
            insert into objects_schema.awsdms_ddl_audit
            values
            (
            default,current_timestamp,current_user,cast(TXID_CURRENT()as varchar(16)),tg_tag,0,'',current_schema,_qry
            );
            delete from objects_schema.awsdms_ddl_audit;
   end if;
   END;
   $$;
   ```

1. Run the following command to create the `awsdms_intercept_ddl` event trigger:

   ```
   CREATE EVENT TRIGGER awsdms_intercept_ddl ON ddl_command_end EXECUTE PROCEDURE objects_schema.awsdms_intercept_ddl();
   ```

   Ensure that all the users and roles that access these events have the necessary DDL permissions. For example:

   ```
   grant all on public.awsdms_ddl_audit to public;
   grant all on public.awsdms_ddl_audit_c_key_seq to public;
   ```

1. Create replication slot on Writer:

   ```
   SELECT * FROM pg_create_logical_replication_slot('dms_read_replica_slot', 'test_decoding', false, true);
   ```

1. Ensure the replication slot is available on Reader:

   ```
   select * from pg_catalog.pg_replication_slots where slot_name = 'dms_read_replica_slot';
   
   slot_name            |plugin       |slot_type|datoid|database|temporary|active|active_pid|xmin|catalog_xmin|restart_lsn|confirmed_flush_lsn|wal_status|safe_wal_size|two_phase|inactive_since               |conflicting|invalidation_reason|failover|synced|
   ---------------------+-------------+---------+------+--------+---------+------+----------+----+------------+-----------+-------------------+----------+-------------+---------+-----------------------------+-----------+-------------------+--------+------+
   dms_read_replica_slot|test_decoding|logical  |     5|postgres|false    |false |          |    |3559        |0/180011B8 |0/180011F0         |reserved  |             |true     |2025-02-10 15:45:04.083 +0100|false      |                   |false   |false |
   ```

1. Create DMS source endpoint for Read Replica and set logical replication slot name via the Extra Connection Attribute:

   ```
   slotName=dms_read_replica_slot;
   ```

1. Create and start the CDC/FL\$1CDC task.
**Note**  
For CDC/FL\$1CDC migrations DMS considers task start time as a CDC start position. All older LSNs from replication slot are ignored.

### Migrating an Amazon RDS for PostgreSQL database without using the master user account
<a name="CHAP_Source.PostgreSQL.RDSPostgreSQL.NonMasterUser"></a>

In some cases, you might not use the master user account for the Amazon RDS PostgreSQL DB instance that you are using as a source. In these cases, you create several objects to capture data definition language (DDL) events. You create these objects in the account other than the master account and then create a trigger in the master user account.

**Note**  
If you set the `CaptureDdls` endpoint setting to `false` on the source endpoint, you don't have to create the following table and trigger on the source database.

Use the following procedure to create these objects.

**To create objects**

1. Choose the schema where the objects are to be created. The default schema is `public`. Ensure that the schema exists and is accessible by the `OtherThanMaster` account. 

1. Log in to the PostgreSQL DB instance using the user account other than the master account, here the `OtherThanMaster` account.

1. Create the table `awsdms_ddl_audit` by running the following command, replacing `objects_schema` in the following code with the name of the schema to use.

   ```
   CREATE TABLE objects_schema.awsdms_ddl_audit
   (
     c_key    bigserial primary key,
     c_time   timestamp,    -- Informational
     c_user   varchar(64),  -- Informational: current_user
     c_txn    varchar(16),  -- Informational: current transaction
     c_tag    varchar(24),  -- Either 'CREATE TABLE' or 'ALTER TABLE' or 'DROP TABLE'
     c_oid    integer,      -- For future use - TG_OBJECTID
     c_name   varchar(64),  -- For future use - TG_OBJECTNAME
     c_schema varchar(64),  -- For future use - TG_SCHEMANAME. For now - holds current_schema
     c_ddlqry  text         -- The DDL query associated with the current DDL event
   );
   ```

1. Create the function `awsdms_intercept_ddl` by running the following command, replacing `objects_schema` in the code following with the name of the schema to use.

   ```
   CREATE OR REPLACE FUNCTION objects_schema.awsdms_intercept_ddl()
     RETURNS event_trigger
   LANGUAGE plpgsql
   SECURITY DEFINER
     AS $$
     declare _qry text;
   BEGIN
     if (tg_tag='CREATE TABLE' or tg_tag='ALTER TABLE' or tg_tag='DROP TABLE' or tg_tag = 'CREATE TABLE AS') then
            SELECT current_query() into _qry;
            insert into objects_schema.awsdms_ddl_audit
            values
            (
            default,current_timestamp,current_user,cast(TXID_CURRENT()as varchar(16)),tg_tag,0,'',current_schema,_qry
            );
            delete from objects_schema.awsdms_ddl_audit;
   end if;
   END;
   $$;
   ```

1. Log out of the `OtherThanMaster` account and log in with an account that has the `rds_superuser` role assigned to it.

1. Create the event trigger `awsdms_intercept_ddl` by running the following command.

   ```
   CREATE EVENT TRIGGER awsdms_intercept_ddl ON ddl_command_end 
   EXECUTE PROCEDURE objects_schema.awsdms_intercept_ddl();
   ```

1. Make sure that all users and roles that access these events have the necessary DDL permissions. For example:

   ```
   grant all on public.awsdms_ddl_audit to public;
   grant all on public.awsdms_ddl_audit_c_key_seq to public;
   ```

When you have completed the procedure preceding, you can create the AWS DMS source endpoint using the `OtherThanMaster` account.

**Note**  
These events are triggered by `CREATE TABLE`, `ALTER TABLE`, and `DROP TABLE` statements.

## Enabling change data capture (CDC) using logical replication
<a name="CHAP_Source.PostgreSQL.Security"></a>

You can use PostgreSQL's native logical replication feature to enable change data capture (CDC) during database migration for PostgreSQL sources. You can use this feature with a self-managed PostgreSQL and also an Amazon RDS for PostgreSQL SQL DB instance. This approach reduces downtime and help ensure that the target database is in sync with the source PostgreSQL database.

AWS DMS supports CDC for PostgreSQL tables with primary keys. If a table does not have a primary key, the write-ahead logs (WAL) don't include a before image of the database row. In this case, DMS can't update the table. Here, you can use additional configuration settings and use table replica identity as a workaround. However, this approach can generate extra logs. We recommend that you use table replica identity as a workaround only after careful testing. For more information, see [Additional configuration settings when using a PostgreSQL database as a DMS source](#CHAP_Source.PostgreSQL.Advanced).

**Note**  
REPLICA IDENTITY FULL is supported with a logical decoding plugin, but isn't supported with a pglogical plugin. For more information, see [pglogical documentation](https://github.com/2ndQuadrant/pglogical#primary-key-or-replica-identity-required).

For full load and CDC and CDC only tasks, AWS DMS uses logical replication slots to retain WAL logs for replication until the logs are decoded. On restart (not resume) for a full load and CDC task or a CDC task, the replication slot gets recreated.

**Note**  
For logical decoding, DMS uses either test\$1decoding or pglogical plugin. If the pglogical plugin is available on a source PostgreSQL database, DMS creates a replication slot using pglogical, otherwise a test\$1decoding plugin is used. For more information about the test\$1decoding plugin, see [PostgreSQL Documentation](https://www.postgresql.org/docs/9.4/test-decoding.html).  
If the database parameter `max_slot_wal_keep_size` is set to a non default value, and the `restart_lsn` of a replication slot falls behind the current LSN by more than this size, the DMS task fails due to removal of required WAL files.

### Configuring the pglogical plugin
<a name="CHAP_Source.PostgreSQL.Security.Pglogical"></a>

Implemented as a PostgreSQL extension, the pglogical plugin is a logical replication system and model for selective data replication. The following table identifies source PostgreSQL database versions that support the pglogical plugin.


|  PostgreSQL source   |  Supports pglogical  | 
| --- | --- | 
|  Self-managed PostgreSQL 9.4 or higher  |  Yes  | 
|  Amazon RDS PostgreSQL 9.5 or lower  |  No  | 
|  Amazon RDS PostgreSQL 9.6 or higher  |  Yes  | 
|  Aurora PostgreSQL 1.x till 2.5.x  |  No  | 
|  Aurora PostgreSQL 2.6.x or higher  |  Yes  | 
|  Aurora PostgreSQL 3.3.x or higher  |  Yes  | 

Before configuring pglogical for use with AWS DMS, first enable logical replication for change data capture (CDC) on your PostgreSQL source database. 
+ For information about enabling logical replication for CDC on *self-managed* PostgreSQL source databases, see [Enabling CDC using a self-managed PostgreSQL database as a AWS DMS source](#CHAP_Source.PostgreSQL.Prerequisites.CDC)
+ For information about enabling logical replication for CDC on *AWS-managed* PostgreSQL source databases, see [Enabling CDC with an AWS-managed PostgreSQL DB instance with AWS DMS](#CHAP_Source.PostgreSQL.RDSPostgreSQL.CDC).

After logical replication is enabled on your PostgreSQL source database, use the following steps to configure pglogical for use with DMS.

**To use the pglogical plugin for logical replication on a PostgreSQL source database with AWS DMS**

1. Create a pglogical extension on your source PostgreSQL database:

   1. Set the correct parameter:
      + For self-managed PostgreSQL databases, set the database parameter `shared_preload_libraries= 'pglogical'`.
      + For PostgreSQL on Amazon RDS and Amazon Aurora PostgreSQL-Compatible Edition databases, set the parameter `shared_preload_libraries` to `pglogical` in the same RDS parameter group.

   1. Restart your PostgreSQL source database.

   1. On the PostgreSQL database, run the command, `create extension pglogical;`

1. Run the following command to verify that pglogical installed successfully:

   `select * FROM pg_catalog.pg_extension`

You can now create a AWS DMS task that performs change data capture for your PostgreSQL source database endpoint.

**Note**  
If you don't enable pglogical on your PostgreSQL source database, AWS DMS uses the `test_decoding` plugin by default. When pglogical is enabled for logical decoding, AWS DMS uses pglogical by default. But you can set the extra connection attribute, `PluginName` to use the `test_decoding` plugin instead.

## Using native CDC start points to set up a CDC load of a PostgreSQL source
<a name="CHAP_Source.PostgreSQL.v10"></a>

To enable native CDC start points with PostgreSQL as a source, set the `slotName` extra connection attribute to the name of an existing logical replication slot when you create the endpoint. This logical replication slot holds ongoing changes from the time of endpoint creation, so it supports replication from a previous point in time. 

PostgreSQL writes the database changes to WAL files that are discarded only after AWS DMS successfully reads changes from the logical replication slot. Using logical replication slots can protect logged changes from being deleted before they are consumed by the replication engine. 

However, depending on rate of change and consumption, changes being held in a logical replication slot can cause elevated disk usage. We recommend that you set space usage alarms in the source PostgreSQL instance when you use logical replication slots. For more information on setting the `slotName` extra connection attribute, see [Endpoint settings and Extra Connection Attributes (ECAs) when using PostgreSQL as a DMS source](#CHAP_Source.PostgreSQL.ConnectionAttrib).

The following procedure walks through this approach in more detail.

**To use a native CDC start point to set up a CDC load of a PostgreSQL source endpoint**

1. Identify the logical replication slot used by an earlier replication task (a parent task) that you want to use as a start point. Then query the `pg_replication_slots` view on your source database to make sure that this slot does not have any active connections. If it does, resolve and close them before proceeding.

   For the following steps, assume that your logical replication slot is `abc1d2efghijk_34567890_z0yx98w7_6v54_32ut_1srq_1a2b34c5d67ef`. 

1. Create a new source endpoint that includes the following extra connection attribute setting.

   ```
   slotName=abc1d2efghijk_34567890_z0yx98w7_6v54_32ut_1srq_1a2b34c5d67ef;
   ```

1. Create a new CDC-only task using the console, AWS CLI or AWS DMS API. For example, using the CLI you might run the following `create-replication-task` command. 

   ```
   aws dms create-replication-task --replication-task-identifier postgresql-slot-name-test 
   --source-endpoint-arn arn:aws:dms:us-west-2:012345678901:endpoint:ABCD1EFGHIJK2LMNOPQRST3UV4 
   --target-endpoint-arn arn:aws:dms:us-west-2:012345678901:endpoint:ZYX9WVUTSRQONM8LKJIHGF7ED6 
   --replication-instance-arn arn:aws:dms:us-west-2:012345678901:rep:AAAAAAAAAAA5BB4CCC3DDDD2EE 
   --migration-type cdc --table-mappings "file://mappings.json" --cdc-start-position "4AF/B00000D0" 
   --replication-task-settings "file://task-pg.json"
   ```

   In the preceding command, the following options are set:
   + The `source-endpoint-arn` option is set to the new value that you created in step 2.
   + The `replication-instance-arn` option is set to the same value as for the parent task from step 1.
   + The `table-mappings` and `replication-task-settings` options are set to the same values as for the parent task from step 1.
   + The `cdc-start-position` option is set to a start position value. To find this start position, either query the `pg_replication_slots` view on your source database or view the console details for the parent task in step 1. For more information, see [Determining a CDC native start point](CHAP_Task.CDC.md#CHAP_Task.CDC.StartPoint.Native).

   To enable custom CDC start mode when creating a new CDC-only task using the AWS DMS console, do the following:
   + In the **Task settings** section, for **CDC start mode for source transactions**, choose **Enable custom CDC start mode**.
   + For **Custom CDC start point for source transactions**, choose **Specify a log sequence number**. Specify the system change number or choose **Specify a recovery checkpoint**, and provide a Recovery checkpoint.

   When this CDC task runs, AWS DMS raises an error if the specified logical replication slot does not exist. It also raises an error if the task isn't created with a valid setting for `cdc-start-position`.

When using native CDC start points with the pglogical plugin and you want to use a new replication slot, complete the setup steps following before creating a CDC task. 

**To use a new replication slot not previously created as part of another DMS task**

1. Create a replication slot, as shown following:

   ```
   SELECT * FROM pg_create_logical_replication_slot('replication_slot_name', 'pglogical');
   ```

1. After the database creates the replication slot, get and note the **restart\$1lsn** and **confirmed\$1flush\$1lsn** values for the slot:

   ```
   select * from pg_replication_slots where slot_name like 'replication_slot_name';
   ```

   Note that the Native CDC Start position for a CDC task created after the replication slot can't be older than the **confirmed\$1flush\$1lsn** value.

   For information about the **restart\$1lsn** and **confirmed\$1flush\$1lsn** values, see [pg\$1replication\$1slots](https://www.postgresql.org/docs/14/view-pg-replication-slots.html) 

1. Create a pglogical node.

   ```
   SELECT pglogical.create_node(node_name := 'node_name', dsn := 'your_dsn_name');
   ```

1. Create two replication sets using the `pglogical.create_replication_set` function. The first replication set tracks updates and deletes for tables that have primary keys. The second replication set tracks only inserts, and has the same name as the first replication set, with the added prefix 'i'.

   ```
   SELECT pglogical.create_replication_set('replication_slot_name', false, true, true, false);
   SELECT pglogical.create_replication_set('ireplication_slot_name', true, false, false, true);
   ```

1. Add a table to the replication set.

   ```
   SELECT pglogical.replication_set_add_table('replication_slot_name', 'schemaname.tablename', true);
   SELECT pglogical.replication_set_add_table('ireplication_slot_name', 'schemaname.tablename', true);
   ```

1. Set the extra connection attribute (ECA) following when you create your source endpoint.

   ```
   PluginName=PGLOGICAL;slotName=slot_name;
   ```

You can now create a CDC only task with a PostgreSQL native start point using the new replication slot. For more information about the pglogical plugin, see the [pglogical 3.7 documentation](https://www.enterprisedb.com/docs/pgd/3.7/pglogical/)

## Migrating from PostgreSQL to PostgreSQL using AWS DMS
<a name="CHAP_Source.PostgreSQL.Homogeneous"></a>

When you migrate from a database engine other than PostgreSQL to a PostgreSQL database, AWS DMS is almost always the best migration tool to use. But when you are migrating from a PostgreSQL database to a PostgreSQL database, PostgreSQL tools can be more effective.

### Using PostgreSQL native tools to migrate data
<a name="CHAP_Source.PostgreSQL.Homogeneous.Native"></a>

We recommend that you use PostgreSQL database migration tools such as `pg_dump` under the following conditions: 
+ You have a homogeneous migration, where you are migrating from a source PostgreSQL database to a target PostgreSQL database. 
+ You are migrating an entire database.
+ The native tools allow you to migrate your data with minimal downtime. 

The pg\$1dump utility uses the COPY command to create a schema and data dump of a PostgreSQL database. The dump script generated by pg\$1dump loads data into a database with the same name and recreates the tables, indexes, and foreign keys. To restore the data to a database with a different name, use the `pg_restore` command and the `-d` parameter.

If you are migrating data from a PostgreSQL source database running on EC2 to an Amazon RDS for PostgreSQL target, you can use the pglogical plugin.

For more information about importing a PostgreSQL database into Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition, see [https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.Procedural.Importing.html](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.Procedural.Importing.html).

### Using DMS to migrate data from PostgreSQL to PostgreSQL
<a name="CHAP_Source.PostgreSQL.Homogeneous.DMS"></a>

 AWS DMS can migrate data, for example, from a source PostgreSQL database that is on premises to a target Amazon RDS for PostgreSQL or Aurora PostgreSQL instance. Core or basic PostgreSQL data types most often migrate successfully.

**Note**  
When replicating partitioned tables from a PostgreSQL source to PostgreSQL target, you don’t need to mention the parent table as part of the selection criteria in the DMS task. Mentioning the parent table causes data to be duplicated in child tables on the target, possibly causing a PK violation. By selecting child tables alone in the table mapping selection criteria, the parent table is automatically populated.

Data types that are supported on the source database but aren't supported on the target might not migrate successfully. AWS DMS streams some data types as strings if the data type is unknown. Some data types, such as XML and JSON, can successfully migrate as small files but can fail if they are large documents. 

When performing data type migration, be aware of the following:
+ In some cases, the PostgreSQL NUMERIC(p,s) data type does not specify any precision and scale. For DMS versions 3.4.2 and earlier, DMS uses a precision of 28 and a scale of 6 by default, NUMERIC(28,6). For example, the value 0.611111104488373 from the source is converted to 0.611111 on the PostgreSQL target.
+ A table with an ARRAY data type must have a primary key. A table with an ARRAY data type missing a primary key gets suspended during full load.

The following table shows source PostgreSQL data types and whether they can be migrated successfully.


| Data type | Migrates successfully | Partially migrates | does not migrate | Comments | 
| --- | --- | --- | --- | --- | 
| INTEGER | X |  |  |  | 
| SMALLINT | X |  |  |  | 
| BIGINT | X |  |  |  | 
| NUMERIC/DECIMAL(p,s) |  | X |  | Where 0<p<39 and 0<s | 
| NUMERIC/DECIMAL |  | X |  | Where p>38 or p=s=0 | 
| REAL | X |  |  |  | 
| DOUBLE | X |  |  |  | 
| SMALLSERIAL | X |  |  |  | 
| SERIAL | X |  |  |  | 
| BIGSERIAL | X |  |  |  | 
| MONEY | X |  |  |  | 
| CHAR |  | X |  | Without specified precision | 
| CHAR(n) | X |  |  |  | 
| VARCHAR |  | X |  | Without specified precision | 
| VARCHAR(n) | X |  |  |  | 
| TEXT | X |  |  |  | 
| BYTEA | X |  |  |  | 
| TIMESTAMP | X |  |  | Positive and negative infinity values are truncated to '9999-12-31 23:59:59' and '4713-01-01 00:00:00 BC' respectively. | 
| TIMESTAMP WITH TIME ZONE |  | X |  |  | 
| DATE | X |  |  |  | 
| TIME | X |  |  |  | 
| TIME WITH TIME ZONE | X |  |  |  | 
| INTERVAL |  | X |  |  | 
| BOOLEAN | X |  |  |  | 
| ENUM |  |  | X |  | 
| CIDR | X |  |  |  | 
| INET |  |  | X |  | 
| MACADDR |  |  | X |  | 
| TSVECTOR |  |  | X |  | 
| TSQUERY |  |  | X |  | 
| XML |  | X |  |  | 
| POINT | X |  |  | PostGIS spatial data type | 
| LINE |  |  | X |  | 
| LSEG |  |  | X |  | 
| BOX |  |  | X |  | 
| PATH |  |  | X |  | 
| POLYGON | X |  |  | PostGIS spatial data type | 
| CIRCLE |  |  | X |  | 
| JSON |  | X |  |  | 
| ARRAY | X |  |  | Requires Primary Key | 
| COMPOSITE |  |  | X |  | 
| RANGE |  |  | X |  | 
| LINESTRING | X |  |  | PostGIS spatial data type | 
| MULTIPOINT | X |  |  | PostGIS spatial data type | 
| MULTILINESTRING | X |  |  | PostGIS spatial data type | 
| MULTIPOLYGON | X |  |  | PostGIS spatial data type | 
| GEOMETRYCOLLECTION | X |  |  | PostGIS spatial data type | 

### Migrating PostGIS spatial data types
<a name="CHAP_Source.PostgreSQL.DataTypes.Spatial"></a>

*Spatial data* identifies the geometry information of an object or location in space. PostgreSQL object-relational databases support PostGIS spatial data types. 

Before migrating PostgreSQL spatial data objects, ensure that the PostGIS plugin is enabled at the global level. Doing this ensures that AWS DMS creates the exact source spatial data columns for the PostgreSQL target DB instance.

For PostgreSQL to PostgreSQL homogeneous migrations, AWS DMS supports the migration of PostGIS geometric and geographic (geodetic coordinates) data object types and subtypes such as the following:
+  POINT 
+  LINESTRING 
+  POLYGON 
+  MULTIPOINT 
+  MULTILINESTRING 
+  MULTIPOLYGON 
+  GEOMETRYCOLLECTION 

## Migrating from Babelfish for Amazon Aurora PostgreSQL using AWS DMS
<a name="CHAP_Source.PostgreSQL.Babelfish"></a>

You can migrate Babelfish for Aurora PostgreSQL source tables to any supported target endpoints using AWS DMS.

When you create your AWS DMS source endpoint using the DMS console, API, or CLI commands, you set the source to **Amazon Aurora PostgreSQL**, and the database name to **babelfish\$1db**. In the **Endpoint Settings** section, make sure that the **DatabaseMode** is set to **Babelfish**, and **BabelfishDatabaseName** is set to the name of the source Babelfish T-SQL database. Instead of using the Babelfish TCP port **1433**, use the Aurora PostgreSQL TCP port **5432**.

You must create your tables before migrating data to make sure that DMS uses the correct data types and table metadata. If you don't create your tables on the target before running migration, DMS may create the tables with incorrect data types and permissions.

### Adding transformation rules to your migration task
<a name="CHAP_Source.PostgreSQL.Babelfish.Transform"></a>

When you create a migration task for a Babelfish source, you need to include transformation rules that ensure that DMS uses the pre-created target tables.

If you set multi-database migration mode when you defined your Babelfish for PostgreSQL cluster, add a transformation rule that renames the schema name to the T-SQL schema. For example, if the T-SQL schema name is `dbo`, and your Babelfish for PostgreSQL schema name is `mydb_dbo`, rename the schema to `dbo` using a transformation rule. To find the PostgreSQL schema name, see [Babelfish architecture](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/babelfish-architecture.html) in the *Amazon Aurora User Guide*. 

If you use single-database mode, you don't need to use a transformation rule to rename database schemas. PostgreSQL schema names have a one-to-one mapping to the schema names in the T-SQL database.

The following transformation rule example shows how to rename the schema name from `mydb_dbo` back to `dbo`:

```
{
    "rules": [
        {
            "rule-type": "transformation",
            "rule-id": "566251737",
            "rule-name": "566251737",
            "rule-target": "schema",
            "object-locator": {
                "schema-name": "mydb_dbo"
            },
            "rule-action": "rename",
            "value": "dbo",
            "old-value": null
        },
        {
            "rule-type": "selection",
            "rule-id": "566111704",
            "rule-name": "566111704",
            "object-locator": {
                "schema-name": "mydb_dbo",
                "table-name": "%"
            },
            "rule-action": "include",
            "filters": []
        }
    ]
}
```

### Limitations for using a PostgreSQL source endpoint with Babelfish tables
<a name="CHAP_Source.PostgreSQL.Babelfish.Limitations"></a>

The following limitations apply when using a PostgreSQL source endpoint with Babelfish tables:
+ DMS only supports migration from Babelfish version 16.2/15.6 and later, and DMS version 3.5.3 and later.
+ DMS does not replicate Babelfish table definition changes to the target endpoint. A workaround for this limitation is to first apply the table definition changes on the target, and then change the table definition on the Babelfish source.
+ When you create Babelfish tables with the BYTEA data type, DMS converts them to the `varbinary(max)` data type when migrating to SQL Server as a target.
+ DMS does not support Full LOB mode for binary data types. Use limited LOB mode for binary data types instead.
+ DMS does not support data validation for Babelfish as a source.
+ For the **Target table preparation mode** task setting, use only the **Do nothing** or **Truncate** modes. Don't use the **Drop tables on target** mode. When using **Drop tables on target**, DMS may create the tables with incorrect data types.
+ When using ongoing replication (CDC or Full load and CDC), set the `PluginName` extra connection attribute (ECA) to `TEST_DECODING`.
+ DMS does not support replication (CDC or Full load and CDC) of partitioned table for Babelfish as a source.

## Removing AWS DMS artifacts from a PostgreSQL source database
<a name="CHAP_Source.PostgreSQL.CleanUp"></a>

To capture DDL events, AWS DMS creates various artifacts in the PostgreSQL database when a migration task starts. When the task completes, you might want to remove these artifacts.

To remove the artifacts, issue the following statements (in the order they appear), where `{AmazonRDSMigration}` is the schema in which the artifacts were created. Dropping a schema should be done with extreme caution. Never drop an operational schema, especially not a public one.

```
drop event trigger awsdms_intercept_ddl;
```

The event trigger does not belong to a specific schema.

```
drop function {AmazonRDSMigration}.awsdms_intercept_ddl()
drop table {AmazonRDSMigration}.awsdms_ddl_audit
drop schema {AmazonRDSMigration}
```

## Additional configuration settings when using a PostgreSQL database as a DMS source
<a name="CHAP_Source.PostgreSQL.Advanced"></a>

You can add additional configuration settings when migrating data from a PostgreSQL database in two ways:
+ You can add values to the extra connection attribute to capture DDL events and to specify the schema in which the operational DDL database artifacts are created. For more information, see [Endpoint settings and Extra Connection Attributes (ECAs) when using PostgreSQL as a DMS source](#CHAP_Source.PostgreSQL.ConnectionAttrib).
+ You can override connection string parameters. Choose this option to do either of the following:
  + Specify internal AWS DMS parameters. Such parameters are rarely required so aren't exposed in the user interface.
  + Specify pass-through (passthru) values for the specific database client. AWS DMS includes pass-through parameters in the connection sting passed to the database client.
+ By using the table-level parameter `REPLICA IDENTITY` in PostgreSQL versions 9.4 and higher, you can control information written to write-ahead logs (WALs). In particular, it does so for WALs that identify rows that are updated or deleted. `REPLICA IDENTITY FULL` records the old values of all columns in the row. Use `REPLICA IDENTITY FULL` carefully for each table as `FULL` generates an extra number of WALs that might not be necessary. For more information, see [ALTER TABLE-REPLICA IDENTITY](https://www.postgresql.org/docs/devel/sql-altertable.html) 

## Read replica as a source for PostgreSQL
<a name="CHAP_Source.PostgreSQL.ReadReplica"></a>

Use PostgreSQL read replicas as CDC sources in AWS DMS to reduce primary database load. This feature is available from PostgreSQL 16.x and requires AWS DMS version 3.6.1 or later. Using read replicas for CDC processing reduces the operational impact on your primary database.

**Note**  
Amazon RDS PostgreSQL version 16.x has limitations for read replica logical replication in the Three-AZ (TAZ) configurations. For full read replica logical replication support in TAZ deployments, you must use PostgreSQL version 17.x or later.

### Prerequisites
<a name="CHAP_Source.PostgreSQL.ReadReplica.prereq"></a>

Before using a read replica as a CDC source for AWS DMS, you must enable logical replication on both the primary Database instance and its read replica to create logical decoding on a read replica. Perform the following actions:
+ Enable logical replication on both, your primary database instance and its read replica along with any other required database parameters. For more information, see [Working with AWS-managed PostgreSQL databases as a DMS source](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.PostgreSQL.html#CHAP_Source.PostgreSQL.RDSPostgreSQL).
+ For CDC-only tasks, create a replication slot on the primary (writer) instance. For more information, see [Using native CDC start points to set up a CDC load of a PostgreSQL source](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.PostgreSQL.html#CHAP_Source.PostgreSQL.v10). This action is necessary as read replicas do not support the creation of replication slot.
+ For PostgreSQL version 16, the slot must be manually created on the read replica.
+ For PostgreSQL version 17 and above, the replication slot must be created on the primary, and it is automatically synchronized to the read replica.
+ When using Full Load \$1 CDC or CDC-only tasks, AWS DMS can automatically manage logical replication slots on primary instances but not on read replicas. For PostgreSQL version 16 read replicas, you must manually drop and recreate replication slots before restarting a task (not resuming). Skipping this step can cause task failures or incorrect CDC starting positions. From PostgreSQL version 17 onwards, logical slot synchronization from the primary instance automates this process.

After completing the prerequisites, you can set up your AWS DMS source endpoint with replication `SlotName` of the read replica source in the endpoint settings and configure your AWS DMS task using native CDC start points. For more information see, [Endpoint settings and Extra Connection Attributes (ECAs) when using PostgreSQL as a DMS source](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.PostgreSQL.html#CHAP_Source.PostgreSQL.ConnectionAttrib) and [Using native CDC start points to set up a CDC load of a PostgreSQL source](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.PostgreSQL.html#CHAP_Source.PostgreSQL.v10).

## Using the `MapBooleanAsBoolean` PostgreSQL endpoint setting
<a name="CHAP_Source.PostgreSQL.ConnectionAttrib.Endpointsetting"></a>

You can use PostgreSQL endpoint settings to map a boolean as a boolean from your PostgreSQL source to a Amazon Redshift target. By default, a BOOLEAN type is migrated as varchar(5). You can specify `MapBooleanAsBoolean` to let PostgreSQL to migrate the boolean type as boolean as shown in the example following.

```
--postgre-sql-settings '{"MapBooleanAsBoolean": true}'
```

Note that you must set this setting on both the source and target endpoints for it to take effect.

Since MySQL does not have a BOOLEAN type, use a transformation rule rather than this setting when migrating BOOLEAN data to MySQL.

## Endpoint settings and Extra Connection Attributes (ECAs) when using PostgreSQL as a DMS source
<a name="CHAP_Source.PostgreSQL.ConnectionAttrib"></a>

You can use endpoint settings and extra connection attributes (ECAs) to configure your PostgreSQL source database. You specify endpoint settings when you create the source endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--postgre-sql-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings and ECAs that you can use with PostgreSQL as a source.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.PostgreSQL.html)

## Limitations on using a PostgreSQL database as a DMS source
<a name="CHAP_Source.PostgreSQL.Limitations"></a>

The following limitations apply when using PostgreSQL as a source for AWS DMS:
+ AWS DMS does not work with Amazon RDS for PostgreSQL 10.4 or Amazon Aurora PostgreSQL 10.4 either as source or target.
+ A captured table must have a primary key. If a table does not have a primary key, AWS DMS ignores DELETE and UPDATE record operations for that table. As a workaround, see [Enabling change data capture (CDC) using logical replication](#CHAP_Source.PostgreSQL.Security). 

  **Note:** We don't recommend migrating without a Primary Key/Unique Index, otherwise additional limitations apply such as "NO" Batch apply capability, Full LOB capability, Data Validation and inability to replicate to Redshift target efficiently.
+ AWS DMS ignores an attempt to update a primary key segment. In these cases, the target identifies the update as one that didn't update any rows. However, because the results of updating a primary key in PostgreSQL are unpredictable, no records are written to the exceptions table.
+ AWS DMS does not support the **Start Process Changes from Timestamp** run option.
+ AWS DMS does not replicate changes that result from partition or subpartition operations (`ADD`, `DROP`, or `TRUNCATE`).
+ Replication of multiple tables with the same name where each name has a different case (for example, table1, TABLE1, and Table1) can cause unpredictable behavior. Because of this issue, AWS DMS does not support this type of replication.
+ In most cases, AWS DMS supports change processing of CREATE, ALTER, and DROP DDL statements for tables. AWS DMS does not support this change processing if the tables are held in an inner function or procedure body block or in other nested constructs.

  For example, the following change isn't captured.

  ```
  CREATE OR REPLACE FUNCTION attu.create_distributors1() RETURNS void
  LANGUAGE plpgsql
  AS $$
  BEGIN
  create table attu.distributors1(did serial PRIMARY KEY,name
  varchar(40) NOT NULL);
  END;
  $$;
  ```
+ Currently, `boolean` data types in a PostgreSQL source are migrated to a SQL Server target as `bit` data type with inconsistent values. As a workaround, pre-create the table with a `VARCHAR(1)` data type for the column (or have AWS DMS create the table). Then have downstream processing treat an "F" as False and a "T" as True.
+ AWS DMS does not support change processing of TRUNCATE operations.
+ The OID LOB data type isn't migrated to the target.
+ AWS DMS supports the PostGIS data type for only homogeneous migrations.
+ If your source is a PostgreSQL database that is on-premises or on an Amazon EC2 instance, ensure that the test\$1decoding output plugin is installed on your source endpoint. You can find this plugin in the PostgreSQL contrib package. For more information about the test-decoding plugin, see the [PostgreSQL documentation](https://www.postgresql.org/docs/10/static/test-decoding.html).
+ AWS DMS does not support change processing to set and unset column default values (using the ALTER COLUMN SET DEFAULT clause on ALTER TABLE statements).
+ AWS DMS does not support change processing to set column nullability (using the ALTER COLUMN [SET\$1DROP] NOT NULL clause on ALTER TABLE statements).
+ When logical replication is enabled, the maximum number of changes kept in memory per transaction is 4 MB. After that, changes are spilled to disk. As a result `ReplicationSlotDiskUsage` increases, and `restart_lsn` doesn’t advance until the transaction is completed or stopped and the rollback finishes. Because it is a long transaction, it can take a long time to rollback. So, avoid long running transactions or many sub-transactions when logical replication is enabled. Instead, break the transaction into several smaller transactions. 

  On Aurora PostgreSQL versions 13 and later, you can tune the `logical_decoding_work_mem` parameter to control when DMS spills change data to disk. For more information, see [Spill files in Aurora PostgreSQL](CHAP_Troubleshooting_Latency_Source_PostgreSQL.md#CHAP_Troubleshooting_Latency_Source_PostgreSQL_Spill).
+ A table with an ARRAY data type must have a primary key. A table with an ARRAY data type missing a primary key gets suspended during full load.
+ AWS DMS does not support migrating table metadata related to table partitioning or [table inheritance](https://www.postgresql.org/docs/15/ddl-inherit.html). When AWS DMS encounters a partitioned table or a table that uses inheritance, the following behavior is observed:
  + AWS DMS identifies and reports both parent and child tables involved in partitioning or inheritance on the source database.
  + **Table Creation on Target**: On the target database, AWS DMS creates the table as a standard (non-partitioned, non-inherited) table, preserving the structure and properties of the selected table(s) but not the partitioning or inheritance logic.
  + **Record Differentiation in Inherited Tables**: For tables that use inheritance, AWS DMS does not distinguish records belonging to child tables when populating the parent table. As a result, it does not utilize SQL queries with syntax such as: `SELECT * FROM ONLY parent_table_name`.
+ To replicate partitioned tables from a PostgreSQL source to a PostgreSQL target, first manually create the parent and child tables on the target. Then define a separate task to replicate to those tables. In such a case, set the task configuration to **Truncate before loading**.
+ The PostgreSQL `NUMERIC` data type isn't fixed in size. When transferring data that is a `NUMERIC` data type but without precision and scale, DMS uses `NUMERIC(28,6)` (a precision of 28 and scale of 6) by default. As an example, the value 0.611111104488373 from the source is converted to 0.611111 on the PostgreSQL target.
+ AWS DMS supports Aurora PostgreSQL Serverless V1 as a source for full load tasks only. AWS DMS supports Aurora PostgreSQL Serverless V2 as a source for full load, full load and CDC, and CDC-only tasks.
+ AWS DMS does not support replication of a table with a unique index created with a coalesce function.
+ If primary key definition on source and target does not match, results of replication may be unpredictable.
+ When using the Parallel Load feature, table segmentation according to partitions or sub-partitions isn't supported. For more information about Parallel Load, see [Using parallel load for selected tables, views, and collections](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Tablesettings.md#CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Tablesettings.ParallelLoad) 
+ AWS DMS does not support Deferred Constraints.
+ AWS DMS version 3.4.7 supports PostgreSQL 14.x as a source with these limitations:
  + AWS DMS does not support change processing of two phase commits.
  + AWS DMS does not support logical replication to stream long in-progress transactions.
+ AWS DMS does not support CDC for Amazon RDS Proxy for PostgreSQL as a source.
+ When using [source filters](CHAP_Tasks.CustomizingTasks.Filters.md) that don't contain a Primary Key column, `DELETE` operations won't be captured.
+ If your source database is also a target for another third–party replication system, DDL changes might not migrate during CDC. Because that situation can prevent the `awsdms_intercept_ddl` event trigger from firing. To work around the situation, modify that trigger on your source database as follows:

  ```
  alter event trigger awsdms_intercept_ddl enable always;
  ```
+ AWS DMS does not support replication of changes made to primary key definitions in the source database. If the primary key structure is altered during an active replication task, subsequent changes to affected tables are not replicated to the target.
+ In DDL replication as part of a script, the maximum total number of DDL commands per script is 8192 and maximum total number of lines per script is 8192 lines.
+ AWS DMS does not support Materialized Views.
+ For full load and CDC tasks using a read replica as the source, AWS DMS cannot create replication slots on read replicas.

## Source data types for PostgreSQL
<a name="CHAP_Source-PostgreSQL-DataTypes"></a>

The following table shows the PostgreSQL source data types that are supported when using AWS DMS and the default mapping to AWS DMS data types.

For information on how to view the data type that is mapped in the target, see the section for the target endpoint you are using.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  PostgreSQL data types  |  DMS data types  | 
| --- | --- | 
|  INTEGER  |  INT4  | 
|  SMALLINT  |  INT2  | 
|  BIGINT  |  INT8  | 
|  NUMERIC (p,s)  |  If precision is from 0 through 38, then use NUMERIC. If precision is 39 or greater, then use STRING.  | 
|  DECIMAL(P,S)  |  If precision is from 0 through 38, then use NUMERIC. If precision is 39 or greater, then use STRING.  | 
|  REAL  |  REAL4  | 
|  DOUBLE  |  REAL8  | 
|  SMALLSERIAL  |  INT2  | 
|  SERIAL  |  INT4  | 
|  BIGSERIAL  |  INT8  | 
|  MONEY  |  NUMERIC(38,4) The MONEY data type is mapped to FLOAT in SQL Server.  | 
|  CHAR  |  WSTRING (1)  | 
|  CHAR(N)  |  WSTRING (n)  | 
|  VARCHAR(N)  |  WSTRING (n)  | 
|  TEXT  |  NCLOB  | 
|  CITEXT  |  NCLOB  | 
|  BYTEA  |  BLOB  | 
|  TIMESTAMP  |  DATETIME  | 
|  TIMESTAMP WITH TIME ZONE  |  DATETIME  | 
|  DATE  |  DATE  | 
|  TIME  |  TIME  | 
|  TIME WITH TIME ZONE  |  TIME  | 
|  INTERVAL  |  STRING (128)—1 YEAR, 2 MONTHS, 3 DAYS, 4 HOURS, 5 MINUTES, 6 SECONDS  | 
|  BOOLEAN  |  CHAR (5) false or true  | 
|  ENUM  |  STRING (64)  | 
|  CIDR  |  STRING (50)  | 
|  INET  |  STRING (50)  | 
|  MACADDR  |  STRING (18)  | 
|  BIT (n)  |  STRING (n)  | 
|  BIT VARYING (n)  |  STRING (n)  | 
|  UUID  |  STRING  | 
|  TSVECTOR  |  CLOB  | 
|  TSQUERY  |  CLOB  | 
|  XML  |  CLOB  | 
|  POINT  |  STRING (255) "(x,y)"  | 
|  LINE  |  STRING (255) "(x,y,z)"  | 
|  LSEG  |  STRING (255) "((x1,y1),(x2,y2))"  | 
|  BOX  |  STRING (255) "((x1,y1),(x2,y2))"  | 
|  PATH  |  CLOB "((x1,y1),(xn,yn))"  | 
|  POLYGON  |  CLOB "((x1,y1),(xn,yn))"  | 
|  CIRCLE  |  STRING (255) "(x,y),r"  | 
|  JSON  |  NCLOB  | 
|  JSONB  |  NCLOB  | 
|  ARRAY  |  NCLOB  | 
|  COMPOSITE  |  NCLOB  | 
|  HSTORE  |  NCLOB  | 
|  INT4RANGE  |  STRING (255)  | 
|  INT8RANGE  |  STRING (255)  | 
|  NUMRANGE  |  STRING (255)  | 
|  STRRANGE  |  STRING (255)  | 

### Working with LOB source data types for PostgreSQL
<a name="CHAP_Source-PostgreSQL-DataTypes-LOBs"></a>

PostgreSQL column sizes affect the conversion of PostgreSQL LOB data types to AWS DMS data types. To work with this, take the following steps for the following AWS DMS data types:
+ BLOB – Set **Limit LOB size to** the **Maximum LOB size (KB)** value at task creation.
+ CLOB – Replication handles each character as a UTF8 character. Therefore, find the length of the longest character text in the column, shown here as `max_num_chars_text`. Use this length to specify the value for **Limit LOB size to**. If the data includes 4-byte characters, multiply by 2 to specify the **Limit LOB size to** value, which is in bytes. In this case, **Limit LOB size to** is equal to `max_num_chars_text` multiplied by 2.
+ NCLOB – Replication handles each character as a double-byte character. Therefore, find the length of the longest character text in the column (`max_num_chars_text`) and multiply by 2. You do this to specify the value for **Limit LOB size to**. In this case, **Limit LOB size to** is equal to `max_num_chars_text` multiplied by 2. If the data includes 4-byte characters, multiply by 2 again. In this case, **Limit LOB size to** is equal to `max_num_chars_text` multiplied by 4.

# Using a MySQL-compatible database as a source for AWS DMS
<a name="CHAP_Source.MySQL"></a>

You can migrate data from any MySQL-compatible database (MySQL, MariaDB, or Amazon Aurora MySQL) using AWS Database Migration Service. 

For information about versions of MySQL that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md). 

You can use SSL to encrypt connections between your MySQL-compatible endpoint and the replication instance. For more information on using SSL with a MySQL-compatible endpoint, see [Using SSL with AWS Database Migration Service](CHAP_Security.SSL.md).

In the following sections, the term "self-managed" applies to any database that is installed either on-premises or on Amazon EC2. The term "AWS-managed" applies to any database on Amazon RDS, Amazon Aurora, or Amazon S3.

For additional details on working with MySQL-compatible databases and AWS DMS, see the following sections.

**Topics**
+ [

## Migrating from MySQL to MySQL using AWS DMS
](#CHAP_Source.MySQL.Homogeneous)
+ [

## Using any MySQL-compatible database as a source for AWS DMS
](#CHAP_Source.MySQL.Prerequisites)
+ [

## Using a self-managed MySQL-compatible database as a source for AWS DMS
](#CHAP_Source.MySQL.CustomerManaged)
+ [

## Using an AWS-managed MySQL-compatible database as a source for AWS DMS
](#CHAP_Source.MySQL.AmazonManaged)
+ [

## Limitations on using a MySQL database as a source for AWS DMS
](#CHAP_Source.MySQL.Limitations)
+ [

## Support for XA transactions
](#CHAP_Source.MySQL.XA)
+ [

## Endpoint settings when using MySQL as a source for AWS DMS
](#CHAP_Source.MySQL.ConnectionAttrib)
+ [

## Source data types for MySQL
](#CHAP_Source.MySQL.DataTypes)

**Note**  
When configuring AWS Database Migration Service (AWS DMS) mapping rules, it is important to avoid using wildcards (%) for database or schema names. Instead, you must explicitly specify only the user-created databases that need to be migrated. Using a wildcard character includes all databases in the migration process, including system databases which are not required on the target instance. Since the MySQL Amazon RDS master user lacks the necessary permissions to import data into target system databases, attempting to migrate these system databases fails.

## Migrating from MySQL to MySQL using AWS DMS
<a name="CHAP_Source.MySQL.Homogeneous"></a>

For a heterogeneous migration, where you are migrating from a database engine other than MySQL to a MySQL database, AWS DMS is almost always the best migration tool to use. But for a homogeneous migration, where you are migrating from a MySQL database to a MySQL database, we recommend that you use a homogeneous data migrations migration project. homogeneous data migrations uses native database tools to provide an improved data migration performance and accuracy when compared to AWS DMS.

## Using any MySQL-compatible database as a source for AWS DMS
<a name="CHAP_Source.MySQL.Prerequisites"></a>

Before you begin to work with a MySQL database as a source for AWS DMS, make sure that you have the following prerequisites. These prerequisites apply to either self-managed or AWS-managed sources.

You must have an account for AWS DMS that has the Replication Admin role. The role needs the following privileges:
+ **REPLICATION CLIENT** – This privilege is required for CDC tasks only. In other words, full-load-only tasks don't require this privilege. 
**Note**  
For MariaDB version 10.5.2\$1, you can use BINLOG MONITOR – it is a replacement for REPLICATION CLIENT.
+ **REPLICATION SLAVE** – This privilege is required for CDC tasks only. In other words, full-load-only tasks don't require this privilege.
+ **SUPER** – This privilege is required only in MySQL versions before 5.6.6.

The AWS DMS user must also have SELECT privileges for the source tables designated for replication.

Grant the following privileges if you use MySQL-specific premigration assessments:

```
grant select on mysql.user to <dms_user>;
grant select on mysql.db to <dms_user>;
grant select on mysql.tables_priv to <dms_user>;
grant select on mysql.role_edges to <dms_user>  #only for MySQL version 8.0.11 and higher
grant select on performance_schema.replication_connection_status to <dms_user>;  #Required for primary instance validation - MySQL version 5.7 and higher only
```

If you're using an RDS source and plan to run MySQL-specific premigration assessments, add the following permission:

```
grant select on mysql.rds_configuration to <dms_user>;  #Required for binary log retention check
```

If parameter `BatchEnable` is `true` it is required to grant:

```
grant create temporary tables on `<schema>`.* to <dms_user>;
```

## Using a self-managed MySQL-compatible database as a source for AWS DMS
<a name="CHAP_Source.MySQL.CustomerManaged"></a>

You can use the following self-managed MySQL-compatible databases as sources for AWS DMS:
+ MySQL Community Edition
+ MySQL Standard Edition
+ MySQL Enterprise Edition
+ MySQL Cluster Carrier Grade Edition
+ MariaDB Community Edition
+ MariaDB Enterprise Edition
+ MariaDB Column Store

To use CDC, make sure to enable binary logging. To enable binary logging, the following parameters must be configured in MySQL's `my.ini` (Windows) or `my.cnf` (UNIX) file.


| Parameter | Value | 
| --- | --- | 
| `server_id` | Set this parameter to a value of 1 or greater. | 
| `log-bin` | Set the path to the binary log file, such as `log-bin=E:\MySql_Logs\BinLog`. Don't include the file extension. | 
| `binlog_format` | Set this parameter to `ROW`. We recommend this setting during replication because in certain cases when `binlog_format` is set to `STATEMENT`, it can cause inconsistency when replicating data to the target. The database engine also writes similar inconsistent data to the target when `binlog_format` is set to `MIXED`, because the database engine automatically switches to `STATEMENT`-based logging which can result in writing inconsistent data on the target database. | 
| `expire_logs_days` | Set this parameter to a value of 1 or greater. To prevent overuse of disk space, we recommend that you don't use the default value of 0. | 
| `binlog_checksum` | Set this parameter to `NONE` for DMS version 3.4.7 or prior. | 
| `binlog_row_image` | Set this parameter to `FULL`. | 
| `log_slave_updates` | Set this parameter to `TRUE` if you are using a MySQL or MariaDB read-replica as a source. | 

If you are using a MySQL or MariaDB read-replica as a source for a DMS migration task using **Migrate existing data and replicate ongoing changes** mode, there is a possibility of data loss. DMS won't write a transaction during either full load or CDC under the following conditions:
+ The transaction had been committed to the primary instance before the DMS task started.
+ The transaction hadn't been committed to the replica until after the DMS task started, due to lag between the primary instance and the replica.

The longer the lag between the primary instance and the replica, the greater potential there is for data loss.

If your source uses the NDB (clustered) database engine, the following parameters must be configured to enable CDC on tables that use that storage engine. Add these changes in MySQL's `my.ini` (Windows) or `my.cnf` (UNIX) file.


| Parameter | Value | 
| --- | --- | 
| `ndb_log_bin` | Set this parameter to `ON`. This value ensures that changes in clustered tables are logged to the binary log. | 
| `ndb_log_update_as_write` | Set this parameter to `OFF`. This value prevents writing UPDATE statements as INSERT statements in the binary log. | 
| `ndb_log_updated_only` | Set this parameter to `OFF`. This value ensures that the binary log contains the entire row and not just the changed columns. | 

## Using an AWS-managed MySQL-compatible database as a source for AWS DMS
<a name="CHAP_Source.MySQL.AmazonManaged"></a>

You can use the following AWS-managed MySQL-compatible databases as sources for AWS DMS:
+ MySQL Community Edition
+ MariaDB Community Edition
+ Amazon Aurora MySQL-Compatible Edition

When using an AWS-managed MySQL-compatible database as a source for AWS DMS, make sure that you have the following prerequisites for CDC:
+ To enable binary logs for RDS for MySQL and for RDS for MariaDB, enable automatic backups at the instance level. To enable binary logs for an Aurora MySQL cluster, change the variable `binlog_format` in the parameter group.

  For more information about setting up automatic backups, see [Working with automated backups](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithAutomatedBackups.html) in the *Amazon RDS User Guide*.

  For more information about setting up binary logging for an Amazon RDS for MySQL database, see [ Setting the binary logging format](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_LogAccess.MySQL.BinaryFormat.html) in the *Amazon RDS User Guide*. 

  For more information about setting up binary logging for an Aurora MySQL cluster, see [ How do I turn on binary logging for my Amazon Aurora MySQL cluster?](https://aws.amazon.com/premiumsupport/knowledge-center/enable-binary-logging-aurora/). 
+ If you plan to use CDC, turn on binary logging. For more information on setting up binary logging for an Amazon RDS for MySQL database, see [Setting the binary logging format](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_LogAccess.MySQL.BinaryFormat.html) in the *Amazon RDS User Guide*.
+ Ensure that the binary logs are available to AWS DMS. Because AWS-managed MySQL-compatible databases purge the binary logs as soon as possible, you should increase the length of time that the logs remain available. For example, to increase log retention to 24 hours, run the following command. 

  ```
   call mysql.rds_set_configuration('binlog retention hours', 24);
  ```
+ Set the `binlog_format` parameter to `"ROW"`.
**Note**  
On MySQL or MariaDB, `binlog_format` is a dynamic parameter, so you don't have to reboot to make the new value take effect. However, the new value will only apply to new sessions. If you switch the `binlog_format` to `ROW` for replication purposes, your database can still create subsequent binary logs using the `MIXED` format, if those sessions started before you changed the value. This may prevent AWS DMS from properly capturing all changes on the source database. When you change the `binlog_format` setting on a MariaDB or MySQL database, be sure to restart the database to close all existing sessions, or restart any application performing DML (Data Manipulation Language) operations. Forcing your database to restart all sessions after changing the `binlog_format` parameter to `ROW` will ensure that your database writes all subsequent source database changes using the correct format, so that AWS DMS can properly capture those changes.
+ Set the `binlog_row_image` parameter to `"Full"`. 
+ Set the `binlog_checksum` parameter to `"NONE"` for DMS version 3.4.7 or prior. For more information about setting parameters in Amazon RDS MySQL, see [Working with automated backups](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithAutomatedBackups.html) in the *Amazon RDS User Guide*.
+ If you are using an Amazon RDS MySQL or Amazon RDS MariaDB read replica as a source, enable backups on the read replica, and ensure the `log_slave_updates` parameter is set to `TRUE`.

## Limitations on using a MySQL database as a source for AWS DMS
<a name="CHAP_Source.MySQL.Limitations"></a>

When using a MySQL database as a source, consider the following:
+  Change data capture (CDC) isn't supported for Amazon RDS MySQL 5.5 or lower. For Amazon RDS MySQL, you must use version 5.6, 5.7, or 8.0 to enable CDC. CDC is supported for self-managed MySQL 5.5 sources. 
+ For CDC, `CREATE TABLE`, `ADD COLUMN`, and `DROP COLUMN` changing the column data type, and `renaming a column` are supported. However, `DROP TABLE`, `RENAME TABLE`, and updates made to other attributes, such as column default value, column nullability, character set and so on, are not supported.
+  For partitioned tables on the source, when you set **Target table preparation mode** to **Drop tables on target**, AWS DMS creates a simple table without any partitions on the MySQL target. To migrate partitioned tables to a partitioned table on the target, precreate the partitioned tables on the target MySQL database.
+  Using an `ALTER TABLE table_name ADD COLUMN column_name` statement to add columns to the beginning (FIRST) or the middle of a table (AFTER) isn't supported for relational targets. Columns are always added to the end of the table. When the target is Amazon S3 or Amazon Kinesis Data Streams, adding columns using FIRST or AFTER is supported.
+ CDC isn't supported when a table name contains uppercase and lowercase characters, and the source engine is hosted on an operating system with case-insensitive file names. An example is Microsoft Windows or OS X using HFS\$1.
+ You can use Aurora MySQL-Compatible Edition Serverless v1 for full load, but you can't use it for CDC. This is because you can't enable the prerequisites for MySQL. For more information, see [ Parameter groups and Aurora Serverless v1](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless.how-it-works.html#aurora-serverless.parameter-groups). 

  Aurora MySQL-Compatible Edition Serverless v2 supports CDC.
+  The AUTO\$1INCREMENT attribute on a column isn't migrated to a target database column.
+  Capturing changes when the binary logs aren't stored on standard block storage isn't supported. For example, CDC does not work when the binary logs are stored on Amazon S3.
+  AWS DMS creates target tables with the InnoDB storage engine by default. If you need to use a storage engine other than InnoDB, you must manually create the table and migrate to it using [do nothing](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_GettingStarted.html) mode.
+ You can't use Aurora MySQL replicas as a source for AWS DMS unless your DMS migration task mode is **Migrate existing data**—full load only.
+  If the MySQL-compatible source is stopped during full load, the AWS DMS task does not stop with an error. The task ends successfully, but the target might be out of sync with the source. If this happens, either restart the task or reload the affected tables.
+  Indexes created on a portion of a column value aren't migrated. For example, the index CREATE INDEX first\$1ten\$1chars ON customer (name(10)) isn't created on the target.
+ In some cases, the task is configured to not replicate LOBs ("SupportLobs" is false in task settings or **Don't include LOB columns** is chosen in the task console). In these cases, AWS DMS does not migrate any MEDIUMBLOB, LONGBLOB, MEDIUMTEXT, and LONGTEXT columns to the target.

  BLOB, TINYBLOB, TEXT, and TINYTEXT columns aren't affected and are migrated to the target.
+ Temporal data tables or system—versioned tables are not supported on MariaDB source and target databases.
+ If migrating between two Amazon RDS Aurora MySQL clusters, the RDS Aurora MySQL source endpoint must be a read/write instance, not a replica instance. 
+ AWS DMS currently does not support views migration for MariaDB.
+ AWS DMS does not support DDL changes for partitioned tables for MySQL. To skip table suspension for partition DDL changes during CDC, set `skipTableSuspensionForPartitionDdl` to `true`.
+ AWS DMS only supports XA transactions in version 3.5.0 and higher. Previous versions do not support XA transactions. AWS DMS does not support XA transactions in MariaDB version 10.6 or higher For more information, see [Support for XA transactions](#CHAP_Source.MySQL.XA) following.
+ AWS DMS does not use GTIDs for replication, even if the source data contains them. 
+ AWS DMS does not support Aurora MySQL enhanced binary log.
+ AWS DMS does not support binary log transaction compression.
+ AWS DMS does not propagate ON DELETE CASCADE and ON UPDATE CASCADE events for MySQL databases using the InnoDB storage engine. For these events, MySQL does not generate binlog events to reflect the cascaded operations on the child tables. Consequently, AWS DMS can't replicate the corresponding changes to the child tables. For more information, see [Indexes, Foreign Keys, or Cascade Updates or Deletes Not Migrated](CHAP_Troubleshooting.md#CHAP_Troubleshooting.MySQL.FKsAndIndexes).
+ AWS DMS does not capture changes to computed (`VIRTUAL` and `GENERATED ALWAYS`) columns. To work around this limitation, do the following:
  + Pre-create the target table in the target database, and create the AWS DMS task with the `DO_NOTHING` or `TRUNCATE_BEFORE_LOAD` full-load task setting.
  + Add a transformation rule to remove the computed column from the task scope. For information about transformation rules, see [Transformation rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.md).
+ Due to internal MySQL limitation, AWS DMS can process BINLOGs no larger than 4GB size. BINLOGs larger than 4GB may result in DMS task failures or other unpredictable behavior. You must reduce the size of transactions to avoid BINLOGs larger than 4GB.
+ AWS DMS does not support back-ticks (```) or single quotes (`'`) in schema, table, and column names.
+ AWS DMS does not migrate data from invisible columns in your source database. To include these columns in your migration scope, use the ALTER TABLE statement to make these columns visible.

## Support for XA transactions
<a name="CHAP_Source.MySQL.XA"></a>

An Extended Architecture (XA) transaction is a transaction that can be used to group a series of operations from multiple transactional resources into a single, reliable global transaction. An XA transaction uses a two-phase commit protocol. In general, capturing changes while there are open XA transactions might lead to loss of data. If your database does not use XA transactions, you can ignore this permission and the configuration `IgnoreOpenXaTransactionsCheck` by using the deafult value `TRUE`. To start replicating from a source that has XA transactions, do the following:
+ Ensure that the AWS DMS endpoint user has the following permission:

  ```
  grant XA_RECOVER_ADMIN on *.* to 'userName'@'%';
  ```
+ Set the endpoint setting `IgnoreOpenXaTransactionsCheck` to `false`.

**Note**  
AWS DMS doesn’t support XA transactions on MariaDB Source DB version 10.6 or higher.

## Endpoint settings when using MySQL as a source for AWS DMS
<a name="CHAP_Source.MySQL.ConnectionAttrib"></a>

You can use endpoint settings to configure your MySQL source database similar to using extra connection attributes. You specify the settings when you create the source endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--my-sql-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with MySQL as a source.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.MySQL.html)

## Source data types for MySQL
<a name="CHAP_Source.MySQL.DataTypes"></a>

The following table shows the MySQL database source data types that are supported when using AWS DMS and the default mapping from AWS DMS data types.

For information on how to view the data type that is mapped in the target, see the section for the target endpoint you are using.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  MySQL data types  |  AWS DMS data types  | 
| --- | --- | 
|  INT  |  INT4  | 
|  BIGINT  |  INT8  | 
|  MEDIUMINT  |  INT4  | 
|  TINYINT  |  INT1  | 
|  SMALLINT  |  INT2  | 
|  UNSIGNED TINYINT  |  UINT1  | 
|  UNSIGNED SMALLINT  |  UINT2  | 
|  UNSIGNED MEDIUMINT  |  UINT4  | 
|  UNSIGNED INT  |  UINT4  | 
|  UNSIGNED BIGINT  |  UINT8  | 
|  DECIMAL(10)  |  NUMERIC (10,0)  | 
|  BINARY  |  BYTES(1)  | 
|  BIT  |  BOOLEAN  | 
|  BIT(64)  |  BYTES(8)  | 
|  BLOB  |  BYTES(65535)  | 
|  LONGBLOB  |  BLOB  | 
|  MEDIUMBLOB  |  BLOB  | 
|  TINYBLOB  |  BYTES(255)  | 
|  DATE  |  DATE  | 
|  DATETIME  |  DATETIME DATETIME without a parenthetical value is replicated without milliseconds. DATETIME with a parenthetical value of 1 to 5 (such as `DATETIME(5)`) is replicated with milliseconds. When replicating a DATETIME column, the time remains the same on the target. It is not converted to UTC.  | 
|  TIME  |  STRING  | 
|  TIMESTAMP  |  DATETIME When replicating a TIMESTAMP column, the time is converted to UTC on the target.  | 
|  YEAR  |  INT2  | 
|  DOUBLE  |  REAL8  | 
|  FLOAT  |  REAL(DOUBLE) If the FLOAT values are not in the range following, use a transformation to map FLOAT to STRING. For more information about transformations, see [Transformation rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.md). The supported FLOAT range is -1.79E\$1308 to -2.23E-308, 0, and 2.23E-308 to 1.79E\$1308  | 
|  VARCHAR (45)  |  WSTRING (45)  | 
|  VARCHAR (2000)  |  WSTRING (2000)  | 
|  VARCHAR (4000)  |  WSTRING (4000)  | 
|  VARBINARY (4000)  |  BYTES (4000)  | 
|  VARBINARY (2000)  |  BYTES (2000)  | 
|  CHAR  |  WSTRING  | 
|  TEXT  |  WSTRING  | 
|  LONGTEXT  |  NCLOB  | 
|  MEDIUMTEXT  |  NCLOB  | 
|  TINYTEXT  |  WSTRING(255)  | 
|  GEOMETRY  |  BLOB  | 
|  POINT  |  BLOB  | 
|  LINESTRING  |  BLOB  | 
|  POLYGON  |  BLOB  | 
|  MULTIPOINT  |  BLOB  | 
|  MULTILINESTRING  |  BLOB  | 
|  MULTIPOLYGON  |  BLOB  | 
|  GEOMETRYCOLLECTION  |  BLOB  | 
|  ENUM  |  WSTRING (*length*) Here, *length* is the length of the longest value in the ENUM.  | 
|  SET  |  WSTRING (*length*) Here, *length* is the total length of all values in the SET, including commas.  | 
|  JSON  |  CLOB  | 

**Note**  
In some cases, you might specify the DATETIME and TIMESTAMP data types with a "zero" value (that is, 0000-00-00). If so, make sure that the target database in the replication task supports "zero" values for the DATETIME and TIMESTAMP data types. Otherwise, these values are recorded as null on the target.

# Using an SAP ASE database as a source for AWS DMS
<a name="CHAP_Source.SAP"></a>

You can migrate data from an SAP Adaptive Server Enterprise (ASE) database—formerly known as Sybase—using AWS DMS. With an SAP ASE database as a source, you can migrate data to any of the other supported AWS DMS target databases. 

For information about versions of SAP ASE that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md).

For additional details on working with SAP ASE databases and AWS DMS, see the following sections.

**Topics**
+ [

## Prerequisites for using an SAP ASE database as a source for AWS DMS
](#CHAP_Source.SAP.Prerequisites)
+ [

## Limitations on using SAP ASE as a source for AWS DMS
](#CHAP_Source.SAP.Limitations)
+ [

## Permissions required for using SAP ASE as a source for AWS DMS
](#CHAP_Source.SAP.Security)
+ [

## Removing the truncation point
](#CHAP_Source.SAP.Truncation)
+ [

## Endpoint settings when using SAP ASE as a source for AWS DMS
](#CHAP_Source.SAP.ConnectionAttrib)
+ [

## Source data types for SAP ASE
](#CHAP_Source.SAP.DataTypes)

## Prerequisites for using an SAP ASE database as a source for AWS DMS
<a name="CHAP_Source.SAP.Prerequisites"></a>

For an SAP ASE database to be a source for AWS DMS, do the following:
+ Enable SAP ASE replication for tables by using the `sp_setreptable` command. For more information, see [Sybase Infocenter Archive]( http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.dc32410_1501/html/refman/X37830.htm). 
+ Disable `RepAgent` on the SAP ASE database. For more information, see [Stop and disable the RepAgent thread in the primary database](http://infocenter-archive.sybase.com/help/index.jsp?topic=/com.sybase.dc20096_1260/html/mra126ag/mra126ag65.htm). 
+ To replicate to SAP ASE version 15.7 on an Windows EC2 instance configured for non-Latin characters (for example, Chinese), install SAP ASE 15.7 SP121 on the target computer.

**Note**  
For ongoing change data capture (CDC) replication, DMS runs `dbcc logtransfer` and `dbcc log` to read data from the transaction log.

## Limitations on using SAP ASE as a source for AWS DMS
<a name="CHAP_Source.SAP.Limitations"></a>

The following limitations apply when using an SAP ASE database as a source for AWS DMS:
+ You can run only one AWS DMS task with ongoing replication or CDC for each SAP ASE database. You can run multiple full-load-only tasks in parallel.
+ You can't rename a table. For example, the following command fails.

  ```
  sp_rename 'Sales.SalesRegion', 'SalesReg;
  ```
+ You can't rename a column. For example, the following command fails.

  ```
  sp_rename 'Sales.Sales.Region', 'RegID', 'COLUMN';
  ```
+ Zero values located at the end of binary data type strings are truncated when replicated to the target database. For example, `0x0000000000000000000000000100000100000000` in the source table becomes `0x00000000000000000000000001000001` in the target table.
+ If the database default is set not to allow NULL values, AWS DMS creates the target table with columns that don't allow NULL values. Consequently, if a full load or CDC replication task contains empty values, AWS DMS throws an error. You can prevent these errors by allowing NULL values in the source database by using the following commands.

  ```
  sp_dboption database_name, 'allow nulls by default', 'true'
  go
  use database_name
  CHECKPOINT
  go
  ```
+ The `reorg rebuild` index command isn't supported.
+ AWS DMS does not support clusters or using MSA (Multi-Site Availability)/Warm Standby as a source.
+ When `AR_H_TIMESTAMP` transformation header expression is used in mapping rules, the milliseconds won't be captured for an added column.
+ Running Merge operations during CDC will result in a non-recoverable error. To bring the target back in sync, run a full load.
+ Rollback trigger events are not supported for tables that use a data row locking scheme.
+ AWS DMS can't resume a replication task after dropping a table within the task scope from a source SAP database. If the DMS replication task was stopped and performed any DML operation (INSERT,UPDATE,DELETE) followed by dropping the table, you must restart the replication task.

## Permissions required for using SAP ASE as a source for AWS DMS
<a name="CHAP_Source.SAP.Security"></a>

To use an SAP ASE database as a source in an AWS DMS task, you need to grant permissions. Grant the user account specified in the AWS DMS database definitions the following permissions in the SAP ASE database: 
+ sa\$1role
+ replication\$1role
+ sybase\$1ts\$1role
+ By default, where you need to have permission to run the `sp_setreptable` stored procedure, AWS DMS enables the SAP ASE replication option. If you want to run `sp_setreptable` on a table directly from the database endpoint and not through AWS DMS itself, you can use the `enableReplication` extra connection attribute. For more information, see [Endpoint settings when using SAP ASE as a source for AWS DMS](#CHAP_Source.SAP.ConnectionAttrib).

## Removing the truncation point
<a name="CHAP_Source.SAP.Truncation"></a>

When a task starts, AWS DMS establishes a `$replication_truncation_point` entry in the `syslogshold` system view, indicating that a replication process is in progress. While AWS DMS is working, it advances the replication truncation point at regular intervals, according to the amount of data that has already been copied to the target.

After the `$replication_truncation_point` entry is established, keep the AWS DMS task running to prevent the database log from becoming excessively large. If you want to stop the AWS DMS task permanently, remove the replication truncation point by issuing the following command:

```
dbcc settrunc('ltm','ignore')
```

After the truncation point is removed, you can't resume the AWS DMS task. The log continues to be truncated automatically at the checkpoints (if automatic truncation is set).

## Endpoint settings when using SAP ASE as a source for AWS DMS
<a name="CHAP_Source.SAP.ConnectionAttrib"></a>

You can use endpoint settings to configure your SAP ASE source database similar to using extra connection attributes. You specify the settings when you create the source endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--sybase-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with SAP ASE as a source.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.SAP.html)

## Source data types for SAP ASE
<a name="CHAP_Source.SAP.DataTypes"></a>

For a list of the SAP ASE source data types that are supported when using AWS DMS and the default mapping from AWS DMS data types, see the following table. AWS DMS doesn't support SAP ASE source tables with columns of the user-defined type (UDT) data type. Replicated columns with this data type are created as NULL. 

For information on how to view the data type that is mapped in the target, see the [Targets for data migration](CHAP_Target.md) section for your target endpoint.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  SAP ASE data types  |  AWS DMS data types  | 
| --- | --- | 
| BIGINT | INT8 | 
| UNSIGNED BIGINT | UINT8 | 
| INT | INT4 | 
| UNSIGNED INT | UINT4 | 
| SMALLINT | INT2 | 
| UNSIGNED SMALLINT | UINT2 | 
| TINYINT | UINT1 | 
| DECIMAL | NUMERIC | 
| NUMERIC | NUMERIC | 
| FLOAT | REAL8 | 
| DOUBLE | REAL8 | 
| REAL | REAL4 | 
| MONEY | NUMERIC | 
| SMALLMONEY | NUMERIC | 
| DATETIME | DATETIME | 
| BIGDATETIME | DATETIME(6) | 
| SMALLDATETIME | DATETIME | 
| DATE | DATE | 
| TIME | TIME | 
| BIGTIME | TIME | 
| CHAR | STRING | 
| UNICHAR | WSTRING | 
| NCHAR | WSTRING | 
| VARCHAR | STRING | 
| UNIVARCHAR | WSTRING | 
| NVARCHAR | WSTRING | 
| BINARY | BYTES | 
| VARBINARY | BYTES | 
| BIT | BOOLEAN | 
| TEXT | CLOB | 
| UNITEXT | NCLOB | 
| IMAGE | BLOB | 

# Using MongoDB as a source for AWS DMS
<a name="CHAP_Source.MongoDB"></a>

 For information about versions of MongoDB that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md). 

Note the following about MongoDB version support:
+ Versions of AWS DMS 3.4.5 and later support MongoDB versions 4.2 and 4.4. 
+ Versions of AWS DMS 3.4.5 and later and versions of MongoDB 4.2 and later support distributed transactions. For more information on MongoDB distributed transactions, see [Transactions](https://docs.mongodb.com/manual/core/transactions/) in the [MongoDB documentation](https://www.mongodb.com/docs/).
+ Versions of AWS DMS 3.5.0 and later don't support versions of MongoDB prior to 3.6.
+ Versions of AWS DMS 3.5.1 and later support MongoDB version 5.0.
+ Versions of AWS DMS 3.5.2 and later support MongoDB version 6.0.
+ Versions of AWS DMS 3.5.4 and later support MongoDB version 7.0 and 8.0.


If you are new to MongoDB, be aware of the following important MongoDB database concepts: 
+ A record in MongoDB is a *document*, which is a data structure composed of field and value pairs. The value of a field can include other documents, arrays, and arrays of documents. A document is roughly equivalent to a row in a relational database table.
+ A *collection* in MongoDB is a group of documents, and is roughly equivalent to a relational database table.
+ A *database* in MongoDB is a set of collections, and is roughly equivalent to a schema in a relational database.
+ Internally, a MongoDB document is stored as a binary JSON (BSON) file in a compressed format that includes a type for each field in the document. Each document has a unique ID.

AWS DMS supports two migration modes when using MongoDB as a source, *Document mode* or *Table mode*. You specify which migration mode to use when you create the MongoDB endpoint or by setting the **Metadata mode** parameter from the AWS DMS console. Optionally, you can create a second column named `_id` that acts as the primary key by selecting the check mark button for **\$1id as a separate column** in the endpoint configuration panel. 

Your choice of migration mode affects the resulting format of the target data, as explained following. 

**Document mode**  
In document mode, the MongoDB document is migrated as is, meaning that the document data is consolidated into a single column named `_doc` in a target table. Document mode is the default setting when you use MongoDB as a source endpoint.  
For example, consider the following documents in a MongoDB collection called myCollection.  

```
 db.myCollection.find()
{ "_id" : ObjectId("5a94815f40bd44d1b02bdfe0"), "a" : 1, "b" : 2, "c" : 3 }
{ "_id" : ObjectId("5a94815f40bd44d1b02bdfe1"), "a" : 4, "b" : 5, "c" : 6 }
```
After migrating the data to a relational database table using document mode, the data is structured as follows. The data fields in the MongoDB document are consolidated into the` _doc` column.      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.MongoDB.html)
You can optionally set the extra connection attribute `extractDocID` to *true* to create a second column named `"_id"` that acts as the primary key. If you are going to use CDC, set this parameter to *true*.  
When using CDC with sources that produce [multi-document transactions](https://www.mongodb.com/docs/manual/reference/method/Session.startTransaction/#mongodb-method-Session.startTransaction), the `ExtractDocId` parameter **must be** set to *true*. If this parameter is not enabled, the AWS DMS task will fail when it encounters a multi-document transaction.  
In document mode, AWS DMS manages the creation and renaming of collections like this:  
+ If you add a new collection to the source database, AWS DMS creates a new target table for the collection and replicates any documents. 
+ If you rename an existing collection on the source database, AWS DMS doesn't rename the target table. 
If the target endpoint is Amazon DocumentDB, run the migration in **Document mode**.

**Table mode**  
In table mode, AWS DMS transforms each top-level field in a MongoDB document into a column in the target table. If a field is nested, AWS DMS flattens the nested values into a single column. AWS DMS then adds a key field and data types to the target table's column set.   
For each MongoDB document, AWS DMS adds each key and type to the target table's column set. For example, using table mode, AWS DMS migrates the previous example into the following table.      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.MongoDB.html)
Nested values are flattened into a column containing dot-separated key names. The column is named the concatenation of the flattened field names separated by periods. For example, AWS DMS migrates a JSON document with a field of nested values such as `{"a" : {"b" : {"c": 1}}}` into a column named `a.b.c.`  
To create the target columns, AWS DMS scans a specified number of MongoDB documents and creates a set of all the fields and their types. AWS DMS then uses this set to create the columns of the target table. If you create or modify your MongoDB source endpoint using the console, you can specify the number of documents to scan. The default value is 1000 documents. If you use the AWS CLI, you can use the extra connection attribute `docsToInvestigate`.  
In table mode, AWS DMS manages documents and collections like this:  
+ When you add a document to an existing collection, the document is replicated. If there are fields that don't exist in the target, those fields aren't replicated.
+ When you update a document, the updated document is replicated. If there are fields that don't exist in the target, those fields aren't replicated.
+ Deleting a document is fully supported.
+ Adding a new collection doesn't result in a new table on the target when done during a CDC task.
+ In the Change Data Capture(CDC) phase, AWS DMS doesn't support renaming a collection.

**Topics**
+ [

## Permissions needed when using MongoDB as a source for AWS DMS
](#CHAP_Source.MongoDB.PrerequisitesCDC)
+ [

## Configuring a MongoDB replica set for CDC
](#CHAP_Source.MongoDB.PrerequisitesCDC.ReplicaSet)
+ [

## Security requirements when using MongoDB as a source for AWS DMS
](#CHAP_Source.MongoDB.Security)
+ [

## Segmenting MongoDB collections and migrating in parallel
](#CHAP_Source.MongoDB.ParallelLoad)
+ [

## Migrating multiple databases when using MongoDB as a source for AWS DMS
](#CHAP_Source.MongoDB.Multidatabase)
+ [

## Limitations when using MongoDB as a source for AWS DMS
](#CHAP_Source.MongoDB.Limitations)
+ [

## Endpoint configuration settings when using MongoDB as a source for AWS DMS
](#CHAP_Source.MongoDB.Configuration)
+ [

## Source data types for MongoDB
](#CHAP_Source.MongoDB.DataTypes)

## Permissions needed when using MongoDB as a source for AWS DMS
<a name="CHAP_Source.MongoDB.PrerequisitesCDC"></a>

For an AWS DMS migration with a MongoDB source, you can create either a user account with root privileges, or a user with permissions only on the database to migrate. 

The following code creates a user to be the root account.

```
use admin
db.createUser(
  {
    user: "root",
    pwd: "password",
    roles: [ { role: "root", db: "admin" } ]
  }
)
```

For a MongoDB 3.x source, the following code creates a user with minimal privileges on the database to be migrated.

```
use database_to_migrate
db.createUser( 
{ 
    user: "dms-user",
    pwd: "password",
    roles: [ { role: "read", db: "local" }, "read"] 
})
```

For a MongoDB 4.x source, the following code creates a user with minimal privileges.

```
{ resource: { db: "", collection: "" }, actions: [ "find", "changeStream" ] }
```

For example, create the following role in the "admin" database.

```
use admin
db.createRole(
{
role: "changestreamrole",
privileges: [
{ resource: { db: "", collection: "" }, actions: [ "find","changeStream" ] }
],
roles: []
}
)
```

And once the role is created, create a user in the database to be migrated.

```
 use test
> db.createUser( 
{ 
user: "dms-user12345",
pwd: "password",
roles: [ { role: "changestreamrole", db: "admin" }, "read"] 
})
```

## Configuring a MongoDB replica set for CDC
<a name="CHAP_Source.MongoDB.PrerequisitesCDC.ReplicaSet"></a>

To use ongoing replication or CDC with MongoDB, AWS DMS requires access to the MongoDB operations log (oplog). To create the oplog, you need to deploy a replica set if one doesn't exist. For more information, see [ the MongoDB documentation](https://docs.mongodb.com/manual/tutorial/deploy-replica-set/).

You can use CDC with either the primary or secondary node of a MongoDB replica set as the source endpoint.

**To convert a standalone instance to a replica set**

1. Using the command line, connect to `mongo`.

   ```
   mongo localhost
   ```

1. Stop the `mongod` service.

   ```
   service mongod stop
   ```

1. Restart `mongod` using the following command:

   ```
   mongod --replSet "rs0" --auth -port port_number
   ```

1. Test the connection to the replica set using the following commands:

   ```
   mongo -u root -p password --host rs0/localhost:port_number 
     --authenticationDatabase "admin"
   ```

If you plan to perform a document mode migration, select option `_id as a separate column` when you create the MongoDB endpoint. Selecting this option creates a second column named `_id` that acts as the primary key. This second column is required by AWS DMS to support data manipulation language (DML) operations.

**Note**  
AWS DMS uses the operations log (oplog) to capture changes during ongoing replication. If MongoDB flushes out the records from the oplog before AWS DMS reads them, your tasks fail. We recommend sizing the oplog to retain changes for at least 24 hours. 

## Security requirements when using MongoDB as a source for AWS DMS
<a name="CHAP_Source.MongoDB.Security"></a>

AWS DMS supports two authentication methods for MongoDB. The two authentication methods are used to encrypt the password, so they are only used when the `authType` parameter is set to *PASSWORD*.

The MongoDB authentication methods are as follows:
+ **MONGODB-CR** – For backward compatibility
+ **SCRAM-SHA-1** – The default when using MongoDB version 3.x and 4.0

If an authentication method isn't specified, AWS DMS uses the default method for the version of the MongoDB source.

## Segmenting MongoDB collections and migrating in parallel
<a name="CHAP_Source.MongoDB.ParallelLoad"></a>

To improve performance of a migration task, MongoDB source endpoints support two options for parallel full load in table mapping. 

In other words, you can migrate a collection in parallel by using either autosegmentation or range segmentation with table mapping for a parallel full load in JSON settings. With autosegmentation, you can specify the criteria for AWS DMS to automatically segment your source for migration in each thread. With range segmentation, you can tell AWS DMS the specific range of each segment for DMS to migrate in each thread. For more information on these settings, see [Table and collection settings rules and operations](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Tablesettings.md).

### Migrating a MongoDB database in parallel using autosegmentation ranges
<a name="CHAP_Source.MongoDB.ParallelLoad.AutoPartitioned"></a>

You can migrate your documents in parallel by specifying the criteria for AWS DMS to automatically partition (segment) your data for each thread. In particular, you specify the number of documents to migrate per thread. Using this approach, AWS DMS attempts to optimize segment boundaries for maximum performance per thread.

You can specify the segmentation criteria using the table-settings options following in table mapping.


|  Table-settings option  |  Description  | 
| --- | --- | 
|  `"type"`  |  (Required) Set to `"partitions-auto"` for MongoDB as a source.  | 
|  `"number-of-partitions"`  |  (Optional) Total number of partitions (segments) used for migration. The default is 16.  | 
|  `"collection-count-from-metadata"`  |  (Optional) If this option is set to `true`, AWS DMS uses an estimated collection count for determining the number of partitions. If this option is set to `false`, AWS DMS uses the actual collection count. The default is `true`.  | 
|  `"max-records-skip-per-page"`  |  (Optional) The number of records to skip at once when determining the boundaries for each partition. AWS DMS uses a paginated skip approach to determine the minimum boundary for a partition. The default is 10,000.  Setting a relatively large value can result in cursor timeouts and task failures. Setting a relatively low value results in more operations per page and a slower full load.   | 
|  `"batch-size"`  |  (Optional) Limits the number of documents returned in one batch. Each batch requires a round trip to the server. If the batch size is zero (0), the cursor uses the server-defined maximum batch size. The default is 0.  | 

The example following shows a table mapping for autosegmentation.

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "admin",
                "table-name": "departments"
            },
            "rule-action": "include",
            "filters": []
        },
        {
            "rule-type": "table-settings",
            "rule-id": "2",
            "rule-name": "2",
            "object-locator": {
                "schema-name": "admin",
                "table-name": "departments"
            },
            "parallel-load": {
                "type": "partitions-auto",
                "number-of-partitions": 5,
                "collection-count-from-metadata": "true",
                "max-records-skip-per-page": 1000000,
                "batch-size": 50000
            }
        }
    ]
}
```

Autosegmentation has the limitation following. The migration for each segment fetches the collection count and the minimum `_id` for the collection separately. It then uses a paginated skip to calculate the minimum boundary for that segment. 

Therefore, ensure that the minimum `_id` value for each collection remains constant until all the segment boundaries in the collection are calculated. If you change the minimum `_id` value for a collection during its segment boundary calculation, it can cause data loss or duplicate row errors.

### Migrating a MongoDB database in parallel using range segmentation
<a name="CHAP_Source.MongoDB.ParallelLoad.Ranges"></a>

You can migrate your documents in parallel by specifying the ranges for each segment in a thread. Using this approach, you tell AWS DMS the specific documents to migrate in each thread according to your choice of document ranges per thread.

The image following shows a MongoDB collection that has seven items, and `_id` as the primary key.

![\[MongoDB collection with seven items.\]](http://docs.aws.amazon.com/dms/latest/userguide/images/datarep-docdb-collection.png)


To split the collection into three specific segments for AWS DMS to migrate in parallel, you can add table mapping rules to your migration task. This approach is shown in the following JSON example.

```
{ // Task table mappings:
  "rules": [
    {
      "rule-type": "selection",
      "rule-id": "1",
      "rule-name": "1",
      "object-locator": {
        "schema-name": "testdatabase",
        "table-name": "testtable"
      },
      "rule-action": "include"
    }, // "selection" :"rule-type"
    {
      "rule-type": "table-settings",
      "rule-id": "2",
      "rule-name": "2",
      "object-locator": {
        "schema-name": "testdatabase",
        "table-name": "testtable"
      },
      "parallel-load": {
        "type": "ranges",
        "columns": [
           "_id",
           "num"
        ],
        "boundaries": [
          // First segment selects documents with _id less-than-or-equal-to 5f805c97873173399a278d79
          // and num less-than-or-equal-to 2.
          [
             "5f805c97873173399a278d79",
             "2"
          ],
          // Second segment selects documents with _id > 5f805c97873173399a278d79 and
          // _id less-than-or-equal-to 5f805cc5873173399a278d7c and
          // num > 2 and num less-than-or-equal-to 5.
          [
             "5f805cc5873173399a278d7c",
             "5"
          ]                                   
          // Third segment is implied and selects documents with _id > 5f805cc5873173399a278d7c.
        ] // :"boundaries"
      } // :"parallel-load"
    } // "table-settings" :"rule-type"
  ] // :"rules"
} // :Task table mappings
```

That table mapping definition splits the source collection into three segments and migrates in parallel. The following are the segmentation boundaries.

```
Data with _id less-than-or-equal-to "5f805c97873173399a278d79" and num less-than-or-equal-to 2 (2 records)
Data with _id > "5f805c97873173399a278d79" and num > 2 and _id  less-than-or-equal-to "5f805cc5873173399a278d7c" and num less-than-or-equal-to 5 (3 records)
Data with _id > "5f805cc5873173399a278d7c" and num > 5 (2 records)
```

After the migration task is complete, you can verify from the task logs that the tables loaded in parallel, as shown in the following example. You can also verify the MongoDB `find` clause used to unload each segment from the source table.

```
[TASK_MANAGER    ] I:  Start loading segment #1 of 3 of table 'testdatabase'.'testtable' (Id = 1) by subtask 1. Start load timestamp 0005B191D638FE86  (replicationtask_util.c:752)

[SOURCE_UNLOAD   ] I:  Range Segmentation filter for Segment #0 is initialized.   (mongodb_unload.c:157)

[SOURCE_UNLOAD   ] I:  Range Segmentation filter for Segment #0 is: { "_id" : { "$lte" : { "$oid" : "5f805c97873173399a278d79" } }, "num" : { "$lte" : { "$numberInt" : "2" } } }  (mongodb_unload.c:328)

[SOURCE_UNLOAD   ] I: Unload finished for segment #1 of segmented table 'testdatabase'.'testtable' (Id = 1). 2 rows sent.

[TASK_MANAGER    ] I: Start loading segment #1 of 3 of table 'testdatabase'.'testtable' (Id = 1) by subtask 1. Start load timestamp 0005B191D638FE86 (replicationtask_util.c:752)
 
[SOURCE_UNLOAD   ] I: Range Segmentation filter for Segment #0 is initialized. (mongodb_unload.c:157) 

[SOURCE_UNLOAD   ] I: Range Segmentation filter for Segment #0 is: { "_id" : { "$lte" : { "$oid" : "5f805c97873173399a278d79" } }, "num" : { "$lte" : { "$numberInt" : "2" } } } (mongodb_unload.c:328)
 
[SOURCE_UNLOAD   ] I: Unload finished for segment #1 of segmented table 'testdatabase'.'testtable' (Id = 1). 2 rows sent.

[TARGET_LOAD     ] I: Load finished for segment #1 of segmented table 'testdatabase'.'testtable' (Id = 1). 1 rows received. 0 rows skipped. Volume transfered 480.

[TASK_MANAGER    ] I: Load finished for segment #1 of table 'testdatabase'.'testtable' (Id = 1) by subtask 1. 2 records transferred.
```

Currently, AWS DMS supports the following MongoDB data types as a segment key column:
+ Double
+ String
+ ObjectId
+ 32 bit integer
+ 64 bit integer

## Migrating multiple databases when using MongoDB as a source for AWS DMS
<a name="CHAP_Source.MongoDB.Multidatabase"></a>

AWS DMS versions 3.4.5 and higher support migrating multiple databases in a single task for all supported MongoDB versions. If you want to migrate multiple databases, take these steps:

1. When you create the MongoDB source endpoint, do one of the following:
   + On the DMS console's **Create endpoint** page, make sure that **Database name** is empty under **Endpoint configuration**.
   + Using the AWS CLI `CreateEndpoint` command, assign an empty string value to the `DatabaseName` parameter in `MongoDBSettings`.

1. For each database that you want to migrate from a MongoDB source, specify the database name as a schema name in the table mapping for the task. You can do this using either the guided input in the console or directly in JSON. For more information on the guided input, see [Specifying table selection and transformations rules from the console](CHAP_Tasks.CustomizingTasks.TableMapping.Console.md). For more information on the JSON, see [Selection rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Selections.md).

For example, you might specify the JSON following to migrate three MongoDB databases.

**Example Migrate all tables in a schema**  
The JSON following migrates all tables from the `Customers`, `Orders`, and `Suppliers` databases in your source endpoint to your target endpoint.  

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "Customers",
                "table-name": "%"
            },
            "rule-action": "include",
            "filters": []
        },
        {
            "rule-type": "selection",
            "rule-id": "2",
            "rule-name": "2",
            "object-locator": {
                "schema-name": "Orders",
                "table-name": "%"
            },
            "rule-action": "include",
            "filters": []
        },
        {
            "rule-type": "selection",
            "rule-id": "3",
            "rule-name": "3",
            "object-locator": {
                "schema-name": "Inventory",
                "table-name": "%"
            },
            "rule-action": "include",
            "filters": []
        }
    ]
}
```

## Limitations when using MongoDB as a source for AWS DMS
<a name="CHAP_Source.MongoDB.Limitations"></a>

The following are limitations when using MongoDB as a source for AWS DMS:
+ In table mode, the documents in a collection must be consistent in the data type that they use for the value in the same field. For example, if a document in a collection includes `'{ a:{ b:value ... }'`, all documents in the collection that reference the `value` of the `a.b` field must use the same data type for `value`, wherever it appears in the collection.
+ When the `_id` option is set as a separate column, the ID string can't exceed 200 characters.
+ Object ID and array type keys are converted to columns that are prefixed with `oid` and `array` in table mode.

  Internally, these columns are referenced with the prefixed names. If you use transformation rules in AWS DMS that reference these columns, make sure to specify the prefixed column. For example, you specify `${oid__id}` and not `${_id}`, or `${array__addresses}` and not `${_addresses}`. 
+  Collection names and key names can't include the dollar symbol (\$1). 
+ AWS DMS doesn't support collections containing the same field with different case (upper, lower) in table mode with a RDBMS target. For example, AWS DMS does not support having two collections named `Field1` and `field1`. 
+ Table mode and document mode have the limitations described preceding.
+ Migrating in parallel using autosegmentation has the limitations described preceding.
+ Source filters aren't supported for MongoDB.
+ AWS DMS doesn't support documents where the nesting level is greater than 97.
+ AWS DMS requires UTF-8 encoded source data when migrating to non-DocumentDB targets. For sources with non-UTF-8 characters, convert them to UTF-8 before migration or migrate to Amazon DocumentDB instead.
+ AWS DMS doesn't support the following features of MongoDB version 5.0:
  + Live resharding
  + Client-Side Field Level Encryption (CSFLE)
  + Timeseries collection migration
**Note**  
A timeseries collection migrated in the full-load phase will be converted to a normal collection in Amazon DocumentDB, because DocumentDB doesn't support timeseries collections.

## Endpoint configuration settings when using MongoDB as a source for AWS DMS
<a name="CHAP_Source.MongoDB.Configuration"></a>

When you set up your MongoDB source endpoint, you can specify multiple endpoint configuration settings using the AWS DMS console. 

The following table describes the configuration settings available when using MongoDB databases as an AWS DMS source. 


| Setting (attribute) | Valid values | Default value and description | 
| --- | --- | --- | 
|  **Authentication mode**  |  `"none"` `"password"`  |  The value `"password"` prompts for a user name and password. When `"none"` is specified, user name and password parameters aren't used.  | 
|  **Authentication source**  |  A valid MongoDB database name.  |  The name of the MongoDB database that you want to use to validate your credentials for authentication. The default value is `"admin"`.   | 
|  **Authentication mechanism**  |  `"default"` `"mongodb_cr"` `"scram_sha_1"`  |  The authentication mechanism. The value` "default"` is `"scram_sha_1"`. This setting isn't used when `authType` is set to `"no"`.  | 
|  **Metadata mode**  |  Document and table  |  Chooses document mode or table mode.   | 
|  **Number of documents to scan** (`docsToInvestigate`)  |  A positive integer greater than `0`.  |  Use this option in table mode only to define the target table definition.  | 
|  **\$1id as a separate column**  |  Check mark in box  |  Optional check mark box that creates a second column named `_id` that acts as the primary key.  | 
|   `ExtractDocID`   |  `true` `false`  |  `false` – Use this attribute when `NestingLevel` is set to `"none"`.  When using CDC with sources that produce [multi-document transactions](https://www.mongodb.com/docs/manual/reference/method/Session.startTransaction/#mongodb-method-Session.startTransaction), the `ExtractDocId` parameter **must be** set to `true`. If this parameter is not enabled, the AWS DMS task will fail when it encounters a multi-document transaction.  | 
|  `socketTimeoutMS`  |  An integer greater than or equal to 0. Extra Connection Attribute (ECA) only.  |  This setting is in units of milliseconds and configures the connection timeout for MongoDB clients. If the value is less than or equal to zero, then the MongoDB client default is used.  | 
|   `UseUpdateLookUp`   |  `true` `false`  |  When true, during CDC update events, AWS DMS copies over the entire updated document to the target. When set to false, AWS DMS uses the MongoDB update command to only update modified fields in the document on the target.  | 
|   `ReplicateShardCollections`   |  `true` `false`  |  When true, AWS DMS replicates data to shard collections. AWS DMS only uses this setting if the target endpoint is a DocumentDB elastic cluster. When this setting is true, note the following: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.MongoDB.html)  | 
|  `useTransactionVerification`  |  `true` `false`  |  When `false`, disables the verification between the change stream and oplogs.   You can miss operations if discrepancies between change streams and oplog entries occur, as the default DMS behavior is to fail the task in such scenarios. Default: `true`.   | 
|  `useOplog`  |  `true` `false`  |  When `true`, enables DMS task to read directly from 'oplog' rather than using the change stream. Default: `false`.  | 

If you choose **Document** as **Metadata mode**, different options are available. 

If the target endpoint is DocumentDB, make sure to run the migration in **Document mode** Also, modify your source endpoint and select the option **\$1id as separate column**. This is a mandatory prerequisite if your source MongoDB workload involves transactions.

## Source data types for MongoDB
<a name="CHAP_Source.MongoDB.DataTypes"></a>

Data migration that uses MongoDB as a source for AWS DMS supports most MongoDB data types. In the following table, you can find the MongoDB source data types that are supported when using AWS DMS and the default mapping from AWS DMS data types. For more information about MongoDB data types, see [ BSON types](https://docs.mongodb.com/manual/reference/bson-types) in the MongoDB documentation.

For information on how to view the data type that is mapped in the target, see the section for the target endpoint that you are using.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  MongoDB data types  |  AWS DMS data types  | 
| --- | --- | 
| Boolean | Bool | 
| Binary | BLOB | 
| Date | Date | 
| Timestamp | Date | 
| Int | INT4 | 
| Long | INT8 | 
| Double | REAL8 | 
| String (UTF-8) | CLOB | 
| Array | CLOB | 
| OID | String | 
| REGEX | CLOB | 
| CODE | CLOB | 

# Using Amazon DocumentDB (with MongoDB compatibility) as a source for AWS DMS
<a name="CHAP_Source.DocumentDB"></a>

For information about versions of Amazon DocumentDB (with MongoDB compatibility) that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md).

 Using Amazon DocumentDB as a source, you can migrate data from one Amazon DocumentDB cluster to another Amazon DocumentDB cluster. You can also migrate data from an Amazon DocumentDB cluster to one of the other target endpoints supported by AWS DMS.

If you are new to Amazon DocumentDB, be aware of the following important concepts for Amazon DocumentDB databases:
+ A record in Amazon DocumentDB is a *document*, a data structure composed of field and value pairs. The value of a field can include other documents, arrays, and arrays of documents. A document is roughly equivalent to a row in a relational database table.
+ A *collection* in Amazon DocumentDB is a group of documents, and is roughly equivalent to a relational database table.
+ A *database* in Amazon DocumentDB is a set of collections, and is roughly equivalent to a schema in a relational database.

AWS DMS supports two migration modes when using Amazon DocumentDB as a source, document mode and table mode. You specify the migration mode when you create the Amazon DocumentDB source endpoint in the AWS DMS console, using either the **Metadata mode** option or the extra connection attribute `nestingLevel`. Following, you can find an explanation how the choice of migration mode affects the resulting format of the target data.

**Document mode**  
In *document mode, *the JSON document is migrated as is. That means the document data is consolidated into one of two items. When you use a relational database as a target, the data is a single column named `_doc` in a target table. When you use a nonrelational database as a target, the data is a single JSON document. Document mode is the default mode, which we recommend when migrating to an Amazon DocumentDB target.  
For example, consider the following documents in a Amazon DocumentDB collection called `myCollection`.  

```
 db.myCollection.find()
{ "_id" : ObjectId("5a94815f40bd44d1b02bdfe0"), "a" : 1, "b" : 2, "c" : 3 }
{ "_id" : ObjectId("5a94815f40bd44d1b02bdfe1"), "a" : 4, "b" : 5, "c" : 6 }
```
After migrating the data to a relational database table using document mode, the data is structured as follows. The data fields in the document are consolidated into the` _doc` column.      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.DocumentDB.html)
You can optionally set the extra connection attribute `extractDocID` to `true` to create a second column named `"_id"` that acts as the primary key. If you are going to use change data capture (CDC), set this parameter to `true` except when using Amazon DocumentDB as the target.  
When using CDC with sources that produce [multi-document transactions](https://www.mongodb.com/docs/manual/reference/method/Session.startTransaction/#mongodb-method-Session.startTransaction), the `ExtractDocId` parameter **must be** set to `true`. If this parameter is not enabled, the AWS DMS task will fail when it encounters a multi-document transaction.  
If you add a new collection to the source database, AWS DMS creates a new target table for the collection and replicates any documents. 

**Table mode**  
In *table mode, *AWS DMS transforms each top-level field in a Amazon DocumentDB document into a column in the target table. If a field is nested, AWS DMS flattens the nested values into a single column. AWS DMS then adds a key field and data types to the target table's column set.   
For each Amazon DocumentDB document, AWS DMS adds each key and type to the target table's column set. For example, using table mode, AWS DMS migrates the previous example into the following table.      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.DocumentDB.html)
Nested values are flattened into a column containing dot-separated key names. The column is named using the concatenation of the flattened field names separated by periods. For example, AWS DMS migrates a JSON document with a field of nested values such as `{"a" : {"b" : {"c": 1}}}` into a column named `a.b.c.`  
To create the target columns, AWS DMS scans a specified number of Amazon DocumentDB documents and creates a set of all the fields and their types. AWS DMS then uses this set to create the columns of the target table. If you create or modify your Amazon DocumentDB source endpoint using the console, you can specify the number of documents to scan. The default value is 1,000 documents. If you use the AWS CLI, you can use the extra connection attribute `docsToInvestigate`.  
In table mode, AWS DMS manages documents and collections like this:  
+ When you add a document to an existing collection, the document is replicated. If there are fields that don't exist in the target, those fields aren't replicated.
+ When you update a document, the updated document is replicated. If there are fields that don't exist in the target, those fields aren't replicated.
+ Deleting a document is fully supported.
+ Adding a new collection doesn't result in a new table on the target when done during a CDC task.
+ In the Change Data Capture(CDC) phase, AWS DMS doesn't support renaming a collection.

**Topics**
+ [

## Setting permissions to use Amazon DocumentDB as a source
](#CHAP_Source.DocumentDB.Permissions)
+ [

## Configuring CDC for an Amazon DocumentDB cluster
](#CHAP_Source.DocumentDB.ConfigureCDC)
+ [

## Connecting to Amazon DocumentDB using TLS
](#CHAP_Source.DocumentDB.TLS)
+ [

## Creating an Amazon DocumentDB source endpoint
](#CHAP_Source.DocumentDB.ConfigureEndpoint)
+ [

## Segmenting Amazon DocumentDB collections and migrating in parallel
](#CHAP_Source.DocumentDB.ParallelLoad)
+ [

## Migrating multiple databases when using Amazon DocumentDB as a source for AWS DMS
](#CHAP_Source.DocumentDB.Multidatabase)
+ [

## Limitations when using Amazon DocumentDB as a source for AWS DMS
](#CHAP_Source.DocumentDB.Limitations)
+ [

## Using endpoint settings with Amazon DocumentDB as a source
](#CHAP_Source.DocumentDB.ECAs)
+ [

## Source data types for Amazon DocumentDB
](#CHAP_Source.DocumentDB.DataTypes)

## Setting permissions to use Amazon DocumentDB as a source
<a name="CHAP_Source.DocumentDB.Permissions"></a>

When using Amazon DocumentDB source for an AWS DMS migration, you can create a user account with root privileges. Or you can create a user with permissions only for the database to be migrated. 

The following code creates a user as the root account.

```
use admin
db.createUser(
  {
    user: "root",
    pwd: "password",
    roles: [ { role: "root", db: "admin" } ]
  })
```

For Amazon DocumentDB 3.6, the code following creates a user with minimal privileges on the database to be migrated.

```
use db_name
db.createUser( 
    {
        user: "dms-user",
        pwd: "password",
        roles: [{ role: "read", db: "db_name" }]
    }
)
```

For Amazon DocumentDB 4.0 and higher, AWS DMS uses a deployment-wide change stream. Here, the code following creates a user with minimal privileges.

```
db.createUser( 
{ 
    user: "dms-user",
    pwd: "password",
    roles: [ { role: "readAnyDatabase", db: "admin" }] 
})
```

## Configuring CDC for an Amazon DocumentDB cluster
<a name="CHAP_Source.DocumentDB.ConfigureCDC"></a>

To use ongoing replication or CDC with Amazon DocumentDB, AWS DMS requires access to the Amazon DocumentDB cluster's change streams. For a description of the time-ordered sequence of update events in your cluster's collections and databases, see [Using change streams](https://docs.aws.amazon.com/documentdb/latest/developerguide/change_streams.html) in the *Amazon DocumentDB Developer Guide*. 

Authenticate to your Amazon DocumentDB cluster using the MongoDB shell. Then run the following command to enable change streams.

```
db.adminCommand({modifyChangeStreams: 1,
    database: "DB_NAME",
    collection: "", 
    enable: true});
```

This approach enables the change stream for all collections in your database. After change streams are enabled, you can create a migration task that migrates existing data and at the same time replicates ongoing changes. AWS DMS continues to capture and apply changes even after the bulk data is loaded. Eventually, the source and target databases synchronize, minimizing downtime for a migration.

**Note**  
AWS DMS uses the operations log (oplog) to capture changes during ongoing replication. If Amazon DocumentDB flushes out the records from the oplog before AWS DMS reads them, your tasks will fail. We recommend sizing the oplog to retain changes for at least 24 hours.

## Connecting to Amazon DocumentDB using TLS
<a name="CHAP_Source.DocumentDB.TLS"></a>

By default, a newly created Amazon DocumentDB cluster accepts secure connections only using Transport Layer Security (TLS). When TLS is enabled, every connection to Amazon DocumentDB requires a public key.

You can retrieve the public key for Amazon DocumentDB by downloading the file `rds-combined-ca-bundle.pem` from an AWS-hosted Amazon S3 bucket. For more information on downloading this file, see [Encrypting connections using TLS](https://docs.aws.amazon.com/documentdb/latest/developerguide/security.encryption.ssl.html) in the *Amazon DocumentDB Developer Guide.* 

After you download the `rds-combined-ca-bundle.pem` file, you can import the public key that it contains into AWS DMS. The following steps describe how to do so.

**To import your public key using the AWS DMS console**

1. Sign in to the AWS Management Console and choose AWS DMS.

1. In the navigation pane, choose **Certificates**.

1. Choose **Import certificate**. The **Import new CA certificate** page appears.

1. In the **Certificate configuration** section, do one of the following:
   + For **Certificate identifier**, enter a unique name for the certificate, such as `docdb-cert`.
   + Choose **Choose file**, navigate to the location where you saved the `rds-combined-ca-bundle.pem` file, and select it.

1. Choose **Add new CA certificate**.

The AWS CLI following example uses the AWS DMS `import-certificate` command to import the public key `rds-combined-ca-bundle.pem` file.

```
aws dms import-certificate \
    --certificate-identifier docdb-cert \
    --certificate-pem file://./rds-combined-ca-bundle.pem
```

## Creating an Amazon DocumentDB source endpoint
<a name="CHAP_Source.DocumentDB.ConfigureEndpoint"></a>

You can create an Amazon DocumentDB source endpoint using either the console or AWS CLI. Use the procedure following with the console.

**To configure an Amazon DocumentDB source endpoint using the AWS DMS console**

1. Sign in to the AWS Management Console and choose AWS DMS.

1. Choose **Endpoints** from the navigation pane, then choose **Create Endpoint**.

1. For **Endpoint identifier**, provide a name that helps you easily identify it, such as `docdb-source`.

1. For **Source engine**, choose **Amazon DocumentDB (with MongoDB compatibility)**.

1. For **Server name**, enter the name of the server where your Amazon DocumentDB database endpoint resides. For example, you might enter the public DNS name of your Amazon EC2 instance, such as `democluster.cluster-cjf6q8nxfefi.us-east-2.docdb.amazonaws.com`.

1. For **Port**, enter 27017.

1. For **SSL mode**, choose **verify-full**. If you have disabled SSL on your Amazon DocumentDB cluster, you can skip this step.

1. For **CA certificate**, choose the Amazon DocumentDB certificate, `rds-combined-ca-bundle.pem`. For instructions on adding this certificate, see [Connecting to Amazon DocumentDB using TLS](#CHAP_Source.DocumentDB.TLS).

1. For **Database name**, enter the name of the database to be migrated.

Use the following procedure with the CLI.

**To configure an Amazon DocumentDB source endpoint using the AWS CLI**
+ Run the following AWS DMS `create-endpoint` command to configure an Amazon DocumentDB source endpoint, replacing placeholders with your own values.

  ```
  aws dms create-endpoint \
             --endpoint-identifier a_memorable_name \
             --endpoint-type source \
             --engine-name docdb \
             --username value \
             --password value \
             --server-name servername_where_database_endpoint_resides \
             --port 27017 \
             --database-name name_of_endpoint_database
  ```

## Segmenting Amazon DocumentDB collections and migrating in parallel
<a name="CHAP_Source.DocumentDB.ParallelLoad"></a>

To improve performance of a migration task, Amazon DocumentDB source endpoints support two options of the parallel full load feature in table mapping. In other words, you can migrate a collection in parallel by using either the autosegmentation or the range segmentation options of table mapping for a parallel full load in JSON settings. The auto-segmenting options allow you to specify the criteria for AWS DMS to automatically segment your source for migration in each thread. The range segmentation options allow you to tell AWS DMS the specific range of each segment for DMS to migrate in each thread. For more information on these settings, see [Table and collection settings rules and operations](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Tablesettings.md).

### Migrating an Amazon DocumentDB database in parallel using autosegmentation ranges
<a name="CHAP_Source.DocumentDB.ParallelLoad.AutoPartitioned"></a>

You can migrate your documents in parallel by specifying the criteria for AWS DMS to automatically partition (segment) your data for each thread, especially the number of documents to migrate per thread. Using this approach, AWS DMS attempts to optimize segment boundaries for maximum performance per thread.

You can specify the segmentation criteria using the table-settings options following in table-mapping:


|  Table-settings option  |  Description  | 
| --- | --- | 
|  `"type"`  |  (Required) Set to `"partitions-auto"` for Amazon DocumentDB as a source.  | 
|  `"number-of-partitions"`  |  (Optional) Total number of partitions (segments) used for migration. The default is 16.  | 
|  `"collection-count-from-metadata"`  |  (Optional) If set to `true`, AWS DMS uses an estimated collection count for determining the number of partitions. If set to `false`, AWS DMS uses the actual collection count. The default is `true`.  | 
|  `"max-records-skip-per-page"`  |  (Optional) The number of records to skip at once when determining the boundaries for each partition. AWS DMS uses a paginated skip approach to determine the minimum boundary for a partition. The default is 10000. Setting a relatively large value might result in curser timeouts and task failures. Setting a relatively low value results in more operations per page and a slower full load.   | 
|  `"batch-size"`  |  (Optional) Limits the number of documents returned in one batch. Each batch requires a round trip to the server. If the batch size is zero (0), the cursor uses the server-defined maximum batch size. The default is 0.  | 

The example following shows a table mapping for autosegmentation.

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "admin",
                "table-name": "departments"
            },
            "rule-action": "include",
            "filters": []
        },
        {
            "rule-type": "table-settings",
            "rule-id": "2",
            "rule-name": "2",
            "object-locator": {
                "schema-name": "admin",
                "table-name": "departments"
            },
            "parallel-load": {
                "type": "partitions-auto",
                "number-of-partitions": 5,
                "collection-count-from-metadata": "true",
                "max-records-skip-per-page": 1000000,
                "batch-size": 50000
            }
        }
    ]
}
```

Auto-segmentation has the limitation following. The migration for each segment fetches the collection count and the minimum `_id` for the collection separately. It then uses a paginated skip to calculate the minimum boundary for that segment. Therefore, ensure that the minimum `_id` value for each collection remains constant until all the segment boundaries in the collection are calculated. If you change the minimum `_id` value for a collection during its segment boundary calculation, this might cause data loss or duplicate row errors.

### Migrating an Amazon DocumentDB database in parallel using specific segment ranges
<a name="CHAP_Source.DocumentDB.ParallelLoad.Ranges"></a>

The following example shows an Amazon DocumentDB collection that has seven items, and `_id` as the primary key.

![\[Amazon DocumentDB collection with seven items.\]](http://docs.aws.amazon.com/dms/latest/userguide/images/datarep-docdb-collection.png)


To split the collection into three segments and migrate in parallel, you can add table mapping rules to your migration task as shown in the following JSON example.

```
{ // Task table mappings:
  "rules": [
    {
      "rule-type": "selection",
      "rule-id": "1",
      "rule-name": "1",
      "object-locator": {
        "schema-name": "testdatabase",
        "table-name": "testtable"
      },
      "rule-action": "include"
    }, // "selection" :"rule-type"
    {
      "rule-type": "table-settings",
      "rule-id": "2",
      "rule-name": "2",
      "object-locator": {
        "schema-name": "testdatabase",
        "table-name": "testtable"
      },
      "parallel-load": {
        "type": "ranges",
        "columns": [
           "_id",
           "num"
        ],
        "boundaries": [
          // First segment selects documents with _id less-than-or-equal-to 5f805c97873173399a278d79
          // and num less-than-or-equal-to 2.
          [
             "5f805c97873173399a278d79",
             "2"
          ],
          // Second segment selects documents with _id > 5f805c97873173399a278d79 and
          // _id less-than-or-equal-to 5f805cc5873173399a278d7c and
          // num > 2 and num less-than-or-equal-to 5.
          [
             "5f805cc5873173399a278d7c",
             "5"
          ]                                   
          // Third segment is implied and selects documents with _id > 5f805cc5873173399a278d7c.
        ] // :"boundaries"
      } // :"parallel-load"
    } // "table-settings" :"rule-type"
  ] // :"rules"
} // :Task table mappings
```

That table mapping definition splits the source collection into three segments and migrates in parallel. The following are the segmentation boundaries.

```
Data with _id less-than-or-equal-to "5f805c97873173399a278d79" and num less-than-or-equal-to 2 (2 records)
Data with _id less-than-or-equal-to "5f805cc5873173399a278d7c" and num less-than-or-equal-to 5 and not in (_id less-than-or-equal-to  "5f805c97873173399a278d79" and num less-than-or-equal-to 2) (3 records)
Data not in (_id less-than-or-equal-to "5f805cc5873173399a278d7c" and num less-than-or-equal-to 5) (2 records)
```

After the migration task is complete, you can verify from the task logs that the tables loaded in parallel, as shown in the following example. You can also verify the Amazon DocumentDB `find` clause used to unload each segment from the source table.

```
[TASK_MANAGER    ] I:  Start loading segment #1 of 3 of table 'testdatabase'.'testtable' (Id = 1) by subtask 1. Start load timestamp 0005B191D638FE86  (replicationtask_util.c:752)

[SOURCE_UNLOAD   ] I:  Range Segmentation filter for Segment #0 is initialized.   (mongodb_unload.c:157)

[SOURCE_UNLOAD   ] I:  Range Segmentation filter for Segment #0 is: { "_id" : { "$lte" : { "$oid" : "5f805c97873173399a278d79" } }, "num" : { "$lte" : { "$numberInt" : "2" } } }  (mongodb_unload.c:328)

[SOURCE_UNLOAD   ] I: Unload finished for segment #1 of segmented table 'testdatabase'.'testtable' (Id = 1). 2 rows sent.

[TASK_MANAGER    ] I: Start loading segment #1 of 3 of table 'testdatabase'.'testtable' (Id = 1) by subtask 1. Start load timestamp 0005B191D638FE86 (replicationtask_util.c:752)
 
[SOURCE_UNLOAD   ] I: Range Segmentation filter for Segment #0 is initialized. (mongodb_unload.c:157) 

[SOURCE_UNLOAD   ] I: Range Segmentation filter for Segment #0 is: { "_id" : { "$lte" : { "$oid" : "5f805c97873173399a278d79" } }, "num" : { "$lte" : { "$numberInt" : "2" } } } (mongodb_unload.c:328)
 
[SOURCE_UNLOAD   ] I: Unload finished for segment #1 of segmented table 'testdatabase'.'testtable' (Id = 1). 2 rows sent.

[TARGET_LOAD     ] I: Load finished for segment #1 of segmented table 'testdatabase'.'testtable' (Id = 1). 1 rows received. 0 rows skipped. Volume transfered 480.

[TASK_MANAGER    ] I: Load finished for segment #1 of table 'testdatabase'.'testtable' (Id = 1) by subtask 1. 2 records transferred.
```

Currently, AWS DMS supports the following Amazon DocumentDB data types as a segment key column:
+ Double
+ String
+ ObjectId
+ 32 bit integer
+ 64 bit integer

## Migrating multiple databases when using Amazon DocumentDB as a source for AWS DMS
<a name="CHAP_Source.DocumentDB.Multidatabase"></a>

AWS DMS versions 3.4.5 and higher support migrating multiple databases in a single task only for Amazon DocumentDB versions 4.0 and higher. If you want to migrate multiple databases, do the following:

1. When you create the Amazon DocumentDB source endpoint:
   + In the AWS Management Console for AWS DMS, leave **Database name** empty under **Endpoint configuration** on the **Create endpoint** page.
   + In the AWS Command Line Interface (AWS CLI), assign an empty string value to the **DatabaseName** parameter in **DocumentDBSettings** that you specify for the **CreateEndpoint** action.

1. For each database that you want to migrate from this Amazon DocumentDB source endpoint, specify the name of each database as the name of a schema in the table-mapping for the task using either the guided input in the console or directly in JSON. For more information on the guided input, see the description of the [Specifying table selection and transformations rules from the console](CHAP_Tasks.CustomizingTasks.TableMapping.Console.md). For more information on the JSON, see [Selection rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Selections.md).

For example, you might specify the JSON following to migrate three Amazon DocumentDB databases.

**Example Migrate all tables in a schema**  
The JSON following migrates all tables from the `Customers`, `Orders`, and `Suppliers` databases in your source enpoint to your target endpoint.  

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "Customers",
                "table-name": "%"
            },
            "object-locator": {
                "schema-name": "Orders",
                "table-name": "%"
            },
            "object-locator": {
                "schema-name": "Inventory",
                "table-name": "%"
            },
            "rule-action": "include"
        }
    ]
}
```

## Limitations when using Amazon DocumentDB as a source for AWS DMS
<a name="CHAP_Source.DocumentDB.Limitations"></a>

The following are limitations when using Amazon DocumentDB as a source for AWS DMS:
+ When the `_id` option is set as a separate column, the ID string can't exceed 200 characters.
+ Object ID and array type keys are converted to columns that are prefixed with `oid` and `array` in table mode.

  Internally, these columns are referenced with the prefixed names. If you use transformation rules in AWS DMS that reference these columns, make sure to specify the prefixed column. For example, specify `${oid__id}` and not `${_id}`, or `${array__addresses}` and not `${_addresses}`. 
+  Collection names and key names can't include the dollar symbol (\$1). 
+ Table mode and document mode have the limitations discussed preceding.
+ Migrating in parallel using autosegmentation has the limitations described preceding.
+ An Amazon DocumentDB (MongoDB compatible) source doesn’t support using a specific timestamp as a start position for change data capture (CDC). An ongoing replication task starts capturing changes regardless of the timestamp.
+ AWS DMS doesn't support documents where the nesting level is greater than 97 for AWS DMS versions lower than 3.5.2.
+ Source filters aren't supported for DocumentDB.
+ AWS DMS doesn’t support CDC (change data capture) replication for DocumentDB as a source in elastic cluster mode.

## Using endpoint settings with Amazon DocumentDB as a source
<a name="CHAP_Source.DocumentDB.ECAs"></a>

You can use endpoint settings to configure your Amazon DocumentDB source database similar to using extra connection attributes. You specify the settings when you create the source endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--doc-db-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with Amazon DocumentDB as a source.


| Attribute name | Valid values | Default value and description | 
| --- | --- | --- | 
|   `NestingLevel`   |  `"none"` `"one"`  |  `"none"` – Specify `"none"` to use document mode. Specify `"one"` to use table mode.  | 
|   `ExtractDocID`   |  `true` `false`  |  `false` – Use this attribute when `NestingLevel` is set to `"none"`.  When using CDC with sources that produce [multi-document transactions](https://www.mongodb.com/docs/manual/reference/method/Session.startTransaction/#mongodb-method-Session.startTransaction), the `ExtractDocId` parameter **must be** set to `true`. If this parameter is not enabled, the AWS DMS task will fail when it encounters a multi-document transaction.  | 
|   `DocsToInvestigate`   |  A positive integer greater than `0`.  |  `1000` – Use this attribute when `NestingLevel` is set to `"one"`.   | 
|   `ReplicateShardCollections `   |  `true` `false`  |  When true, AWS DMS replicates data to shard collections. AWS DMS only uses this setting if the target endpoint is a DocumentDB elastic cluster. When this setting is true, note the following: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.DocumentDB.html)  | 

## Source data types for Amazon DocumentDB
<a name="CHAP_Source.DocumentDB.DataTypes"></a>

In the following table, you can find the Amazon DocumentDB source data types that are supported when using AWS DMS. You can also find the default mapping from AWS DMS data types in this table. For more information about data types, see [BSON types](https://docs.mongodb.com/manual/reference/bson-types) in the MongoDB documentation.

For information on how to view the data type that is mapped in the target, see the section for the target endpoint that you are using.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  Amazon DocumentDB data types  |  AWS DMS data types  | 
| --- | --- | 
| Boolean | Bool | 
| Binary | BLOB | 
| Date | Date | 
| Timestamp | Date | 
| Int | INT4 | 
| Long | INT8 | 
| Double | REAL8 | 
| String (UTF-8) | CLOB | 
| Array | CLOB | 
| OID | String | 

# Using Amazon S3 as a source for AWS DMS
<a name="CHAP_Source.S3"></a>

You can migrate data from an Amazon S3 bucket using AWS DMS. To do this, provide access to an Amazon S3 bucket containing one or more data files. In that S3 bucket, include a JSON file that describes the mapping between the data and the database tables of the data in those files.

The source data files must be present in the Amazon S3 bucket before the full load starts. You specify the bucket name using the `bucketName` parameter. 

The source data files can be in the following formats:
+ Comma-separated value (.csv)
+ Parquet (DMS version 3.5.3 and later). For information about using Parquet-format files, see [Using Parquet-format files in Amazon S3 as a source for AWS DMS](#CHAP_Source.S3.Parquet).

For source data files in comma-separated value (.csv) format, name them using the following naming convention. In this convention, *`schemaName`* is the source schema and *`tableName`* is the name of a table within that schema.

```
/schemaName/tableName/LOAD001.csv
/schemaName/tableName/LOAD002.csv
/schemaName/tableName/LOAD003.csv
...
```

 For example, suppose that your data files are in `amzn-s3-demo-bucket`, at the following Amazon S3 path.

```
s3://amzn-s3-demo-bucket/hr/employee
```

At load time, AWS DMS assumes that the source schema name is `hr`, and that the source table name is `employee`.

In addition to `bucketName` (which is required), you can optionally provide a `bucketFolder` parameter to specify where AWS DMS should look for data files in the Amazon S3 bucket. Continuing the previous example, if you set `bucketFolder` to `sourcedata`, then AWS DMS reads the data files at the following path.

```
s3://amzn-s3-demo-bucket/sourcedata/hr/employee
```

You can specify the column delimiter, row delimiter, null value indicator, and other parameters using extra connection attributes. For more information, see [Endpoint settings for Amazon S3 as a source for AWS DMS](#CHAP_Source.S3.Configuring).

You can specify a bucket owner and prevent sniping by using the `ExpectedBucketOwner` Amazon S3 endpoint setting, as shown following. Then, when you make a request to test a connection or perform a migration, S3 checks the account ID of the bucket owner against the specified parameter.

```
--s3-settings='{"ExpectedBucketOwner": "AWS_Account_ID"}'
```

**Topics**
+ [

## Defining external tables for Amazon S3 as a source for AWS DMS
](#CHAP_Source.S3.ExternalTableDef)
+ [

## Using CDC with Amazon S3 as a source for AWS DMS
](#CHAP_Source.S3.CDC)
+ [

## Prerequisites when using Amazon S3 as a source for AWS DMS
](#CHAP_Source.S3.Prerequisites)
+ [

## Limitations when using Amazon S3 as a source for AWS DMS
](#CHAP_Source.S3.Limitations)
+ [

## Endpoint settings for Amazon S3 as a source for AWS DMS
](#CHAP_Source.S3.Configuring)
+ [

## Source data types for Amazon S3
](#CHAP_Source.S3.DataTypes)
+ [

## Using Parquet-format files in Amazon S3 as a source for AWS DMS
](#CHAP_Source.S3.Parquet)

## Defining external tables for Amazon S3 as a source for AWS DMS
<a name="CHAP_Source.S3.ExternalTableDef"></a>

In addition to the data files, you must also provide an external table definition. An *external table definition* is a JSON document that describes how AWS DMS should interpret the data from Amazon S3. The maximum size of this document is 2 MB. If you create a source endpoint using the AWS DMS Management Console, you can enter the JSON directly into the table-mapping box. If you use the AWS Command Line Interface (AWS CLI) or AWS DMS API to perform migrations, you can create a JSON file to specify the external table definition.

Suppose that you have a data file that includes the following.

```
101,Smith,Bob,2014-06-04,New York
102,Smith,Bob,2015-10-08,Los Angeles
103,Smith,Bob,2017-03-13,Dallas
104,Smith,Bob,2017-03-13,Dallas
```

Following is an example external table definition for this data.

```
{
    "TableCount": "1",
    "Tables": [
        {
            "TableName": "employee",
            "TablePath": "hr/employee/",
            "TableOwner": "hr",
            "TableColumns": [
                {
                    "ColumnName": "Id",
                    "ColumnType": "INT8",
                    "ColumnNullable": "false",
                    "ColumnIsPk": "true"
                },
                {
                    "ColumnName": "LastName",
                    "ColumnType": "STRING",
                    "ColumnLength": "20"
                },
                {
                    "ColumnName": "FirstName",
                    "ColumnType": "STRING",
                    "ColumnLength": "30"
                },
                {
                    "ColumnName": "HireDate",
                    "ColumnType": "DATETIME"
                },
                {
                    "ColumnName": "OfficeLocation",
                    "ColumnType": "STRING",
                    "ColumnLength": "20"
                }
            ],
            "TableColumnsTotal": "5"
        }
    ]
}
```

The elements in this JSON document are as follows:

`TableCount` – the number of source tables. In this example, there is only one table.

`Tables` – an array consisting of one JSON map per source table. In this example, there is only one map. Each map consists of the following elements:
+ `TableName` – the name of the source table.
+ `TablePath` – the path in your Amazon S3 bucket where AWS DMS can find the full data load file. If a `bucketFolder` value is specified, its value is prepended to the path.
+ `TableOwner` – the schema name for this table.
+ `TableColumns` – an array of one or more maps, each of which describes a column in the source table:
  + `ColumnName` – the name of a column in the source table.
  + `ColumnType` – the data type for the column. For valid data types, see [Source data types for Amazon S3](#CHAP_Source.S3.DataTypes).
  + `ColumnLength` – the number of bytes in this column. Maximum column length is limited to2147483647 Bytes (2,047 MegaBytes) since an S3 source doesn't support FULL LOB mode. `ColumnLength` is valid for the following data types:
    + BYTE
    + STRING
  + `ColumnNullable` – a Boolean value that is `true` if this column can contain NULL values (default=`false`).
  + `ColumnIsPk` – a Boolean value that is `true` if this column is part of the primary key (default=`false`).
  + `ColumnDateFormat` – the input date format for a column with DATE, TIME, and DATETIME types, and used to parse a data string into a date object. Possible values include:

    ```
    - YYYY-MM-dd HH:mm:ss
    - YYYY-MM-dd HH:mm:ss.F
    - YYYY/MM/dd HH:mm:ss
    - YYYY/MM/dd HH:mm:ss.F
    - MM/dd/YYYY HH:mm:ss
    - MM/dd/YYYY HH:mm:ss.F
    - YYYYMMdd HH:mm:ss
    - YYYYMMdd HH:mm:ss.F
    ```
+ `TableColumnsTotal` – the total number of columns. This number must match the number of elements in the `TableColumns` array.

If you don't specify otherwise, AWS DMS assumes that `ColumnLength` is zero.

**Note**  
In supported versions of AWS DMS, the S3 source data can also contain an optional operation column as the first column before the `TableName` column value. This operation column identifies the operation (`INSERT`) used to migrate the data to an S3 target endpoint during a full load.   
If present, the value of this column is the initial character of the `INSERT` operation keyword (`I`). If specified, this column generally indicates that the S3 source was created by DMS as an S3 target during a previous migration.   
In DMS versions prior to 3.4.2, this column wasn't present in S3 source data created from a previous DMS full load. Adding this column to S3 target data allows the format of all rows written to the S3 target to be consistent whether they are written during a full load or during a CDC load. For more information on the options for formatting S3 target data, see [Indicating source DB operations in migrated S3 data](CHAP_Target.S3.md#CHAP_Target.S3.Configuring.InsertOps).

For a column of the NUMERIC type, specify the precision and scale. *Precision* is the total number of digits in a number, and *scale* is the number of digits to the right of the decimal point. You use the `ColumnPrecision` and `ColumnScale` elements for this, as shown following.

```
...
    {
        "ColumnName": "HourlyRate",
        "ColumnType": "NUMERIC",
        "ColumnPrecision": "5"
        "ColumnScale": "2"
    }
...
```

For a column of the DATETIME type with data that contains fractional seconds, specify the scale. *Scale* is the number of digits for the fractional seconds, and can range from 0 to 9. You use the `ColumnScale` element for this, as shown following.

```
...
{
      "ColumnName": "HireDate",
      "ColumnType": "DATETIME",
      "ColumnScale": "3"
}
...
```

If you don't specify otherwise, AWS DMS assumes `ColumnScale` is zero and truncates the fractional seconds.

## Using CDC with Amazon S3 as a source for AWS DMS
<a name="CHAP_Source.S3.CDC"></a>

After AWS DMS performs a full data load, it can optionally replicate data changes to the target endpoint. To do this, you upload change data capture files (CDC files) to your Amazon S3 bucket. AWS DMS reads these CDC files when you upload them, and then applies the changes at the target endpoint. 

The CDC files are named as follows:

```
CDC00001.csv
CDC00002.csv
CDC00003.csv
...
```

**Note**  
To replicate CDC files in the change data folder successfully upload them in a lexical (sequential) order. For example, upload the file CDC00002.csv before the file CDC00003.csv. Otherwise, CDC00002.csv is skipped and isn't replicated if you load it after CDC00003.csv. But the file CDC00004.csv replicates successfully if loaded after CDC00003.csv.

To indicate where AWS DMS can find the files, specify the `cdcPath` parameter. Continuing the previous example, if you set `cdcPath` to `changedata`, then AWS DMS reads the CDC files at the following path.

```
s3://amzn-s3-demo-bucket/changedata
```

If you set `cdcPath` to `changedata` and `bucketFolder` to `myFolder`, then AWS DMS reads the CDC files at the following path.

```
s3://amzn-s3-demo-bucket/myFolder/changedata
```

The records in a CDC file are formatted as follows:
+ Operation – the change operation to be performed: `INSERT` or `I`, `UPDATE` or `U`, or `DELETE` or `D`. These keyword and character values are case-insensitive.
**Note**  
In supported AWS DMS versions, AWS DMS can identify the operation to perform for each load record in two ways. AWS DMS can do this from the record's keyword value (for example, `INSERT`) or from its keyword initial character (for example, `I`). In prior versions, AWS DMS recognized the load operation only from the full keyword value.   
In prior versions of AWS DMS, the full keyword value was written to log the CDC data. Also, prior versions wrote the operation value to any S3 target using only the keyword initial.   
Recognizing both formats allows AWS DMS to handle the operation regardless of how the operation column is written to create the S3 source data. This approach supports using S3 target data as the source for a later migration. With this approach, you don't need to change the format of any keyword initial value that appears in the operation column of the later S3 source.
+ Table name – the name of the source table.
+ Schema name – the name of the source schema.
+ Data – one or more columns that represent the data to be changed.

Following is an example CDC file for a table named `employee`.

```
INSERT,employee,hr,101,Smith,Bob,2014-06-04,New York
UPDATE,employee,hr,101,Smith,Bob,2015-10-08,Los Angeles
UPDATE,employee,hr,101,Smith,Bob,2017-03-13,Dallas
DELETE,employee,hr,101,Smith,Bob,2017-03-13,Dallas
```

## Prerequisites when using Amazon S3 as a source for AWS DMS
<a name="CHAP_Source.S3.Prerequisites"></a>

To use Amazon S3 as a source for AWS DMS, your source S3 bucket must be in the same AWS Region as the DMS replication instance that migrates your data. In addition, the AWS account you use for the migration must have read access to the source bucket. For AWS DMS version 3.4.7 and higher, DMS must access the source bucket through a VPC endpoint or a public route. For information about VPC endpoints, see [Configuring VPC endpoints for AWS DMS](CHAP_VPC_Endpoints.md).

The AWS Identity and Access Management (IAM) role assigned to the user account used to create the migration task must have the following set of permissions.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
       {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket*/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket*"
            ]
        }
    ]
}
```

------

The AWS Identity and Access Management (IAM) role assigned to the user account used to create the migration task must have the following set of permissions if versioning is enabled on the Amazon S3 bucket.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
       {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:GetObjectVersion"
            ],
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket*/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket*"
            ]
        }
    ]
}
```

------

## Limitations when using Amazon S3 as a source for AWS DMS
<a name="CHAP_Source.S3.Limitations"></a>

The following limitations apply when using Amazon S3 as a source:
+ Don’t enable versioning for S3. If you need S3 versioning, use lifecycle policies to actively delete old versions. Otherwise, you might encounter endpoint test connection failures because of an S3 `list-object` call timeout. To create a lifecycle policy for an S3 bucket, see [ Managing your storage lifecycle](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt.html). To delete a version of an S3 object, see [ Deleting object versions from a versioning-enabled bucket](https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingObjectVersions.html).
+ A VPC-enabled (gateway VPC) S3 bucket is supported in versions 3.4.7 and higher.
+ MySQL converts the `time` datatype to `string`. To see `time` data type values in MySQL, define the column in the target table as `string`, and set the task's **Target table preparation mode** setting to **Truncate**.
+ AWS DMS uses the `BYTE` data type internally for data in both `BYTE` and `BYTES` data types.
+ S3 source endpoints do not support the DMS table reload feature.
+ AWS DMS doesn't support Full LOB mode with Amazon S3 as a Source.

The following limitations apply when using Parquet-format files in Amazon S3 as a source:
+ Dates in `MMYYYYDD`, or `DDMMYYYY` are not supported for the S3 Parquet Source date-partitioning feature.

## Endpoint settings for Amazon S3 as a source for AWS DMS
<a name="CHAP_Source.S3.Configuring"></a>

You can use endpoint settings to configure your Amazon S3 source database similar to using extra connection attributes. You specify the settings when you create the source endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--s3-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

**Note**  
AWS DMS defaults to a secure connection to the Amazon S3 endpoint without requiring to specify SSL mode or certificate.

The following table shows the endpoint settings that you can use with Amazon S3 as a source.


| **Option** | **Description** | 
| --- | --- | 
| BucketFolder |  (Optional) A folder name in the S3 bucket. If this attribute is provided, source data files and CDC files are read from the path `s3://amzn-s3-demo-bucket/bucketFolder/schemaName/tableName/` and `s3://amzn-s3-demo-bucket/bucketFolder/` respectively. If this attribute isn't specified, then the path used is `schemaName/tableName/`.  `'{"BucketFolder": "sourceData"}'`  | 
| BucketName |  The name of the S3 bucket. `'{"BucketName": "amzn-s3-demo-bucket"}'`  | 
| CdcPath | The location of CDC files. This attribute is required if a task captures change data; otherwise, it's optional. If CdcPath is present, then AWS DMS reads CDC files from this path and replicates the data changes to the target endpoint. For more information, see [Using CDC with Amazon S3 as a source for AWS DMS](#CHAP_Source.S3.CDC). `'{"CdcPath": "changeData"}'`  | 
| CsvDelimiter |  The delimiter used to separate columns in the source files. The default is a comma. An example follows. `'{"CsvDelimiter": ","}'`  | 
| CsvNullValue |  A user-defined string that AWS DMS treats as null when reading from the source. The default is an empty string. If you do not set this parameter, AWS DMS treats an empty string as a null value. If you set this parameter to a string such as "\$1N", AWS DMS treats this string as the null value, and treats empty strings as an empty string value.  | 
| CsvRowDelimiter |  The delimiter used to separate rows in the source files. The default is a newline (`\n`). `'{"CsvRowDelimiter": "\n"}'`  | 
| DataFormat |  Set this value to `Parquet` to read data in Parquet format. `'{"DataFormat": "Parquet"}'`  | 
| IgnoreHeaderRows |  When this value is set to 1, AWS DMS ignores the first row header in a .csv file. A value of 1 enables the feature, a value of 0 disables the feature. The default is 0. `'{"IgnoreHeaderRows": 1}'`  | 
| Rfc4180 |  When this value is set to `true` or `y`, each leading double quotation mark has to be followed by an ending double quotation mark. This formatting complies with RFC 4180. When this value is set to `false` or `n`, string literals are copied to the target as is. In this case, a delimiter (row or column) signals the end of the field. Thus, you can't use a delimiter as part of the string, because it signals the end of the value. The default is `true`. Valid values: `true`, `false`, `y`, `n` `'{"Rfc4180": false}'`  | 

## Source data types for Amazon S3
<a name="CHAP_Source.S3.DataTypes"></a>

Data migration that uses Amazon S3 as a source for AWS DMS needs to map data from Amazon S3 to AWS DMS data types. For more information, see [Defining external tables for Amazon S3 as a source for AWS DMS](#CHAP_Source.S3.ExternalTableDef).

For information on how to view the data type that is mapped in the target, see the section for the target endpoint you are using.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).

The following AWS DMS data types are used with Amazon S3 as a source:
+ BYTE – Requires `ColumnLength`. For more information, see [Defining external tables for Amazon S3 as a source for AWS DMS](#CHAP_Source.S3.ExternalTableDef).
+ DATE
+ TIME
+ DATETIME – For more information and an example, see the DATETIME type example in [Defining external tables for Amazon S3 as a source for AWS DMS](#CHAP_Source.S3.ExternalTableDef).
+ INT1
+ INT2
+ INT4
+ INT8
+ NUMERIC – Requires `ColumnPrecision` and `ColumnScale`. AWS DMS supports the following maximum values:
  + **ColumnPrecision: 38**
  + **ColumnScale: 31**

  For more information and an example, see the NUMERIC type example in [Defining external tables for Amazon S3 as a source for AWS DMS](#CHAP_Source.S3.ExternalTableDef).
+ REAL4
+ REAL8
+ STRING – Requires `ColumnLength`. For more information, see [Defining external tables for Amazon S3 as a source for AWS DMS](#CHAP_Source.S3.ExternalTableDef).
+ UINT1
+ UINT2
+ UINT4
+ UINT8
+ BLOB
+ CLOB
+ BOOLEAN

## Using Parquet-format files in Amazon S3 as a source for AWS DMS
<a name="CHAP_Source.S3.Parquet"></a>

In AWS DMS version 3.5.3 and later, you can use Parquet-format files in an S3 bucket as a source for both Full-Load or CDC replication. 

DMS only supports Parquet format files as a source that DMS generates by migrating data to an S3 target endpoint. File names must be in the supported format, or DMS won't include them in the migration.

For source data files in Parquet format, they must be in the following folder and naming convention.

```
schema/table1/LOAD00001.parquet
schema/table2/LOAD00002.parquet
schema/table2/LOAD00003.parquet
```

For source data files for CDC data in Parquet format, name and store them using the following folder and naming convention.

```
schema/table/20230405-094615814.parquet
schema/table/20230405-094615853.parquet
schema/table/20230405-094615922.parquet
```

To access files in Parquet format, set the following endpoint settings:
+ Set `DataFormat` to `Parquet`. 
+ Do not set the `cdcPath` setting. Make sure that you create your Parquet-format files in the specified schema/ table folders. 

For more information about settings for S3 endpoints, see [S3Settings](https://docs.aws.amazon.com/dms/latest/APIReference/API_S3Settings.html) in the *AWS Database Migration Service API Reference*.

### Supported datatypes for Parquet-format files
<a name="CHAP_Source.S3.Parquet.Datatypes"></a>

AWS DMS supports the following source and target data types when migrating data from Parquet-format files. Ensure that your target table has columns of the correct data types before migrating.


| Source data type | Target data type | 
| --- | --- | 
| BYTE | BINARY | 
| DATE | DATE32 | 
| TIME | TIME32 | 
| DATETIME | TIMESTAMP | 
| INT1 | INT8 | 
| INT2 | INT16 | 
| INT4 | INT32 | 
| INT8 | INT64 | 
| NUMERIC | DECIMAL | 
| REAL4 | FLOAT | 
| REAL8 | DOUBLE | 
| STRING | STRING | 
| UINT1 | UINT8 | 
| UINT2 | UINT16 | 
| UINT4 | UINT32 | 
| UINT8 | UINT | 
| WSTRING | STRING | 
| BLOB | BINARY | 
| NCLOB | STRING | 
| CLOB | STRING | 
| BOOLEAN | BOOL | 

# Using IBM Db2 for Linux, Unix, Windows, and Amazon RDS database (Db2 LUW) as a source for AWS DMS
<a name="CHAP_Source.DB2"></a>

You can migrate data from an IBM Db2 for Linux, Unix, Windows, and Amazon RDS (Db2 LUW) database to any supported target database using AWS Database Migration Service (AWS DMS). 

For information about versions of Db2 on Linux, Unix, Windows, and RDS that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md). 

You can use Secure Sockets Layer (SSL) to encrypt connections between your Db2 LUW endpoint and the replication instance. For more information on using SSL with a Db2 LUW endpoint, see [Using SSL with AWS Database Migration Service](CHAP_Security.SSL.md).

When AWS DMS reads data from an IBM Db2 source database, it uses the default isolation level CURSOR STABILITY (CS) for Db2 version 9.7 and above. For more information, see [IBM Db2 for Linux, UNIX and Windows](https://www.ibm.com/docs/en/db2/12.1.0) documentation.

## Prerequisites when using Db2 LUW as a source for AWS DMS
<a name="CHAP_Source.DB2.Prerequisites"></a>

The following prerequisites are required before you can use an Db2 LUW database as a source.

To enable ongoing replication, also called change data capture (CDC), do the following:
+ Set the database to be recoverable, which AWS DMS requires to capture changes. A database is recoverable if either or both of the database configuration parameters `LOGARCHMETH1` and `LOGARCHMETH2` are set to `ON`.

  If your database is recoverable, then AWS DMS can access the Db2 `ARCHIVE LOG` if needed.
+ Ensure that the DB2 transaction logs are available, with a sufficient retention period to be processed by AWS DMS. 
+ DB2 requires `SYSADM` or `DBADM` authorization to extract transaction log records. Grant the user account the following permissions:
  + `SYSADM` or `DBADM`
  + `DATAACCESS`
**Note**  
For full-load only tasks, the DMS user account needs DATAACCESS permission.
+ When using IBM DB2 for LUW version 9.7 as a source, set the extra connection attribute (ECA), `CurrentLsn` as follows:

  `CurrentLsn=LSN` where `LSN` specifies a log sequence number (LSN) where you want the replication to start. Or, `CurrentLsn=scan`.
+ When using Amazon RDS for Db2 LUW as a source, ensure that the archive logs are available to AWS DMS. Because AWS-managed Db2 databases purge the archive logs as soon as possible, you should increase the length of time that the logs remain available. For example, to increase log retention to 24 hours, run the following command:

  ```
  db2 "call rdsadmin.set_archive_log_retention( ?, 'TESTDB', '24')"
  ```

  For more information about Amazon RDS for Db2 LUW procedures, see the [ Amazon RDS for Db2 stored procedure reference ](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/db2-stored-procedures.html) in the *Amazon Relational Database Service User Guide*.
+ Grant the following privileges if you use DB2 specific premigration assessments:

  ```
  GRANT CONNECT ON DATABASE TO USER <DMS_USER>;
  GRANT SELECT ON SYSIBM.SYSDUMMY1 TO USER <DMS_USER>;
  GRANT SELECT ON SYSIBMADM.ENV_INST_INFO TO USER <DMS_USER>;
  GRANT SELECT ON SYSIBMADM.DBCFG TO USER <DMS_USER>;
  GRANT SELECT ON SYSCAT.SCHEMATA TO USER <DMS_USER>;
  GRANT SELECT ON SYSCAT.COLUMNS TO USER <DMS_USER>;
  GRANT SELECT ON SYSCAT.TABLES TO USER <DMS_USER>;
  GRANT EXECUTE ON FUNCTION SYSPROC.AUTH_LIST_AUTHORITIES_FOR_AUTHID TO <DMS_USER>;
  GRANT EXECUTE ON PACKAGE NULLID.SYSSH200 TO USER <DMS_USER>;
  ```

## Limitations when using Db2 LUW as a source for AWS DMS
<a name="CHAP_Source.DB2.Limitations"></a>

AWS DMS doesn't support clustered databases. However, you can define a separate Db2 LUW for each of the endpoints of a cluster. For example, you can create a Full Load migration task with any one of the nodes in the cluster, then create separate tasks from each node.

AWS DMS doesn't support the `BOOLEAN` data type in your source Db2 LUW database.

When using ongoing replication (CDC), the following limitations apply:
+ When a table with multiple partitions is truncated, the number of DDL events shown in the AWS DMS console is equal to the number of partitions. This is because Db2 LUW records a separate DDL for each partition.
+ The following DDL actions aren't supported on partitioned tables:
  + ALTER TABLE ADD PARTITION
  + ALTER TABLE DETACH PARTITION
  + ALTER TABLE ATTACH PARTITION
+ AWS DMS doesn't support an ongoing replication migration from a DB2 high availability disaster recovery (HADR) standby instance. The standby is inaccessible.
+ The DECFLOAT data type isn't supported. Consequently, changes to DECFLOAT columns are ignored during ongoing replication.
+ The RENAME COLUMN statement isn't supported.
+ When performing updates to Multi-Dimensional Clustering (MDC) tables, each update is shown in the AWS DMS console as INSERT \$1 DELETE.
+ When the task setting **Include LOB columns in replication** isn't enabled, any table that has LOB columns is suspended during ongoing replication.
+ For Db2 LUW versions 10.5 and higher, variable-length string columns with data that is stored out-of-row are ignored. This limitation only applies to tables created with extended row size for columns with data types like VARCHAR and VARGRAPHIC. To work around this limitation, move the table to a table space with a higher page size. For more information, see [ What can I do if I want to change the pagesize of DB2 tablespaces]( https://www.ibm.com/support/pages/what-can-i-do-if-i-want-change-pagesize-db2-tablespaces ).
+ For ongoing replication, DMS doesn't support migrating data loaded at the page level by the DB2 LOAD utility. Instead, use the IMPORT utility which uses SQL inserts. For more information, see [ differences between the import and load utilities]( https://www.ibm.com/docs/en/db2/11.1?topic=utilities-differences-between-import-load-utility). 
+ While a replication task is running, DMS captures CREATE TABLE DDLs only if the tables were created with the DATA CAPTURE CHANGE attribute.
+ DMS has the following limitations when using the Db2 Database Partition Feature (DPF):
  + DMS can't coordinate transactions across Db2 nodes in a DPF environment. This is due to constraints within the IBM DB2READLOG API interface. In DPF, transactions may span multiple Db2 nodes, depending upon how DB2 partitions the data. As a result, your DMS solution must capture transactions from each Db2 node independently.
  + DMS can capture local transactions from each Db2 node in the DPF cluster by setting `connectNode` to `1` on multiple DMS source endpoints. This configuration corresponds to logical node numbers defined in the DB2 server configuration file `db2nodes.cfg`.
  + Local transactions on individual Db2 nodes may be parts of a larger, global transaction. DMS applies each local transaction independently on the target, without coordination with transactions on other Db2 nodes. This independent processing can lead to complications, especially when rows are moved between partitions.
  + When DMS replicates from multiple Db2 nodes, there is no assurance of the correct order of operations on the target, because DMS applies operations independently for each Db2 node. You must ensure that capturing local transactions independently from each Db2 node works for your specific use case.
  + When migrating from a DPF environment, we recommend first running a Full Load task without cached events, and then running CDC-only tasks. We recommend running one task per Db2 node, starting from the Full Load start timestamp or LRI (log record identifier) you set using the `StartFromContext` endpoint extra connection attribute. For information about determining your replication start point, see [ Finding the LSN or LRI value for replication start ](https://www.ibm.com/support/pages/db2-finding-lsn-or-lri-value-replication-start) in the *IBM Support documentation*. 
+ For ongoing replication (CDC), if you plan to start replication from a specific timestamp, you must set the `StartFromContext` extra connection attribute to the required timestamp.
+ Currently, DMS doesn't support the Db2 pureScale Feature, an extension of DB2 LUW that you can use to scale your database solution.
+ The `DATA CAPTURE CHANGES` table option is a crucial prerequisite for DB2 data replication processes. Neglecting to enable this option when creating tables can cause missing data, especially for CDC (Change Data Capture) only replication tasks initiated from an earlier starting point. AWS DMS will enable this attribute by default when restarting a CDC or FULL\$1CDC task. However, any changes made in the source database before the task restart may be missed.

  ```
  ALTER TABLE TABLE_SCHEMA.TABLE_NAME DATA CAPTURE CHANGES INCLUDE LONGVAR COLUMNS;
  ```

## Endpoint settings when using Db2 LUW as a source for AWS DMS
<a name="CHAP_Source.DB2.ConnectionSettings"></a>

You can specify the settings when you create the source endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/create-endpoint.html), with the

`--ibm-db2-settings '{"EndpointSetting1": "value1","EndpointSetting2": "value2"}'`

JSON syntax.

The following table shows the endpoint settings that you can use with Db2 LUW as a source.


| Setting Name | Description | 
| --- | --- | 
|  `CurrentLsn`  |  For ongoing replication (CDC), use `CurrentLsn` to specify a log sequence number (LSN) where you want the replication to start.   | 
|  `MaxKBytesPerRead`  |  Maximum number of bytes per read, as a NUMBER value. The default is 64 KB.  | 
|  `SetDataCaptureChanges`  |  Enables ongoing replication (CDC) as a BOOLEAN value. The default is true.  | 

## Extra Connection Attributes (ECAs) when using Db2 LUW as a source for AWS DMS
<a name="CHAP_Source.DB2.ConnectionAttrib"></a>

You can specify the Extra Connection Attributes (ECAs) when you create the source endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/create-endpoint.html), with the

`--extra-connection-attributes 'ECAname1=value1;ECAname2=value2;'`

The following table shows the ECAs that you can use with Db2 LUW as a source.


| Attribute Name | Description | 
| --- | --- | 
|  `ConnectionTimeout`  |  Use this ECA to set the endpoint connection timeout for the Db2 LUW endpoint, in seconds. The default value is 10 seconds. Example: `ConnectionTimeout=30;`  | 
|  `executeTimeout`  |  Extra connection attribute which sets the statement (query) timeout for the DB2 LUW endpoint, in seconds. The default value is 60 seconds. Example: `executeTimeout=120;`  | 
|  `StartFromContext`  |  For ongoing replication (CDC), use `StartFromContext` to specify a log's lower limit from where to start the replication. `StartFromContext` accepts different forms of values. Valid values include: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.DB2.html) [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.DB2.html) [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.DB2.html) To determine the LRI/LSN range of a log file, run the `db2flsn` command as shown in the example following. <pre>db2flsn -db SAMPLE -lrirange 2</pre> The output from that example is similar to the following.  <pre><br />S0000002.LOG: has LRI range 00000000000000010000000000002254000000000004F9A6 to <br />000000000000000100000000000022CC000000000004FB13</pre> In that output, the log file is S0000002.LOG and the **StartFromContext** LRI value is the 34 bytes at the end of the range. <pre>0100000000000022CC000000000004FB13</pre>  | 

## Source data types for IBM Db2 LUW
<a name="CHAP_Source.DB2.DataTypes"></a>

Data migration that uses Db2 LUW as a source for AWS DMS supports most Db2 LUW data types. The following table shows the Db2 LUW source data types that are supported when using AWS DMS and the default mapping from AWS DMS data types. For more information about Db2 LUW data types, see the [Db2 LUW documentation](https://www.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r0008483.html).

For information on how to view the data type that is mapped in the target, see the section for the target endpoint that you're using.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  Db2 LUW data types  |  AWS DMS data types  | 
| --- | --- | 
|  INTEGER  |  INT4  | 
|  SMALLINT  |  INT2  | 
|  BIGINT  |  INT8  | 
|  DECIMAL (p,s)  |  NUMERIC (p,s)  | 
|  FLOAT  |  REAL8  | 
|  DOUBLE  |  REAL8  | 
|  REAL  |  REAL4  | 
|  DECFLOAT (p)  |  If precision is 16, then REAL8; if precision is 34, then STRING  | 
|  GRAPHIC (n)  |  WSTRING, for fixed-length graphic strings of double byte chars with a length greater than 0 and less than or equal to 127  | 
|  VARGRAPHIC (n)  |  WSTRING, for varying-length graphic strings with a length greater than 0 and less than or equal to16,352 double byte chars  | 
|  LONG VARGRAPHIC (n)  |  CLOB, for varying-length graphic strings with a length greater than 0 and less than or equal to16,352 double byte chars  | 
|  CHARACTER (n)  |  STRING, for fixed-length strings of double byte chars with a length greater than 0 and less than or equal to 255  | 
|  VARCHAR (n)  |  STRING, for varying-length strings of double byte chars with a length greater than 0 and less than or equal to 32,704  | 
|  LONG VARCHAR (n)  |  CLOB, for varying-length strings of double byte chars with a length greater than 0 and less than or equal to 32,704  | 
|  CHAR (n) FOR BIT DATA  |  BYTES  | 
|  VARCHAR (n) FOR BIT DATA  |  BYTES  | 
|  LONG VARCHAR FOR BIT DATA  |  BYTES  | 
|  DATE  |  DATE  | 
|  TIME  |  TIME  | 
|  TIMESTAMP  |  DATETIME  | 
|  BLOB (n)  |  BLOB Maximum length is 2,147,483,647 bytes  | 
|  CLOB (n)  |  CLOB Maximum length is 2,147,483,647 bytes  | 
|  DBCLOB (n)  |  CLOB Maximum length is 1,073,741,824 double byte chars  | 
|  XML  |  CLOB  | 

# Using IBM Db2 for z/OS databases as a source for AWS DMS
<a name="CHAP_Source.DB2zOS"></a>

You can migrate data from an IBM for z/OS database to any supported target database using AWS Database Migration Service (AWS DMS). 

For information about versions of Db2 for z/OS that AWS DMS supports as a source, see [Sources for AWS DMS](CHAP_Introduction.Sources.md).

## Prerequisites when using Db2 for z/OS as a source for AWS DMS
<a name="CHAP_Source.DB2zOS.Prerequisites"></a>

To use an IBM Db2 for z/OS database as a source in AWS DMS, grant the following privileges to the Db2 for z/OS user specified in the source endpoint connection settings.

```
GRANT SELECT ON SYSIBM.SYSTABLES TO Db2USER;
GRANT SELECT ON SYSIBM.SYSTABLESPACE TO Db2USER;
GRANT SELECT ON SYSIBM.SYSTABLEPART TO Db2USER;                    
GRANT SELECT ON SYSIBM.SYSCOLUMNS TO Db2USER;
GRANT SELECT ON SYSIBM.SYSDATABASE TO Db2USER;
GRANT SELECT ON SYSIBM.SYSDUMMY1 TO Db2USER
```

Also grant SELECT ON `user defined` source tables.

An AWS DMS IBM Db2 for z/OS source endpoint relies on the IBM Data Server Driver for ODBC to access data. The database server must have a valid IBM ODBC Connect license for DMS to connect to this endpoint.

## Limitations when using Db2 for z/OS as a source for AWS DMS
<a name="CHAP_Source.DB2zOS.Limitations"></a>

The following limitations apply when using an IBM Db2 for z/OS database as a source for AWS DMS:
+ Only Full Load replication tasks are supported. Change data capture (CDC) isn't supported.
+ Parallel load isn't supported.
+ Data validation of views are not supported.
+ Schema, table, and columns names must be specified in UPPER case in table mappings for Column/table level transformations and row level selection filters.

## Source data types for IBM Db2 for z/OS
<a name="CHAP_Source.DB2zOS.DataTypes"></a>

Data migrations that use Db2 for z/OS as a source for AWS DMS support most Db2 for z/OS data types. The following table shows the Db2 for z/OS source data types that are supported when using AWS DMS, and the default mapping from AWS DMS data types.

For more information about Db2 for z/OS data types, see the [IBM Db2 for z/OS documentation](https://www.ibm.com/docs/en/db2-for-zos/12?topic=elements-data-types).

For information on how to view the data type that is mapped in the target, see the section for the target endpoint that you're using.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  Db2 for z/OS data types  |  AWS DMS data types  | 
| --- | --- | 
|  INTEGER  |  INT4  | 
|  SMALLINT  |  INT2  | 
|  BIGINT  |  INT8  | 
|  DECIMAL (p,s)  |  NUMERIC (p,s) If a decimal point is set to a comma (,) in the DB2 configuration, configure Replicate to support the DB2 setting.   | 
|  FLOAT  |  REAL8  | 
|  DOUBLE  |  REAL8  | 
|  REAL  |  REAL4  | 
|  DECFLOAT (p)  |  If precision is 16, then REAL8; if precision is 34, then STRING  | 
|  GRAPHIC (n)  |  If n>=127 then WSTRING, for fixed-length graphic strings of double byte chars with a length greater than 0 and less than or equal to 127  | 
|  VARGRAPHIC (n)  |  WSTRING, for varying-length graphic strings with a length greater than 0 and less than or equal to16,352 double byte chars  | 
|  LONG VARGRAPHIC (n)  |  CLOB, for varying-length graphic strings with a length greater than 0 and less than or equal to16,352 double byte chars  | 
|  CHARACTER (n)  |  STRING, for fixed-length strings of double byte chars with a length greater than 0 and less than or equal to 255  | 
|  VARCHAR (n)  |  STRING, for varying-length strings of double byte chars with a length greater than 0 and less than or equal to 32,704  | 
|  LONG VARCHAR (n)  |  CLOB, for varying-length strings of double byte chars with a length greater than 0 and less than or equal to 32,704  | 
|  CHAR (n) FOR BIT DATA  |  BYTES  | 
|  VARCHAR (n) FOR BIT DATA  |  BYTES  | 
|  LONG VARCHAR FOR BIT DATA  |  BYTES  | 
|  DATE  |  DATE  | 
|  TIME  |  TIME  | 
|  TIMESTAMP  |  DATETIME  | 
|  BLOB (n)  |  BLOB Maximum length is 2,147,483,647 bytes  | 
|  CLOB (n)  |  CLOB Maximum length is 2,147,483,647 bytes  | 
|  DBCLOB (n)  |  CLOB Maximum length is 1,073,741,824 double byte chars  | 
|  XML  |  CLOB  | 
|  BINARY  |  BYTES  | 
|  VARBINARY  |  BYTES  | 
|  ROWID  |  BYTES. For more information about working with ROWID, see following.   | 
|  TIMESTAMP WITH TIME ZONE  |  Not supported.  | 

ROWID columns are migrated by default when the target table prep mode for the task is set to DROP\$1AND\$1CREATE (the default). Data validation ignores these columns because the rows are meaningless outside the specific database and table. To turn off migration of these columns, you can do one of the following preparatory steps: 
+ Precreate the target table without these columns. Then, set the target table prep mode of the task to either DO\$1NOTHING or TRUNCATE\$1BEFORE\$1LOAD. You can use AWS Schema Conversion Tool (AWS SCT) to precreate the target table without the columns.
+ Add a table mapping rule to a task that filters out these columns so that they're ignored. For more information, see [Transformation rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.md).

## EBCDIC collations in PostgreSQL for AWS Mainframe Modernization service
<a name="CHAP_Source.DB2zOS.EBCDIC"></a>

AWS Mainframe Modernization program helps you modernize your mainframe applications to AWS managed runtime environments. It provides tools and resources that help you plan and implement your migration and modernization projects. For more information about mainframe modernization and migration, see [Mainframe Modernization with AWS](https://aws.amazon.com/mainframe/).

Some IBM Db2 for z/OS data sets are encoded in the Extended Binary Coded Decimal Interchange (EBCDIC) character set. This is a character set that was developed before ASCII (American Standard Code for Information Interchange) became commonly used. A *code page* maps each character of text to the characters in a character set. A traditional code page contains the mapping information between a code point and a character ID. A *character ID *is an 8-byte character data string. A *code point* is an 8-bit binary number that represents a character. Code points are usually shown as hexadecimal representations of their binary values.

If you currently use either the Micro Focus or BluAge component of the Mainframe Modernization service, you must tell AWS DMS to *shift* (translate) certain code points. You can use AWS DMS task settings to perform the shifts. The following example shows how to use the AWS DMS `CharacterSetSettings` operation to map the shifts in a DMS task setting.

```
"CharacterSetSettings": {
        "CharacterSetSupport": null,
        "CharacterReplacements": [
{"SourceCharacterCodePoint": "0000","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "00B8","TargetCharacterCodePoint": "0160"}
,{"SourceCharacterCodePoint": "00BC","TargetCharacterCodePoint": "0161"}
,{"SourceCharacterCodePoint": "00BD","TargetCharacterCodePoint": "017D"}
,{"SourceCharacterCodePoint": "00BE","TargetCharacterCodePoint": "017E"}
,{"SourceCharacterCodePoint": "00A8","TargetCharacterCodePoint": "0152"}
,{"SourceCharacterCodePoint": "00B4","TargetCharacterCodePoint": "0153"}
,{"SourceCharacterCodePoint": "00A6","TargetCharacterCodePoint": "0178"}
            }
        ]
    }
```

Some EBCDIC collations already exist for PostgreSQL that understand the shifting that's needed. Several different code pages are supported. The sections following provide JSON samples of what you must shift for all the supported code pages. You can simply copy-and-past the necessary JSON that you need in your DMS task.

### Micro Focus specific EBCDIC collations
<a name="CHAP_Source.DB2zOS.EBCDIC.MicroFocus"></a>

For Micro Focus, shift a subset of characters as needed for the following collations.

```
 da-DK-cp1142m-x-icu
 de-DE-cp1141m-x-icu
 en-GB-cp1146m-x-icu
 en-US-cp1140m-x-icu
 es-ES-cp1145m-x-icu
 fi-FI-cp1143m-x-icu
 fr-FR-cp1147m-x-icu
 it-IT-cp1144m-x-icu
 nl-BE-cp1148m-x-icu
```

**Example Micro Focus data shifts per collation:**  
**en\$1us\$1cp1140m**  
Code Shift:  

```
0000    0180
00A6    0160
00B8    0161
00BC    017D
00BD    017E
00BE    0152
00A8    0153
00B4    0178
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0000","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "00A6","TargetCharacterCodePoint": "0160"}
,{"SourceCharacterCodePoint": "00B8","TargetCharacterCodePoint": "0161"}
,{"SourceCharacterCodePoint": "00BC","TargetCharacterCodePoint": "017D"}
,{"SourceCharacterCodePoint": "00BD","TargetCharacterCodePoint": "017E"}
,{"SourceCharacterCodePoint": "00BE","TargetCharacterCodePoint": "0152"}
,{"SourceCharacterCodePoint": "00A8","TargetCharacterCodePoint": "0153"}
,{"SourceCharacterCodePoint": "00B4","TargetCharacterCodePoint": "0178"}
```
**en\$1us\$1cp1141m**  
Code Shift:  

```
0000    0180
00B8    0160
00BC    0161
00BD    017D
00BE    017E
00A8    0152
00B4    0153
00A6    0178
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0000","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "00B8","TargetCharacterCodePoint": "0160"}
,{"SourceCharacterCodePoint": "00BC","TargetCharacterCodePoint": "0161"}
,{"SourceCharacterCodePoint": "00BD","TargetCharacterCodePoint": "017D"}
,{"SourceCharacterCodePoint": "00BE","TargetCharacterCodePoint": "017E"}
,{"SourceCharacterCodePoint": "00A8","TargetCharacterCodePoint": "0152"}
,{"SourceCharacterCodePoint": "00B4","TargetCharacterCodePoint": "0153"}
,{"SourceCharacterCodePoint": "00A6","TargetCharacterCodePoint": "0178"}
```
**en\$1us\$1cp1142m**  
Code Shift:  

```
0000    0180
00A6    0160
00B8    0161
00BC    017D
00BD    017E
00BE    0152
00A8    0153
00B4    0178
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0000","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "00A6","TargetCharacterCodePoint": "0160"}
,{"SourceCharacterCodePoint": "00B8","TargetCharacterCodePoint": "0161"}
,{"SourceCharacterCodePoint": "00BC","TargetCharacterCodePoint": "017D"}
,{"SourceCharacterCodePoint": "00BD","TargetCharacterCodePoint": "017E"}
,{"SourceCharacterCodePoint": "00BE","TargetCharacterCodePoint": "0152"}
,{"SourceCharacterCodePoint": "00A8","TargetCharacterCodePoint": "0153"}
,{"SourceCharacterCodePoint": "00B4","TargetCharacterCodePoint": "0178"}
```
**en\$1us\$1cp1143m**  
Code Shift:  

```
0000    0180
00B8    0160
00BC    0161
00BD    017D
00BE    017E
00A8    0152
00B4    0153
00A6    0178
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0000","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "00B8","TargetCharacterCodePoint": "0160"}
,{"SourceCharacterCodePoint": "00BC","TargetCharacterCodePoint": "0161"}
,{"SourceCharacterCodePoint": "00BD","TargetCharacterCodePoint": "017D"}
,{"SourceCharacterCodePoint": "00BE","TargetCharacterCodePoint": "017E"}
,{"SourceCharacterCodePoint": "00A8","TargetCharacterCodePoint": "0152"}
,{"SourceCharacterCodePoint": "00B4","TargetCharacterCodePoint": "0153"}
,{"SourceCharacterCodePoint": "00A6","TargetCharacterCodePoint": "0178"}
```
**en\$1us\$1cp1144m**  
Code Shift:  

```
0000    0180
00B8    0160
00BC    0161
00BD    017D
00BE    017E
00A8    0152
00B4    0153
00A6    0178
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0000","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "00B8","TargetCharacterCodePoint": "0160"}
,{"SourceCharacterCodePoint": "00BC","TargetCharacterCodePoint": "0161"}
,{"SourceCharacterCodePoint": "00BD","TargetCharacterCodePoint": "017D"}
,{"SourceCharacterCodePoint": "00BE","TargetCharacterCodePoint": "017E"}
,{"SourceCharacterCodePoint": "00A8","TargetCharacterCodePoint": "0152"}
,{"SourceCharacterCodePoint": "00B4","TargetCharacterCodePoint": "0153"}
,{"SourceCharacterCodePoint": "00A6","TargetCharacterCodePoint": "0178"}
```
**en\$1us\$1cp1145m**  
Code Shift:  

```
0000    0180
00A6    0160
00B8    0161
00A8    017D
00BC    017E
00BD    0152
00BE    0153
00B4    0178
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0000","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "00A6","TargetCharacterCodePoint": "0160"}
,{"SourceCharacterCodePoint": "00B8","TargetCharacterCodePoint": "0161"}
,{"SourceCharacterCodePoint": "00A8","TargetCharacterCodePoint": "017D"}
,{"SourceCharacterCodePoint": "00BC","TargetCharacterCodePoint": "017E"}
,{"SourceCharacterCodePoint": "00BD","TargetCharacterCodePoint": "0152"}
,{"SourceCharacterCodePoint": "00BE","TargetCharacterCodePoint": "0153"}
,{"SourceCharacterCodePoint": "00B4","TargetCharacterCodePoint": "0178"}
```
**en\$1us\$1cp1146m**  
Code Shift:  

```
0000    0180
00A6    0160
00B8    0161
00BC    017D
00BD    017E
00BE    0152
00A8    0153
00B4    0178
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0000","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "00A6","TargetCharacterCodePoint": "0160"}
,{"SourceCharacterCodePoint": "00B8","TargetCharacterCodePoint": "0161"}
,{"SourceCharacterCodePoint": "00BC","TargetCharacterCodePoint": "017D"}
,{"SourceCharacterCodePoint": "00BD","TargetCharacterCodePoint": "017E"}
,{"SourceCharacterCodePoint": "00BE","TargetCharacterCodePoint": "0152"}
,{"SourceCharacterCodePoint": "00A8","TargetCharacterCodePoint": "0153"}
,{"SourceCharacterCodePoint": "00B4","TargetCharacterCodePoint": "0178"}
```
**en\$1us\$1cp1147m**  
Code Shift:  

```
0000    0180
00B8    0160
00A8    0161
00BC    017D
00BD    017E
00BE    0152
00B4    0153
00A6    0178
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0000","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "00B8","TargetCharacterCodePoint": "0160"}
,{"SourceCharacterCodePoint": "00A8","TargetCharacterCodePoint": "0161"}
,{"SourceCharacterCodePoint": "00BC","TargetCharacterCodePoint": "017D"}
,{"SourceCharacterCodePoint": "00BD","TargetCharacterCodePoint": "017E"}
,{"SourceCharacterCodePoint": "00BE","TargetCharacterCodePoint": "0152"}
,{"SourceCharacterCodePoint": "00B4","TargetCharacterCodePoint": "0153"}
,{"SourceCharacterCodePoint": "00A6","TargetCharacterCodePoint": "0178"}
```
**en\$1us\$1cp1148m**  
Code Shift:  

```
0000    0180
00A6    0160
00B8    0161
00BC    017D
00BD    017E
00BE    0152
00A8    0153
00B4    0178
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0000","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "00A6","TargetCharacterCodePoint": "0160"}
,{"SourceCharacterCodePoint": "00B8","TargetCharacterCodePoint": "0161"}
,{"SourceCharacterCodePoint": "00BC","TargetCharacterCodePoint": "017D"}
,{"SourceCharacterCodePoint": "00BD","TargetCharacterCodePoint": "017E"}
,{"SourceCharacterCodePoint": "00BE","TargetCharacterCodePoint": "0152"}
,{"SourceCharacterCodePoint": "00A8","TargetCharacterCodePoint": "0153"}
,{"SourceCharacterCodePoint": "00B4","TargetCharacterCodePoint": "0178"}
```

### BluAge specific EBCDIC collations
<a name="CHAP_Source.DB2zOS.EBCDIC.BluAge"></a>

For BluAge, shift all of the following *low values* and *high values* as needed. These collations should only be used to support the Mainframe Migration BluAge service.

```
da-DK-cp1142b-x-icu
 da-DK-cp277b-x-icu
 de-DE-cp1141b-x-icu
 de-DE-cp273b-x-icu
 en-GB-cp1146b-x-icu
 en-GB-cp285b-x-icu
 en-US-cp037b-x-icu
 en-US-cp1140b-x-icu
 es-ES-cp1145b-x-icu
 es-ES-cp284b-x-icu
 fi-FI-cp1143b-x-icu
 fi-FI-cp278b-x-icu 
 fr-FR-cp1147b-x-icu
 fr-FR-cp297b-x-icu
 it-IT-cp1144b-x-icu
 it-IT-cp280b-x-icu
 nl-BE-cp1148b-x-icu
 nl-BE-cp500b-x-icu
```

**Example BluAge Data Shifts:**  
**da-DK-cp277b** and **da-DK-cp1142b**  
Code Shift:  

```
0180    0180
0001    0181
0002    0182
0003    0183
009C    0184
0009    0185
0086    0186
007F    0187
0097    0188
008D    0189
008E    018A
000B    018B
000C    018C
000D    018D
000E    018E
000F    018F
0010    0190
0011    0191
0012    0192
0013    0193
009D    0194
0085    0195
0008    0196
0087    0197
0018    0198
0019    0199
0092    019A
008F    019B
001C    019C
001D    019D
001E    019E
001F    019F
0080    01A0
0081    01A1
0082    01A2
0083    01A3
0084    01A4
000A    01A5
0017    01A6
001B    01A7
0088    01A8
0089    01A9
008A    01AA
008B    01AB
008C    01AC
0005    01AD
0006    01AE
0007    01AF
0090    01B0
0091    01B1
0016    01B2
0093    01B3
0094    01B4
0095    01B5
0096    01B6
0004    01B7
0098    01B8
0099    01B9
009A    01BA
009B    01BB
0014    01BC
0015    01BD
009E    01BE
001A    01BF
009F    027F
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0180","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "0001","TargetCharacterCodePoint": "0181"}
,{"SourceCharacterCodePoint": "0002","TargetCharacterCodePoint": "0182"}
,{"SourceCharacterCodePoint": "0003","TargetCharacterCodePoint": "0183"}
,{"SourceCharacterCodePoint": "009C","TargetCharacterCodePoint": "0184"}
,{"SourceCharacterCodePoint": "0009","TargetCharacterCodePoint": "0185"}
,{"SourceCharacterCodePoint": "0086","TargetCharacterCodePoint": "0186"}
,{"SourceCharacterCodePoint": "007F","TargetCharacterCodePoint": "0187"}
,{"SourceCharacterCodePoint": "0097","TargetCharacterCodePoint": "0188"}
,{"SourceCharacterCodePoint": "008D","TargetCharacterCodePoint": "0189"}
,{"SourceCharacterCodePoint": "008E","TargetCharacterCodePoint": "018A"}
,{"SourceCharacterCodePoint": "000B","TargetCharacterCodePoint": "018B"}
,{"SourceCharacterCodePoint": "000C","TargetCharacterCodePoint": "018C"}
,{"SourceCharacterCodePoint": "000D","TargetCharacterCodePoint": "018D"}
,{"SourceCharacterCodePoint": "000E","TargetCharacterCodePoint": "018E"}
,{"SourceCharacterCodePoint": "000F","TargetCharacterCodePoint": "018F"}
,{"SourceCharacterCodePoint": "0010","TargetCharacterCodePoint": "0190"}
,{"SourceCharacterCodePoint": "0011","TargetCharacterCodePoint": "0191"}
,{"SourceCharacterCodePoint": "0012","TargetCharacterCodePoint": "0192"}
,{"SourceCharacterCodePoint": "0013","TargetCharacterCodePoint": "0193"}
,{"SourceCharacterCodePoint": "009D","TargetCharacterCodePoint": "0194"}
,{"SourceCharacterCodePoint": "0085","TargetCharacterCodePoint": "0195"}
,{"SourceCharacterCodePoint": "0008","TargetCharacterCodePoint": "0196"}
,{"SourceCharacterCodePoint": "0087","TargetCharacterCodePoint": "0197"}
,{"SourceCharacterCodePoint": "0018","TargetCharacterCodePoint": "0198"}
,{"SourceCharacterCodePoint": "0019","TargetCharacterCodePoint": "0199"}
,{"SourceCharacterCodePoint": "0092","TargetCharacterCodePoint": "019A"}
,{"SourceCharacterCodePoint": "008F","TargetCharacterCodePoint": "019B"}
,{"SourceCharacterCodePoint": "001C","TargetCharacterCodePoint": "019C"}
,{"SourceCharacterCodePoint": "001D","TargetCharacterCodePoint": "019D"}
,{"SourceCharacterCodePoint": "001E","TargetCharacterCodePoint": "019E"}
,{"SourceCharacterCodePoint": "001F","TargetCharacterCodePoint": "019F"}
,{"SourceCharacterCodePoint": "0080","TargetCharacterCodePoint": "01A0"}
,{"SourceCharacterCodePoint": "0081","TargetCharacterCodePoint": "01A1"}
,{"SourceCharacterCodePoint": "0082","TargetCharacterCodePoint": "01A2"}
,{"SourceCharacterCodePoint": "0083","TargetCharacterCodePoint": "01A3"}
,{"SourceCharacterCodePoint": "0084","TargetCharacterCodePoint": "01A4"}
,{"SourceCharacterCodePoint": "000A","TargetCharacterCodePoint": "01A5"}
,{"SourceCharacterCodePoint": "0017","TargetCharacterCodePoint": "01A6"}
,{"SourceCharacterCodePoint": "001B","TargetCharacterCodePoint": "01A7"}
,{"SourceCharacterCodePoint": "0088","TargetCharacterCodePoint": "01A8"}
,{"SourceCharacterCodePoint": "0089","TargetCharacterCodePoint": "01A9"}
,{"SourceCharacterCodePoint": "008A","TargetCharacterCodePoint": "01AA"}
,{"SourceCharacterCodePoint": "008B","TargetCharacterCodePoint": "01AB"}
,{"SourceCharacterCodePoint": "008C","TargetCharacterCodePoint": "01AC"}
,{"SourceCharacterCodePoint": "0005","TargetCharacterCodePoint": "01AD"}
,{"SourceCharacterCodePoint": "0006","TargetCharacterCodePoint": "01AE"}
,{"SourceCharacterCodePoint": "0007","TargetCharacterCodePoint": "01AF"}
,{"SourceCharacterCodePoint": "0090","TargetCharacterCodePoint": "01B0"}
,{"SourceCharacterCodePoint": "0091","TargetCharacterCodePoint": "01B1"}
,{"SourceCharacterCodePoint": "0016","TargetCharacterCodePoint": "01B2"}
,{"SourceCharacterCodePoint": "0093","TargetCharacterCodePoint": "01B3"}
,{"SourceCharacterCodePoint": "0094","TargetCharacterCodePoint": "01B4"}
,{"SourceCharacterCodePoint": "0095","TargetCharacterCodePoint": "01B5"}
,{"SourceCharacterCodePoint": "0096","TargetCharacterCodePoint": "01B6"}
,{"SourceCharacterCodePoint": "0004","TargetCharacterCodePoint": "01B7"}
,{"SourceCharacterCodePoint": "0098","TargetCharacterCodePoint": "01B8"}
,{"SourceCharacterCodePoint": "0099","TargetCharacterCodePoint": "01B9"}
,{"SourceCharacterCodePoint": "009A","TargetCharacterCodePoint": "01BA"}
,{"SourceCharacterCodePoint": "009B","TargetCharacterCodePoint": "01BB"}
,{"SourceCharacterCodePoint": "0014","TargetCharacterCodePoint": "01BC"}
,{"SourceCharacterCodePoint": "0015","TargetCharacterCodePoint": "01BD"}
,{"SourceCharacterCodePoint": "009E","TargetCharacterCodePoint": "01BE"}
,{"SourceCharacterCodePoint": "001A","TargetCharacterCodePoint": "01BF"}
,{"SourceCharacterCodePoint": "009F","TargetCharacterCodePoint": "027F"}
```
**de-DE-273b** and **de-DE-1141b**  
Code Shift:  

```
0180    0180
0001    0181
0002    0182
0003    0183
009C    0184
0009    0185
0086    0186
007F    0187
0097    0188
008D    0189
008E    018A
000B    018B
000C    018C
000D    018D
000E    018E
000F    018F
0010    0190
0011    0191
0012    0192
0013    0193
009D    0194
0085    0195
0008    0196
0087    0197
0018    0198
0019    0199
0092    019A
008F    019B
001C    019C
001D    019D
001E    019E
001F    019F
0080    01A0
0081    01A1
0082    01A2
0083    01A3
0084    01A4
000A    01A5
0017    01A6
001B    01A7
0088    01A8
0089    01A9
008A    01AA
008B    01AB
008C    01AC
0005    01AD
0006    01AE
0007    01AF
0090    01B0
0091    01B1
0016    01B2
0093    01B3
0094    01B4
0095    01B5
0096    01B6
0004    01B7
0098    01B8
0099    01B9
009A    01BA
009B    01BB
0014    01BC
0015    01BD
009E    01BE
001A    01BF
009F    027F
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0180","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "0001","TargetCharacterCodePoint": "0181"}
,{"SourceCharacterCodePoint": "0002","TargetCharacterCodePoint": "0182"}
,{"SourceCharacterCodePoint": "0003","TargetCharacterCodePoint": "0183"}
,{"SourceCharacterCodePoint": "009C","TargetCharacterCodePoint": "0184"}
,{"SourceCharacterCodePoint": "0009","TargetCharacterCodePoint": "0185"}
,{"SourceCharacterCodePoint": "0086","TargetCharacterCodePoint": "0186"}
,{"SourceCharacterCodePoint": "007F","TargetCharacterCodePoint": "0187"}
,{"SourceCharacterCodePoint": "0097","TargetCharacterCodePoint": "0188"}
,{"SourceCharacterCodePoint": "008D","TargetCharacterCodePoint": "0189"}
,{"SourceCharacterCodePoint": "008E","TargetCharacterCodePoint": "018A"}
,{"SourceCharacterCodePoint": "000B","TargetCharacterCodePoint": "018B"}
,{"SourceCharacterCodePoint": "000C","TargetCharacterCodePoint": "018C"}
,{"SourceCharacterCodePoint": "000D","TargetCharacterCodePoint": "018D"}
,{"SourceCharacterCodePoint": "000E","TargetCharacterCodePoint": "018E"}
,{"SourceCharacterCodePoint": "000F","TargetCharacterCodePoint": "018F"}
,{"SourceCharacterCodePoint": "0010","TargetCharacterCodePoint": "0190"}
,{"SourceCharacterCodePoint": "0011","TargetCharacterCodePoint": "0191"}
,{"SourceCharacterCodePoint": "0012","TargetCharacterCodePoint": "0192"}
,{"SourceCharacterCodePoint": "0013","TargetCharacterCodePoint": "0193"}
,{"SourceCharacterCodePoint": "009D","TargetCharacterCodePoint": "0194"}
,{"SourceCharacterCodePoint": "0085","TargetCharacterCodePoint": "0195"}
,{"SourceCharacterCodePoint": "0008","TargetCharacterCodePoint": "0196"}
,{"SourceCharacterCodePoint": "0087","TargetCharacterCodePoint": "0197"}
,{"SourceCharacterCodePoint": "0018","TargetCharacterCodePoint": "0198"}
,{"SourceCharacterCodePoint": "0019","TargetCharacterCodePoint": "0199"}
,{"SourceCharacterCodePoint": "0092","TargetCharacterCodePoint": "019A"}
,{"SourceCharacterCodePoint": "008F","TargetCharacterCodePoint": "019B"}
,{"SourceCharacterCodePoint": "001C","TargetCharacterCodePoint": "019C"}
,{"SourceCharacterCodePoint": "001D","TargetCharacterCodePoint": "019D"}
,{"SourceCharacterCodePoint": "001E","TargetCharacterCodePoint": "019E"}
,{"SourceCharacterCodePoint": "001F","TargetCharacterCodePoint": "019F"}
,{"SourceCharacterCodePoint": "0080","TargetCharacterCodePoint": "01A0"}
,{"SourceCharacterCodePoint": "0081","TargetCharacterCodePoint": "01A1"}
,{"SourceCharacterCodePoint": "0082","TargetCharacterCodePoint": "01A2"}
,{"SourceCharacterCodePoint": "0083","TargetCharacterCodePoint": "01A3"}
,{"SourceCharacterCodePoint": "0084","TargetCharacterCodePoint": "01A4"}
,{"SourceCharacterCodePoint": "000A","TargetCharacterCodePoint": "01A5"}
,{"SourceCharacterCodePoint": "0017","TargetCharacterCodePoint": "01A6"}
,{"SourceCharacterCodePoint": "001B","TargetCharacterCodePoint": "01A7"}
,{"SourceCharacterCodePoint": "0088","TargetCharacterCodePoint": "01A8"}
,{"SourceCharacterCodePoint": "0089","TargetCharacterCodePoint": "01A9"}
,{"SourceCharacterCodePoint": "008A","TargetCharacterCodePoint": "01AA"}
,{"SourceCharacterCodePoint": "008B","TargetCharacterCodePoint": "01AB"}
,{"SourceCharacterCodePoint": "008C","TargetCharacterCodePoint": "01AC"}
,{"SourceCharacterCodePoint": "0005","TargetCharacterCodePoint": "01AD"}
,{"SourceCharacterCodePoint": "0006","TargetCharacterCodePoint": "01AE"}
,{"SourceCharacterCodePoint": "0007","TargetCharacterCodePoint": "01AF"}
,{"SourceCharacterCodePoint": "0090","TargetCharacterCodePoint": "01B0"}
,{"SourceCharacterCodePoint": "0091","TargetCharacterCodePoint": "01B1"}
,{"SourceCharacterCodePoint": "0016","TargetCharacterCodePoint": "01B2"}
,{"SourceCharacterCodePoint": "0093","TargetCharacterCodePoint": "01B3"}
,{"SourceCharacterCodePoint": "0094","TargetCharacterCodePoint": "01B4"}
,{"SourceCharacterCodePoint": "0095","TargetCharacterCodePoint": "01B5"}
,{"SourceCharacterCodePoint": "0096","TargetCharacterCodePoint": "01B6"}
,{"SourceCharacterCodePoint": "0004","TargetCharacterCodePoint": "01B7"}
,{"SourceCharacterCodePoint": "0098","TargetCharacterCodePoint": "01B8"}
,{"SourceCharacterCodePoint": "0099","TargetCharacterCodePoint": "01B9"}
,{"SourceCharacterCodePoint": "009A","TargetCharacterCodePoint": "01BA"}
,{"SourceCharacterCodePoint": "009B","TargetCharacterCodePoint": "01BB"}
,{"SourceCharacterCodePoint": "0014","TargetCharacterCodePoint": "01BC"}
,{"SourceCharacterCodePoint": "0015","TargetCharacterCodePoint": "01BD"}
,{"SourceCharacterCodePoint": "009E","TargetCharacterCodePoint": "01BE"}
,{"SourceCharacterCodePoint": "001A","TargetCharacterCodePoint": "01BF"}
,{"SourceCharacterCodePoint": "009F","TargetCharacterCodePoint": "027F"}
```
**en-GB-285b** and **en-GB-1146b**  
Code Shift:  

```
0180    0180
0001    0181
0002    0182
0003    0183
009C    0184
0009    0185
0086    0186
007F    0187
0097    0188
008D    0189
008E    018A
000B    018B
000C    018C
000D    018D
000E    018E
000F    018F
0010    0190
0011    0191
0012    0192
0013    0193
009D    0194
0085    0195
0008    0196
0087    0197
0018    0198
0019    0199
0092    019A
008F    019B
001C    019C
001D    019D
001E    019E
001F    019F
0080    01A0
0081    01A1
0082    01A2
0083    01A3
0084    01A4
000A    01A5
0017    01A6
001B    01A7
0088    01A8
0089    01A9
008A    01AA
008B    01AB
008C    01AC
0005    01AD
0006    01AE
0007    01AF
0090    01B0
0091    01B1
0016    01B2
0093    01B3
0094    01B4
0095    01B5
0096    01B6
0004    01B7
0098    01B8
0099    01B9
009A    01BA
009B    01BB
0014    01BC
0015    01BD
009E    01BE
001A    01BF
009F    027F
```
Corresponding input mapping for an AWS DMS task:  

```
{"SourceCharacterCodePoint": "0180","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "0001","TargetCharacterCodePoint": "0181"}
,{"SourceCharacterCodePoint": "0002","TargetCharacterCodePoint": "0182"}
,{"SourceCharacterCodePoint": "0003","TargetCharacterCodePoint": "0183"}
,{"SourceCharacterCodePoint": "009C","TargetCharacterCodePoint": "0184"}
,{"SourceCharacterCodePoint": "0009","TargetCharacterCodePoint": "0185"}
,{"SourceCharacterCodePoint": "0086","TargetCharacterCodePoint": "0186"}
,{"SourceCharacterCodePoint": "007F","TargetCharacterCodePoint": "0187"}
,{"SourceCharacterCodePoint": "0097","TargetCharacterCodePoint": "0188"}
,{"SourceCharacterCodePoint": "008D","TargetCharacterCodePoint": "0189"}
,{"SourceCharacterCodePoint": "008E","TargetCharacterCodePoint": "018A"}
,{"SourceCharacterCodePoint": "000B","TargetCharacterCodePoint": "018B"}
,{"SourceCharacterCodePoint": "000C","TargetCharacterCodePoint": "018C"}
,{"SourceCharacterCodePoint": "000D","TargetCharacterCodePoint": "018D"}
,{"SourceCharacterCodePoint": "000E","TargetCharacterCodePoint": "018E"}
,{"SourceCharacterCodePoint": "000F","TargetCharacterCodePoint": "018F"}
,{"SourceCharacterCodePoint": "0010","TargetCharacterCodePoint": "0190"}
,{"SourceCharacterCodePoint": "0011","TargetCharacterCodePoint": "0191"}
,{"SourceCharacterCodePoint": "0012","TargetCharacterCodePoint": "0192"}
,{"SourceCharacterCodePoint": "0013","TargetCharacterCodePoint": "0193"}
,{"SourceCharacterCodePoint": "009D","TargetCharacterCodePoint": "0194"}
,{"SourceCharacterCodePoint": "0085","TargetCharacterCodePoint": "0195"}
,{"SourceCharacterCodePoint": "0008","TargetCharacterCodePoint": "0196"}
,{"SourceCharacterCodePoint": "0087","TargetCharacterCodePoint": "0197"}
,{"SourceCharacterCodePoint": "0018","TargetCharacterCodePoint": "0198"}
,{"SourceCharacterCodePoint": "0019","TargetCharacterCodePoint": "0199"}
,{"SourceCharacterCodePoint": "0092","TargetCharacterCodePoint": "019A"}
,{"SourceCharacterCodePoint": "008F","TargetCharacterCodePoint": "019B"}
,{"SourceCharacterCodePoint": "001C","TargetCharacterCodePoint": "019C"}
,{"SourceCharacterCodePoint": "001D","TargetCharacterCodePoint": "019D"}
,{"SourceCharacterCodePoint": "001E","TargetCharacterCodePoint": "019E"}
,{"SourceCharacterCodePoint": "001F","TargetCharacterCodePoint": "019F"}
,{"SourceCharacterCodePoint": "0080","TargetCharacterCodePoint": "01A0"}
,{"SourceCharacterCodePoint": "0081","TargetCharacterCodePoint": "01A1"}
,{"SourceCharacterCodePoint": "0082","TargetCharacterCodePoint": "01A2"}
,{"SourceCharacterCodePoint": "0083","TargetCharacterCodePoint": "01A3"}
,{"SourceCharacterCodePoint": "0084","TargetCharacterCodePoint": "01A4"}
,{"SourceCharacterCodePoint": "000A","TargetCharacterCodePoint": "01A5"}
,{"SourceCharacterCodePoint": "0017","TargetCharacterCodePoint": "01A6"}
,{"SourceCharacterCodePoint": "001B","TargetCharacterCodePoint": "01A7"}
,{"SourceCharacterCodePoint": "0088","TargetCharacterCodePoint": "01A8"}
,{"SourceCharacterCodePoint": "0089","TargetCharacterCodePoint": "01A9"}
,{"SourceCharacterCodePoint": "008A","TargetCharacterCodePoint": "01AA"}
,{"SourceCharacterCodePoint": "008B","TargetCharacterCodePoint": "01AB"}
,{"SourceCharacterCodePoint": "008C","TargetCharacterCodePoint": "01AC"}
,{"SourceCharacterCodePoint": "0005","TargetCharacterCodePoint": "01AD"}
,{"SourceCharacterCodePoint": "0006","TargetCharacterCodePoint": "01AE"}
,{"SourceCharacterCodePoint": "0007","TargetCharacterCodePoint": "01AF"}
,{"SourceCharacterCodePoint": "0090","TargetCharacterCodePoint": "01B0"}
,{"SourceCharacterCodePoint": "0091","TargetCharacterCodePoint": "01B1"}
,{"SourceCharacterCodePoint": "0016","TargetCharacterCodePoint": "01B2"}
,{"SourceCharacterCodePoint": "0093","TargetCharacterCodePoint": "01B3"}
,{"SourceCharacterCodePoint": "0094","TargetCharacterCodePoint": "01B4"}
,{"SourceCharacterCodePoint": "0095","TargetCharacterCodePoint": "01B5"}
,{"SourceCharacterCodePoint": "0096","TargetCharacterCodePoint": "01B6"}
,{"SourceCharacterCodePoint": "0004","TargetCharacterCodePoint": "01B7"}
,{"SourceCharacterCodePoint": "0098","TargetCharacterCodePoint": "01B8"}
,{"SourceCharacterCodePoint": "0099","TargetCharacterCodePoint": "01B9"}
,{"SourceCharacterCodePoint": "009A","TargetCharacterCodePoint": "01BA"}
,{"SourceCharacterCodePoint": "009B","TargetCharacterCodePoint": "01BB"}
,{"SourceCharacterCodePoint": "0014","TargetCharacterCodePoint": "01BC"}
,{"SourceCharacterCodePoint": "0015","TargetCharacterCodePoint": "01BD"}
,{"SourceCharacterCodePoint": "009E","TargetCharacterCodePoint": "01BE"}
,{"SourceCharacterCodePoint": "001A","TargetCharacterCodePoint": "01BF"}
,{"SourceCharacterCodePoint": "009F","TargetCharacterCodePoint": "027F"}
```
**en-us-037b** and **en-us-1140b**  
Code Shift:  

```
0180    0180
0001    0181
0002    0182
0003    0183
009C    0184
0009    0185
0086    0186
007F    0187
0097    0188
008D    0189
008E    018A
000B    018B
000C    018C
000D    018D
000E    018E
000F    018F
0010    0190
0011    0191
0012    0192
0013    0193
009D    0194
0085    0195
0008    0196
0087    0197
0018    0198
0019    0199
0092    019A
008F    019B
001C    019C
001D    019D
001E    019E
001F    019F
0080    01A0
0081    01A1
0082    01A2
0083    01A3
0084    01A4
000A    01A5
0017    01A6
001B    01A7
0088    01A8
0089    01A9
008A    01AA
008B    01AB
008C    01AC
0005    01AD
0006    01AE
0007    01AF
0090    01B0
0091    01B1
0016    01B2
0093    01B3
0094    01B4
0095    01B5
0096    01B6
0004    01B7
0098    01B8
0099    01B9
009A    01BA
009B    01BB
0014    01BC
0015    01BD
009E    01BE
001A    01BF
009F    027F
```
Corresponding input mapping for an AWS DMS task:  

```
{"SourceCharacterCodePoint": "0180","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "0001","TargetCharacterCodePoint": "0181"}
,{"SourceCharacterCodePoint": "0002","TargetCharacterCodePoint": "0182"}
,{"SourceCharacterCodePoint": "0003","TargetCharacterCodePoint": "0183"}
,{"SourceCharacterCodePoint": "009C","TargetCharacterCodePoint": "0184"}
,{"SourceCharacterCodePoint": "0009","TargetCharacterCodePoint": "0185"}
,{"SourceCharacterCodePoint": "0086","TargetCharacterCodePoint": "0186"}
,{"SourceCharacterCodePoint": "007F","TargetCharacterCodePoint": "0187"}
,{"SourceCharacterCodePoint": "0097","TargetCharacterCodePoint": "0188"}
,{"SourceCharacterCodePoint": "008D","TargetCharacterCodePoint": "0189"}
,{"SourceCharacterCodePoint": "008E","TargetCharacterCodePoint": "018A"}
,{"SourceCharacterCodePoint": "000B","TargetCharacterCodePoint": "018B"}
,{"SourceCharacterCodePoint": "000C","TargetCharacterCodePoint": "018C"}
,{"SourceCharacterCodePoint": "000D","TargetCharacterCodePoint": "018D"}
,{"SourceCharacterCodePoint": "000E","TargetCharacterCodePoint": "018E"}
,{"SourceCharacterCodePoint": "000F","TargetCharacterCodePoint": "018F"}
,{"SourceCharacterCodePoint": "0010","TargetCharacterCodePoint": "0190"}
,{"SourceCharacterCodePoint": "0011","TargetCharacterCodePoint": "0191"}
,{"SourceCharacterCodePoint": "0012","TargetCharacterCodePoint": "0192"}
,{"SourceCharacterCodePoint": "0013","TargetCharacterCodePoint": "0193"}
,{"SourceCharacterCodePoint": "009D","TargetCharacterCodePoint": "0194"}
,{"SourceCharacterCodePoint": "0085","TargetCharacterCodePoint": "0195"}
,{"SourceCharacterCodePoint": "0008","TargetCharacterCodePoint": "0196"}
,{"SourceCharacterCodePoint": "0087","TargetCharacterCodePoint": "0197"}
,{"SourceCharacterCodePoint": "0018","TargetCharacterCodePoint": "0198"}
,{"SourceCharacterCodePoint": "0019","TargetCharacterCodePoint": "0199"}
,{"SourceCharacterCodePoint": "0092","TargetCharacterCodePoint": "019A"}
,{"SourceCharacterCodePoint": "008F","TargetCharacterCodePoint": "019B"}
,{"SourceCharacterCodePoint": "001C","TargetCharacterCodePoint": "019C"}
,{"SourceCharacterCodePoint": "001D","TargetCharacterCodePoint": "019D"}
,{"SourceCharacterCodePoint": "001E","TargetCharacterCodePoint": "019E"}
,{"SourceCharacterCodePoint": "001F","TargetCharacterCodePoint": "019F"}
,{"SourceCharacterCodePoint": "0080","TargetCharacterCodePoint": "01A0"}
,{"SourceCharacterCodePoint": "0081","TargetCharacterCodePoint": "01A1"}
,{"SourceCharacterCodePoint": "0082","TargetCharacterCodePoint": "01A2"}
,{"SourceCharacterCodePoint": "0083","TargetCharacterCodePoint": "01A3"}
,{"SourceCharacterCodePoint": "0084","TargetCharacterCodePoint": "01A4"}
,{"SourceCharacterCodePoint": "000A","TargetCharacterCodePoint": "01A5"}
,{"SourceCharacterCodePoint": "0017","TargetCharacterCodePoint": "01A6"}
,{"SourceCharacterCodePoint": "001B","TargetCharacterCodePoint": "01A7"}
,{"SourceCharacterCodePoint": "0088","TargetCharacterCodePoint": "01A8"}
,{"SourceCharacterCodePoint": "0089","TargetCharacterCodePoint": "01A9"}
,{"SourceCharacterCodePoint": "008A","TargetCharacterCodePoint": "01AA"}
,{"SourceCharacterCodePoint": "008B","TargetCharacterCodePoint": "01AB"}
,{"SourceCharacterCodePoint": "008C","TargetCharacterCodePoint": "01AC"}
,{"SourceCharacterCodePoint": "0005","TargetCharacterCodePoint": "01AD"}
,{"SourceCharacterCodePoint": "0006","TargetCharacterCodePoint": "01AE"}
,{"SourceCharacterCodePoint": "0007","TargetCharacterCodePoint": "01AF"}
,{"SourceCharacterCodePoint": "0090","TargetCharacterCodePoint": "01B0"}
,{"SourceCharacterCodePoint": "0091","TargetCharacterCodePoint": "01B1"}
,{"SourceCharacterCodePoint": "0016","TargetCharacterCodePoint": "01B2"}
,{"SourceCharacterCodePoint": "0093","TargetCharacterCodePoint": "01B3"}
,{"SourceCharacterCodePoint": "0094","TargetCharacterCodePoint": "01B4"}
,{"SourceCharacterCodePoint": "0095","TargetCharacterCodePoint": "01B5"}
,{"SourceCharacterCodePoint": "0096","TargetCharacterCodePoint": "01B6"}
,{"SourceCharacterCodePoint": "0004","TargetCharacterCodePoint": "01B7"}
,{"SourceCharacterCodePoint": "0098","TargetCharacterCodePoint": "01B8"}
,{"SourceCharacterCodePoint": "0099","TargetCharacterCodePoint": "01B9"}
,{"SourceCharacterCodePoint": "009A","TargetCharacterCodePoint": "01BA"}
,{"SourceCharacterCodePoint": "009B","TargetCharacterCodePoint": "01BB"}
,{"SourceCharacterCodePoint": "0014","TargetCharacterCodePoint": "01BC"}
,{"SourceCharacterCodePoint": "0015","TargetCharacterCodePoint": "01BD"}
,{"SourceCharacterCodePoint": "009E","TargetCharacterCodePoint": "01BE"}
,{"SourceCharacterCodePoint": "001A","TargetCharacterCodePoint": "01BF"}
,{"SourceCharacterCodePoint": "009F","TargetCharacterCodePoint": "027F"}
```
**es-ES-284b** and **es-ES-1145b**  
Code Shift:  

```
0180    0180
0001    0181
0002    0182
0003    0183
009C    0184
0009    0185
0086    0186
007F    0187
0097    0188
008D    0189
008E    018A
000B    018B
000C    018C
000D    018D
000E    018E
000F    018F
0010    0190
0011    0191
0012    0192
0013    0193
009D    0194
0085    0195
0008    0196
0087    0197
0018    0198
0019    0199
0092    019A
008F    019B
001C    019C
001D    019D
001E    019E
001F    019F
0080    01A0
0081    01A1
0082    01A2
0083    01A3
0084    01A4
000A    01A5
0017    01A6
001B    01A7
0088    01A8
0089    01A9
008A    01AA
008B    01AB
008C    01AC
0005    01AD
0006    01AE
0007    01AF
0090    01B0
0091    01B1
0016    01B2
0093    01B3
0094    01B4
0095    01B5
0096    01B6
0004    01B7
0098    01B8
0099    01B9
009A    01BA
009B    01BB
0014    01BC
0015    01BD
009E    01BE
001A    01BF
009F    027F
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0180","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "0001","TargetCharacterCodePoint": "0181"}
,{"SourceCharacterCodePoint": "0002","TargetCharacterCodePoint": "0182"}
,{"SourceCharacterCodePoint": "0003","TargetCharacterCodePoint": "0183"}
,{"SourceCharacterCodePoint": "009C","TargetCharacterCodePoint": "0184"}
,{"SourceCharacterCodePoint": "0009","TargetCharacterCodePoint": "0185"}
,{"SourceCharacterCodePoint": "0086","TargetCharacterCodePoint": "0186"}
,{"SourceCharacterCodePoint": "007F","TargetCharacterCodePoint": "0187"}
,{"SourceCharacterCodePoint": "0097","TargetCharacterCodePoint": "0188"}
,{"SourceCharacterCodePoint": "008D","TargetCharacterCodePoint": "0189"}
,{"SourceCharacterCodePoint": "008E","TargetCharacterCodePoint": "018A"}
,{"SourceCharacterCodePoint": "000B","TargetCharacterCodePoint": "018B"}
,{"SourceCharacterCodePoint": "000C","TargetCharacterCodePoint": "018C"}
,{"SourceCharacterCodePoint": "000D","TargetCharacterCodePoint": "018D"}
,{"SourceCharacterCodePoint": "000E","TargetCharacterCodePoint": "018E"}
,{"SourceCharacterCodePoint": "000F","TargetCharacterCodePoint": "018F"}
,{"SourceCharacterCodePoint": "0010","TargetCharacterCodePoint": "0190"}
,{"SourceCharacterCodePoint": "0011","TargetCharacterCodePoint": "0191"}
,{"SourceCharacterCodePoint": "0012","TargetCharacterCodePoint": "0192"}
,{"SourceCharacterCodePoint": "0013","TargetCharacterCodePoint": "0193"}
,{"SourceCharacterCodePoint": "009D","TargetCharacterCodePoint": "0194"}
,{"SourceCharacterCodePoint": "0085","TargetCharacterCodePoint": "0195"}
,{"SourceCharacterCodePoint": "0008","TargetCharacterCodePoint": "0196"}
,{"SourceCharacterCodePoint": "0087","TargetCharacterCodePoint": "0197"}
,{"SourceCharacterCodePoint": "0018","TargetCharacterCodePoint": "0198"}
,{"SourceCharacterCodePoint": "0019","TargetCharacterCodePoint": "0199"}
,{"SourceCharacterCodePoint": "0092","TargetCharacterCodePoint": "019A"}
,{"SourceCharacterCodePoint": "008F","TargetCharacterCodePoint": "019B"}
,{"SourceCharacterCodePoint": "001C","TargetCharacterCodePoint": "019C"}
,{"SourceCharacterCodePoint": "001D","TargetCharacterCodePoint": "019D"}
,{"SourceCharacterCodePoint": "001E","TargetCharacterCodePoint": "019E"}
,{"SourceCharacterCodePoint": "001F","TargetCharacterCodePoint": "019F"}
,{"SourceCharacterCodePoint": "0080","TargetCharacterCodePoint": "01A0"}
,{"SourceCharacterCodePoint": "0081","TargetCharacterCodePoint": "01A1"}
,{"SourceCharacterCodePoint": "0082","TargetCharacterCodePoint": "01A2"}
,{"SourceCharacterCodePoint": "0083","TargetCharacterCodePoint": "01A3"}
,{"SourceCharacterCodePoint": "0084","TargetCharacterCodePoint": "01A4"}
,{"SourceCharacterCodePoint": "000A","TargetCharacterCodePoint": "01A5"}
,{"SourceCharacterCodePoint": "0017","TargetCharacterCodePoint": "01A6"}
,{"SourceCharacterCodePoint": "001B","TargetCharacterCodePoint": "01A7"}
,{"SourceCharacterCodePoint": "0088","TargetCharacterCodePoint": "01A8"}
,{"SourceCharacterCodePoint": "0089","TargetCharacterCodePoint": "01A9"}
,{"SourceCharacterCodePoint": "008A","TargetCharacterCodePoint": "01AA"}
,{"SourceCharacterCodePoint": "008B","TargetCharacterCodePoint": "01AB"}
,{"SourceCharacterCodePoint": "008C","TargetCharacterCodePoint": "01AC"}
,{"SourceCharacterCodePoint": "0005","TargetCharacterCodePoint": "01AD"}
,{"SourceCharacterCodePoint": "0006","TargetCharacterCodePoint": "01AE"}
,{"SourceCharacterCodePoint": "0007","TargetCharacterCodePoint": "01AF"}
,{"SourceCharacterCodePoint": "0090","TargetCharacterCodePoint": "01B0"}
,{"SourceCharacterCodePoint": "0091","TargetCharacterCodePoint": "01B1"}
,{"SourceCharacterCodePoint": "0016","TargetCharacterCodePoint": "01B2"}
,{"SourceCharacterCodePoint": "0093","TargetCharacterCodePoint": "01B3"}
,{"SourceCharacterCodePoint": "0094","TargetCharacterCodePoint": "01B4"}
,{"SourceCharacterCodePoint": "0095","TargetCharacterCodePoint": "01B5"}
,{"SourceCharacterCodePoint": "0096","TargetCharacterCodePoint": "01B6"}
,{"SourceCharacterCodePoint": "0004","TargetCharacterCodePoint": "01B7"}
,{"SourceCharacterCodePoint": "0098","TargetCharacterCodePoint": "01B8"}
,{"SourceCharacterCodePoint": "0099","TargetCharacterCodePoint": "01B9"}
,{"SourceCharacterCodePoint": "009A","TargetCharacterCodePoint": "01BA"}
,{"SourceCharacterCodePoint": "009B","TargetCharacterCodePoint": "01BB"}
,{"SourceCharacterCodePoint": "0014","TargetCharacterCodePoint": "01BC"}
,{"SourceCharacterCodePoint": "0015","TargetCharacterCodePoint": "01BD"}
,{"SourceCharacterCodePoint": "009E","TargetCharacterCodePoint": "01BE"}
,{"SourceCharacterCodePoint": "001A","TargetCharacterCodePoint": "01BF"}
,{"SourceCharacterCodePoint": "009F","TargetCharacterCodePoint": "027F"}
```
**fi\$1FI-278b** and **fi-FI-1143b**  
Code Shift:  

```
0180    0180
0001    0181
0002    0182
0003    0183
009C    0184
0009    0185
0086    0186
007F    0187
0097    0188
008D    0189
008E    018A
000B    018B
000C    018C
000D    018D
000E    018E
000F    018F
0010    0190
0011    0191
0012    0192
0013    0193
009D    0194
0085    0195
0008    0196
0087    0197
0018    0198
0019    0199
0092    019A
008F    019B
001C    019C
001D    019D
001E    019E
001F    019F
0080    01A0
0081    01A1
0082    01A2
0083    01A3
0084    01A4
000A    01A5
0017    01A6
001B    01A7
0088    01A8
0089    01A9
008A    01AA
008B    01AB
008C    01AC
0005    01AD
0006    01AE
0007    01AF
0090    01B0
0091    01B1
0016    01B2
0093    01B3
0094    01B4
0095    01B5
0096    01B6
0004    01B7
0098    01B8
0099    01B9
009A    01BA
009B    01BB
0014    01BC
0015    01BD
009E    01BE
001A    01BF
009F    027F
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0180","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "0001","TargetCharacterCodePoint": "0181"}
,{"SourceCharacterCodePoint": "0002","TargetCharacterCodePoint": "0182"}
,{"SourceCharacterCodePoint": "0003","TargetCharacterCodePoint": "0183"}
,{"SourceCharacterCodePoint": "009C","TargetCharacterCodePoint": "0184"}
,{"SourceCharacterCodePoint": "0009","TargetCharacterCodePoint": "0185"}
,{"SourceCharacterCodePoint": "0086","TargetCharacterCodePoint": "0186"}
,{"SourceCharacterCodePoint": "007F","TargetCharacterCodePoint": "0187"}
,{"SourceCharacterCodePoint": "0097","TargetCharacterCodePoint": "0188"}
,{"SourceCharacterCodePoint": "008D","TargetCharacterCodePoint": "0189"}
,{"SourceCharacterCodePoint": "008E","TargetCharacterCodePoint": "018A"}
,{"SourceCharacterCodePoint": "000B","TargetCharacterCodePoint": "018B"}
,{"SourceCharacterCodePoint": "000C","TargetCharacterCodePoint": "018C"}
,{"SourceCharacterCodePoint": "000D","TargetCharacterCodePoint": "018D"}
,{"SourceCharacterCodePoint": "000E","TargetCharacterCodePoint": "018E"}
,{"SourceCharacterCodePoint": "000F","TargetCharacterCodePoint": "018F"}
,{"SourceCharacterCodePoint": "0010","TargetCharacterCodePoint": "0190"}
,{"SourceCharacterCodePoint": "0011","TargetCharacterCodePoint": "0191"}
,{"SourceCharacterCodePoint": "0012","TargetCharacterCodePoint": "0192"}
,{"SourceCharacterCodePoint": "0013","TargetCharacterCodePoint": "0193"}
,{"SourceCharacterCodePoint": "009D","TargetCharacterCodePoint": "0194"}
,{"SourceCharacterCodePoint": "0085","TargetCharacterCodePoint": "0195"}
,{"SourceCharacterCodePoint": "0008","TargetCharacterCodePoint": "0196"}
,{"SourceCharacterCodePoint": "0087","TargetCharacterCodePoint": "0197"}
,{"SourceCharacterCodePoint": "0018","TargetCharacterCodePoint": "0198"}
,{"SourceCharacterCodePoint": "0019","TargetCharacterCodePoint": "0199"}
,{"SourceCharacterCodePoint": "0092","TargetCharacterCodePoint": "019A"}
,{"SourceCharacterCodePoint": "008F","TargetCharacterCodePoint": "019B"}
,{"SourceCharacterCodePoint": "001C","TargetCharacterCodePoint": "019C"}
,{"SourceCharacterCodePoint": "001D","TargetCharacterCodePoint": "019D"}
,{"SourceCharacterCodePoint": "001E","TargetCharacterCodePoint": "019E"}
,{"SourceCharacterCodePoint": "001F","TargetCharacterCodePoint": "019F"}
,{"SourceCharacterCodePoint": "0080","TargetCharacterCodePoint": "01A0"}
,{"SourceCharacterCodePoint": "0081","TargetCharacterCodePoint": "01A1"}
,{"SourceCharacterCodePoint": "0082","TargetCharacterCodePoint": "01A2"}
,{"SourceCharacterCodePoint": "0083","TargetCharacterCodePoint": "01A3"}
,{"SourceCharacterCodePoint": "0084","TargetCharacterCodePoint": "01A4"}
,{"SourceCharacterCodePoint": "000A","TargetCharacterCodePoint": "01A5"}
,{"SourceCharacterCodePoint": "0017","TargetCharacterCodePoint": "01A6"}
,{"SourceCharacterCodePoint": "001B","TargetCharacterCodePoint": "01A7"}
,{"SourceCharacterCodePoint": "0088","TargetCharacterCodePoint": "01A8"}
,{"SourceCharacterCodePoint": "0089","TargetCharacterCodePoint": "01A9"}
,{"SourceCharacterCodePoint": "008A","TargetCharacterCodePoint": "01AA"}
,{"SourceCharacterCodePoint": "008B","TargetCharacterCodePoint": "01AB"}
,{"SourceCharacterCodePoint": "008C","TargetCharacterCodePoint": "01AC"}
,{"SourceCharacterCodePoint": "0005","TargetCharacterCodePoint": "01AD"}
,{"SourceCharacterCodePoint": "0006","TargetCharacterCodePoint": "01AE"}
,{"SourceCharacterCodePoint": "0007","TargetCharacterCodePoint": "01AF"}
,{"SourceCharacterCodePoint": "0090","TargetCharacterCodePoint": "01B0"}
,{"SourceCharacterCodePoint": "0091","TargetCharacterCodePoint": "01B1"}
,{"SourceCharacterCodePoint": "0016","TargetCharacterCodePoint": "01B2"}
,{"SourceCharacterCodePoint": "0093","TargetCharacterCodePoint": "01B3"}
,{"SourceCharacterCodePoint": "0094","TargetCharacterCodePoint": "01B4"}
,{"SourceCharacterCodePoint": "0095","TargetCharacterCodePoint": "01B5"}
,{"SourceCharacterCodePoint": "0096","TargetCharacterCodePoint": "01B6"}
,{"SourceCharacterCodePoint": "0004","TargetCharacterCodePoint": "01B7"}
,{"SourceCharacterCodePoint": "0098","TargetCharacterCodePoint": "01B8"}
,{"SourceCharacterCodePoint": "0099","TargetCharacterCodePoint": "01B9"}
,{"SourceCharacterCodePoint": "009A","TargetCharacterCodePoint": "01BA"}
,{"SourceCharacterCodePoint": "009B","TargetCharacterCodePoint": "01BB"}
,{"SourceCharacterCodePoint": "0014","TargetCharacterCodePoint": "01BC"}
,{"SourceCharacterCodePoint": "0015","TargetCharacterCodePoint": "01BD"}
,{"SourceCharacterCodePoint": "009E","TargetCharacterCodePoint": "01BE"}
,{"SourceCharacterCodePoint": "001A","TargetCharacterCodePoint": "01BF"}
,{"SourceCharacterCodePoint": "009F","TargetCharacterCodePoint": "027F"}
```
**fr-FR-297b** and **fr-FR-1147b**  
Code Shift:  

```
0180    0180
0001    0181
0002    0182
0003    0183
009C    0184
0009    0185
0086    0186
007F    0187
0097    0188
008D    0189
008E    018A
000B    018B
000C    018C
000D    018D
000E    018E
000F    018F
0010    0190
0011    0191
0012    0192
0013    0193
009D    0194
0085    0195
0008    0196
0087    0197
0018    0198
0019    0199
0092    019A
008F    019B
001C    019C
001D    019D
001E    019E
001F    019F
0080    01A0
0081    01A1
0082    01A2
0083    01A3
0084    01A4
000A    01A5
0017    01A6
001B    01A7
0088    01A8
0089    01A9
008A    01AA
008B    01AB
008C    01AC
0005    01AD
0006    01AE
0007    01AF
0090    01B0
0091    01B1
0016    01B2
0093    01B3
0094    01B4
0095    01B5
0096    01B6
0004    01B7
0098    01B8
0099    01B9
009A    01BA
009B    01BB
0014    01BC
0015    01BD
009E    01BE
001A    01BF
009F    027F
```
Corresponding input mapping for an AWS DMS task:  

```
{"SourceCharacterCodePoint": "0180","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "0001","TargetCharacterCodePoint": "0181"}
,{"SourceCharacterCodePoint": "0002","TargetCharacterCodePoint": "0182"}
,{"SourceCharacterCodePoint": "0003","TargetCharacterCodePoint": "0183"}
,{"SourceCharacterCodePoint": "009C","TargetCharacterCodePoint": "0184"}
,{"SourceCharacterCodePoint": "0009","TargetCharacterCodePoint": "0185"}
,{"SourceCharacterCodePoint": "0086","TargetCharacterCodePoint": "0186"}
,{"SourceCharacterCodePoint": "007F","TargetCharacterCodePoint": "0187"}
,{"SourceCharacterCodePoint": "0097","TargetCharacterCodePoint": "0188"}
,{"SourceCharacterCodePoint": "008D","TargetCharacterCodePoint": "0189"}
,{"SourceCharacterCodePoint": "008E","TargetCharacterCodePoint": "018A"}
,{"SourceCharacterCodePoint": "000B","TargetCharacterCodePoint": "018B"}
,{"SourceCharacterCodePoint": "000C","TargetCharacterCodePoint": "018C"}
,{"SourceCharacterCodePoint": "000D","TargetCharacterCodePoint": "018D"}
,{"SourceCharacterCodePoint": "000E","TargetCharacterCodePoint": "018E"}
,{"SourceCharacterCodePoint": "000F","TargetCharacterCodePoint": "018F"}
,{"SourceCharacterCodePoint": "0010","TargetCharacterCodePoint": "0190"}
,{"SourceCharacterCodePoint": "0011","TargetCharacterCodePoint": "0191"}
,{"SourceCharacterCodePoint": "0012","TargetCharacterCodePoint": "0192"}
,{"SourceCharacterCodePoint": "0013","TargetCharacterCodePoint": "0193"}
,{"SourceCharacterCodePoint": "009D","TargetCharacterCodePoint": "0194"}
,{"SourceCharacterCodePoint": "0085","TargetCharacterCodePoint": "0195"}
,{"SourceCharacterCodePoint": "0008","TargetCharacterCodePoint": "0196"}
,{"SourceCharacterCodePoint": "0087","TargetCharacterCodePoint": "0197"}
,{"SourceCharacterCodePoint": "0018","TargetCharacterCodePoint": "0198"}
,{"SourceCharacterCodePoint": "0019","TargetCharacterCodePoint": "0199"}
,{"SourceCharacterCodePoint": "0092","TargetCharacterCodePoint": "019A"}
,{"SourceCharacterCodePoint": "008F","TargetCharacterCodePoint": "019B"}
,{"SourceCharacterCodePoint": "001C","TargetCharacterCodePoint": "019C"}
,{"SourceCharacterCodePoint": "001D","TargetCharacterCodePoint": "019D"}
,{"SourceCharacterCodePoint": "001E","TargetCharacterCodePoint": "019E"}
,{"SourceCharacterCodePoint": "001F","TargetCharacterCodePoint": "019F"}
,{"SourceCharacterCodePoint": "0080","TargetCharacterCodePoint": "01A0"}
,{"SourceCharacterCodePoint": "0081","TargetCharacterCodePoint": "01A1"}
,{"SourceCharacterCodePoint": "0082","TargetCharacterCodePoint": "01A2"}
,{"SourceCharacterCodePoint": "0083","TargetCharacterCodePoint": "01A3"}
,{"SourceCharacterCodePoint": "0084","TargetCharacterCodePoint": "01A4"}
,{"SourceCharacterCodePoint": "000A","TargetCharacterCodePoint": "01A5"}
,{"SourceCharacterCodePoint": "0017","TargetCharacterCodePoint": "01A6"}
,{"SourceCharacterCodePoint": "001B","TargetCharacterCodePoint": "01A7"}
,{"SourceCharacterCodePoint": "0088","TargetCharacterCodePoint": "01A8"}
,{"SourceCharacterCodePoint": "0089","TargetCharacterCodePoint": "01A9"}
,{"SourceCharacterCodePoint": "008A","TargetCharacterCodePoint": "01AA"}
,{"SourceCharacterCodePoint": "008B","TargetCharacterCodePoint": "01AB"}
,{"SourceCharacterCodePoint": "008C","TargetCharacterCodePoint": "01AC"}
,{"SourceCharacterCodePoint": "0005","TargetCharacterCodePoint": "01AD"}
,{"SourceCharacterCodePoint": "0006","TargetCharacterCodePoint": "01AE"}
,{"SourceCharacterCodePoint": "0007","TargetCharacterCodePoint": "01AF"}
,{"SourceCharacterCodePoint": "0090","TargetCharacterCodePoint": "01B0"}
,{"SourceCharacterCodePoint": "0091","TargetCharacterCodePoint": "01B1"}
,{"SourceCharacterCodePoint": "0016","TargetCharacterCodePoint": "01B2"}
,{"SourceCharacterCodePoint": "0093","TargetCharacterCodePoint": "01B3"}
,{"SourceCharacterCodePoint": "0094","TargetCharacterCodePoint": "01B4"}
,{"SourceCharacterCodePoint": "0095","TargetCharacterCodePoint": "01B5"}
,{"SourceCharacterCodePoint": "0096","TargetCharacterCodePoint": "01B6"}
,{"SourceCharacterCodePoint": "0004","TargetCharacterCodePoint": "01B7"}
,{"SourceCharacterCodePoint": "0098","TargetCharacterCodePoint": "01B8"}
,{"SourceCharacterCodePoint": "0099","TargetCharacterCodePoint": "01B9"}
,{"SourceCharacterCodePoint": "009A","TargetCharacterCodePoint": "01BA"}
,{"SourceCharacterCodePoint": "009B","TargetCharacterCodePoint": "01BB"}
,{"SourceCharacterCodePoint": "0014","TargetCharacterCodePoint": "01BC"}
,{"SourceCharacterCodePoint": "0015","TargetCharacterCodePoint": "01BD"}
,{"SourceCharacterCodePoint": "009E","TargetCharacterCodePoint": "01BE"}
,{"SourceCharacterCodePoint": "001A","TargetCharacterCodePoint": "01BF"}
,{"SourceCharacterCodePoint": "009F","TargetCharacterCodePoint": "027F"}
```
**it-IT-280b** and **it-IT-1144b**  
Code Shift:  

```
0180    0180
0001    0181
0002    0182
0003    0183
009C    0184
0009    0185
0086    0186
007F    0187
0097    0188
008D    0189
008E    018A
000B    018B
000C    018C
000D    018D
000E    018E
000F    018F
0010    0190
0011    0191
0012    0192
0013    0193
009D    0194
0085    0195
0008    0196
0087    0197
0018    0198
0019    0199
0092    019A
008F    019B
001C    019C
001D    019D
001E    019E
001F    019F
0080    01A0
0081    01A1
0082    01A2
0083    01A3
0084    01A4
000A    01A5
0017    01A6
001B    01A7
0088    01A8
0089    01A9
008A    01AA
008B    01AB
008C    01AC
0005    01AD
0006    01AE
0007    01AF
0090    01B0
0091    01B1
0016    01B2
0093    01B3
0094    01B4
0095    01B5
0096    01B6
0004    01B7
0098    01B8
0099    01B9
009A    01BA
009B    01BB
0014    01BC
0015    01BD
009E    01BE
001A    01BF
009F    027F
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0180","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "0001","TargetCharacterCodePoint": "0181"}
,{"SourceCharacterCodePoint": "0002","TargetCharacterCodePoint": "0182"}
,{"SourceCharacterCodePoint": "0003","TargetCharacterCodePoint": "0183"}
,{"SourceCharacterCodePoint": "009C","TargetCharacterCodePoint": "0184"}
,{"SourceCharacterCodePoint": "0009","TargetCharacterCodePoint": "0185"}
,{"SourceCharacterCodePoint": "0086","TargetCharacterCodePoint": "0186"}
,{"SourceCharacterCodePoint": "007F","TargetCharacterCodePoint": "0187"}
,{"SourceCharacterCodePoint": "0097","TargetCharacterCodePoint": "0188"}
,{"SourceCharacterCodePoint": "008D","TargetCharacterCodePoint": "0189"}
,{"SourceCharacterCodePoint": "008E","TargetCharacterCodePoint": "018A"}
,{"SourceCharacterCodePoint": "000B","TargetCharacterCodePoint": "018B"}
,{"SourceCharacterCodePoint": "000C","TargetCharacterCodePoint": "018C"}
,{"SourceCharacterCodePoint": "000D","TargetCharacterCodePoint": "018D"}
,{"SourceCharacterCodePoint": "000E","TargetCharacterCodePoint": "018E"}
,{"SourceCharacterCodePoint": "000F","TargetCharacterCodePoint": "018F"}
,{"SourceCharacterCodePoint": "0010","TargetCharacterCodePoint": "0190"}
,{"SourceCharacterCodePoint": "0011","TargetCharacterCodePoint": "0191"}
,{"SourceCharacterCodePoint": "0012","TargetCharacterCodePoint": "0192"}
,{"SourceCharacterCodePoint": "0013","TargetCharacterCodePoint": "0193"}
,{"SourceCharacterCodePoint": "009D","TargetCharacterCodePoint": "0194"}
,{"SourceCharacterCodePoint": "0085","TargetCharacterCodePoint": "0195"}
,{"SourceCharacterCodePoint": "0008","TargetCharacterCodePoint": "0196"}
,{"SourceCharacterCodePoint": "0087","TargetCharacterCodePoint": "0197"}
,{"SourceCharacterCodePoint": "0018","TargetCharacterCodePoint": "0198"}
,{"SourceCharacterCodePoint": "0019","TargetCharacterCodePoint": "0199"}
,{"SourceCharacterCodePoint": "0092","TargetCharacterCodePoint": "019A"}
,{"SourceCharacterCodePoint": "008F","TargetCharacterCodePoint": "019B"}
,{"SourceCharacterCodePoint": "001C","TargetCharacterCodePoint": "019C"}
,{"SourceCharacterCodePoint": "001D","TargetCharacterCodePoint": "019D"}
,{"SourceCharacterCodePoint": "001E","TargetCharacterCodePoint": "019E"}
,{"SourceCharacterCodePoint": "001F","TargetCharacterCodePoint": "019F"}
,{"SourceCharacterCodePoint": "0080","TargetCharacterCodePoint": "01A0"}
,{"SourceCharacterCodePoint": "0081","TargetCharacterCodePoint": "01A1"}
,{"SourceCharacterCodePoint": "0082","TargetCharacterCodePoint": "01A2"}
,{"SourceCharacterCodePoint": "0083","TargetCharacterCodePoint": "01A3"}
,{"SourceCharacterCodePoint": "0084","TargetCharacterCodePoint": "01A4"}
,{"SourceCharacterCodePoint": "000A","TargetCharacterCodePoint": "01A5"}
,{"SourceCharacterCodePoint": "0017","TargetCharacterCodePoint": "01A6"}
,{"SourceCharacterCodePoint": "001B","TargetCharacterCodePoint": "01A7"}
,{"SourceCharacterCodePoint": "0088","TargetCharacterCodePoint": "01A8"}
,{"SourceCharacterCodePoint": "0089","TargetCharacterCodePoint": "01A9"}
,{"SourceCharacterCodePoint": "008A","TargetCharacterCodePoint": "01AA"}
,{"SourceCharacterCodePoint": "008B","TargetCharacterCodePoint": "01AB"}
,{"SourceCharacterCodePoint": "008C","TargetCharacterCodePoint": "01AC"}
,{"SourceCharacterCodePoint": "0005","TargetCharacterCodePoint": "01AD"}
,{"SourceCharacterCodePoint": "0006","TargetCharacterCodePoint": "01AE"}
,{"SourceCharacterCodePoint": "0007","TargetCharacterCodePoint": "01AF"}
,{"SourceCharacterCodePoint": "0090","TargetCharacterCodePoint": "01B0"}
,{"SourceCharacterCodePoint": "0091","TargetCharacterCodePoint": "01B1"}
,{"SourceCharacterCodePoint": "0016","TargetCharacterCodePoint": "01B2"}
,{"SourceCharacterCodePoint": "0093","TargetCharacterCodePoint": "01B3"}
,{"SourceCharacterCodePoint": "0094","TargetCharacterCodePoint": "01B4"}
,{"SourceCharacterCodePoint": "0095","TargetCharacterCodePoint": "01B5"}
,{"SourceCharacterCodePoint": "0096","TargetCharacterCodePoint": "01B6"}
,{"SourceCharacterCodePoint": "0004","TargetCharacterCodePoint": "01B7"}
,{"SourceCharacterCodePoint": "0098","TargetCharacterCodePoint": "01B8"}
,{"SourceCharacterCodePoint": "0099","TargetCharacterCodePoint": "01B9"}
,{"SourceCharacterCodePoint": "009A","TargetCharacterCodePoint": "01BA"}
,{"SourceCharacterCodePoint": "009B","TargetCharacterCodePoint": "01BB"}
,{"SourceCharacterCodePoint": "0014","TargetCharacterCodePoint": "01BC"}
,{"SourceCharacterCodePoint": "0015","TargetCharacterCodePoint": "01BD"}
,{"SourceCharacterCodePoint": "009E","TargetCharacterCodePoint": "01BE"}
,{"SourceCharacterCodePoint": "001A","TargetCharacterCodePoint": "01BF"}
,{"SourceCharacterCodePoint": "009F","TargetCharacterCodePoint": "027F"}
```
**nl-BE-500b** and **nl-BE-1148b**  
Code Shift:  

```
0180    0180
0001    0181
0002    0182
0003    0183
009C    0184
0009    0185
0086    0186
007F    0187
0097    0188
008D    0189
008E    018A
000B    018B
000C    018C
000D    018D
000E    018E
000F    018F
0010    0190
0011    0191
0012    0192
0013    0193
009D    0194
0085    0195
0008    0196
0087    0197
0018    0198
0019    0199
0092    019A
008F    019B
001C    019C
001D    019D
001E    019E
001F    019F
0080    01A0
0081    01A1
0082    01A2
0083    01A3
0084    01A4
000A    01A5
0017    01A6
001B    01A7
0088    01A8
0089    01A9
008A    01AA
008B    01AB
008C    01AC
0005    01AD
0006    01AE
0007    01AF
0090    01B0
0091    01B1
0016    01B2
0093    01B3
0094    01B4
0095    01B5
0096    01B6
0004    01B7
0098    01B8
0099    01B9
009A    01BA
009B    01BB
0014    01BC
0015    01BD
009E    01BE
001A    01BF
009F    027F
```
Corresponding input mapping for an AWS DMS task:  

```
 {"SourceCharacterCodePoint": "0180","TargetCharacterCodePoint": "0180"}
,{"SourceCharacterCodePoint": "0001","TargetCharacterCodePoint": "0181"}
,{"SourceCharacterCodePoint": "0002","TargetCharacterCodePoint": "0182"}
,{"SourceCharacterCodePoint": "0003","TargetCharacterCodePoint": "0183"}
,{"SourceCharacterCodePoint": "009C","TargetCharacterCodePoint": "0184"}
,{"SourceCharacterCodePoint": "0009","TargetCharacterCodePoint": "0185"}
,{"SourceCharacterCodePoint": "0086","TargetCharacterCodePoint": "0186"}
,{"SourceCharacterCodePoint": "007F","TargetCharacterCodePoint": "0187"}
,{"SourceCharacterCodePoint": "0097","TargetCharacterCodePoint": "0188"}
,{"SourceCharacterCodePoint": "008D","TargetCharacterCodePoint": "0189"}
,{"SourceCharacterCodePoint": "008E","TargetCharacterCodePoint": "018A"}
,{"SourceCharacterCodePoint": "000B","TargetCharacterCodePoint": "018B"}
,{"SourceCharacterCodePoint": "000C","TargetCharacterCodePoint": "018C"}
,{"SourceCharacterCodePoint": "000D","TargetCharacterCodePoint": "018D"}
,{"SourceCharacterCodePoint": "000E","TargetCharacterCodePoint": "018E"}
,{"SourceCharacterCodePoint": "000F","TargetCharacterCodePoint": "018F"}
,{"SourceCharacterCodePoint": "0010","TargetCharacterCodePoint": "0190"}
,{"SourceCharacterCodePoint": "0011","TargetCharacterCodePoint": "0191"}
,{"SourceCharacterCodePoint": "0012","TargetCharacterCodePoint": "0192"}
,{"SourceCharacterCodePoint": "0013","TargetCharacterCodePoint": "0193"}
,{"SourceCharacterCodePoint": "009D","TargetCharacterCodePoint": "0194"}
,{"SourceCharacterCodePoint": "0085","TargetCharacterCodePoint": "0195"}
,{"SourceCharacterCodePoint": "0008","TargetCharacterCodePoint": "0196"}
,{"SourceCharacterCodePoint": "0087","TargetCharacterCodePoint": "0197"}
,{"SourceCharacterCodePoint": "0018","TargetCharacterCodePoint": "0198"}
,{"SourceCharacterCodePoint": "0019","TargetCharacterCodePoint": "0199"}
,{"SourceCharacterCodePoint": "0092","TargetCharacterCodePoint": "019A"}
,{"SourceCharacterCodePoint": "008F","TargetCharacterCodePoint": "019B"}
,{"SourceCharacterCodePoint": "001C","TargetCharacterCodePoint": "019C"}
,{"SourceCharacterCodePoint": "001D","TargetCharacterCodePoint": "019D"}
,{"SourceCharacterCodePoint": "001E","TargetCharacterCodePoint": "019E"}
,{"SourceCharacterCodePoint": "001F","TargetCharacterCodePoint": "019F"}
,{"SourceCharacterCodePoint": "0080","TargetCharacterCodePoint": "01A0"}
,{"SourceCharacterCodePoint": "0081","TargetCharacterCodePoint": "01A1"}
,{"SourceCharacterCodePoint": "0082","TargetCharacterCodePoint": "01A2"}
,{"SourceCharacterCodePoint": "0083","TargetCharacterCodePoint": "01A3"}
,{"SourceCharacterCodePoint": "0084","TargetCharacterCodePoint": "01A4"}
,{"SourceCharacterCodePoint": "000A","TargetCharacterCodePoint": "01A5"}
,{"SourceCharacterCodePoint": "0017","TargetCharacterCodePoint": "01A6"}
,{"SourceCharacterCodePoint": "001B","TargetCharacterCodePoint": "01A7"}
,{"SourceCharacterCodePoint": "0088","TargetCharacterCodePoint": "01A8"}
,{"SourceCharacterCodePoint": "0089","TargetCharacterCodePoint": "01A9"}
,{"SourceCharacterCodePoint": "008A","TargetCharacterCodePoint": "01AA"}
,{"SourceCharacterCodePoint": "008B","TargetCharacterCodePoint": "01AB"}
,{"SourceCharacterCodePoint": "008C","TargetCharacterCodePoint": "01AC"}
,{"SourceCharacterCodePoint": "0005","TargetCharacterCodePoint": "01AD"}
,{"SourceCharacterCodePoint": "0006","TargetCharacterCodePoint": "01AE"}
,{"SourceCharacterCodePoint": "0007","TargetCharacterCodePoint": "01AF"}
,{"SourceCharacterCodePoint": "0090","TargetCharacterCodePoint": "01B0"}
,{"SourceCharacterCodePoint": "0091","TargetCharacterCodePoint": "01B1"}
,{"SourceCharacterCodePoint": "0016","TargetCharacterCodePoint": "01B2"}
,{"SourceCharacterCodePoint": "0093","TargetCharacterCodePoint": "01B3"}
,{"SourceCharacterCodePoint": "0094","TargetCharacterCodePoint": "01B4"}
,{"SourceCharacterCodePoint": "0095","TargetCharacterCodePoint": "01B5"}
,{"SourceCharacterCodePoint": "0096","TargetCharacterCodePoint": "01B6"}
,{"SourceCharacterCodePoint": "0004","TargetCharacterCodePoint": "01B7"}
,{"SourceCharacterCodePoint": "0098","TargetCharacterCodePoint": "01B8"}
,{"SourceCharacterCodePoint": "0099","TargetCharacterCodePoint": "01B9"}
,{"SourceCharacterCodePoint": "009A","TargetCharacterCodePoint": "01BA"}
,{"SourceCharacterCodePoint": "009B","TargetCharacterCodePoint": "01BB"}
,{"SourceCharacterCodePoint": "0014","TargetCharacterCodePoint": "01BC"}
,{"SourceCharacterCodePoint": "0015","TargetCharacterCodePoint": "01BD"}
,{"SourceCharacterCodePoint": "009E","TargetCharacterCodePoint": "01BE"}
,{"SourceCharacterCodePoint": "001A","TargetCharacterCodePoint": "01BF"}
,{"SourceCharacterCodePoint": "009F","TargetCharacterCodePoint": "027F"}
```

# Targets for data migration
<a name="CHAP_Target"></a>

AWS Database Migration Service (AWS DMS) can use many of the most popular databases as a target for data replication. The target can be on an Amazon Elastic Compute Cloud (Amazon EC2) instance, an Amazon Relational Database Service (Amazon RDS) instance, or an on-premises database. 

For a comprehensive list of valid targets, see [Targets for AWS DMS](CHAP_Introduction.Targets.md).

**Note**  
AWS DMS doesn't support migration across AWS Regions for the following target endpoint types:  
Amazon DynamoDB
Amazon OpenSearch Service
Amazon Kinesis Data Streams
Amazon Aurora PostgreSQL Limitless is available as a target for AWS Database Migration Service (AWS DMS). For more information see [Using a PostgreSQL database as a target for AWS Database Migration Service](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.PostgreSQL.html).

**Topics**
+ [

# Using an Oracle database as a target for AWS Database Migration Service
](CHAP_Target.Oracle.md)
+ [

# Using a Microsoft SQL Server database as a target for AWS Database Migration Service
](CHAP_Target.SQLServer.md)
+ [

# Using a PostgreSQL database as a target for AWS Database Migration Service
](CHAP_Target.PostgreSQL.md)
+ [

# Using a MySQL-compatible database as a target for AWS Database Migration Service
](CHAP_Target.MySQL.md)
+ [

# Using an Amazon Redshift database as a target for AWS Database Migration Service
](CHAP_Target.Redshift.md)
+ [

# Using a SAP ASE database as a target for AWS Database Migration Service
](CHAP_Target.SAP.md)
+ [

# Using Amazon S3 as a target for AWS Database Migration Service
](CHAP_Target.S3.md)
+ [

# Using an Amazon DynamoDB database as a target for AWS Database Migration Service
](CHAP_Target.DynamoDB.md)
+ [

# Using Amazon Kinesis Data Streams as a target for AWS Database Migration Service
](CHAP_Target.Kinesis.md)
+ [

# Using Apache Kafka as a target for AWS Database Migration Service
](CHAP_Target.Kafka.md)
+ [

# Using an Amazon OpenSearch Service cluster as a target for AWS Database Migration Service
](CHAP_Target.Elasticsearch.md)
+ [

# Using Amazon DocumentDB as a target for AWS Database Migration Service
](CHAP_Target.DocumentDB.md)
+ [

# Using Amazon Neptune as a target for AWS Database Migration Service
](CHAP_Target.Neptune.md)
+ [

# Using Redis OSS as a target for AWS Database Migration Service
](CHAP_Target.Redis.md)
+ [

# Using Babelfish as a target for AWS Database Migration Service
](CHAP_Target.Babelfish.md)
+ [

# Using Amazon Timestream as a target for AWS Database Migration Service
](CHAP_Target.Timestream.md)
+ [

# Using Amazon RDS for Db2 and IBM Db2 LUW as a target for AWS DMS
](CHAP_Target.DB2.md)

# Using an Oracle database as a target for AWS Database Migration Service
<a name="CHAP_Target.Oracle"></a>

You can migrate data to Oracle database targets using AWS DMS, either from another Oracle database or from one of the other supported databases. You can use Secure Sockets Layer (SSL) to encrypt connections between your Oracle endpoint and the replication instance. For more information on using SSL with an Oracle endpoint, see [Using SSL with AWS Database Migration Service](CHAP_Security.SSL.md). AWS DMS also supports the use of Oracle transparent data encryption (TDE) to encrypt data at rest in the target database because Oracle TDE does not require an encryption key or password to write to the database.

For information about versions of Oracle that AWS DMS supports as a target, see [Targets for AWS DMS](CHAP_Introduction.Targets.md). 

When you use Oracle as a target, we assume that the data is to be migrated into the schema or user that is used for the target connection. If you want to migrate data to a different schema, use a schema transformation to do so. For example, suppose that your target endpoint connects to the user `RDSMASTER` and you want to migrate from the user `PERFDATA1` to `PERFDATA2`. In this case, create a transformation like the following.

```
{
   "rule-type": "transformation",
   "rule-id": "2",
   "rule-name": "2",
   "rule-action": "rename",
   "rule-target": "schema",
   "object-locator": {
   "schema-name": "PERFDATA1"
},
"value": "PERFDATA2"
}
```

When using Oracle as a target, AWS DMS migrates all tables and indexes to default table and index tablespaces in the target. If you want to migrate tables and indexes to different table and index tablespaces, use a tablespace transformation to do so. For example, suppose that you have a set of tables in the `INVENTORY` schema assigned to some tablespaces in the Oracle source. For the migration, you want to assign all of these tables to a single `INVENTORYSPACE` tablespace in the target. In this case, create a transformation like the following.

```
{
   "rule-type": "transformation",
   "rule-id": "3",
   "rule-name": "3",
   "rule-action": "rename",
   "rule-target": "table-tablespace",
   "object-locator": {
      "schema-name": "INVENTORY",
      "table-name": "%",
      "table-tablespace-name": "%"
   },
   "value": "INVENTORYSPACE"
}
```

For more information about transformations, see [Specifying table selection and transformations rules using JSON](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.md).

If Oracle is both source and target, you can preserve existing table or index tablespace assignments by setting the Oracle source extra connection attribute, `enableHomogenousTablespace=true`. For more information, see [Endpoint settings when using Oracle as a source for AWS DMS](CHAP_Source.Oracle.md#CHAP_Source.Oracle.ConnectionAttrib)

For additional details on working with Oracle databases as a target for AWS DMS, see the following sections: 

**Topics**
+ [

## Limitations on Oracle as a target for AWS Database Migration Service
](#CHAP_Target.Oracle.Limitations)
+ [

## User account privileges required for using Oracle as a target
](#CHAP_Target.Oracle.Privileges)
+ [

## Configuring an Oracle database as a target for AWS Database Migration Service
](#CHAP_Target.Oracle.Configuration)
+ [

## Endpoint settings when using Oracle as a target for AWS DMS
](#CHAP_Target.Oracle.ConnectionAttrib)
+ [

## Target data types for Oracle
](#CHAP_Target.Oracle.DataTypes)

## Limitations on Oracle as a target for AWS Database Migration Service
<a name="CHAP_Target.Oracle.Limitations"></a>

Limitations when using Oracle as a target for data migration include the following:
+ AWS DMS doesn't create schema on the target Oracle database. You have to create any schemas you want on the target Oracle database. The schema name must already exist for the Oracle target. Tables from source schema are imported to the user or schema, which AWS DMS uses to connect to the target instance. To migrate multiple schemas, you can create multiple replication tasks. You can also migrate data to different schemas on a target. To do this, you need to use schema transformation rules on the AWS DMS table mappings.
+ AWS DMS doesn't support the `Use direct path full load` option for tables with INDEXTYPE CONTEXT. As a workaround, you can use array load. 
+ With the batch optimized apply option, loading into the net changes table uses a direct path, which doesn't support XML type. As a workaround, you can use transactional apply mode.
+ Empty strings migrated from source databases can be treated differently by the Oracle target (converted to one-space strings, for example). This can result in AWS DMS validation reporting a mismatch.
+ You can express the total number of columns per table supported in Batch optimized apply mode, using the following formula:

  ```
  2 * columns_in_original_table + columns_in_primary_key <= 999
  ```

  For example, if the original table has 25 columns and its Primary Key consists of 5 columns, then the total number of columns is 55. If a table exceeds the supported number of columns, then all of the changes are applied in one-by-one mode.
+ AWS DMS doesn't support Autonomous DB on Oracle Cloud Infrastructure (OCI).
+ In transactional apply mode, an Oracle target can process DML statements up to 32 KB in size. While this limit is sufficient for many use cases, DML statements exceeding 32 KB will fail with the error: "ORA-01460: unimplemented or unreasonable conversion requested." To resolve this issue, you must enable the batch apply feature by setting the `BatchApplyEnabled` task setting to `true`. Batch apply reduces the overall statement size, allowing you to bypass the 32 KB limitation. For more information, see [Target metadata task settings](CHAP_Tasks.CustomizingTasks.TaskSettings.TargetMetadata.md).
+ AWS DMS Direct path full load for LOB tables may fail with error ORA-39777 due to special handling requirements for LOB data. This error occurs during the direct path load process and can disrupt migration tasks involving LOB columns. To resolve, disable the `useDirectPathFullLoad` setting on the target endpoint and retry the load operation.

## User account privileges required for using Oracle as a target
<a name="CHAP_Target.Oracle.Privileges"></a>

To use an Oracle target in an AWS Database Migration Service task, grant the following privileges in the Oracle database. You grant these to the user account specified in the Oracle database definitions for AWS DMS.
+ SELECT ANY TRANSACTION 
+ SELECT on V\$1NLS\$1PARAMETERS 
+ SELECT on V\$1TIMEZONE\$1NAMES 
+ SELECT on ALL\$1INDEXES 
+ SELECT on ALL\$1OBJECTS 
+ SELECT on DBA\$1OBJECTS
+ SELECT on ALL\$1TABLES 
+ SELECT on ALL\$1USERS 
+ SELECT on ALL\$1CATALOG 
+ SELECT on ALL\$1CONSTRAINTS 
+ SELECT on ALL\$1CONS\$1COLUMNS 
+ SELECT on ALL\$1TAB\$1COLS 
+ SELECT on ALL\$1IND\$1COLUMNS 
+ DROP ANY TABLE 
+ SELECT ANY TABLE
+ INSERT ANY TABLE 
+ UPDATE ANY TABLE
+ CREATE ANY VIEW
+ DROP ANY VIEW
+ CREATE ANY PROCEDURE
+ ALTER ANY PROCEDURE
+ DROP ANY PROCEDURE
+ CREATE ANY SEQUENCE
+ ALTER ANY SEQUENCE
+ DROP ANY SEQUENCE 
+ DELETE ANY TABLE

For the following requirements, grant these additional privileges:
+ To use a specific table list, grant SELECT on any replicated table and also ALTER on any replicated table.
+ To allow a user to create a table in a default tablespace, grant the privilege GRANT UNLIMITED TABLESPACE.
+ For logon, grant the privilege CREATE SESSION.
+ If you are using a direct path (which is the default for full load), `GRANT LOCK ANY TABLE to dms_user;`.
+ If schema is different when using “DROP and CREATE” table prep mode, `GRANT CREATE ANY INDEX to dms_user;`.
+ For some full load scenarios, you might choose the "DROP and CREATE table" or "TRUNCATE before loading" option where a target table schema is different from the DMS user's. In this case, grant DROP ANY TABLE.
+ To store changes in change tables or an audit table where the target table schema is different from the DMS user's, grant CREATE ANY TABLE and CREATE ANY INDEX.
+ To validate LOB columns with the validation feature, grant EXECUTE privelege on `SYS.DBMS_CRYPTO` to the DMS user.

### Read privileges required for AWS Database Migration Service on the target database
<a name="CHAP_Target.Oracle.Privileges.Read"></a>

The AWS DMS user account must be granted read permissions for the following DBA tables:
+ SELECT on DBA\$1USERS
+ SELECT on DBA\$1TAB\$1PRIVS
+ SELECT on DBA\$1OBJECTS
+ SELECT on DBA\$1SYNONYMS
+ SELECT on DBA\$1SEQUENCES
+ SELECT on DBA\$1TYPES
+ SELECT on DBA\$1INDEXES
+ SELECT on DBA\$1TABLES
+ SELECT on DBA\$1TRIGGERS
+ SELECT on SYS.DBA\$1REGISTRY

If any of the required privileges cannot be granted to V\$1xxx, then grant them to V\$1\$1xxx.

### Premigration assessments
<a name="CHAP_Target.Oracle.Privileges.Premigration"></a>

To use the premigration assessments listed in [Oracle assessments](CHAP_Tasks.AssessmentReport.Oracle.md) with Oracle as a Target, you must add the following permissions to the user account specified in the Oracle database target endpoint:

```
GRANT SELECT ON V_$INSTANCE TO dms_user;
GRANT EXECUTE ON SYS.DBMS_XMLGEN TO dms_user;
```

## Configuring an Oracle database as a target for AWS Database Migration Service
<a name="CHAP_Target.Oracle.Configuration"></a>

Before using an Oracle database as a data migration target, you must provide an Oracle user account to AWS DMS. The user account must have read/write privileges on the Oracle database, as specified in [User account privileges required for using Oracle as a target](#CHAP_Target.Oracle.Privileges).

## Endpoint settings when using Oracle as a target for AWS DMS
<a name="CHAP_Target.Oracle.ConnectionAttrib"></a>

You can use endpoint settings to configure your Oracle target database similar to using extra connection attributes. You specify the settings when you create the target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--oracle-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with Oracle as a target.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Oracle.html)

## Target data types for Oracle
<a name="CHAP_Target.Oracle.DataTypes"></a>

A target Oracle database used with AWS DMS supports most Oracle data types. The following table shows the Oracle target data types that are supported when using AWS DMS and the default mapping from AWS DMS data types. For more information about how to view the data type that is mapped from the source, see the section for the source you are using.


|  AWS DMS data type  |  Oracle data type  | 
| --- | --- | 
|  BOOLEAN  |  NUMBER (1)  | 
|  BYTES  |  RAW (length)  | 
|  DATE  |  DATETIME  | 
|  TIME  | TIMESTAMP (0) | 
|  DATETIME  |  TIMESTAMP (scale)  | 
|  INT1  | NUMBER (3) | 
|  INT2  |  NUMBER (5)  | 
|  INT4  | NUMBER (10) | 
|  INT8  |  NUMBER (19)  | 
|  NUMERIC  |  NUMBER (p,s)  | 
|  REAL4  |  FLOAT  | 
|  REAL8  | FLOAT | 
|  STRING  |  With date indication: DATE  With time indication: TIMESTAMP  With timestamp indication: TIMESTAMP  With timestamp\$1with\$1timezone indication: TIMESTAMP WITH TIMEZONE  With timestamp\$1with\$1local\$1timezone indication: TIMESTAMP WITH LOCAL TIMEZONE With interval\$1year\$1to\$1month indication: INTERVAL YEAR TO MONTH  With interval\$1day\$1to\$1second indication: INTERVAL DAY TO SECOND  If length > 4000: CLOB In all other cases: VARCHAR2 (length)  | 
|  UINT1  |  NUMBER (3)  | 
|  UINT2  |  NUMBER (5)  | 
|  UINT4  |  NUMBER (10)  | 
|  UINT8  |  NUMBER (19)  | 
|  WSTRING  |  If length > 2000: NCLOB In all other cases: NVARCHAR2 (length)  | 
|  BLOB  |  BLOB To use this data type with AWS DMS, you must enable the use of BLOBs for a specific task. BLOB data types are supported only in tables that include a primary key  | 
|  CLOB  |  CLOB To use this data type with AWS DMS, you must enable the use of CLOBs for a specific task. During change data capture (CDC), CLOB data types are supported only in tables that include a primary key. STRING An Oracle VARCHAR2 data type on the source with a declared size greater than 4000 bytes maps through the AWS DMS CLOB to a STRING on the Oracle target.  | 
|  NCLOB  |  NCLOB To use this data type with AWS DMS, you must enable the use of NCLOBs for a specific task. During CDC, NCLOB data types are supported only in tables that include a primary key. WSTRING An Oracle VARCHAR2 data type on the source with a declared size greater than 4000 bytes maps through the AWS DMS NCLOB to a WSTRING on the Oracle target.   | 
| XMLTYPE |  The XMLTYPE target data type is only relevant in Oracle-to-Oracle replication tasks. When the source database is Oracle, the source data types are replicated as-is to the Oracle target. For example, an XMLTYPE data type on the source is created as an XMLTYPE data type on the target.  | 

# Using a Microsoft SQL Server database as a target for AWS Database Migration Service
<a name="CHAP_Target.SQLServer"></a>

You can migrate data to Microsoft SQL Server databases using AWS DMS. With an SQL Server database as a target, you can migrate data from either another SQL Server database or one of the other supported databases.

For information about versions of SQL Server that AWS DMS supports as a target, see [Targets for AWS DMS](CHAP_Introduction.Targets.md). 

AWS DMS supports the on-premises and Amazon RDS editions of Enterprise, Standard, Workgroup, and Developer.

For additional details on working with AWS DMS and SQL Server target databases, see the following.

**Topics**
+ [

## Limitations on using SQL Server as a target for AWS Database Migration Service
](#CHAP_Target.SQLServer.Limitations)
+ [

## Security requirements when using SQL Server as a target for AWS Database Migration Service
](#CHAP_Target.SQLServer.Security)
+ [

## Endpoint settings when using SQL Server as a target for AWS DMS
](#CHAP_Target.SQLServer.ConnectionAttrib)
+ [

## Target data types for Microsoft SQL Server
](#CHAP_Target.SQLServer.DataTypes)

## Limitations on using SQL Server as a target for AWS Database Migration Service
<a name="CHAP_Target.SQLServer.Limitations"></a>

The following limitations apply when using a SQL Server database as a target for AWS DMS:
+ When you manually create a SQL Server target table with a computed column, full load replication is not supported when using the BCP bulk-copy utility. To use full load replication, disable BCP loading by setting the extra connection attribute (ECA) `'useBCPFullLoad=false'` on the endpoint. For information about setting ECAs on endpoints, see [Creating source and target endpoints](CHAP_Endpoints.Creating.md). For more information on working with BCP, see the [Microsoft SQL Server documentation](https://docs.microsoft.com/en-us/sql/relational-databases/import-export/import-and-export-bulk-data-by-using-the-bcp-utility-sql-server).
+ When replicating tables with SQL Server spatial data types (GEOMETRY and GEOGRAPHY), AWS DMS replaces any spatial reference identifier (SRID) that you might have inserted with the default SRID. The default SRID is 0 for GEOMETRY and 4326 for GEOGRAPHY.
+ Temporal tables are not supported. Migrating temporal tables may work with a replication-only task in transactional apply mode if those tables are manually created on the target.
+ Currently, `boolean` data types in a PostgreSQL source are migrated to a SQLServer target as the `bit` data type with inconsistent values. 

  As a workaround, do the following:
  + Precreate the table with a `VARCHAR(1)` data type for the column (or let AWS DMS create the table). Then have downstream processing treat an "F" as False and a "T" as True.
  + To avoid having to change downstream processing, add a transformation rule to the task to change the "F" values to "0" and "T" values to 1, and store them as the SQL server bit datatype.
+ AWS DMS doesn't support change processing to set column nullability (using the `ALTER COLUMN [SET|DROP] NOT NULL` clause with `ALTER TABLE` statements).
+ Windows Authentication isn't supported.

## Security requirements when using SQL Server as a target for AWS Database Migration Service
<a name="CHAP_Target.SQLServer.Security"></a>

The following describes the security requirements for using AWS DMS with a Microsoft SQL Server target:
+ The AWS DMS user account must have at least the `db_owner` user role on the SQL Server database that you are connecting to.
+ A SQL Server system administrator must provide this permission to all AWS DMS user accounts.

## Endpoint settings when using SQL Server as a target for AWS DMS
<a name="CHAP_Target.SQLServer.ConnectionAttrib"></a>

You can use endpoint settings to configure your SQL Server target database similar to using extra connection attributes. You specify the settings when you create the target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--microsoft-sql-server-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with SQL Server as a target.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.SQLServer.html)

## Target data types for Microsoft SQL Server
<a name="CHAP_Target.SQLServer.DataTypes"></a>

The following table shows the Microsoft SQL Server target data types that are supported when using AWS DMS and the default mapping from AWS DMS data types. For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  AWS DMS data type  |  SQL Server data type  | 
| --- | --- | 
|  BOOLEAN  |  TINYINT  | 
|  BYTES  |  VARBINARY(length)  | 
|  DATE  |  For SQL Server 2008 and higher, use DATE. For earlier versions, if the scale is 3 or less use DATETIME. In all other cases, use VARCHAR (37).  | 
|  TIME  |  For SQL Server 2008 and higher, use DATETIME2 (%d). For earlier versions, if the scale is 3 or less use DATETIME. In all other cases, use VARCHAR (37).  | 
|  DATETIME  |  For SQL Server 2008 and higher, use DATETIME2 (scale).  For earlier versions, if the scale is 3 or less use DATETIME. In all other cases, use VARCHAR (37).  | 
|  INT1  | SMALLINT | 
|  INT2  |  SMALLINT  | 
|  INT4  | INT | 
|  INT8  |  BIGINT  | 
|  NUMERIC  |  NUMERIC (p,s)  | 
|  REAL4  |  REAL  | 
|  REAL8  | FLOAT | 
|  STRING  |  If the column is a date or time column, then do the following:  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.SQLServer.html) If the column is not a date or time column, use VARCHAR (length).  | 
|  UINT1  |  TINYINT  | 
|  UINT2  |  SMALLINT  | 
|  UINT4  |  INT  | 
|  UINT8  |  BIGINT  | 
|  WSTRING  |  NVARCHAR (length)  | 
|  BLOB  |  VARBINARY(max) IMAGE To use this data type with AWS DMS, you must enable the use of BLOBs for a specific task. AWS DMS supports BLOB data types only in tables that include a primary key.  | 
|  CLOB  |  VARCHAR(max) To use this data type with AWS DMS, you must enable the use of CLOBs for a specific task. During change data capture (CDC), AWS DMS supports CLOB data types only in tables that include a primary key.  | 
|  NCLOB  |  NVARCHAR(max) To use this data type with AWS DMS, you must enable the use of NCLOBs for a specific task. During CDC, AWS DMS supports NCLOB data types only in tables that include a primary key.  | 

# Using a PostgreSQL database as a target for AWS Database Migration Service
<a name="CHAP_Target.PostgreSQL"></a>

You can migrate data to PostgreSQL databases using AWS DMS, either from another PostgreSQL database or from one of the other supported databases. 

For information about versions of PostgreSQL that AWS DMS supports as a target, see [Targets for AWS DMS](CHAP_Introduction.Targets.md).

**Note**  
Amazon Aurora Serverless is available as a target for Amazon Aurora with PostgreSQL compatibility. For more information about Amazon Aurora Serverless, see [Using Amazon Aurora Serverless v2](https://docs.aws.amazon.com//AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.html) in the *Amazon Aurora User Guide*.
Aurora Serverless DB clusters are accessible only from an Amazon VPC and can't use a [public IP address](https://docs.aws.amazon.com//AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.requirements.html). So, if you intend to have a replication instance in a different region than Aurora PostgreSQL Serverless, you must configure [vpc peering](https://docs.aws.amazon.com//dms/latest/userguide/CHAP_ReplicationInstance.VPC.html#CHAP_ReplicationInstance.VPC.Configurations.ScenarioVPCPeer). Otherwise, check the availability of Aurora PostgreSQL Serverless [regions](https://docs.aws.amazon.com//AmazonRDS/latest/AuroraUserGuide/Concepts.AuroraFeaturesRegionsDBEngines.grids.html#Concepts.Aurora_Fea_Regions_DB-eng.Feature.Serverless), and decide to use one of those regions for both Aurora PostgreSQL Serverless and your replication instance.
Babelfish capability is built into Amazon Aurora and doesn't have an additional cost. For more information, see [Using Babelfish for Aurora PostgreSQL as a target for AWS Database Migration Service](#CHAP_Target.PostgreSQL.Babelfish).

AWS DMS takes a table-by-table approach when migrating data from source to target in the Full Load phase. Table order during the full load phase cannot be guaranteed. Tables are out of sync during the full load phase and while cached transactions for individual tables are being applied. As a result, active referential integrity constraints can result in task failure during the full load phase.

In PostgreSQL, foreign keys (referential integrity constraints) are implemented using triggers. During the full load phase, AWS DMS loads each table one at a time. We strongly recommend that you disable foreign key constraints during a full load, using one of the following methods:
+ Temporarily disable all triggers from the instance, and finish the full load.
+ Use the `session_replication_role` parameter in PostgreSQL.

At any given time, a trigger can be in one of the following states: `origin`, `replica`, `always`, or `disabled`. When the `session_replication_role` parameter is set to `replica`, only triggers in the `replica` state are active, and they are fired when they are called. Otherwise, the triggers remain inactive. 

PostgreSQL has a failsafe mechanism to prevent a table from being truncated, even when `session_replication_role` is set. You can use this as an alternative to disabling triggers, to help the full load run to completion. To do this, set the target table preparation mode to `DO_NOTHING`. Otherwise, DROP and TRUNCATE operations fail when there are foreign key constraints.

In Amazon RDS, you can control set this parameter using a parameter group. For a PostgreSQL instance running on Amazon EC2, you can set the parameter directly.


For additional details on working with a PostgreSQL database as a target for AWS DMS, see the following sections: 

**Topics**
+ [

## Limitations on using PostgreSQL as a target for AWS Database Migration Service
](#CHAP_Target.PostgreSQL.Limitations)
+ [

## Limitations on using Amazon Aurora PostgreSQL Limitless as a target for AWS Database Migration Service
](#CHAP_Target.PostgreSQL.Aurora.Limitations)
+ [

## Security requirements when using a PostgreSQL database as a target for AWS Database Migration Service
](#CHAP_Target.PostgreSQL.Security)
+ [

## Endpoint settings and Extra Connection Attributes (ECAs) when using PostgreSQL as a target for AWS DMS
](#CHAP_Target.PostgreSQL.ConnectionAttrib)
+ [

## Target data types for PostgreSQL
](#CHAP_Target.PostgreSQL.DataTypes)
+ [

## Using Babelfish for Aurora PostgreSQL as a target for AWS Database Migration Service
](#CHAP_Target.PostgreSQL.Babelfish)

## Limitations on using PostgreSQL as a target for AWS Database Migration Service
<a name="CHAP_Target.PostgreSQL.Limitations"></a>

The following limitations apply when using a PostgreSQL database as a target for AWS DMS:
+ For heterogeneous migrations, the JSON data type is converted to the Native CLOB data type internally.
+ In an Oracle to PostgreSQL migration, if a column in Oracle contains a NULL character (hex value U\$10000), AWS DMS converts the NULL character to a space (hex value U\$10020). This is due to a PostgreSQL limitation.
+ AWS DMS doesn't support replication to a table with a unique index created with coalesce function.
+ If your tables use sequences, then update the value of `NEXTVAL` for each sequence in the target database after you stop the replication from the source database. AWS DMS copies data from your source database, but doesn't migrate sequences to the target during the ongoing replication.

## Limitations on using Amazon Aurora PostgreSQL Limitless as a target for AWS Database Migration Service
<a name="CHAP_Target.PostgreSQL.Aurora.Limitations"></a>

The following limitations apply when using Amazon Aurora PostgreSQL Limitless as a target for AWS DMS:
+ AWS DMS Data Validation does not support Amazon Aurora PostgreSQL Limitless.
+ AWS DMS migrates source tables as Standard tables, which are not distributed. After migration, you can convert these Standard tables to Limitless tables by following the official conversion guide.

## Security requirements when using a PostgreSQL database as a target for AWS Database Migration Service
<a name="CHAP_Target.PostgreSQL.Security"></a>

For security purposes, the user account used for the data migration must be a registered user in any PostgreSQL database that you use as a target.

Your PostgreSQL target endpoint requires minimum user permissions to run an AWS DMS migration, see the following examples.

```
    CREATE USER newuser WITH PASSWORD 'your-password';
    ALTER SCHEMA schema_name OWNER TO newuser;
```

Or,

```
    GRANT USAGE ON SCHEMA schema_name TO myuser;
    GRANT CONNECT ON DATABASE postgres to myuser;
    GRANT CREATE ON DATABASE postgres TO myuser;
    GRANT CREATE ON SCHEMA schema_name TO myuser;
    GRANT UPDATE, INSERT, SELECT, DELETE, TRUNCATE ON ALL TABLES IN SCHEMA schema_name TO myuser;
    GRANT TRUNCATE ON schema_name."BasicFeed" TO myuser;
```

## Endpoint settings and Extra Connection Attributes (ECAs) when using PostgreSQL as a target for AWS DMS
<a name="CHAP_Target.PostgreSQL.ConnectionAttrib"></a>

You can use endpoint settings and Extra Connection Attributes (ECAs) to configure your PostgreSQL target database. 

You specify the settings when you create the target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--postgre-sql-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

You specify ECAs using the `ExtraConnectionAttributes` parameter for your endpoint.

The following table shows the endpoint settings that you can use with PostgreSQL as a target.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.PostgreSQL.html)

## Target data types for PostgreSQL
<a name="CHAP_Target.PostgreSQL.DataTypes"></a>

The PostgreSQL database endpoint for AWS DMS supports most PostgreSQL database data types. The following table shows the PostgreSQL database target data types that are supported when using AWS DMS and the default mapping from AWS DMS data types.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  AWS DMS data type  |  PostgreSQL data type  | 
| --- | --- | 
|  BOOLEAN  |  BOOLEAN  | 
|  BLOB  |  BYTEA  | 
|  BYTES  |  BYTEA  | 
|  DATE  |  DATE  | 
|  TIME  |  TIME  | 
|  DATETIME  |  If the scale is from 0 through 6, then use TIMESTAMP. If the scale is from 7 through 9, then use VARCHAR (37).  | 
|  INT1  |  SMALLINT  | 
|  INT2  |  SMALLINT  | 
|  INT4  |  INTEGER  | 
|  INT8  |  BIGINT  | 
|  NUMERIC   |  DECIMAL (P,S)  | 
|  REAL4  |  FLOAT4  | 
|  REAL8  |  FLOAT8  | 
|  STRING  |  If the length is from 1 through 21,845, then use VARCHAR (length in bytes).  If the length is 21,846 through 2,147,483,647, then use VARCHAR (65535).  | 
|  UINT1  |  SMALLINT  | 
|  UINT2  |  INTEGER  | 
|  UINT4  |  BIGINT  | 
|  UINT8  |  BIGINT  | 
|  WSTRING  |  If the length is from 1 through 21,845, then use VARCHAR (length in bytes).  If the length is 21,846 through 2,147,483,647, then use VARCHAR (65535).  | 
|  NCLOB  |  TEXT  | 
|  CLOB  |  TEXT  | 

**Note**  
When replicating from a PostgreSQL source, AWS DMS creates the target table with the same data types for all columns, apart from columns with user-defined data types. In such cases, the data type is created as "character varying" in the target.

## Using Babelfish for Aurora PostgreSQL as a target for AWS Database Migration Service
<a name="CHAP_Target.PostgreSQL.Babelfish"></a>

You can migrate SQL Server source tables to a Babelfish for Amazon Aurora PostgreSQL target using AWS Database Migration Service. With Babelfish, Aurora PostgreSQL understands T-SQL, Microsoft SQL Server's proprietary SQL dialect, and supports the same communications protocol. So, applications written for SQL Server can now work with Aurora with fewer code changes. Babelfish capability is built into Amazon Aurora and doesn't have an additional cost. You can activate Babelfish on your Amazon Aurora cluster from the Amazon RDS console.

When you create your AWS DMS target endpoint using the AWS DMS console, API, or CLI commands, specify the target engine as **Amazon Aurora PostgreSQL**, and name the database, **babelfish\$1db**. In the **Endpoint Settings** section, add settings to set `DatabaseMode` to `Babelfish` and `BabelfishDatabaseName` to the name of the target Babelfish T-SQL database.

### Adding transformation rules to your migration task
<a name="CHAP_Target.PostgreSQL.Babelfish.transform"></a>

When you define a migration task for a Babelfish target, you need to include transformation rules that ensure DMS uses the pre-created T-SQL Babelfish tables in the target database.

First, add a transformation rule to your migration task that makes all table names lowercase. Babelfish stores as lowercase in the PostgreSQL `pg_class` catalog the names of tables that you create using T-SQL. However, when you have SQL Server tables with mixed-case names, DMS creates the tables using PostgreSQL native data types instead of the T-SQL compatible data types. For that reason, be sure to add a transformation rule that makes all table names lowercase. Note that column names should not be transformed to lowercase.

Next, if you used the multidatabase migration mode when you defined your cluster, add a transformation rule that renames the original SQL Server schema. Make sure to rename the SQL Server schema name to include the name of the T-SQL database. For example, if the original SQL Server schema name is dbo, and your T-SQL database name is mydb, rename the schema to mydb\$1dbo using a transformation rule.

**Note**  
When using Babelfish for Aurora PostgreSQL 16 or later, the default migration mode is "mutidatabase". When running DMS migration tasks, ensure to review the migration mode parameter and update the transformation rules if needed.

If you use single database mode, you don't need a transformation rule to rename schema names. Schema names have a one-to-one mapping with the target T-SQL database in Babelfish.

The following sample transformation rule makes all table names lowercase, and renames the original SQL Server schema name from `dbo` to `mydb_dbo`.

```
{
   "rules": [
   {
      "rule-type": "transformation",
      "rule-id": "566251737",
      "rule-name": "566251737",
      "rule-target": "schema",
      "object-locator": {
         "schema-name": "dbo"
      },
      "rule-action": "rename",
      "value": "mydb_dbo",
      "old-value": null
   },
   {
      "rule-type": "transformation",
      "rule-id": "566139410",
      "rule-name": "566139410",
      "rule-target": "table",
      "object-locator": {
         "schema-name": "%",
         "table-name": "%"
      },
      "rule-action": "convert-lowercase",
      "value": null,
      "old-value": null
   },
   {
      "rule-type": "selection",
      "rule-id": "566111704",
      "rule-name": "566111704",
      "object-locator": {
         "schema-name": "dbo",
         "table-name": "%"
      },
      "rule-action": "include",
      "filters": []
   }
]
}
```

### Limitations to using a PostgreSQL target endpoint with Babelfish tables
<a name="CHAP_Target.PostgreSQL.Babelfish.limitations"></a>

The following limitations apply when using a PostgreSQL target endpoint with Babelfish tables:
+ For **Target table preparation** mode, use only the **Do nothing** or **Truncate** modes. Don't use the **Drop tables on target** mode. In that mode, DMS creates the tables as PostgreSQL tables that T-SQL might not recognize.
+ AWS DMS doesn't support the sql\$1variant data type.
+ Babelfish under Postgres endpoint does not support `HEIRARCHYID`, `GEOMETRY` (prior to 3.5.4) and `GEOGRAPHY` (prior to 3.5.4) data types. To migrate these data types, you can add transformation rules to convert the data type to wstring(250).
+ Babelfish only supports migrating `BINARY`, `VARBINARY`, and `IMAGE` data types using the `BYTEA` data type. For earlier versions of Aurora PostgreSQL, you can use DMS to migrate these tables to a [Babelfish target endpoint](CHAP_Target.Babelfish.md). You don't have to specify a length for the `BYTEA` data type, as shown in the following example.

  ```
  [Picture] [VARBINARY](max) NULL
  ```

  Change the preceding T-SQL data type to the T-SQL supported `BYTEA` data type.

  ```
  [Picture] BYTEA NULL
  ```
+ For earlier versions of Aurora PostgreSQL Babelfish, if you create a migration task for ongoing replication from SQL Server to Babelfish using the PostgreSQL target endpoint, you need to assign the `SERIAL` data type to any tables that use `IDENTITY` columns. Starting with Aurora PostgreSQL (version 15.3/14.8 and higher) and Babelfish (version 3.2.0 and higher), the identity column is supported, and it is no longer required to assign the SERIAL data type. For more information, see [SERIAL Usage](https://docs.aws.amazon.com/dms/latest/sql-server-to-aurora-postgresql-migration-playbook/chap-sql-server-aurora-pg.tsql.sequences..html) in the Sequences and Identity section of the *SQL Server to Aurora PostgreSQL Migration Playbook*. Then, when you create the table in Babelfish, change the column definition from the following.

  ```
      [IDCol] [INT] IDENTITY(1,1) NOT NULL PRIMARY KEY
  ```

  Change the preceding into the following.

  ```
      [IDCol] SERIAL PRIMARY KEY
  ```

  Babelfish-compatible Aurora PostgreSQL creates a sequence using the default configuration and adds a `NOT NULL` constraint to the column. The newly created sequence behaves like a regular sequence (incremented by 1) and has no composite `SERIAL` option.
+ After migrating data with tables that use `IDENTITY` columns or the `SERIAL` data type, reset the PostgreSQL-based sequence object based on the maximum value for the column. After performing a full load of the tables, use the following T-SQL query to generate statements to seed the associated sequence object.

  ```
  DECLARE @schema_prefix NVARCHAR(200) = ''
  
  IF current_setting('babelfishpg_tsql.migration_mode') = 'multi-db'
          SET @schema_prefix = db_name() + '_'
  
  SELECT 'SELECT setval(pg_get_serial_sequence(''' + @schema_prefix + schema_name(tables.schema_id) + '.' + tables.name + ''', ''' + columns.name + ''')
                 ,(select max(' + columns.name + ') from ' + schema_name(tables.schema_id) + '.' + tables.name + '));'
  FROM sys.tables tables
  JOIN sys.columns columns ON tables.object_id = columns.object_id
  WHERE columns.is_identity = 1
  
  UNION ALL
  
  SELECT 'SELECT setval(pg_get_serial_sequence(''' + @schema_prefix + table_schema + '.' + table_name + ''', 
  ''' + column_name + '''),(select max(' + column_name + ') from ' + table_schema + '.' + table_name + '));'
  FROM information_schema.columns
  WHERE column_default LIKE 'nextval(%';
  ```

  The query generates a series of SELECT statements that you execute in order to update the maximum IDENTITY and SERIAL values.
+ For Babelfish versions prior to 3.2, **Full LOB mode** might result in a table error. If that happens, create a separate task for the tables that failed to load. Then use **Limited LOB mode** to specify the appropriate value for the **Maximum LOB size (KB)**. Another option is to set the SQL Server Endpoint Connection Attribute setting `ForceFullLob=True`.
+ For Babelfish versions prior to 3.2, performing data validation with Babelfish tables that don't use integer based primary keys generates a message that a suitable unique key can't be found. Starting with Aurora PostgreSQL (version 15.3/14.8 and higher) and Babelfish (version 3.2.0 and higher), data validation for non-integer primary keys is supported. 
+ Because of precision differences in the number of decimal places for seconds, DMS reports data validation failures for Babelfish tables that use `DATETIME` data types. To suppress those failures, you can add the following validation rule type for `DATETIME` data types.

  ```
  {
           "rule-type": "validation",
           "rule-id": "3",
           "rule-name": "3",
           "rule-target": "column",
           "object-locator": {
               "schema-name": "dbo",
               "table-name": "%",
               "column-name": "%",
               "data-type": "datetime"
           },
           "rule-action": "override-validation-function",
           "source-function": "case when ${column-name} is NULL then NULL else 0 end",
           "target-function": "case when ${column-name} is NULL then NULL else 0 end"
       }
  ```

# Using a MySQL-compatible database as a target for AWS Database Migration Service
<a name="CHAP_Target.MySQL"></a>

You can migrate data to any MySQL-compatible database using AWS DMS, from any of the source data engines that AWS DMS supports. If you are migrating to an on-premises MySQL-compatible database, then AWS DMS requires that your source engine reside within the AWS ecosystem. The engine can be on an AWS-managed service such as Amazon RDS, Amazon Aurora, or Amazon S3. Or the engine can be on a self-managed database on Amazon EC2. 

You can use SSL to encrypt connections between your MySQL-compatible endpoint and the replication instance. For more information on using SSL with a MySQL-compatible endpoint, see [Using SSL with AWS Database Migration Service](CHAP_Security.SSL.md). 

For information about versions of MySQL that AWS DMS supports as a target, see [Targets for AWS DMS](CHAP_Introduction.Targets.md).

You can use the following MySQL-compatible databases as targets for AWS DMS:
+ MySQL Community Edition
+ MySQL Standard Edition
+ MySQL Enterprise Edition
+ MySQL Cluster Carrier Grade Edition
+ MariaDB Community Edition
+ MariaDB Enterprise Edition
+ MariaDB Column Store
+ Amazon Aurora MySQL

**Note**  
Regardless of the source storage engine (MyISAM, MEMORY, and so on), AWS DMS creates a MySQL-compatible target table as an InnoDB table by default.   
If you need a table in a storage engine other than InnoDB, you can manually create the table on the MySQL-compatible target and migrate the table using the **Do nothing** option. For more information, see [Full-load task settings](CHAP_Tasks.CustomizingTasks.TaskSettings.FullLoad.md).

For additional details on working with a MySQL-compatible database as a target for AWS DMS, see the following sections. 

**Topics**
+ [

## Using any MySQL-compatible database as a target for AWS Database Migration Service
](#CHAP_Target.MySQL.Prerequisites)
+ [

## Limitations on using a MySQL-compatible database as a target for AWS Database Migration Service
](#CHAP_Target.MySQL.Limitations)
+ [

## Endpoint settings when using a MySQL-compatible database as a target for AWS DMS
](#CHAP_Target.MySQL.ConnectionAttrib)
+ [

## Target data types for MySQL
](#CHAP_Target.MySQL.DataTypes)

## Using any MySQL-compatible database as a target for AWS Database Migration Service
<a name="CHAP_Target.MySQL.Prerequisites"></a>

Before you begin to work with a MySQL-compatible database as a target for AWS DMS, make sure that you have completed the following prerequisites:
+ Provide a user account to AWS DMS that has read/write privileges to the MySQL-compatible database. To create the necessary privileges, run the following commands.

  ```
  CREATE USER '<user acct>'@'%' IDENTIFIED BY '<user password>';
  GRANT ALTER, CREATE, DROP, INDEX, INSERT, UPDATE, DELETE, SELECT, CREATE TEMPORARY TABLES  ON <schema>.* TO 
  '<user acct>'@'%';
  GRANT ALL PRIVILEGES ON awsdms_control.* TO '<user acct>'@'%';
  ```
+ During the full-load migration phase, you must disable foreign keys on your target tables. To disable foreign key checks on a MySQL-compatible database during a full load, you can add the following command to the **Extra connection attributes** section of the AWS DMS console for your target endpoint.

  ```
  Initstmt=SET FOREIGN_KEY_CHECKS=0;
  ```
+ Set the database parameter `local_infile = 1` to enable AWS DMS to load data into the target database.
+ Grant the following privileges if you use MySQL-specific premigration assessments.

  ```
  grant select on mysql.user to <dms_user>;
  grant select on mysql.db to <dms_user>;
  grant select on mysql.tables_priv to <dms_user>;
  grant select on mysql.role_edges to <dms_user>  #only for MySQL version 8.0.11 and higher
  ```

## Limitations on using a MySQL-compatible database as a target for AWS Database Migration Service
<a name="CHAP_Target.MySQL.Limitations"></a>

When using a MySQL database as a target, AWS DMS doesn't support the following:
+ The data definition language (DDL) statements TRUNCATE PARTITION, DROP TABLE, and RENAME TABLE.
+ Using an `ALTER TABLE table_name ADD COLUMN column_name` statement to add columns to the beginning or the middle of a table.
+ When loading data to a MySQL-compatible target in a full load task, AWS DMS doesn't report errors caused by constraints in the task logs, which can cause duplicate key errors or mismatches with the number of records. This is caused by the way MySQL handles local data with the `LOAD DATA` command. Be sure to do the following during the full load phase: 
  + Disable constraints
  + Use AWS DMS validation to make sure the data is consistent.
+ When you update a column's value to its existing value, MySQL-compatible databases return a `0 rows affected` warning. Although this behavior isn't technically an error, it is different from how the situation is handled by other database engines. For example, Oracle performs an update of one row. For MySQL-compatible databases, AWS DMS generates an entry in the awsdms\$1apply\$1exceptions control table and logs the following warning.

  ```
  Some changes from the source database had no impact when applied to
  the target database. See awsdms_apply_exceptions table for details.
  ```
+ Aurora Serverless is available as a target for Amazon Aurora version 2, compatible with MySQL version 5.7. (Select Aurora MySQL version 2.07.1 to be able to use Aurora Serverless with MySQL 5.7 compatibility.) For more information about Aurora Serverless, see [Using Aurora Serverless v2](https://docs.aws.amazon.com//AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.html) in the *Amazon Aurora User Guide*.
+ AWS DMS does not support using a reader endpoint for Aurora or Amazon RDS, unless the instances are in writable mode, that is, the `read_only` and `innodb_read_only` parameters are set to `0` or `OFF`. For more information about using Amazon RDS and Aurora as targets, see the following:
  +  [ Determining which DB instance you are connected to](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraMySQL.BestPractices.html#AuroraMySQL.BestPractices.DeterminePrimaryInstanceConnection) 
  +  [ Updating read replicas with MySQL](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_MySQL.Replication.ReadReplicas.html#USER_MySQL.Replication.ReadReplicas.Updates) 
+ When replicating TIME datatype, fractional part of time value is not replicated.
+ When replicating TIME datatype with Extra Connection Attribute `loadUsingCSV=false`, the time value is capped to range `[00:00:00, 23:59:59]`.

## Endpoint settings when using a MySQL-compatible database as a target for AWS DMS
<a name="CHAP_Target.MySQL.ConnectionAttrib"></a>

You can use endpoint settings to configure your MySQL-compatible target database similar to using extra connection attributes. You specify the settings when you create the target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--my-sql-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with MySQL as a target.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.MySQL.html)

You can also use extra connection attributes to configure your MySQL-compatible target database.

The following table shows the extra connection attributes that you can use with MySQL as a target.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.MySQL.html)

Alternatively, you can use the `AfterConnectScript` parameter of the `--my-sql-settings` command to disable foreign key checks and specify the time zone for your database.

## Target data types for MySQL
<a name="CHAP_Target.MySQL.DataTypes"></a>

The following table shows the MySQL database target data types that are supported when using AWS DMS and the default mapping from AWS DMS data types.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  AWS DMS data types  |  MySQL data types  | 
| --- | --- | 
|  BOOLEAN  |  BOOLEAN  | 
|  BYTES  |  If the length is from 1 through 65,535, then use VARBINARY (length).  If the length is from 65,536 through 2,147,483,647, then use LONGLOB.  | 
|  DATE  |  DATE  | 
|  TIME  |  TIME  | 
|  TIMESTAMP  |  "If scale is => 0 and =< 6, then: DATETIME (Scale) If scale is => 7 and =< 9, then: VARCHAR (37)"  | 
|  INT1  |  TINYINT  | 
|  INT2  |  SMALLINT  | 
|  INT4  |  INTEGER  | 
|  INT8  |  BIGINT  | 
|  NUMERIC  |  DECIMAL (p,s)  | 
|  REAL4  |  FLOAT  | 
|  REAL8  |  DOUBLE PRECISION  | 
|  STRING  |  If the length is from 1 through 21,845, then use VARCHAR (length). If the length is from 21,846 through 2,147,483,647, then use LONGTEXT.  | 
|  UINT1  |  UNSIGNED TINYINT  | 
|  UINT2  |  UNSIGNED SMALLINT  | 
|  UINT4  |  UNSIGNED INTEGER  | 
|  UINT8  |  UNSIGNED BIGINT  | 
|  WSTRING  |  If the length is from 1 through 32,767, then use VARCHAR (length).  If the length is from 32,768 through 2,147,483,647, then use LONGTEXT.  | 
|  BLOB  |  If the length is from 1 through 65,535, then use BLOB. If the length is from 65,536 through 2,147,483,647, then use LONGBLOB. If the length is 0, then use LONGBLOB (full LOB support).  | 
|  NCLOB  |  If the length is from 1 through 65,535, then use TEXT. If the length is from 65,536 through 2,147,483,647, then use LONGTEXT with ucs2 for CHARACTER SET. If the length is 0, then use LONGTEXT (full LOB support) with ucs2 for CHARACTER SET.  | 
|  CLOB  |  If the length is from 1 through 65,535, then use TEXT. If the length is from 65,536 through 2147483647, then use LONGTEXT. If the length is 0, then use LONGTEXT (full LOB support).  | 

# Using an Amazon Redshift database as a target for AWS Database Migration Service
<a name="CHAP_Target.Redshift"></a>

You can migrate data to Amazon Redshift databases using AWS Database Migration Service. Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. With an Amazon Redshift database as a target, you can migrate data from all of the other supported source databases.

You can use Amazon Redshift Serverless as a target for AWS DMS. For more information, see [Using AWS DMS with Amazon Redshift Serverless as a TargetAmazon Redshift Serverless](#CHAP_Target.Redshift.RSServerless) following.

 The Amazon Redshift cluster must be in the same AWS account and same AWS Region as the replication instance. 

During a database migration to Amazon Redshift, AWS DMS first moves data to an Amazon S3 bucket. When the files reside in an Amazon S3 bucket, AWS DMS then transfers them to the proper tables in the Amazon Redshift data warehouse. AWS DMS creates the S3 bucket in the same AWS Region as the Amazon Redshift database. The AWS DMS replication instance must be located in that same AWS Region . 

If you use the AWS CLI or DMS API to migrate data to Amazon Redshift, set up an AWS Identity and Access Management (IAM) role to allow S3 access. For more information about creating this IAM role, see [Creating the IAM roles to use with AWS DMS](security-iam.md#CHAP_Security.APIRole).

The Amazon Redshift endpoint provides full automation for the following:
+ Schema generation and data type mapping
+ Full load of source database tables
+ Incremental load of changes made to source tables
+ Application of schema changes in data definition language (DDL) made to the source tables
+ Synchronization between full load and change data capture (CDC) processes.

AWS Database Migration Service supports both full load and change processing operations. AWS DMS reads the data from the source database and creates a series of comma-separated value (.csv) files. For full-load operations, AWS DMS creates files for each table. AWS DMS then copies the table files for each table to a separate folder in Amazon S3. When the files are uploaded to Amazon S3, AWS DMS sends a copy command and the data in the files are copied into Amazon Redshift. For change-processing operations, AWS DMS copies the net changes to the .csv files. AWS DMS then uploads the net change files to Amazon S3 and copies the data to Amazon Redshift.

For additional details on working with Amazon Redshift as a target for AWS DMS, see the following sections: 

**Topics**
+ [

## Prerequisites for using an Amazon Redshift database as a target for AWS Database Migration Service
](#CHAP_Target.Redshift.Prerequisites)
+ [

## Privileges required for using Redshift as a target
](#CHAP_Target.Redshift.Privileges)
+ [

## Limitations on using Amazon Redshift as a target for AWS Database Migration Service
](#CHAP_Target.Redshift.Limitations)
+ [

## Configuring an Amazon Redshift database as a target for AWS Database Migration Service
](#CHAP_Target.Redshift.Configuration)
+ [

## Using enhanced VPC routing with Amazon Redshift as a target for AWS Database Migration Service
](#CHAP_Target.Redshift.EnhancedVPC)
+ [

## Creating and using AWS KMS keys to encrypt Amazon Redshift target data
](#CHAP_Target.Redshift.KMSKeys)
+ [

## Endpoint settings when using Amazon Redshift as a target for AWS DMS
](#CHAP_Target.Redshift.ConnectionAttrib)
+ [

## Using a data encryption key, and an Amazon S3 bucket as intermediate storage
](#CHAP_Target.Redshift.EndpointSettings)
+ [

## Multithreaded task settings for Amazon Redshift
](#CHAP_Target.Redshift.ParallelApply)
+ [

## Target data types for Amazon Redshift
](#CHAP_Target.Redshift.DataTypes)
+ [

## Using AWS DMS with Amazon Redshift Serverless as a Target
](#CHAP_Target.Redshift.RSServerless)

## Prerequisites for using an Amazon Redshift database as a target for AWS Database Migration Service
<a name="CHAP_Target.Redshift.Prerequisites"></a>

The following list describes the prerequisites necessary for working with Amazon Redshift as a target for data migration:
+ Use the AWS Management Console to launch an Amazon Redshift cluster. Note the basic information about your AWS account and your Amazon Redshift cluster, such as your password, user name, and database name. You need these values when creating the Amazon Redshift target endpoint. 
+ The Amazon Redshift cluster must be in the same AWS account and the same AWS Region as the replication instance.
+ The AWS DMS replication instance needs network connectivity to the Amazon Redshift endpoint (hostname and port) that your cluster uses.
+ AWS DMS uses an Amazon S3 bucket to transfer data to the Amazon Redshift database. For AWS DMS to create the bucket, the console uses an IAM role, `dms-access-for-endpoint`. If you use the AWS CLI or DMS API to create a database migration with Amazon Redshift as the target database, you must create this IAM role. For more information about creating this role, see [Creating the IAM roles to use with AWS DMS](security-iam.md#CHAP_Security.APIRole). 
+ AWS DMS converts BLOBs, CLOBs, and NCLOBs to a VARCHAR on the target Amazon Redshift instance. Amazon Redshift does not support VARCHAR data types larger than 64 KB, so you can't store traditional LOBs on Amazon Redshift. 
+ Set the target metadata task setting [BatchApplyEnabled](CHAP_Tasks.CustomizingTasks.TaskSettings.ChangeProcessingTuning.md) to `true` for AWS DMS to handle changes to Amazon Redshift target tables during CDC. A Primary Key on both the source and target table is required. Without a Primary Key, changes are applied statement by statement. And that can adversely affect task performance during CDC by causing target latency and impacting the cluster commit queue. 
+ When Row Level Security is enabled on the tables in Redshift, you must grant appropriate permissions to all your DMS users.

## Privileges required for using Redshift as a target
<a name="CHAP_Target.Redshift.Privileges"></a>

Use the GRANT command to define access privileges for a user or user group. Privileges include access options such as being able to read data in tables and views, write data, and create tables. For more information about using GRANT with Amazon Redshift, see [GRANT](https://docs.aws.amazon.com//redshift/latest/dg/r_GRANT.html) in the * Amazon Redshift Database Developer Guide*. 

The following is the syntax to give specific privileges for a table, database, schema, function, procedure, or language-level privileges on Amazon Redshift tables and views.

```
GRANT { { SELECT | INSERT | UPDATE | DELETE | REFERENCES } [,...] | ALL [ PRIVILEGES ] }
    ON { [ TABLE ] table_name [, ...] | ALL TABLES IN SCHEMA schema_name [, ...] }
    TO { username [ WITH GRANT OPTION ] | GROUP group_name | PUBLIC } [, ...]

GRANT { { CREATE | TEMPORARY | TEMP } [,...] | ALL [ PRIVILEGES ] }
    ON DATABASE db_name [, ...]
    TO { username [ WITH GRANT OPTION ] | GROUP group_name | PUBLIC } [, ...]

GRANT { { CREATE | USAGE } [,...] | ALL [ PRIVILEGES ] }
    ON SCHEMA schema_name [, ...]
    TO { username [ WITH GRANT OPTION ] | GROUP group_name | PUBLIC } [, ...]

GRANT { EXECUTE | ALL [ PRIVILEGES ] }
    ON { FUNCTION function_name ( [ [ argname ] argtype [, ...] ] ) [, ...] | ALL FUNCTIONS IN SCHEMA schema_name [, ...] }
    TO { username [ WITH GRANT OPTION ] | GROUP group_name | PUBLIC } [, ...]

GRANT { EXECUTE | ALL [ PRIVILEGES ] }
    ON { PROCEDURE procedure_name ( [ [ argname ] argtype [, ...] ] ) [, ...] | ALL PROCEDURES IN SCHEMA schema_name [, ...] }
    TO { username [ WITH GRANT OPTION ] | GROUP group_name | PUBLIC } [, ...]

GRANT USAGE 
    ON LANGUAGE language_name [, ...]
    TO { username [ WITH GRANT OPTION ] | GROUP group_name | PUBLIC } [, ...]
```

The following is the syntax for column-level privileges on Amazon Redshift tables and views. 

```
GRANT { { SELECT | UPDATE } ( column_name [, ...] ) [, ...] | ALL [ PRIVILEGES ] ( column_name [,...] ) }
     ON { [ TABLE ] table_name [, ...] }
     TO { username | GROUP group_name | PUBLIC } [, ...]
```

The following is the syntax for the ASSUMEROLE privilege granted to users and groups with a specified role.

```
GRANT ASSUMEROLE
    ON { 'iam_role' [, ...] | ALL }
    TO { username | GROUP group_name | PUBLIC } [, ...]
    FOR { ALL | COPY | UNLOAD } [, ...]
```

## Limitations on using Amazon Redshift as a target for AWS Database Migration Service
<a name="CHAP_Target.Redshift.Limitations"></a>

The following limitations apply when using an Amazon Redshift database as a target:
+ Don’t enable versioning for the S3 bucket you use as intermediate storage for your Amazon Redshift target. If you need S3 versioning, use lifecycle policies to actively delete old versions. Otherwise, you might encounter endpoint test connection failures because of an S3 `list-object` call timeout. To create a lifecycle policy for an S3 bucket, see [ Managing your storage lifecycle](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt.html). To delete a version of an S3 object, see [ Deleting object versions from a versioning-enabled bucket](https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingObjectVersions.html).
+ The following DDL is not supported:

  ```
  ALTER TABLE table name MODIFY COLUMN column name data type;
  ```
+  AWS DMS cannot migrate or replicate changes to a schema with a name that begins with underscore (\$1). If you have schemas that have a name that begins with an underscore, use mapping transformations to rename the schema on the target. 
+  Amazon Redshift does not support VARCHARs larger than 64 KB. LOBs from traditional databases can't be stored in Amazon Redshift.
+  Applying a DELETE statement to a table with a multi-column primary key is not supported when any of the primary key column names use a reserved word. Go [here](https://docs.aws.amazon.com/redshift/latest/dg/r_pg_keywords.html) to see a list of Amazon Redshift reserved words.
+ You may experience performance issues if your source system performs UPDATE operations on the primary key of a source table. These performance issues occur when applying changes to the target. This is because UPDATE (and DELETE) operations depend on the primary key value to identify the target row. If you update the primary key of a source table, your task log will contain messages like the following:

  ```
  Update on table 1 changes PK to a PK that was previously updated in the same bulk update.
  ```
+ DMS does not support custom DNS names when configuring an endpoint for a Redshift cluster, and you need to use the Amazon provided DNS name. Since the Amazon Redshift cluster must be in the same AWS account and Region as the replication instance, validation fails if you use a custom DNS endpoint.
+ Amazon Redshift has a default 4-hour idle session timeout. When there isn't any activity within the DMS replication task, Redshift disconnects the session after 4 hours. Errors can result from DMS being unable to connect and potentially needing to restart. As a workaround, set a SESSION TIMEOUT limit greater than 4 hours for the DMS replication user. Or, see the description of [ALTER USER](https://docs.aws.amazon.com/redshift/latest/dg/r_ALTER_USER.html) in the *Amazon Redshift Database Developer Guide*.
+ When AWS DMS replicates source table data without a primary or unique key, CDC latency might be high resulting in an unacceptable level of performance.
+ Truncating partitions is not supported during CDC replication from Oracle source to Redshift target.
+ Duplicate records might appear in target tables because Amazon Redshift does not enforce primary keys and AWS DMS may replay CDC when a task is resumed. To prevent duplicates, use the `ApplyErrorInsertPolicy=INSERT_RECORD` setting. For more information, see [Error handling task settings](CHAP_Tasks.CustomizingTasks.TaskSettings.ErrorHandling.md). Alternatively, you can implement application-level duplicate detection and post-migration cleanup procedures.

## Configuring an Amazon Redshift database as a target for AWS Database Migration Service
<a name="CHAP_Target.Redshift.Configuration"></a>

AWS Database Migration Service must be configured to work with the Amazon Redshift instance. The following table describes the configuration properties available for the Amazon Redshift endpoint.


| Property | Description | 
| --- | --- | 
| server | The name of the Amazon Redshift cluster you are using. | 
| port | The port number for Amazon Redshift. The default value is 5439. | 
| username | An Amazon Redshift user name for a registered user. | 
| password | The password for the user named in the username property. | 
| database | The name of the Amazon Redshift data warehouse (service) you are working with. | 

If you want to add extra connection string attributes to your Amazon Redshift endpoint, you can specify the `maxFileSize` and `fileTransferUploadStreams` attributes. For more information on these attributes, see [Endpoint settings when using Amazon Redshift as a target for AWS DMS](#CHAP_Target.Redshift.ConnectionAttrib).

## Using enhanced VPC routing with Amazon Redshift as a target for AWS Database Migration Service
<a name="CHAP_Target.Redshift.EnhancedVPC"></a>

If you use Enhanced VPC Routing with your Amazon Redshift target, all COPY traffic between your Amazon Redshift cluster and your data repositories goes through your VPC. Because Enhanced VPC Routing affects the way that Amazon Redshift accesses other resources, COPY commands might fail if you haven't configured your VPC correctly.

AWS DMS can be affected by this behavior because it uses the COPY command to move data in S3 to an Amazon Redshift cluster.

Following are the steps AWS DMS takes to load data into an Amazon Redshift target:

1. AWS DMS copies data from the source to .csv files on the replication server.

1. AWS DMS uses the AWS SDK to copy the .csv files into an S3 bucket on your account.

1. AWS DMS then uses the COPY command in Amazon Redshift to copy data from the .csv files in S3 to an appropriate table in Amazon Redshift.

If Enhanced VPC Routing is not enabled, Amazon Redshift routes traffic through the internet, including traffic to other services within the AWS network. If the feature is not enabled, you do not have to configure the network path. If the feature is enabled, you must specifically create a network path between your cluster's VPC and your data resources. For more information on the configuration required, see [Enhanced VPC routing](https://docs.aws.amazon.com/redshift/latest/mgmt/enhanced-vpc-routing.html) in the Amazon Redshift documentation. 

## Creating and using AWS KMS keys to encrypt Amazon Redshift target data
<a name="CHAP_Target.Redshift.KMSKeys"></a>

You can encrypt your target data pushed to Amazon S3 before it is copied to Amazon Redshift. To do so, you can create and use custom AWS KMS keys. You can use the key you created to encrypt your target data using one of the following mechanisms when you create the Amazon Redshift target endpoint:
+ Use the following option when you run the `create-endpoint` command using the AWS CLI.

  ```
  --redshift-settings '{"EncryptionMode": "SSE_KMS", "ServerSideEncryptionKmsKeyId": "your-kms-key-ARN"}'
  ```

  Here, `your-kms-key-ARN` is the Amazon Resource Name (ARN) for your KMS key. For more information, see [Using a data encryption key, and an Amazon S3 bucket as intermediate storage](#CHAP_Target.Redshift.EndpointSettings).
+ Set the extra connection attribute `encryptionMode` to the value `SSE_KMS` and the extra connection attribute `serverSideEncryptionKmsKeyId` to the ARN for your KMS key. For more information, see [Endpoint settings when using Amazon Redshift as a target for AWS DMS](#CHAP_Target.Redshift.ConnectionAttrib).

To encrypt Amazon Redshift target data using a KMS key, you need an AWS Identity and Access Management (IAM) role that has permissions to access Amazon Redshift data. This IAM role is then accessed in a policy (a key policy) attached to the encryption key that you create. You can do this in your IAM console by creating the following:
+ An IAM role with an AWS-managed policy.
+ A KMS key with a key policy that references this role.

The following procedures describe how to do this.

**To create an IAM role with the required AWS-managed policy**

1. Open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/).

1. In the navigation pane, choose **Roles**. The **Roles** page opens.

1. Choose **Create role**. The **Create role** page opens.

1. With **AWS service** chosen as the trusted entity, choose **DMS** as the service to use the role.

1. Choose **Next: Permissions**. The **Attach permissions policies** page appears.

1. Find and select the `AmazonDMSRedshiftS3Role` policy.

1. Choose **Next: Tags**. The **Add tags** page appears. Here, you can add any tags you want.

1. Choose **Next: Review** and review your results.

1. If the settings are what you need, enter a name for the role (for example, `DMS-Redshift-endpoint-access-role`), and any additional description, then choose **Create role**. The **Roles** page opens with a message indicating that your role has been created.

You have now created the new role to access Amazon Redshift resources for encryption with a specified name, for example `DMS-Redshift-endpoint-access-role`.

**To create an AWS KMS encryption key with a key policy that references your IAM role**
**Note**  
For more information about how AWS DMS works with AWS KMS encryption keys, see [Setting an encryption key and specifying AWS KMS permissions](CHAP_Security.md#CHAP_Security.EncryptionKey).

1. Sign in to the AWS Management Console and open the AWS Key Management Service (AWS KMS) console at [https://console.aws.amazon.com/kms](https://console.aws.amazon.com/kms).

1. To change the AWS Region, use the Region selector in the upper-right corner of the page.

1. In the navigation pane, choose **Customer managed keys**.

1. Choose **Create key**. The **Configure key** page opens.

1. For **Key type**, choose **Symmetric**.
**Note**  
When you create this key, you can only create a symmetric key, because all AWS services, such as Amazon Redshift, only work with symmetric encryption keys.

1. Choose **Advanced Options**. For **Key material origin**, make sure that **KMS** is chosen, then choose **Next**. The **Add labels** page opens.

1. For **Create alias and description**, enter an alias for the key (for example, `DMS-Redshift-endpoint-encryption-key`) and any additional description.

1. For **Tags**, add any tags that you want to help identify the key and track its usage, then choose **Next**. The **Define key administrative permissions** page opens showing a list of users and roles that you can choose from.

1. Add the users and roles that you want to manage the key. Make sure that these users and roles have the required permissions to manage the key. 

1. For **Key deletion**, choose whether key administrators can delete the key, then choose **Next**. The **Define key usage permissions** page opens showing an additional list of users and roles that you can choose from.

1. For **This account**, choose the available users you want to perform cryptographic operations on Amazon Redshift targets. Also choose the role that you previously created in **Roles** to enable access to encrypt Amazon Redshift target objects, for example `DMS-Redshift-endpoint-access-role`).

1. If you want to add other accounts not listed to have this same access, for **Other AWS accounts**, choose **Add another AWS account**, then choose **Next**. The **Review and edit key policy** page opens, showing the JSON for the key policy that you can review and edit by typing into the existing JSON. Here, you can see where the key policy references the role and users (for example, `Admin` and `User1`) that you chose in the previous step. You can also see the different key actions permitted for the different principals (users and roles), as shown in the following example.

------
#### [ JSON ]

****  

   ```
   {
       "Id": "key-consolepolicy-3",
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Sid": "Enable IAM User Permissions",
               "Effect": "Allow",
               "Principal": {
                   "AWS": [
                       "arn:aws:iam::111122223333:root"
                   ]
               },
               "Action": "kms:*",
               "Resource": "*"
           },
           {
               "Sid": "Allow access for Key Administrators",
               "Effect": "Allow",
               "Principal": {
                   "AWS": [
                       "arn:aws:iam::111122223333:role/Admin"
                   ]
               },
               "Action": [
                   "kms:Create*",
                   "kms:Describe*",
                   "kms:Enable*",
                   "kms:List*",
                   "kms:Put*",
                   "kms:Update*",
                   "kms:Revoke*",
                   "kms:Disable*",
                   "kms:Get*",
                   "kms:Delete*",
                   "kms:TagResource",
                   "kms:UntagResource",
                   "kms:ScheduleKeyDeletion",
                   "kms:CancelKeyDeletion"
               ],
               "Resource": "*"
           },
           {
               "Sid": "Allow use of the key",
               "Effect": "Allow",
               "Principal": {
                   "AWS": [
                       "arn:aws:iam::111122223333:role/DMS-Redshift-endpoint-access-role",
                       "arn:aws:iam::111122223333:role/Admin",
                       "arn:aws:iam::111122223333:role/User1"
                   ]
               },
               "Action": [
                   "kms:Encrypt",
                   "kms:Decrypt",
                   "kms:ReEncrypt*",
                   "kms:GenerateDataKey*",
                   "kms:DescribeKey"
               ],
               "Resource": "*"
           },
           {
               "Sid": "Allow attachment of persistent resources",
               "Effect": "Allow",
               "Principal": {
                   "AWS": [
                       "arn:aws:iam::111122223333:role/DMS-Redshift-endpoint-access-role",
                       "arn:aws:iam::111122223333:role/Admin",
                       "arn:aws:iam::111122223333:role/User1"
                   ]
               },
               "Action": [
                   "kms:CreateGrant",
                   "kms:ListGrants",
                   "kms:RevokeGrant"
               ],
               "Resource": "*",
               "Condition": {
                   "Bool": {
                       "kms:GrantIsForAWSResource": true
                   }
               }
           }
       ]
   }
   ```

------

1. Choose **Finish**. The **Encryption keys** page opens with a message indicating that your AWS KMS key has been created.

You have now created a new KMS key with a specified alias (for example, `DMS-Redshift-endpoint-encryption-key`). This key enables AWS DMS to encrypt Amazon Redshift target data.

## Endpoint settings when using Amazon Redshift as a target for AWS DMS
<a name="CHAP_Target.Redshift.ConnectionAttrib"></a>

You can use endpoint settings to configure your Amazon Redshift target database similar to using extra connection attributes. You specify the settings when you create the target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--redshift-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with Amazon Redshift as a target.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Redshift.html)

## Using a data encryption key, and an Amazon S3 bucket as intermediate storage
<a name="CHAP_Target.Redshift.EndpointSettings"></a>

You can use Amazon Redshift target endpoint settings to configure the following:
+ A custom AWS KMS data encryption key. You can then use this key to encrypt your data pushed to Amazon S3 before it is copied to Amazon Redshift.
+ A custom S3 bucket as intermediate storage for data migrated to Amazon Redshift.
+ Map a boolean as a boolean from a PostgreSQL source. By default, a BOOLEAN type is migrated as varchar(1). You can specify `MapBooleanAsBoolean` to let your Redshift target migrate the boolean type as boolean, as shown in the example following.

  ```
  --redshift-settings '{"MapBooleanAsBoolean": true}'
  ```

  Note that you must set this setting on both the source and target endpoints for it to take effect.

### KMS key settings for data encryption
<a name="CHAP_Target.Redshift.EndpointSettings.KMSkeys"></a>

The following examples show configuring a custom KMS key to encrypt your data pushed to S3. To start, you might make the following `create-endpoint` call using the AWS CLI.

```
aws dms create-endpoint --endpoint-identifier redshift-target-endpoint --endpoint-type target 
--engine-name redshift --username your-username --password your-password 
--server-name your-server-name --port 5439 --database-name your-db-name 
--redshift-settings '{"EncryptionMode": "SSE_KMS", 
"ServerSideEncryptionKmsKeyId": "arn:aws:kms:us-east-1:111122223333:key/24c3c5a1-f34a-4519-a85b-2debbef226d1"}'
```

Here, the JSON object specified by `--redshift-settings` option defines two parameters. One is an `EncryptionMode` parameter with the value `SSE_KMS`. The other is an `ServerSideEncryptionKmsKeyId` parameter with the value `arn:aws:kms:us-east-1:111122223333:key/24c3c5a1-f34a-4519-a85b-2debbef226d1`. This value is an Amazon Resource Name (ARN) for your custom KMS key.

By default, S3 data encryption occurs using S3 server-side encryption. For the previous example's Amazon Redshift target, this is also equivalent of specifying its endpoint settings, as in the following example.

```
aws dms create-endpoint --endpoint-identifier redshift-target-endpoint --endpoint-type target 
--engine-name redshift --username your-username --password your-password 
--server-name your-server-name --port 5439 --database-name your-db-name 
--redshift-settings '{"EncryptionMode": "SSE_S3"}'
```

For more information about working with S3 server-side encryption, see [Protecting data using server-side encryption](https://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-encryption.html) in the *Amazon Simple Storage Service User Guide.*

**Note**  
You can also use the CLI `modify-endpoint` command to change the value of the `EncryptionMode` parameter for an existing endpoint from `SSE_KMS` to `SSE_S3`. But you can’t change the `EncryptionMode` value from `SSE_S3` to `SSE_KMS`.

### Amazon S3 bucket settings
<a name="CHAP_Target.Redshift.EndpointSettings.S3Buckets"></a>

When you migrate data to an Amazon Redshift target endpoint, AWS DMS uses a default Amazon S3 bucket as intermediate task storage before copying the migrated data to Amazon Redshift. For example, the examples shown for creating an Amazon Redshift target endpoint with a AWS KMS data encryption key use this default S3 bucket (see [KMS key settings for data encryption](#CHAP_Target.Redshift.EndpointSettings.KMSkeys)). 

You can instead specify a custom S3 bucket for this intermediate storage by including the following parameters in the value of your `--redshift-settings` option on the AWS CLI `create-endpoint` command:
+ `BucketName` – A string you specify as the name of the S3 bucket storage. If your service access role is based on the `AmazonDMSRedshiftS3Role` policy, this value must have a prefix of `dms-`, for example, `dms-my-bucket-name`.
+ `BucketFolder` – (Optional) A string you can specify as the name of the storage folder in the specified S3 bucket.
+ `ServiceAccessRoleArn` – The ARN of an IAM role that permits administrative access to the S3 bucket. Typically, you create this role based on the `AmazonDMSRedshiftS3Role` policy. For an example, see the procedure to create an IAM role with the required AWS-managed policy in [Creating and using AWS KMS keys to encrypt Amazon Redshift target data](#CHAP_Target.Redshift.KMSKeys).
**Note**  
If you specify the ARN of a different IAM role using the `--service-access-role-arn` option of the `create-endpoint` command, this IAM role option takes precedence.

The following example shows how you might use these parameters to specify a custom Amazon S3 bucket in the following `create-endpoint` call using the AWS CLI. 

```
aws dms create-endpoint --endpoint-identifier redshift-target-endpoint --endpoint-type target 
--engine-name redshift --username your-username --password your-password 
--server-name your-server-name --port 5439 --database-name your-db-name 
--redshift-settings '{"ServiceAccessRoleArn": "your-service-access-ARN", 
"BucketName": "your-bucket-name", "BucketFolder": "your-bucket-folder-name"}'
```

## Multithreaded task settings for Amazon Redshift
<a name="CHAP_Target.Redshift.ParallelApply"></a>

You can improve performance of full load and change data capture (CDC) tasks for an Amazon Redshift target endpoint by using multithreaded task settings. They enable you to specify the number of concurrent threads and the number of records to store in a buffer.

### Multithreaded full load task settings for Amazon Redshift
<a name="CHAP_Target.Redshift.ParallelApply.FullLoad"></a>

To promote full load performance, you can use the following `ParallelLoad*` task settings:
+ `ParallelLoadThreads` – Specifies the number of concurrent threads that DMS uses during a full load to push data records to an Amazon Redshift target endpoint. The default value is zero (0) and the maximum value is 32. For more information, see [Full-load task settings](CHAP_Tasks.CustomizingTasks.TaskSettings.FullLoad.md).

  You can use the `enableParallelBatchInMemoryCSVFiles` attribute set to `false` when using the `ParallelLoadThreads` task setting. The attribute improves performance of larger multithreaded full load tasks by having DMS write to disk instead of memory. The default value is `true`.
+ `ParallelLoadBufferSize` – Specifies the maximum data record requests while using parallel load threads with Redshift target. The default value is 100 and the maximum value is 1,000. We recommend you use this option when ParallelLoadThreads > 1 (greater than one).

**Note**  
Support for the use of `ParallelLoad*` task settings during FULL LOAD to Amazon Redshift target endpoints is available in AWS DMS versions 3.4.5 and higher.  
The `ReplaceInvalidChars` Redshift endpoint setting is not supported for use during change data capture (CDC) or during a parallel load enabled FULL LOAD migration task. It is supported for FULL LOAD migration when parallel load isn’t enabled. For more information see [RedshiftSettings](https://docs.aws.amazon.com/dms/latest/APIReference/API_RedshiftSettings.html) in the *AWS Database Migration Service API Reference*

### Multithreaded CDC task settings for Amazon Redshift
<a name="CHAP_Target.Redshift.ParallelApply.CDC"></a>

To promote CDC performance, you can use the following `ParallelApply*` task settings:
+ `ParallelApplyThreads` – Specifies the number of concurrent threads that AWS DMS uses during a CDC load to push data records to a Amazon Redshift target endpoint. The default value is zero (0) and the maximum value is 32. The minimum recommended value equals the number of slices in your cluster.
+ `ParallelApplyBufferSize` – Specifies the maximum data record requests while using parallel apply threads with Redshift target. The default value is 100 and the maximum value is 1,000. We recommend to use this option when ParallelApplyThreads > 1 (greater than one). 

  To obtain the most benefit for Redshift as a target, we recommend that the value of `ParallelApplyBufferSize` be at least two times (double or more) the number of `ParallelApplyThreads`.

**Note**  
Support for the use of `ParallelApply*` task settings during CDC to Amazon Redshift target endpoints is available in AWS DMS versions 3.4.3 and higher.

The level of parallelism applied depends on the correlation between the total *batch size* and the *maximum file size* used to transfer data. When using multithreaded CDC task settings with a Redshift target, benefits are gained when batch size is large in relation to the maximum file size. For example, you can use the following combination of endpoint and task settings to tune for optimal performance. 

```
// Redshift endpoint setting
                
        MaxFileSize=250000;

// Task settings

        BatchApplyEnabled=true;
        BatchSplitSize =8000;
        BatchApplyTimeoutMax =1800;
        BatchApplyTimeoutMin =1800;
        ParallelApplyThreads=32;
        ParallelApplyBufferSize=100;
```

Using the settings in the previous example, a customer with a heavy transactional workload benefits by their batch buffer, containing 8000 records, getting filled in 1800 seconds, utilizing 32 parallel threads with a 250 MB maximum file size.

For more information, see [Change processing tuning settings](CHAP_Tasks.CustomizingTasks.TaskSettings.ChangeProcessingTuning.md).

**Note**  
DMS queries that run during ongoing replication to a Redshift cluster can share the same WLM (workload management) queue with other application queries that are running. So, consider properly configuring WLM properties to influence performance during ongoing replication to a Redshift target. For example, if other parallel ETL queries are running, DMS runs slower and performance gains are lost.

## Target data types for Amazon Redshift
<a name="CHAP_Target.Redshift.DataTypes"></a>

The Amazon Redshift endpoint for AWS DMS supports most Amazon Redshift data types. The following table shows the Amazon Redshift target data types that are supported when using AWS DMS and the default mapping from AWS DMS data types.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


| AWS DMS data types | Amazon Redshift data types | 
| --- | --- | 
| BOOLEAN | BOOL | 
| BYTES | VARCHAR (Length) | 
| DATE | DATE | 
| TIME | VARCHAR(20) | 
| DATETIME |  If the scale is => 0 and =< 6, depending on Redshift target column type, then one of the following: TIMESTAMP (s) TIMESTAMPTZ (s) — If source timestamp contains a zone offset (such as in SQL Server or Oracle) it converts to UTC on insert/update. If it does not contain an offset, then time is considered in UTC already. If the scale is => 7 and =< 9, then:  VARCHAR (37) | 
| INT1 | INT2 | 
| INT2 | INT2 | 
| INT4 | INT4 | 
| INT8 | INT8 | 
| NUMERIC | If the scale is => 0 and =< 37, then:  NUMERIC (p,s)  If the scale is => 38 and =< 127, then:  VARCHAR (Length) | 
| REAL4 | FLOAT4 | 
| REAL8 | FLOAT8 | 
| STRING | If the length is 1–65,535, then use VARCHAR (length in bytes)  If the length is 65,536–2,147,483,647, then use VARCHAR (65535) | 
| UINT1 | INT2 | 
| UINT2 | INT2 | 
| UINT4 | INT4 | 
| UINT8 | NUMERIC (20,0) | 
| WSTRING |  If the length is 1–65,535, then use NVARCHAR (length in bytes)  If the length is 65,536–2,147,483,647, then use NVARCHAR (65535) | 
| BLOB | VARCHAR (maximum LOB size \$12)  The maximum LOB size cannot exceed 31 KB. Amazon Redshift does not support VARCHARs larger than 64 KB. | 
| NCLOB | NVARCHAR (maximum LOB size)  The maximum LOB size cannot exceed 63 KB. Amazon Redshift does not support VARCHARs larger than 64 KB. | 
| CLOB | VARCHAR (maximum LOB size)  The maximum LOB size cannot exceed 63 KB. Amazon Redshift does not support VARCHARs larger than 64 KB. | 

## Using AWS DMS with Amazon Redshift Serverless as a Target
<a name="CHAP_Target.Redshift.RSServerless"></a>

AWS DMS supports using Amazon Redshift Serverless as a target endpoint. For information about using Amazon Redshift Serverless, see [Amazon Redshift Serverless](https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-serverless.html) in the [Amazon Redshift Management Guide](https://docs.aws.amazon.com/redshift/latest/mgmt/welcome.html).

This topic describes how to use a Amazon Redshift Serverless endpoint with AWS DMS.

**Note**  
When creating an Amazon Redshift Serverless endpoint, for the **DatabaseName** field of your [RedshiftSettings](https://docs.aws.amazon.com/dms/latest/APIReference/API_RedshiftSettings.html) endpoint configuration, use either the name of the Amazon Redshift data warehouse or the name of the workgroup endpoint. For the **ServerName** field, use the value for Endpoint displayed in the **Workgroup** page for the serverless cluster (for example, `default-workgroup.093291321484.us-east-1.redshift-serverless.amazonaws.com`). For information about creating an endpoint, see [Creating source and target endpoints](CHAP_Endpoints.Creating.md). For information about the workgroup endpoint, see [ Connecting to Amazon Redshift Serverless ](https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-connecting.html).

### Trust Policy with Amazon Redshift Serverless as a target
<a name="CHAP_Target.Redshift.RSServerless.policy"></a>

When using Amazon Redshift Serverless as a target endpoint, you must add the following highlighted section to your trust policy. This trust policy is attached to the `dms-access-for-endpoint` role.

For more information about using a trust policy with AWS DMS, see [Creating the IAM roles to use with AWS DMS](security-iam.md#CHAP_Security.APIRole).

### Limitations when using Amazon Redshift Serverless as a target
<a name="CHAP_Target.Redshift.RSServerless.Limitations"></a>

Using Redshift Serverless as a target has the following limitations:
+ AWS DMS only supports Amazon Redshift Serverless as an endpoint in regions that support Amazon Redshift Serverless. For information about which regions support Amazon Redshift Serverless, see **Redshift Serverless API** in the [Amazon Redshift endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/redshift-service.html) topic in the [AWS General Reference](https://docs.aws.amazon.com/general/latest/gr/Welcome.html).
+ When using Enhanced VPC Routing, make sure that you create an Amazon S3 endpoint in the same VPC as your Redshift Serverless or Redshift Provisioned cluster. For more information, see [Using enhanced VPC routing with Amazon Redshift as a target for AWS Database Migration Service](#CHAP_Target.Redshift.EnhancedVPC).
+ AWS DMS does not support Enhanced Throughput for Amazon Redshift Serverless as a target. For more information, see [Enhanced Throughput for Full-Load Oracle to Amazon Redshift and Amazon S3 Migrations](CHAP_Serverless.Components.md#CHAP_Serverless.Throughput).
+ AWS DMS does not support connections to Amazon Redshift Redshift Serverless when the SSL mode is set to `verify-full`. For connections requiring SSL verification to Amazon Redshift Serverless targets, use alternative SSL modes such as `require` or `verify-ca`.

# Using a SAP ASE database as a target for AWS Database Migration Service
<a name="CHAP_Target.SAP"></a>

You can migrate data to SAP Adaptive Server Enterprise (ASE)–formerly known as Sybase–databases using AWS DMS, either from any of the supported database sources.

For information about versions of SAP ASE that AWS DMS supports as a target, see [Targets for AWS DMS](CHAP_Introduction.Targets.md).

## Prerequisites for using a SAP ASE database as a target for AWS Database Migration Service
<a name="CHAP_Target.SAP.Prerequisites"></a>

Before you begin to work with a SAP ASE database as a target for AWS DMS, make sure that you have the following prerequisites:
+ Provide SAP ASE account access to the AWS DMS user. This user must have read/write privileges in the SAP ASE database.
+ In some cases, you might replicate to SAP ASE version 15.7 installed on an Amazon EC2 instance on Microsoft Windows that is configured with non-Latin characters (for example, Chinese). In such cases, AWS DMS requires SAP ASE 15.7 SP121 to be installed on the target SAP ASE machine.

## Limitations when using a SAP ASE database as a target for AWS DMS
<a name="CHAP_Target.SAP.Limitations"></a>

The following limitations apply when using an SAP ASE database as a target for AWS DMS:
+ AWS DMS doesn't support tables that include fields with the following data types. Replicated columns with these data types show as null. 
  + User-defined type (UDT)

## Endpoint settings when using SAP ASE as a target for AWS DMS
<a name="CHAP_Target.SAP.ConnectionAttrib"></a>

You can use endpoint settings to configure your SAP ASE target database similar to using extra connection attributes. You specify the settings when you create the target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--sybase-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with SAP ASE as a target.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.SAP.html)

## Target data types for SAP ASE
<a name="CHAP_Target.SAP.DataTypes"></a>

The following table shows the SAP ASE database target data types that are supported when using AWS DMS and the default mapping from AWS DMS data types.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  AWS DMS data types  |  SAP ASE data types  | 
| --- | --- | 
| BOOLEAN | BIT | 
| BYTES | VARBINARY (Length) | 
| DATE | DATE | 
| TIME | TIME | 
| TIMESTAMP |  If scale is => 0 and =< 6, then: BIGDATETIME  If scale is => 7 and =< 9, then: VARCHAR (37)  | 
| INT1 | TINYINT | 
| INT2 | SMALLINT | 
| INT4 | INTEGER | 
| INT8 | BIGINT | 
| NUMERIC | NUMERIC (p,s) | 
| REAL4 | REAL | 
| REAL8 | DOUBLE PRECISION | 
| STRING | VARCHAR (Length) | 
| UINT1 | TINYINT | 
| UINT2 | UNSIGNED SMALLINT | 
| UINT4 | UNSIGNED INTEGER | 
| UINT8 | UNSIGNED BIGINT | 
| WSTRING | VARCHAR (Length) | 
| BLOB | IMAGE | 
| CLOB | UNITEXT | 
| NCLOB | TEXT | 

# Using Amazon S3 as a target for AWS Database Migration Service
<a name="CHAP_Target.S3"></a>

You can migrate data to Amazon S3 using AWS DMS from any of the supported database sources. When using Amazon S3 as a target in an AWS DMS task, both full load and change data capture (CDC) data is written to comma-separated value (.csv) format by default. For more compact storage and faster query options, you also have the option to have the data written to Apache Parquet (.parquet) format. 

AWS DMS names files created during a full load using an incremental hexadecimal counter—for example LOAD00001.csv, LOAD00002..., LOAD00009, LOAD0000A, and so on for .csv files. AWS DMS names CDC files using timestamps, for example 20141029-1134010000.csv. For each source table that contains records, AWS DMS creates a folder under the specified target folder (if the source table is not empty). AWS DMS writes all full load and CDC files to the specified Amazon S3 bucket. You can control the size of the files that AWS DMS creates by using the [MaxFileSize](https://docs.aws.amazon.com/dms/latest/APIReference/API_S3Settings.html#DMS-Type-S3Settings-MaxFileSize) endpoint setting. 

The parameter `bucketFolder` contains the location where the .csv or .parquet files are stored before being uploaded to the S3 bucket. With .csv files, table data is stored in the following format in the S3 bucket, shown with full-load files.

```
database_schema_name/table_name/LOAD00000001.csv
database_schema_name/table_name/LOAD00000002.csv
...
database_schema_name/table_name/LOAD00000009.csv
database_schema_name/table_name/LOAD0000000A.csv
database_schema_name/table_name/LOAD0000000B.csv
...database_schema_name/table_name/LOAD0000000F.csv
database_schema_name/table_name/LOAD00000010.csv
...
```

You can specify the column delimiter, row delimiter, and other parameters using the extra connection attributes. For more information on the extra connection attributes, see [Endpoint settings when using Amazon S3 as a target for AWS DMS](#CHAP_Target.S3.Configuring) at the end of this section.

To prevent spoofing, AWS DMS validates bucket ownership before performing operations. By default, when the `ExpectedBucketOwner` Amazon S3 endpoint setting is not specified, AWS DMS uses the AWS account ID that owns the AWS DMS service role as the expected bucket owner.

To migrate data to an S3 bucket owned by a different AWS account, you must explicitly specify the actual bucket owner in the `ExpectedBucketOwner` Amazon S3 endpoint setting, as shown following. Otherwise, the cross-account replication task will fail.

```
--s3-settings '{"ExpectedBucketOwner": "AWS_Account_ID"}'
```

When you use AWS DMS to replicate data changes using a CDC task, the first column of the .csv or .parquet output file indicates how the row data was changed as shown for the following .csv file.

```
I,101,Smith,Bob,4-Jun-14,New York
U,101,Smith,Bob,8-Oct-15,Los Angeles
U,101,Smith,Bob,13-Mar-17,Dallas
D,101,Smith,Bob,13-Mar-17,Dallas
```

For this example, suppose that there is an `EMPLOYEE` table in the source database. AWS DMS writes data to the .csv or .parquet file, in response to the following events:
+ A new employee (Bob Smith, employee ID 101) is hired on 4-Jun-14 at the New York office. In the .csv or .parquet file, the `I` in the first column indicates that a new row was `INSERT`ed into the EMPLOYEE table at the source database.
+ On 8-Oct-15, Bob transfers to the Los Angeles office. In the .csv or .parquet file, the `U` indicates that the corresponding row in the EMPLOYEE table was `UPDATE`d to reflect Bob's new office location. The rest of the line reflects the row in the EMPLOYEE table as it appears after the `UPDATE`. 
+ On 13-Mar,17, Bob transfers again to the Dallas office. In the .csv or .parquet file, the `U` indicates that this row was `UPDATE`d again. The rest of the line reflects the row in the EMPLOYEE table as it appears after the `UPDATE`.
+ After some time working in Dallas, Bob leaves the company. In the .csv or .parquet file, the `D` indicates that the row was `DELETE`d in the source table. The rest of the line reflects how the row in the EMPLOYEE table appeared before it was deleted.

Note that by default for CDC, AWS DMS stores the row changes for each database table without regard to transaction order. If you want to store the row changes in CDC files according to transaction order, you need to use S3 endpoint settings to specify this and the folder path where you want the CDC transaction files to be stored on the S3 target. For more information, see [Capturing data changes (CDC) including transaction order on the S3 target](#CHAP_Target.S3.EndpointSettings.CdcPath).

To control the frequency of writes to an Amazon S3 target during a data replication task, you can configure the `cdcMaxBatchInterval` and `cdcMinFileSize` extra connection attributes. This can result in better performance when analyzing the data without any additional overhead operations. For more information, see [Endpoint settings when using Amazon S3 as a target for AWS DMS](#CHAP_Target.S3.Configuring) 

**Topics**
+ [

## Prerequisites for using Amazon S3 as a target
](#CHAP_Target.S3.Prerequisites)
+ [

## Limitations to using Amazon S3 as a target
](#CHAP_Target.S3.Limitations)
+ [

## Security
](#CHAP_Target.S3.Security)
+ [

## Using Apache Parquet to store Amazon S3 objects
](#CHAP_Target.S3.Parquet)
+ [

## Amazon S3 object tagging
](#CHAP_Target.S3.Tagging)
+ [

## Creating AWS KMS keys to encrypt Amazon S3 target objects
](#CHAP_Target.S3.KMSKeys)
+ [

## Using date-based folder partitioning
](#CHAP_Target.S3.DatePartitioning)
+ [

## Parallel load of partitioned sources when using Amazon S3 as a target for AWS DMS
](#CHAP_Target.S3.ParallelLoad)
+ [

## Endpoint settings when using Amazon S3 as a target for AWS DMS
](#CHAP_Target.S3.Configuring)
+ [

## Using AWS Glue Data Catalog with an Amazon S3 target for AWS DMS
](#CHAP_Target.S3.GlueCatalog)
+ [

## Using data encryption, parquet files, and CDC on your Amazon S3 target
](#CHAP_Target.S3.EndpointSettings)
+ [

## Indicating source DB operations in migrated S3 data
](#CHAP_Target.S3.Configuring.InsertOps)
+ [

## Target data types for S3 Parquet
](#CHAP_Target.S3.DataTypes)

## Prerequisites for using Amazon S3 as a target
<a name="CHAP_Target.S3.Prerequisites"></a>

Before using Amazon S3 as a target, check that the following are true: 
+ The S3 bucket that you're using as a target is in the same AWS Region as the DMS replication instance you are using to migrate your data.
+ The AWS account that you use for the migration has an IAM role with write and delete access to the S3 bucket you are using as a target.
+ This role has tagging access so you can tag any S3 objects written to the target bucket.
+ The IAM role has DMS (dms.amazonaws.com) added as *trusted entity*. 
+ For AWS DMS version 3.4.7 and higher, DMS must access the source bucket through a VPC endpoint or a public route. For information about VPC endpoints, see [Configuring VPC endpoints for AWS DMS](CHAP_VPC_Endpoints.md).

To set up this account access, ensure that the role assigned to the user account used to create the migration task has the following set of permissions.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:PutObjectTagging"
            ],
            "Resource": [
                "arn:aws:s3:::buckettest2/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::buckettest2"
            ]
        }
    ]
}
```

------

For prerequisites for using validation with S3 as a target, see [S3 target validation prerequisites](CHAP_Validating_S3.md#CHAP_Validating_S3_prerequisites).

## Limitations to using Amazon S3 as a target
<a name="CHAP_Target.S3.Limitations"></a>

The following limitations apply when using Amazon S3 as a target:
+ Don’t enable versioning for S3. If you need S3 versioning, use lifecycle policies to actively delete old versions. Otherwise, you might encounter endpoint test connection failures because of an S3 `list-object` call timeout. To create a lifecycle policy for an S3 bucket, see [ Managing your storage lifecycle](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt.html). To delete a version of an S3 object, see [ Deleting object versions from a versioning-enabled bucket](https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingObjectVersions.html).
+ A VPC-enabled (gateway VPC) S3 bucket is supported in versions 3.4.7 and higher.
+ The following data definition language (DDL) commands are supported for change data capture (CDC): Truncate Table, Drop Table, Create Table, Rename Table, Add Column, Drop Column, Rename Column, and Change Column Data Type. Note that when a column is added, dropped, or renamed on the source database, no ALTER statement is recorded in the target S3 bucket, and AWS DMS does not alter previously created records to match the new structure. After the change, AWS DMS creates any new records using the new table structure.
**Note**  
A truncate DDL operation removes all files and corresponding table folders from an S3 bucket. You can use task settings to disable that behavior and configure the way DMS handles DDL behavior during change data capture (CDC). For more information, see [Task settings for change processing DDL handling](CHAP_Tasks.CustomizingTasks.TaskSettings.DDLHandling.md).
+ Full LOB mode is not supported.
+ Changes to the source table structure during full load are not supported. Changes to data are supported during full load.
+ Multiple tasks that replicate data from the same source table to the same target S3 endpoint bucket result in those tasks writing to the same file. We recommend that you specify different target endpoints (buckets) if your data source is from the same table.
+ `BatchApply` is not supported for an S3 endpoint. Using Batch Apply (for example, the `BatchApplyEnabled` target metadata task setting) for an S3 target might result in loss of data.
+ You can't use `DatePartitionEnabled` or `addColumnName` together with `PreserveTransactions` or `CdcPath`.
+ AWS DMS doesn't support renaming multiple source tables to the same target folder using transformation rules.
+ If there is intensive writing to the source table during the full load phase, DMS may write duplicate records to the S3 bucket or cached changes.
+ If you configure the task with a `TargetTablePrepMode` of `DO_NOTHING`, DMS may write duplicate records to the S3 bucket if the task stops and resumes abruptly during the full load phase.
+ If you configure the target endpoint with a `PreserveTransactions` setting of `true`, reloading a table doesn't clear previously generated CDC files. For more information, see [Capturing data changes (CDC) including transaction order on the S3 target](#CHAP_Target.S3.EndpointSettings.CdcPath).

For limitations for using validation with S3 as a target, see [Limitations for using S3 target validation](CHAP_Validating_S3.md#CHAP_Validating_S3_limitations).

## Security
<a name="CHAP_Target.S3.Security"></a>

To use Amazon S3 as a target, the account used for the migration must have write and delete access to the Amazon S3 bucket that is used as the target. Specify the Amazon Resource Name (ARN) of an IAM role that has the permissions required to access Amazon S3. 

AWS DMS supports a set of predefined grants for Amazon S3, known as canned access control lists (ACLs). Each canned ACL has a set of grantees and permissions that you can use to set permissions for the Amazon S3 bucket. You can specify a canned ACL using the `cannedAclForObjects` on the connection string attribute for your S3 target endpoint. For more information about using the extra connection attribute `cannedAclForObjects`, see [Endpoint settings when using Amazon S3 as a target for AWS DMS](#CHAP_Target.S3.Configuring). For more information about Amazon S3 canned ACLs, see [Canned ACL](http://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl).

The IAM role that you use for the migration must be able to perform the `s3:PutObjectAcl` API operation.

## Using Apache Parquet to store Amazon S3 objects
<a name="CHAP_Target.S3.Parquet"></a>

The comma-separated value (.csv) format is the default storage format for Amazon S3 target objects. For more compact storage and faster queries, you can instead use Apache Parquet (.parquet) as the storage format.

Apache Parquet is an open-source file storage format originally designed for Hadoop. For more information on Apache Parquet, see [https://parquet.apache.org/](https://parquet.apache.org/).

To set .parquet as the storage format for your migrated S3 target objects, you can use the following mechanisms:
+ Endpoint settings that you provide as parameters of a JSON object when you create the endpoint using the AWS CLI or the API for AWS DMS. For more information, see [Using data encryption, parquet files, and CDC on your Amazon S3 target](#CHAP_Target.S3.EndpointSettings).
+ Extra connection attributes that you provide as a semicolon-separated list when you create the endpoint. For more information, see [Endpoint settings when using Amazon S3 as a target for AWS DMS](#CHAP_Target.S3.Configuring).

## Amazon S3 object tagging
<a name="CHAP_Target.S3.Tagging"></a>

You can tag Amazon S3 objects that a replication instance creates by specifying appropriate JSON objects as part of task-table mapping rules. For more information about requirements and options for S3 object tagging, including valid tag names, see [Object tagging](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-tagging.html) in the *Amazon Simple Storage Service User Guide*. For more information about table mapping using JSON, see [Specifying table selection and transformations rules using JSON](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.md).

You tag S3 objects created for specified tables and schemas by using one or more JSON objects of the `selection` rule type. You then follow this `selection` object (or objects) by one or more JSON objects of the `post-processing` rule type with `add-tag` action. These post-processing rules identify the S3 objects that you want to tag and specify the names and values of the tags that you want to add to these S3 objects.

You can find the parameters to specify in JSON objects of the `post-processing` rule type in the following table.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html)

When you specify multiple `post-processing` rule types to tag a selection of S3 objects, each S3 object is tagged using only one `tag-set` object from one post-processing rule. The particular tag set used to tag a given S3 object is the one from the post-processing rule whose associated object locator best matches that S3 object. 

For example, suppose that two post-processing rules identify the same S3 object. Suppose also that the object locator from one rule uses wildcards and the object locator from the other rule uses an exact match to identify the S3 object (without wildcards). In this case, the tag set associated with the post-processing rule with the exact match is used to tag the S3 object. If multiple post-processing rules match a given S3 object equally well, the tag set associated with the first such post-processing rule is used to tag the object.

**Example Adding static tags to an S3 object created for a single table and schema**  
The following selection and post-processing rules add three tags (`tag_1`, `tag_2`, and `tag_3` with corresponding static values `value_1`, `value_2`, and `value_3`) to a created S3 object. This S3 object corresponds to a single table in the source named `STOCK` with a schema named `aat2`.  

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "5",
            "rule-name": "5",
            "object-locator": {
                "schema-name": "aat2",
                "table-name": "STOCK"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "post-processing",
            "rule-id": "41",
            "rule-name": "41",
            "rule-action": "add-tag",
            "object-locator": {
                "schema-name": "aat2",
                "table-name": "STOCK"
            },
            "tag-set": [
              {
                "key": "tag_1",
                "value": "value_1"
              },
              {
                "key": "tag_2",
                "value": "value_2"
              },
              {
                "key": "tag_3",
                "value": "value_3"
              }                                     
           ]
        }
    ]
}
```

**Example Adding static and dynamic tags to S3 objects created for multiple tables and schemas**  
The following example has one selection and two post-processing rules, where input from the source includes all tables and all of their schemas.  

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "%",
                "table-name": "%"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "post-processing",
            "rule-id": "21",
            "rule-name": "21",
            "rule-action": "add-tag",
            "object-locator": {
                "schema-name": "%",
                "table-name": "%",
            },
            "tag-set": [
              { 
                "key": "dw-schema-name",
                "value":"${schema-name}"
              },
              {
                "key": "dw-schema-table",
                "value": "my_prefix_${table-name}"
              }
            ]
        },
        {
            "rule-type": "post-processing",
            "rule-id": "41",
            "rule-name": "41",
            "rule-action": "add-tag",
            "object-locator": {
                "schema-name": "aat",
                "table-name": "ITEM",
            },
            "tag-set": [
              {
                "key": "tag_1",
                "value": "value_1"
              },
              {
                "key": "tag_2",
                "value": "value_2"
              }           ]
        }
    ]
}
```
The first post-processing rule adds two tags (`dw-schema-name` and `dw-schema-table`) with corresponding dynamic values (`${schema-name}` and `my_prefix_${table-name}`) to almost all S3 objects created in the target. The exception is the S3 object identified and tagged with the second post-processing rule. Thus, each target S3 object identified by the wildcard object locator is created with tags that identify the schema and table to which it corresponds in the source.  
The second post-processing rule adds `tag_1` and `tag_2` with corresponding static values `value_1` and `value_2` to a created S3 object that is identified by an exact-match object locator. This created S3 object thus corresponds to the single table in the source named `ITEM` with a schema named `aat`. Because of the exact match, these tags replace any tags on this object added from the first post-processing rule, which matches S3 objects by wildcard only.

**Example Adding both dynamic tag names and values to S3 objects**  
The following example has two selection rules and one post-processing rule. Here, input from the source includes just the `ITEM` table in either the `retail` or `wholesale` schema.  

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "retail",
                "table-name": "ITEM"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "wholesale",
                "table-name": "ITEM"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "post-processing",
            "rule-id": "21",
            "rule-name": "21",
            "rule-action": "add-tag",
            "object-locator": {
                "schema-name": "%",
                "table-name": "ITEM",
            },
            "tag-set": [
              { 
                "key": "dw-schema-name",
                "value":"${schema-name}"
              },
              {
                "key": "dw-schema-table",
                "value": "my_prefix_ITEM"
              },
              {
                "key": "${schema-name}_ITEM_tag_1",
                "value": "value_1"
              },
              {
                "key": "${schema-name}_ITEM_tag_2",
                "value": "value_2"
              }
            ]
    ]
}
```
The tag set for the post-processing rule adds two tags (`dw-schema-name` and `dw-schema-table`) to all S3 objects created for the `ITEM` table in the target. The first tag has the dynamic value `"${schema-name}"` and the second tag has a static value, `"my_prefix_ITEM"`. Thus, each target S3 object is created with tags that identify the schema and table to which it corresponds in the source.   
In addition, the tag set adds two additional tags with dynamic names (`${schema-name}_ITEM_tag_1` and `"${schema-name}_ITEM_tag_2"`). These have the corresponding static values `value_1` and `value_2`. Thus, these tags are each named for the current schema, `retail` or `wholesale`. You can't create a duplicate dynamic tag name in this object, because each object is created for a single unique schema name. The schema name is used to create an otherwise unique tag name.

## Creating AWS KMS keys to encrypt Amazon S3 target objects
<a name="CHAP_Target.S3.KMSKeys"></a>

You can create and use custom AWS KMS keys to encrypt your Amazon S3 target objects. After you create a KMS key, you can use it to encrypt objects using one of the following approaches when you create the S3 target endpoint:
+ Use the following options for S3 target objects (with the default .csv file storage format) when you run the `create-endpoint` command using the AWS CLI.

  ```
  --s3-settings '{"ServiceAccessRoleArn": "your-service-access-ARN", 
  "CsvRowDelimiter": "\n", "CsvDelimiter": ",", "BucketFolder": "your-bucket-folder", 
  "BucketName": "your-bucket-name", "EncryptionMode": "SSE_KMS", 
  "ServerSideEncryptionKmsKeyId": "your-KMS-key-ARN"}'
  ```

  Here, your-`your-KMS-key-ARN` is the Amazon Resource Name (ARN) for your KMS key and it is required your IAM role has access permissions, see [Using data encryption, parquet files, and CDC on your Amazon S3 target](#CHAP_Target.S3.EndpointSettings).
+ Set the extra connection attribute `encryptionMode` to the value `SSE_KMS` and the extra connection attribute `serverSideEncryptionKmsKeyId` to the ARN for your KMS key. For more information, see [Endpoint settings when using Amazon S3 as a target for AWS DMS](#CHAP_Target.S3.Configuring).

To encrypt Amazon S3 target objects using a KMS key, you need an IAM role that has permissions to access the Amazon S3 bucket. This IAM role is then accessed in a policy (a key policy) attached to the encryption key that you create. You can do this in your IAM console by creating the following:
+ A policy with permissions to access the Amazon S3 bucket.
+ An IAM role with this policy.
+ A KMS key encryption key with a key policy that references this role.

The following procedures describe how to do this.

**To create an IAM policy with permissions to access the Amazon S3 bucket**

1. Open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/).

1. In the navigation pane, choose **Policies** in the navigation pane. The **Policies** page opens.

1. Choose **Create policy**. The **Create policy** page opens.

1. Choose **Service** and choose **S3**. A list of action permissions appears.

1. Choose **Expand all** to expand the list and choose the following permissions at a minimum:
   + **ListBucket**
   + **PutObject**
   + **DeleteObject**

   Choose any other permissions you need, and then choose **Collapse all** to collapse the list.

1. Choose **Resources** to specify the resources that you want to access. At a minimum, choose **All resources** to provide general Amazon S3 resource access.

1. Add any other conditions or permissions you need, then choose **Review policy**. Check your results on the **Review policy** page.

1. If the settings are what you need, enter a name for the policy (for example, `DMS-S3-endpoint-access`), and any description, then choose **Create policy**. The **Policies** page opens with a message indicating that your policy has been created.

1. Search for and choose the policy name in the **Policies** list. The **Summary** page appears displaying JSON for the policy similar to the following.

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Sid": "VisualEditor0",
               "Effect": "Allow",
               "Action": [
                   "s3:PutObject",
                   "s3:ListBucket",
                   "s3:DeleteObject"
               ],
               "Resource": "*"
           }
       ]
   }
   ```

------

You have now created the new policy to access Amazon S3 resources for encryption with a specified name, for example `DMS-S3-endpoint-access`.

**To create an IAM role with this policy**

1. On your IAM console, choose **Roles** in the navigation pane. The **Roles** detail page opens.

1. Choose **Create role**. The **Create role** page opens.

1. With AWS service selected as the trusted entity, choose **DMS** as the service to use the IAM role.

1. Choose **Next: Permissions**. The **Attach permissions policies** view appears in the **Create role** page.

1. Find and select the IAM policy for the IAM role that you created in the previous procedure (`DMS-S3-endpoint-access`).

1. Choose **Next: Tags**. The **Add tags** view appears in the **Create role** page. Here, you can add any tags you want.

1. Choose **Next: Review**. The **Review** view appears in the **Create role** page. Here, you can verify the results.

1. If the settings are what you need, enter a name for the role (required, for example, `DMS-S3-endpoint-access-role`), and any additional description, then choose **Create role**. The **Roles** detail page opens with a message indicating that your role has been created.

You have now created the new role to access Amazon S3 resources for encryption with a specified name, for example, `DMS-S3-endpoint-access-role`.

**To create a KMS key encryption key with a key policy that references your IAM role**
**Note**  
For more information about how AWS DMS works with AWS KMS encryption keys, see [Setting an encryption key and specifying AWS KMS permissions](CHAP_Security.md#CHAP_Security.EncryptionKey).

1. Sign in to the AWS Management Console and open the AWS Key Management Service (AWS KMS) console at [https://console.aws.amazon.com/kms](https://console.aws.amazon.com/kms).

1. To change the AWS Region, use the Region selector in the upper-right corner of the page.

1. In the navigation pane, choose **Customer managed keys**.

1. Choose **Create key**. The **Configure key** page opens.

1. For **Key type**, choose **Symmetric**.
**Note**  
When you create this key, you can only create a symmetric key, because all AWS services, such as Amazon S3, only work with symmetric encryption keys.

1. Choose **Advanced Options**. For **Key material origin**, make sure that **KMS** is chosen, then choose **Next**. The **Add labels** page opens.

1. For **Create alias and description**, enter an alias for the key (for example, `DMS-S3-endpoint-encryption-key`) and any additional description.

1. For **Tags**, add any tags that you want to help identify the key and track its usage, then choose **Next**. The **Define key administrative permissions** page opens showing a list of users and roles that you can choose from.

1. Add the users and roles that you want to manage the key. Make sure that these users and roles have the required permissions to manage the key. 

1. For **Key deletion**, choose whether key administrators can delete the key, then choose **Next**. The **Define key usage permissions** page opens showing an additional list of users and roles that you can choose from.

1. For **This account**, choose the available users you want to perform cryptographic operations on Amazon S3 targets. Also choose the role that you previously created in **Roles** to enable access to encrypt Amazon S3 target objects, for example `DMS-S3-endpoint-access-role`).

1. If you want to add other accounts not listed to have this same access, for **Other AWS accounts**, choose **Add another AWS account**, then choose **Next**. The **Review and edit key policy** page opens, showing the JSON for the key policy that you can review and edit by typing into the existing JSON. Here, you can see where the key policy references the role and users (for example, `Admin` and `User1`) that you chose in the previous step. You can also see the different key actions permitted for the different principals (users and roles), as shown in the example following.

------
#### [ JSON ]

****  

   ```
   {
       "Id": "key-consolepolicy-3",
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Sid": "Enable IAM User Permissions",
               "Effect": "Allow",
               "Principal": {
                   "AWS": [
                       "arn:aws:iam::111122223333:root"
                   ]
               },
               "Action": "kms:*",
               "Resource": "*"
           },
           {
               "Sid": "Allow access for Key Administrators",
               "Effect": "Allow",
               "Principal": {
                   "AWS": [
                       "arn:aws:iam::111122223333:role/Admin"
                   ]
               },
               "Action": [
                   "kms:Create*",
                   "kms:Describe*",
                   "kms:Enable*",
                   "kms:List*",
                   "kms:Put*",
                   "kms:Update*",
                   "kms:Revoke*",
                   "kms:Disable*",
                   "kms:Get*",
                   "kms:Delete*",
                   "kms:TagResource",
                   "kms:UntagResource",
                   "kms:ScheduleKeyDeletion",
                   "kms:CancelKeyDeletion"
               ],
               "Resource": "*"
           },
           {
               "Sid": "Allow use of the key",
               "Effect": "Allow",
               "Principal": {
                   "AWS": [
                       "arn:aws:iam::111122223333:role/DMS-S3-endpoint-access-role",
                       "arn:aws:iam::111122223333:role/Admin",
                       "arn:aws:iam::111122223333:role/User1"
                   ]
               },
               "Action": [
                   "kms:Encrypt",
                   "kms:Decrypt",
                   "kms:ReEncrypt*",
                   "kms:GenerateDataKey*",
                   "kms:DescribeKey"
               ],
               "Resource": "*"
           },
           {
               "Sid": "Allow attachment of persistent resources",
               "Effect": "Allow",
               "Principal": {
                   "AWS": [
                       "arn:aws:iam::111122223333:role/DMS-S3-endpoint-access-role",
                       "arn:aws:iam::111122223333:role/Admin",
                       "arn:aws:iam::111122223333:role/User1"
                   ]
               },
               "Action": [
                   "kms:CreateGrant",
                   "kms:ListGrants",
                   "kms:RevokeGrant"
               ],
               "Resource": "*",
               "Condition": {
                   "Bool": {
                       "kms:GrantIsForAWSResource": true
                   }
               }
           }
       ]
   }
   ```

------

1. Choose **Finish**. The **Encryption keys** page opens with a message indicating that your KMS key has been created.

You have now created a new KMS key with a specified alias (for example, `DMS-S3-endpoint-encryption-key`). This key enables AWS DMS to encrypt Amazon S3 target objects.

## Using date-based folder partitioning
<a name="CHAP_Target.S3.DatePartitioning"></a>

AWS DMS supports S3 folder partitions based on a transaction commit date when you use Amazon S3 as your target endpoint. Using date-based folder partitioning, you can write data from a single source table to a time-hierarchy folder structure in an S3 bucket. By partitioning folders when creating an S3 target endpoint, you can do the following:
+ Better manage your S3 objects
+ Limit the size of each S3 folder
+ Optimize data lake queries or other subsequent operations

You can enable date-based folder partitioning when you create an S3 target endpoint. You can enable it when you either migrate existing data and replicate ongoing changes (full load \$1 CDC), or replicate data changes only (CDC only). When you migrate existing data and replicate ongoing changes, only ongoing changes will be partitioned. Use the following target endpoint settings:
+ `DatePartitionEnabled` – Specifies partitioning based on dates. Set this Boolean option to `true` to partition S3 bucket folders based on transaction commit dates. 

  You can't use this setting with `PreserveTransactions` or `CdcPath`.

  The default value is `false`. 
+ `DatePartitionSequence` – Identifies the sequence of the date format to use during folder partitioning. Set this ENUM option to `YYYYMMDD`, `YYYYMMDDHH`, `YYYYMM`, `MMYYYYDD`, or `DDMMYYYY`. The default value is `YYYYMMDD`. Use this setting when `DatePartitionEnabled` is set to `true.`
+ `DatePartitionDelimiter` – Specifies a date separation delimiter to use during folder partitioning. Set this ENUM option to `SLASH`, `DASH`, `UNDERSCORE`, or `NONE`. The default value is `SLASH`. Use this setting when `DatePartitionEnabled` is set to `true`.
+ `DatePartitionTimezone` – When creating an S3 target endpoint, set `DatePartitionTimezone` to convert the current UTC time into a specified time zone. The conversion occurs when a date partition folder is created and a CDC filename is generated. The time zone format is Area/Location. Use this parameter when `DatePartitionedEnabled` is set to `true`, as shown in the following example:

  ```
  s3-settings='{"DatePartitionEnabled": true, "DatePartitionSequence": "YYYYMMDDHH", "DatePartitionDelimiter": "SLASH", "DatePartitionTimezone":"Asia/Seoul", "BucketName": "dms-nattarat-test"}'
  ```

The following example shows how to enable date-based folder partitioning, with default values for the data partition sequence and the delimiter. It uses the `--s3-settings '{json-settings}'` option of the AWS CLI.`create-endpoint` command. 

```
   --s3-settings '{"DatePartitionEnabled": true,"DatePartitionSequence": "YYYYMMDD","DatePartitionDelimiter": "SLASH"}'
```

## Parallel load of partitioned sources when using Amazon S3 as a target for AWS DMS
<a name="CHAP_Target.S3.ParallelLoad"></a>

You can configure a parallel full load of partitioned data sources to Amazon S3 targets. This approach improves the load times for migrating partitioned data from supported source database engines to the S3 target. To improve the load times of partitioned source data, you create S3 target subfolders mapped to the partitions of every table in the source database. These partition-bound subfolders allow AWS DMS to run parallel processes to populate each subfolder on the target.

To configure a parallel full load of an S3 target, S3 supports three `parallel-load` rule types for the `table-settings` rule of table mapping:
+ `partitions-auto`
+ `partitions-list`
+ `ranges`

For more information on these parallel-load rule types, see [Table and collection settings rules and operations](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Tablesettings.md).

For the `partitions-auto` and `partitions-list` rule types, AWS DMS uses each partition name from the source endpoint to identify the target subfolder structure, as follows.

```
bucket_name/bucket_folder/database_schema_name/table_name/partition_name/LOADseq_num.csv
```

Here, the subfolder path where data is migrated and stored on the S3 target includes an additional `partition_name` subfolder that corresponds to a source partition with the same name. This `partition_name` subfolder then stores one or more `LOADseq_num.csv` files containing data migrated from the specified source partition. Here, `seq_num` is the sequence number postfix on the .csv file name, such as `00000001` in the .csv file with the name, `LOAD00000001.csv`.

However, some database engines, such as MongoDB and DocumentDB, don't have the concept of partitions. For these database engines, AWS DMS adds the running source segment index as a prefix to the target .csv file name, as follows.

```
.../database_schema_name/table_name/SEGMENT1_LOAD00000001.csv
.../database_schema_name/table_name/SEGMENT1_LOAD00000002.csv
...
.../database_schema_name/table_name/SEGMENT2_LOAD00000009.csv
.../database_schema_name/table_name/SEGMENT3_LOAD0000000A.csv
```

Here, the files `SEGMENT1_LOAD00000001.csv` and `SEGMENT1_LOAD00000002.csv` are named with the same running source segment index prefix, `SEGMENT1`. They're named as so because the migrated source data for these two .csv files is associated with the same running source segment index. On the other hand, the migrated data stored in each of the target `SEGMENT2_LOAD00000009.csv` and `SEGMENT3_LOAD0000000A.csv` files is associated with different running source segment indexes. Each file has its file name prefixed with the name of its running segment index, `SEGMENT2` and `SEGMENT3`.

For the `ranges` parallel-load type, you define the column names and column values using the `columns` and `boundaries` settings of the `table-settings` rules. With these rules, you can specify partitions corresponding to segment names, as follows.

```
"parallel-load": {
    "type": "ranges",
    "columns": [
         "region",
         "sale"
    ],
    "boundaries": [
          [
               "NORTH",
               "1000"
          ],
          [
               "WEST",
               "3000"
          ]
    ],
    "segment-names": [
          "custom_segment1",
          "custom_segment2",
          "custom_segment3"
    ]
}
```

Here, the `segment-names` setting defines names for three partitions to migrate data in parallel on the S3 target. The migrated data is parallel-loaded and stored in .csv files under the partition subfolders in order, as follows.

```
.../database_schema_name/table_name/custom_segment1/LOAD[00000001...].csv
.../database_schema_name/table_name/custom_segment2/LOAD[00000001...].csv
.../database_schema_name/table_name/custom_segment3/LOAD[00000001...].csv
```

Here, AWS DMS stores a series of .csv files in each of the three partition subfolders. The series of .csv files in each partition subfolder is named incrementally starting from `LOAD00000001.csv` until all the data is migrated.

In some cases, you might not explicitly name partition subfolders for a `ranges` parallel-load type using the `segment-names` setting. In these case, AWS DMS applies the default of creating each series of .csv files under its `table_name` subfolder. Here, AWS DMS prefixes the file names of each series of .csv files with the name of the running source segment index, as follows.

```
.../database_schema_name/table_name/SEGMENT1_LOAD[00000001...].csv
.../database_schema_name/table_name/SEGMENT2_LOAD[00000001...].csv
.../database_schema_name/table_name/SEGMENT3_LOAD[00000001...].csv
...
.../database_schema_name/table_name/SEGMENTZ_LOAD[00000001...].csv
```

## Endpoint settings when using Amazon S3 as a target for AWS DMS
<a name="CHAP_Target.S3.Configuring"></a>

You can use endpoint settings to configure your Amazon S3 target database similar to using extra connection attributes. You specify the settings when you create the target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--s3-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

**Note**  
DMS writes changes to Parquet files based on the commit order from the source database, but when migrating multiple tables, the original transaction order is not preserved due to table-level partitioning. To maintain transaction sequence information, configure the `TimestampColumnName` endpoint setting to include the source commit timestamp for each row, which you can then use in downstream processing to reconstruct the original transaction sequence. Unlike CSV format, which offers the `PreserveTransactions` setting, Parquet files handle transactions differently due to their columnar storage structure, but this approach enables accurate tracking of source commit times, supports post-migration transaction order reconstruction, and allows efficient data processing while maintaining data consistency.

The following table shows the endpoint settings that you can use with Amazon S3 as a target.


| **Option** | **Description** | 
| --- | --- | 
| CsvNullValue |  An optional parameter that specifies how AWS DMS treats null values. While handling the null value, you can use this parameter to pass a user-defined string as null when writing to the target. For example, when target columns are nullable, you can use this option to differentiate between the empty string value and the null value.  Default value: `""` Valid values: any valid string Example: `--s3-settings '{"CsvNullValue": "NULL"}'` If the source database column value is null, in S3 CSV file, the column value is `NULL` instead of "" string.  | 
| AddColumnName |  An optional parameter that when set to `true` or `y` you can use to add column name information to the .csv output file. You can't use this parameter with `PreserveTransactions` or `CdcPath`. Default value: `false` Valid values: `true`, `false`, `y`, `n` Example: `--s3-settings '{"AddColumnName": true}'`  | 
| AddTrailingPaddingCharacter |  Use the S3 target endpoint setting `AddTrailingPaddingCharacter` to add padding on string data. The default value is `false`. Type: Boolean Example: `--s3-settings '{"AddTrailingPaddingCharacter": true}'`  | 
| BucketFolder |  An optional parameter to set a folder name in the S3 bucket. If provided, target objects are created as .csv or .parquet files in the path `BucketFolder/schema_name/table_name/`. If this parameter isn't specified, then the path used is `schema_name/table_name/`.  Example: `--s3-settings '{"BucketFolder": "testFolder"}'`  | 
| BucketName |  The name of the S3 bucket where S3 target objects are created as .csv or .parquet files. Example: `--s3-settings '{"BucketName": "buckettest"}'`  | 
| CannedAclForObjects |  A value that enables AWS DMS to specify a predefined (canned) access control list for objects created in the S3 bucket as .csv or .parquet files. For more information about Amazon S3 canned ACLs, see [Canned ACL](http://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl) in the *Amazon S3 Developer Guide.* Default value: NONE Valid values for this attribute are: NONE; PRIVATE; PUBLIC\$1READ; PUBLIC\$1READ\$1WRITE; AUTHENTICATED\$1READ; AWS\$1EXEC\$1READ; BUCKET\$1OWNER\$1READ; BUCKET\$1OWNER\$1FULL\$1CONTROL. Example: `--s3-settings '{"CannedAclForObjects": "PUBLIC_READ"}'`  | 
| CdcInsertsOnly |  An optional parameter during a change data capture (CDC) load to write only INSERT operations to the comma-separated value (.csv) or columnar storage (.parquet) output files. By default (the `false` setting), the first field in a .csv or .parquet record contains the letter I (INSERT), U (UPDATE), or D (DELETE). This letter indicates whether the row was inserted, updated, or deleted at the source database for a CDC load to the target. If `cdcInsertsOnly` is set to `true` or `y`, only INSERTs from the source database are migrated to the .csv or .parquet file. For .csv format only, how these INSERTS are recorded depends on the value of `IncludeOpForFullLoad`. If `IncludeOpForFullLoad` is set to `true`, the first field of every CDC record is set to I to indicate the INSERT operation at the source. If `IncludeOpForFullLoad` is set to `false`, every CDC record is written without a first field to indicate the INSERT operation at the source. For more information about how these parameters work together, see [Indicating source DB operations in migrated S3 data](#CHAP_Target.S3.Configuring.InsertOps). Default value: `false` Valid values: `true`, `false`, `y`, `n` Example: `--s3-settings '{"CdcInsertsOnly": true}'`  | 
| CdcInsertsAndUpdates |  Enables a change data capture (CDC) load to write INSERT and UPDATE operations to .csv or .parquet (columnar storage) output files. The default setting is `false`, but when `cdcInsertsAndUpdates` is set to `true` or `y`, INSERTs and UPDATEs from the source database are migrated to the .csv or .parquet file.  For .csv file format only, how these INSERTs and UPDATEs are recorded depends on the value of the `includeOpForFullLoad` parameter. If `includeOpForFullLoad` is set to `true`, the first field of every CDC record is set to either `I` or `U` to indicate INSERT and UPDATE operations at the source. But if `includeOpForFullLoad` is set to `false`, CDC records are written without an indication of INSERT or UPDATE operations at the source.   For more information about how these parameters work together, see [Indicating source DB operations in migrated S3 data](#CHAP_Target.S3.Configuring.InsertOps).  `CdcInsertsOnly` and `cdcInsertsAndUpdates` can't both be set to true for the same endpoint. Set either `cdcInsertsOnly` or `cdcInsertsAndUpdates` to `true` for the same endpoint, but not both.   Default value: `false` Valid values: `true`, `false`, `y`, `n` Example: `--s3-settings '{"CdcInsertsAndUpdates": true}'`  | 
|  `CdcPath`  |  Specifies the folder path of CDC files. For an S3 source, this setting is required if a task captures change data; otherwise, it's optional. If `CdcPath` is set, DMS reads CDC files from this path and replicates the data changes to the target endpoint. For an S3 target if you set `PreserveTransactions` to true, DMS verifies that you have set this parameter to a folder path on your S3 target where DMS can save the transaction order for the CDC load. DMS creates this CDC folder path in either your S3 target working directory or the S3 target location specified by `BucketFolder` and `BucketName`. You can't use this parameter with `DatePartitionEnabled` or `AddColumnName`. Type: String For example, if you specify `CdcPath` as `MyChangedData`, and you specify `BucketName` as `MyTargetBucket` but do not specify `BucketFolder`, DMS creates the following CDC folder path: `MyTargetBucket/MyChangedData`.  If you specify the same `CdcPath`, and you specify `BucketName` as `MyTargetBucket` and `BucketFolder` as `MyTargetData`, DMS creates the following CDC folder path: `MyTargetBucket/MyTargetData/MyChangedData`. This setting is supported in AWS DMS versions 3.4.2 and higher. When capturing data changes in transaction order, DMS always stores the row changes in .csv files regardless of the value of the DataFormat S3 setting on the target.   | 
|  `CdcMaxBatchInterval`  |  Maximum interval length condition, defined in seconds, to output a file to Amazon S3. Default Value: 60 seconds When `CdcMaxBatchInterval` is specified and `CdcMinFileSize` is specified, the file write is triggered by whichever parameter condition is met first.  Starting with AWS DMS version 3.5.3, when using PostgreSQL or Aurora PostgreSQL as the source and Amazon S3 with Parquet as the target, the frequency of `confirmed_flush_lsn` updates depends on the amount of data the target endpoint is configured to retain in memory. AWS DMS sends the `confirmed_flush_lsn` back to the source only after the data in memory is written to Amazon S3. If you configure the `CdcMaxBatchInterval` parameter to a higher value, you may observe increased replication slot usage on the source database.   | 
|  `CdcMinFileSize`  |  Minimum file size condition as defined in kilobytes to output a file to Amazon S3. Default Value: 32000 KB When `CdcMinFileSize` is specified and `CdcMaxBatchInterval` is specified, the file write is triggered by whichever parameter condition is met first.  | 
|  `PreserveTransactions`  |  If set to `true`, DMS saves the transaction order for change data capture (CDC) on the Amazon S3 target specified by `CdcPath`. You can't use this parameter with `DatePartitionEnabled` or `AddColumnName`. Type: Boolean When capturing data changes in transaction order, DMS always stores the row changes in .csv files regardless of the value of the DataFormat S3 setting on the target. This setting is supported in AWS DMS versions 3.4.2 and higher.   | 
| IncludeOpForFullLoad |  An optional parameter during a full load to write the INSERT operations to the comma-separated value (.csv) output files only. For full load, records can only be inserted. By default (the `false` setting), there is no information recorded in these output files for a full load to indicate that the rows were inserted at the source database. If `IncludeOpForFullLoad` is set to `true` or `y`, the INSERT is recorded as an I annotation in the first field of the .csv file.  This parameter works together with `CdcInsertsOnly` or `CdcInsertsAndUpdates` for output to .csv files only. For more information about how these parameters work together, see [Indicating source DB operations in migrated S3 data](#CHAP_Target.S3.Configuring.InsertOps).  Default value: `false` Valid values: `true`, `false`, `y`, `n` Example: `--s3-settings '{"IncludeOpForFullLoad": true}'`  | 
| CompressionType |  An optional parameter when set to `GZIP` uses GZIP to compress the target .csv files. When this parameter is set to the default, it leaves the files uncompressed. Default value: `NONE` Valid values: `GZIP` or `NONE` Example: `--s3-settings '{"CompressionType": "GZIP"}'`  | 
| CsvDelimiter |  The delimiter used to separate columns in .csv source files. The default is a comma (,). Example: `--s3-settings '{"CsvDelimiter": ","}'`  | 
| CsvRowDelimiter |  The delimiter used to separate rows in the .csv source files. The default is a newline (\$1n). Example: `--s3-settings '{"CsvRowDelimiter": "\n"}'`  | 
|   `MaxFileSize`   |  A value that specifies the maximum size (in KB) of any .csv file to be created while migrating to an S3 target during full load. Default value: 1,048,576 KB (1 GB) Valid values: 1–1,048,576 Example: `--s3-settings '{"MaxFileSize": 512}'`  | 
| Rfc4180 |  An optional parameter used to set behavior to comply with RFC for data migrated to Amazon S3 using .csv file format only. When this value is set to `true` or `y` using Amazon S3 as a target, if the data has quotation marks, commas, or newline characters in it, AWS DMS encloses the entire column with an additional pair of double quotation marks ("). Every quotation mark within the data is repeated twice. This formatting complies with RFC 4180. Default value: `true` Valid values: `true`, `false`, `y`, `n` Example: `--s3-settings '{"Rfc4180": false}'`  | 
| EncryptionMode |  The server-side encryption mode that you want to encrypt your .csv or .parquet object files copied to S3. The valid values are `SSE_S3` (S3 server-side encryption) or `SSE_KMS` (KMS key encryption). If you choose `SSE_KMS`, set the `ServerSideEncryptionKmsKeyId` parameter to the Amazon Resource Name (ARN) for the KMS key to be used for encryption.  You can also use the CLI `modify-endpoint` command to change the value of the `EncryptionMode` attribute for an existing endpoint from `SSE_KMS` to `SSE_S3`. But you can’t change the `EncryptionMode` value from `SSE_S3` to `SSE_KMS`.  Default value: `SSE_S3` Valid values: `SSE_S3` or `SSE_KMS` Example: `--s3-settings '{"EncryptionMode": SSE_S3}'`  | 
| ServerSideEncryptionKmsKeyId |  If you set `EncryptionMode` to `SSE_KMS`, set this parameter to the Amazon Resource Name (ARN) for the KMS key. You can find this ARN by selecting the key alias in the list of AWS KMS keys created for your account. When you create the key, you must associate specific policies and roles associated with this KMS key. For more information, see [Creating AWS KMS keys to encrypt Amazon S3 target objects](#CHAP_Target.S3.KMSKeys). Example: `--s3-settings '{"ServerSideEncryptionKmsKeyId":"arn:aws:kms:us-east-1:111122223333:key/11a1a1a1-aaaa-9999-abab-2bbbbbb222a2"}'`  | 
| DataFormat |  The output format for the files that AWS DMS uses to create S3 objects. For Amazon S3 targets, AWS DMS supports either .csv or .parquet files. The .parquet files have a binary columnar storage format with efficient compression options and faster query performance. For more information about .parquet files, see [https://parquet.apache.org/](https://parquet.apache.org/). Default value: `csv` Valid values: `csv` or `parquet` Example: `--s3-settings '{"DataFormat": "parquet"}'`  | 
| EncodingType |  The Parquet encoding type. The encoding type options include the following: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html) Default value: `rle-dictionary` Valid values: `rle-dictionary`, `plain`, or `plain-dictionary` Example: `--s3-settings '{"EncodingType": "plain-dictionary"}'`  | 
| DictPageSizeLimit |  The maximum allowed size, in bytes, for a dictionary page in a .parquet file. If a dictionary page exceeds this value, the page uses plain encoding. Default value: 1,024,000 (1 MB) Valid values: Any valid integer value Example: `--s3-settings '{"DictPageSizeLimit": 2,048,000}'`  | 
| RowGroupLength |  The number of rows in one row group of a .parquet file. Default value: 10,024 (10 KB) Valid values: Any valid integer Example: `--s3-settings '{"RowGroupLength": 20,048}'`  | 
| DataPageSize |  The maximum allowed size, in bytes, for a data page in a .parquet file. Default value: 1,024,000 (1 MB) Valid values: Any valid integer Example: `--s3-settings '{"DataPageSize": 2,048,000}'`  | 
| ParquetVersion |  The version of the .parquet file format. Default value: `PARQUET_1_0` Valid values: `PARQUET_1_0` or `PARQUET_2_0` Example: `--s3-settings '{"ParquetVersion": "PARQUET_2_0"}'`  | 
| EnableStatistics |  Set to `true` or `y` to enable statistics about .parquet file pages and row groups. Default value: `true` Valid values: `true`, `false`, `y`, `n` Example: `--s3-settings '{"EnableStatistics": false}'`  | 
| TimestampColumnName |  An optional parameter to include a timestamp column in the S3 target endpoint data. AWS DMS includes an additional `STRING` column in the .csv or .parquet object files of your migrated data when you set `TimestampColumnName` to a non blank value. For a full load, each row of this timestamp column contains a timestamp for when the data was transferred from the source to the target by DMS.  For a CDC load, each row of the timestamp column contains the timestamp for the commit of that row in the source database. The string format for this timestamp column value is `yyyy-MM-dd HH:mm:ss.SSSSSS`. By default, the precision of this value is in microseconds. For a CDC load, the rounding of the precision depends on the commit timestamp supported by DMS for the source database. When the `AddColumnName` parameter is set to `true`, DMS also includes the name for the timestamp column that you set as the non blank value of `TimestampColumnName`. Example: `--s3-settings '{"TimestampColumnName": "TIMESTAMP"}'`  | 
| UseTaskStartTimeForFullLoadTimestamp |  When set to `true`, this parameter uses the task start time as the timestamp column value instead of the time data is written to target. For full load, when `UseTaskStartTimeForFullLoadTimestamp` is set to `true`, each row of the timestamp column contains the task start time. For CDC loads, each row of the timestamp column contains the transaction commit time. When `UseTaskStartTimeForFullLoadTimestamp` is set to `false`, the full load timestamp in the timestamp column increments with the time data arrives at the target. Default value: `false` Valid values: `true`, `false` Example: `--s3-settings '{"UseTaskStartTimeForFullLoadTimestamp": true}'` `UseTaskStartTimeForFullLoadTimestamp: true` helps make the S3 target `TimestampColumnName` for a full load sortable with `TimestampColumnName` for a CDC load.  | 
| ParquetTimestampInMillisecond |  An optional parameter that specifies the precision of any `TIMESTAMP` column values written to an S3 object file in .parquet format. When this attribute is set to `true` or `y`, AWS DMS writes all `TIMESTAMP` columns in a .parquet formatted file with millisecond precision. Otherwise, DMS writes them with microsecond precision. Currently, Amazon Athena and AWS Glue can handle only millisecond precision for `TIMESTAMP` values. Set this attribute to true for .parquet formatted S3 endpoint object files only if you plan to query or process the data with Athena or AWS Glue.    AWS DMS writes any `TIMESTAMP` column values written to an S3 file in .csv format with microsecond precision.   The setting of this attribute has no effect on the string format of the timestamp column value inserted by setting the `TimestampColumnName` attribute.    Default value: `false` Valid values: `true`, `false`, `y`, `n` Example: `--s3-settings '{"ParquetTimestampInMillisecond": true}'`  | 
| GlueCatalogGeneration |  To generate an AWS Glue Data Catalog, set this endpoint setting to `true`. Default value: `false` Valid values: `true`, `false`, Example: `--s3-settings '{"GlueCatalogGeneration": true}'` **Note: **Don't use `GlueCatalogGeneration` with `PreserveTransactions` and `CdcPath`.  | 

## Using AWS Glue Data Catalog with an Amazon S3 target for AWS DMS
<a name="CHAP_Target.S3.GlueCatalog"></a>

AWS Glue is a service that provides simple ways to categorize data, and consists of a metadata repository known as AWS Glue Data Catalog. You can integrate AWS Glue Data Catalog with your Amazon S3 target endpoint and query Amazon S3 data through other AWS services such as Amazon Athena. Amazon Redshift works with AWS Glue but AWS DMS doesn't support that as a pre-built option. 

To generate the data catalog, set the `GlueCatalogGeneration` endpoint setting to `true`, as shown in the following AWS CLI example.

```
aws dms create-endpoint --endpoint-identifier s3-target-endpoint 
            --engine-name s3 --endpoint-type target--s3-settings '{"ServiceAccessRoleArn": 
            "your-service-access-ARN", "BucketFolder": "your-bucket-folder", "BucketName": 
            "your-bucket-name", "DataFormat": "parquet", "GlueCatalogGeneration": true}'
```

For a Full load replication task that includes `csv` type data, set `IncludeOpForFullLoad` to `true`.

Don't use `GlueCatalogGeneration` with `PreserveTransactions` and `CdcPath`. The AWS Glue crawler can't reconcile the different schemas of files stored under the specified `CdcPath`.

For Amazon Athena to index your Amazon S3 data, and for you to query your data using standard SQL queries through Amazon Athena, the IAM role attached to the endpoint must have the following policy:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	  
    "Statement": [ 
        {
            "Effect": "Allow", 
            "Action": [
                "s3:GetBucketLocation", 
                "s3:GetObject",
                "s3:ListBucket", 
                "s3:ListBucketMultipartUploads", 
                "s3:ListMultipartUploadParts", 
                "s3:AbortMultipartUpload" 
            ], 
            "Resource": [
                "arn:aws:s3:::bucket123", 
                "arn:aws:s3:::bucket123/*" 
            ]
        },
        {
            "Effect": "Allow", 
            "Action": [ 
                "glue:CreateDatabase", 
                "glue:GetDatabase", 
                "glue:CreateTable", 
                "glue:DeleteTable", 
                "glue:UpdateTable", 
                "glue:GetTable", 
                "glue:BatchCreatePartition", 
                "glue:CreatePartition", 
                "glue:UpdatePartition", 
                "glue:GetPartition", 
                "glue:GetPartitions", 
                "glue:BatchGetPartition"
            ], 
            "Resource": [
                "arn:aws:glue:*:111122223333:catalog", 
                "arn:aws:glue:*:111122223333:database/*", 
                "arn:aws:glue:*:111122223333:table/*" 
            ]
        }, 
        {
            "Effect": "Allow",
            "Action": [
                "athena:StartQueryExecution",
                "athena:GetQueryExecution", 
                "athena:CreateWorkGroup"
            ],
            "Resource": "arn:aws:athena:*:111122223333:workgroup/glue_catalog_generation_for_task_*"
        }
    ]
}
```

------

**References**
+ For more information about AWS Glue, see [Concepts](https://docs.aws.amazon.com//glue/latest/dg/components-key-concepts.html) in the *AWS Glue Developer Guide* .
+ For more information about AWS Glue Data Catalog see [Components](https://docs.aws.amazon.com/glue/latest/dg/components-overview.html) in the *AWS Glue Developer Guide* .

## Using data encryption, parquet files, and CDC on your Amazon S3 target
<a name="CHAP_Target.S3.EndpointSettings"></a>

You can use S3 target endpoint settings to configure the following:
+ A custom KMS key to encrypt your S3 target objects.
+ Parquet files as the storage format for S3 target objects.
+ Change data capture (CDC) including transaction order on the S3 target.
+ Integrate AWS Glue Data Catalog with your Amazon S3 target endpoint and query Amazon S3 data through other services such as Amazon Athena.

### AWS KMS key settings for data encryption
<a name="CHAP_Target.S3.EndpointSettings.KMSkeys"></a>

The following examples show configuring a custom KMS key to encrypt your S3 target objects. To start, you might run the following `create-endpoint` CLI command.

```
aws dms create-endpoint --endpoint-identifier s3-target-endpoint --engine-name s3 --endpoint-type target 
--s3-settings '{"ServiceAccessRoleArn": "your-service-access-ARN", "CsvRowDelimiter": "\n", 
"CsvDelimiter": ",", "BucketFolder": "your-bucket-folder", 
"BucketName": "your-bucket-name", 
"EncryptionMode": "SSE_KMS", 
"ServerSideEncryptionKmsKeyId": "arn:aws:kms:us-east-1:111122223333:key/72abb6fb-1e49-4ac1-9aed-c803dfcc0480"}'
```

Here, the JSON object specified by `--s3-settings` option defines two parameters. One is an `EncryptionMode` parameter with the value `SSE_KMS`. The other is an `ServerSideEncryptionKmsKeyId` parameter with the value of `arn:aws:kms:us-east-1:111122223333:key/72abb6fb-1e49-4ac1-9aed-c803dfcc0480`. This value is an Amazon Resource Name (ARN) for your custom KMS key. For an S3 target, you also specify additional settings. These identify the server access role, provide delimiters for the default CSV object storage format, and give the bucket location and name to store S3 target objects.

By default, S3 data encryption occurs using S3 server-side encryption. For the previous example's S3 target, this is also equivalent to specifying its endpoint settings as in the following example.

```
aws dms create-endpoint --endpoint-identifier s3-target-endpoint --engine-name s3 --endpoint-type target
--s3-settings '{"ServiceAccessRoleArn": "your-service-access-ARN", "CsvRowDelimiter": "\n", 
"CsvDelimiter": ",", "BucketFolder": "your-bucket-folder", 
"BucketName": "your-bucket-name", 
"EncryptionMode": "SSE_S3"}'
```

For more information about working with S3 server-side encryption, see [Protecting data using server-side encryption](https://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-encryption.html).

**Note**  
You can also use the CLI `modify-endpoint` command to change the value of the `EncryptionMode` parameter for an existing endpoint from `SSE_KMS` to `SSE_S3`. But you can’t change the `EncryptionMode` value from `SSE_S3` to `SSE_KMS`.

### Settings for using .parquet files to store S3 target objects
<a name="CHAP_Target.S3.EndpointSettings.Parquet"></a>

The default format for creating S3 target objects is .csv files. The following examples show some endpoint settings for specifying .parquet files as the format for creating S3 target objects. You can specify the .parquet files format with all the defaults, as in the following example.

```
aws dms create-endpoint --endpoint-identifier s3-target-endpoint --engine-name s3 --endpoint-type target 
--s3-settings '{"ServiceAccessRoleArn": "your-service-access-ARN", "DataFormat": "parquet"}'
```

Here, the `DataFormat` parameter is set to `parquet` to enable the format with all the S3 defaults. These defaults include a dictionary encoding (`"EncodingType: "rle-dictionary"`) that uses a combination of bit-packing and run-length encoding to more efficiently store repeating values.

You can add additional settings for options other than the defaults as in the following example.

```
aws dms create-endpoint --endpoint-identifier s3-target-endpoint --engine-name s3 --endpoint-type target
--s3-settings '{"ServiceAccessRoleArn": "your-service-access-ARN", "BucketFolder": "your-bucket-folder",
"BucketName": "your-bucket-name", "DataFormat": "parquet", "EncodingType: "plain-dictionary", "DictPageSizeLimit": 3,072,000,
"EnableStatistics": false }'
```

Here, in addition to parameters for several standard S3 bucket options and the `DataFormat` parameter, the following additional .parquet file parameters are set:
+ `EncodingType` – Set to a dictionary encoding (`plain-dictionary`) that stores values encountered in each column in a per-column chunk of the dictionary page.
+ `DictPageSizeLimit` – Set to a maximum dictionary page size of 3 MB.
+ `EnableStatistics` – Disables the default that enables the collection of statistics about Parquet file pages and row groups.

### Capturing data changes (CDC) including transaction order on the S3 target
<a name="CHAP_Target.S3.EndpointSettings.CdcPath"></a>

By default when AWS DMS runs a CDC task, it stores all the row changes logged in your source database (or databases) in one or more files for each table. Each set of files containing changes for the same table reside in a single target directory associated with that table. AWS DMS creates as many target directories as database tables migrated to the Amazon S3 target endpoint. The files are stored on the S3 target in these directories without regard to transaction order. For more information on the file naming conventions, data contents, and format, see [Using Amazon S3 as a target for AWS Database Migration Service](#CHAP_Target.S3).

To capture source database changes in a manner that also captures the transaction order, you can specify S3 endpoint settings that direct AWS DMS to store the row changes for *all* database tables in one or more .csv files created depending on transaction size. These .csv *transaction files* contain all row changes listed sequentially in transaction order for all tables involved in each transaction. These transaction files reside together in a single *transaction directory* that you also specify on the S3 target. In each transaction file, the transaction operation and the identity of the database and source table for each row change is stored as part of the row data as follows. 

```
operation,table_name,database_schema_name,field_value,...
```

Here, `operation` is the transaction operation on the changed row, `table_name` is the name of the database table where the row is changed, `database_schema_name` is the name of the database schema where the table resides, and `field_value` is the first of one or more field values that specify the data for the row.

The example following of a transaction file shows changed rows for one or more transactions that involve two tables.

```
I,Names_03cdcad11a,rdsTempsdb,13,Daniel
U,Names_03cdcad11a,rdsTempsdb,23,Kathy
D,Names_03cdcad11a,rdsTempsdb,13,Cathy
I,Names_6d152ce62d,rdsTempsdb,15,Jane
I,Names_6d152ce62d,rdsTempsdb,24,Chris
I,Names_03cdcad11a,rdsTempsdb,16,Mike
```

Here, the transaction operation on each row is indicated by `I` (insert), `U` (update), or `D` (delete) in the first column. The table name is the second column value (for example, `Names_03cdcad11a`). The name of the database schema is the value of the third column (for example, `rdsTempsdb`). And the remaining columns are populated with your own row data (for example, `13,Daniel`).

In addition, AWS DMS names the transaction files it creates on the Amazon S3 target using a time stamp according to the following naming convention.

```
CDC_TXN-timestamp.csv
```

Here, `timestamp` is the time when the transaction file was created, as in the following example. 

```
CDC_TXN-20201117153046033.csv
```

This time stamp in the file name ensures that the transaction files are created and listed in transaction order when you list them in their transaction directory.

**Note**  
When capturing data changes in transaction order, AWS DMS always stores the row changes in .csv files regardless of the value of the `DataFormat` S3 setting on the target.

To control the frequency of writes to an Amazon S3 target during a data replication task, you can configure the `CdcMaxBatchInterval` and `CdcMinFileSize` settings. This can result in better performance when analyzing the data without any additional overhead operations. For more information, see [Endpoint settings when using Amazon S3 as a target for AWS DMS](#CHAP_Target.S3.Configuring) 

**To tell AWS DMS to store all row changes in transaction order**

1. Set the `PreserveTransactions` S3 setting on the target to `true`.

1. Set the `CdcPath` S3 setting on the target to a relative folder path where you want AWS DMS to store the .csv transaction files.

   AWS DMS creates this path either under the default S3 target bucket and working directory or under the bucket and bucket folder that you specify using the `BucketName` and `BucketFolder` S3 settings on the target.

## Indicating source DB operations in migrated S3 data
<a name="CHAP_Target.S3.Configuring.InsertOps"></a>

When AWS DMS migrates records to an S3 target, it can create an additional field in each migrated record. This additional field indicates the operation applied to the record at the source database. How AWS DMS creates and sets this first field depends on the migration task type and settings of `includeOpForFullLoad`, `cdcInsertsOnly`, and `cdcInsertsAndUpdates`.

For a full load when `includeOpForFullLoad` is `true`, AWS DMS always creates an additional first field in each .csv record. This field contains the letter I (INSERT) to indicate that the row was inserted at the source database. For a CDC load when `cdcInsertsOnly` is `false` (the default), AWS DMS also always creates an additional first field in each .csv or .parquet record. This field contains the letter I (INSERT), U (UPDATE), or D (DELETE) to indicate whether the row was inserted, updated, or deleted at the source database.

In the following table, you can see how the settings of the `includeOpForFullLoad` and `cdcInsertsOnly` attributes work together to affect the setting of migrated records.


| With these parameter settings | DMS sets target records as follows for .csv and .parquet output  | includeOpForFullLoad | cdcInsertsOnly | For full load | For CDC load | 
| --- | --- | --- | --- | --- | --- | 
| true | true | Added first field value set to I | Added first field value set to I | 
| false | false | No added field | Added first field value set to I, U, or D | 
| false | true | No added field | No added field | 
| true | false | Added first field value set to I | Added first field value set to I, U, or D | 

When `includeOpForFullLoad` and `cdcInsertsOnly` are set to the same value, the target records are set according to the attribute that controls record settings for the current migration type. That attribute is `includeOpForFullLoad` for full load and `cdcInsertsOnly` for CDC load.

When `includeOpForFullLoad` and `cdcInsertsOnly` are set to different values, AWS DMS makes the target record settings consistent for both CDC and full load. It does this by making the record settings for a CDC load conform to the record settings for any earlier full load specified by `includeOpForFullLoad`. 

In other words, suppose that a full load is set to add a first field to indicate an inserted record. In this case, a following CDC load is set to add a first field that indicates an inserted, updated, or deleted record as appropriate at the source. In contrast, suppose that a full load is set to *not* add a first field to indicate an inserted record. In this case, a CDC load is also set to not add a first field to each record regardless of its corresponding record operation at the source.

Similarly, how DMS creates and sets an additional first field depends on the settings of `includeOpForFullLoad` and `cdcInsertsAndUpdates`. In the following table, you can see how the settings of the `includeOpForFullLoad` and `cdcInsertsAndUpdates` attributes work together to affect the setting of migrated records in this format. 


| With these parameter settings | DMS sets target records as follows for .csv output  | includeOpForFullLoad | cdcInsertsAndUpdates | For full load | For CDC load | 
| --- | --- | --- | --- | --- | --- | 
| true | true | Added first field value set to I | Added first field value set to I or U | 
| false | false | No added field | Added first field value set to I, U, or D | 
| false | true | No added field | Added first field value set to I or U | 
| true | false | Added first field value set to I | Added first field value set to I, U, or D | 

## Target data types for S3 Parquet
<a name="CHAP_Target.S3.DataTypes"></a>

The following table shows the Parquet target data types that are supported when using AWS DMS and the default mapping from AWS DMS data types.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  AWS DMS data type  |  S3 parquet data type   | 
| --- | --- | 
| BYTES | BINARY | 
| DATE | DATE32 | 
| TIME | TIME32 | 
| DATETIME | TIMESTAMP | 
| INT1 | INT8 | 
| INT2 | INT16 | 
| INT4 | INT32 | 
| INT8 | INT64 | 
| NUMERIC | DECIMAL | 
| REAL4 | FLOAT | 
| REAL8 | DOUBLE | 
| STRING | STRING | 
| UINT1 | UINT8 | 
| UINT2 | UINT16 | 
| UINT4 | UINT32 | 
| UINT8 | UINT64 | 
| WSTRING | STRING | 
| BLOB | BINARY | 
| NCLOB | STRING | 
| CLOB | STRING | 
| BOOLEAN | BOOL | 

# Using an Amazon DynamoDB database as a target for AWS Database Migration Service
<a name="CHAP_Target.DynamoDB"></a>

You can use AWS DMS to migrate data to an Amazon DynamoDB table. Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. AWS DMS supports using a relational database or MongoDB as a source.

In DynamoDB, tables, items, and attributes are the core components that you work with. A *table *is a collection of items, and each *item *is a collection of attributes. DynamoDB uses primary keys, called partition keys, to uniquely identify each item in a table. You can also use keys and secondary indexes to provide more querying flexibility.

You use object mapping to migrate your data from a source database to a target DynamoDB table. Object mapping enables you to determine where the source data is located in the target. 

When AWS DMS creates tables on an DynamoDB target endpoint, it creates as many tables as in the source database endpoint. AWS DMS also sets several DynamoDB parameter values. The cost for the table creation depends on the amount of data and the number of tables to be migrated.

**Note**  
The **SSL Mode** option on the AWS DMS console or API doesn’t apply to some data streaming and NoSQL services like Kinesis and DynamoDB. They are secure by default, so AWS DMS shows the SSL mode setting is equal to none (**SSL Mode=None**). You don’t need to provide any additional configuration for your endpoint to make use of SSL. For example, when using DynamoDB as a target endpoint, it is secure by default. All API calls to DynamoDB use SSL, so there is no need for an additional SSL option in the AWS DMS endpoint. You can securely put data and retrieve data through SSL endpoints using the HTTPS protocol, which AWS DMS uses by default when connecting to a DynamoDB database.

To help increase the speed of the transfer, AWS DMS supports a multithreaded full load to a DynamoDB target instance. DMS supports this multithreading with task settings that include the following:
+ `MaxFullLoadSubTasks` – Use this option to indicate the maximum number of source tables to load in parallel. DMS loads each table into its corresponding DynamoDB target table using a dedicated subtask. The default value is 8. The maximum value is 49.
+ `ParallelLoadThreads` – Use this option to specify the number of threads that AWS DMS uses to load each table into its DynamoDB target table. The default value is 0 (single-threaded). The maximum value is 200. You can ask to have this maximum limit increased.
**Note**  
DMS assigns each segment of a table to its own thread for loading. Therefore, set `ParallelLoadThreads` to the maximum number of segments that you specify for a table in the source.
+ `ParallelLoadBufferSize` – Use this option to specify the maximum number of records to store in the buffer that the parallel load threads use to load data to the DynamoDB target. The default value is 50. The maximum value is 1,000. Use this setting with `ParallelLoadThreads`. `ParallelLoadBufferSize` is valid only when there is more than one thread.
+ Table-mapping settings for individual tables – Use `table-settings` rules to identify individual tables from the source that you want to load in parallel. Also use these rules to specify how to segment the rows of each table for multithreaded loading. For more information, see [Table and collection settings rules and operations](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Tablesettings.md).

**Note**  
When AWS DMS sets DynamoDB parameter values for a migration task, the default Read Capacity Units (RCU) parameter value is set to 200.  
The Write Capacity Units (WCU) parameter value is also set, but its value depends on several other settings:  
The default value for the WCU parameter is 200.
If the `ParallelLoadThreads` task setting is set greater than 1 (the default is 0), then the WCU parameter is set to 200 times the `ParallelLoadThreads` value.
Standard AWS DMS usage fees apply to resources you use.

## Migrating from a relational database to a DynamoDB table
<a name="CHAP_Target.DynamoDB.RDBMS2DynamoDB"></a>

AWS DMS supports migrating data to DynamoDB scalar data types. When migrating from a relational database like Oracle or MySQL to DynamoDB, you might want to restructure how you store this data.

Currently AWS DMS supports single table to single table restructuring to DynamoDB scalar type attributes. If you are migrating data into DynamoDB from a relational database table, you take data from a table and reformat it into DynamoDB scalar data type attributes. These attributes can accept data from multiple columns, and you can map a column to an attribute directly.

AWS DMS supports the following DynamoDB scalar data types:
+ String
+ Number
+ Boolean

**Note**  
NULL data from the source are ignored on the target.

## Prerequisites for using DynamoDB as a target for AWS Database Migration Service
<a name="CHAP_Target.DynamoDB.Prerequisites"></a>

Before you begin to work with a DynamoDB database as a target for AWS DMS, make sure that you create an IAM role. This IAM role should allow AWS DMS to assume and grant access to the DynamoDB tables that are being migrated into. The minimum set of access permissions is shown in the following IAM policy.

------
#### [ JSON ]

****  

```
{
   "Version":"2012-10-17",		 	 	 
   "Statement": [
      {
         "Sid": "",
         "Effect": "Allow",
         "Principal": {
            "Service": "dms.amazonaws.com"
         },
         "Action": "sts:AssumeRole"
      }
   ]
}
```

------

The role that you use for the migration to DynamoDB must have the following permissions.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "dynamodb:PutItem",
                "dynamodb:CreateTable",
                "dynamodb:DescribeTable",
                "dynamodb:DeleteTable",
                "dynamodb:DeleteItem",
                "dynamodb:UpdateItem"
            ],
            "Resource": [
                "arn:aws:dynamodb:us-west-2:111122223333:table/name1",
                "arn:aws:dynamodb:us-west-2:111122223333:table/OtherName*",
                "arn:aws:dynamodb:us-west-2:111122223333:table/awsdms_apply_exceptions",
                "arn:aws:dynamodb:us-west-2:111122223333:table/awsdms_full_load_exceptions"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "dynamodb:ListTables"
            ],
            "Resource": "*"
        }
    ]
}
```

------

## Limitations when using DynamoDB as a target for AWS Database Migration Service
<a name="CHAP_Target.DynamoDB.Limitations"></a>

The following limitations apply when using DynamoDB as a target:
+ DynamoDB limits the precision of the Number data type to 38 places. Store all data types with a higher precision as a String. You need to explicitly specify this using the object-mapping feature.
+ Because DynamoDB doesn't have a Date data type, data using the Date data type are converted to strings.
+ DynamoDB doesn't allow updates to the primary key attributes. This restriction is important when using ongoing replication with change data capture (CDC) because it can result in unwanted data in the target. Depending on how you have the object mapping, a CDC operation that updates the primary key can do one of two things. It can either fail or insert a new item with the updated primary key and incomplete data.
+ AWS DMS only supports replication of tables with noncomposite primary keys. The exception is if you specify an object mapping for the target table with a custom partition key or sort key, or both.
+ AWS DMS doesn't support LOB data unless it is a CLOB. AWS DMS converts CLOB data into a DynamoDB string when migrating the data.
+ When you use DynamoDB as target, only the Apply Exceptions control table (`dmslogs.awsdms_apply_exceptions`) is supported. For more information about control tables, see [Control table task settings](CHAP_Tasks.CustomizingTasks.TaskSettings.ControlTable.md).
+ AWS DMS doesn't support the task setting `TargetTablePrepMode=TRUNCATE_BEFORE_LOAD` for DynamoDB as a target. 
+ AWS DMS doesn't support the task setting `TaskRecoveryTableEnabled` for DynamoDB as a target. 
+ `BatchApply` is not supported for a DynamoDB endpoint.
+ AWS DMS cannot migrate attributes whose names match reserved words in DynamoDB. For more information, see [Reserved words in DynamoDB](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ReservedWords.html) in the *Amazon DynamoDB Developer Guide*.

## Using object mapping to migrate data to DynamoDB
<a name="CHAP_Target.DynamoDB.ObjectMapping"></a>

AWS DMS uses table-mapping rules to map data from the source to the target DynamoDB table. To map data to a DynamoDB target, you use a type of table-mapping rule called *object-mapping*. Object mapping lets you define the attribute names and the data to be migrated to them. You must have selection rules when you use object mapping.

DynamoDB doesn't have a preset structure other than having a partition key and an optional sort key. If you have a noncomposite primary key, AWS DMS uses it. If you have a composite primary key or you want to use a sort key, define these keys and the other attributes in your target DynamoDB table.

To create an object-mapping rule, you specify the `rule-type` as *object-mapping*. This rule specifies what type of object mapping you want to use. 

The structure for the rule is as follows:

```
{ "rules": [
    {
      "rule-type": "object-mapping",
      "rule-id": "<id>",
      "rule-name": "<name>",
      "rule-action": "<valid object-mapping rule action>",
      "object-locator": {
      "schema-name": "<case-sensitive schema name>",
      "table-name": ""
      },
      "target-table-name": "<table_name>"
    }
  ]
}
```

AWS DMS currently supports `map-record-to-record` and `map-record-to-document` as the only valid values for the `rule-action` parameter. These values specify what AWS DMS does by default to records that aren't excluded as part of the `exclude-columns` attribute list. These values don't affect the attribute mappings in any way. 
+ You can use `map-record-to-record` when migrating from a relational database to DynamoDB. It uses the primary key from the relational database as the partition key in DynamoDB and creates an attribute for each column in the source database. When using `map-record-to-record`, for any column in the source table not listed in the `exclude-columns` attribute list, AWS DMS creates a corresponding attribute on the target DynamoDB instance. It does so regardless of whether that source column is used in an attribute mapping. 
+ You use `map-record-to-document` to put source columns into a single, flat DynamoDB map on the target using the attribute name "\$1doc." When using `map-record-to-document`, AWS DMS places the data into a single, flat, DynamoDB map attribute on the source. This attribute is called "\$1doc". This placement applies to any column in the source table not listed in the `exclude-columns` attribute list. 

One way to understand the difference between the `rule-action` parameters `map-record-to-record` and `map-record-to-document` is to see the two parameters in action. For this example, assume that you are starting with a relational database table row with the following structure and data:

![\[sample database for example\]](http://docs.aws.amazon.com/dms/latest/userguide/images/datarep-dynamodb1.png)


To migrate this information to DynamoDB, you create rules to map the data into a DynamoDB table item. Note the columns listed for the `exclude-columns` parameter. These columns are not directly mapped over to the target. Instead, attribute mapping is used to combine the data into new items, such as where *FirstName* and *LastName* are grouped together to become *CustomerName* on the DynamoDB target. *NickName* and *income* are not excluded.

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "test",
                "table-name": "%"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "2",
            "rule-name": "TransformToDDB",
            "rule-action": "map-record-to-record",
            "object-locator": {
                "schema-name": "test",
                "table-name": "customer"
            },
            "target-table-name": "customer_t",
            "mapping-parameters": {
                "partition-key-name": "CustomerName",
                "exclude-columns": [
                    "FirstName",
                    "LastName",
                    "HomeAddress",
                    "HomePhone",
                    "WorkAddress",
                    "WorkPhone"
                ],
                "attribute-mappings": [
                    {
                        "target-attribute-name": "CustomerName",
                        "attribute-type": "scalar",
                        "attribute-sub-type": "string",
                        "value": "${FirstName},${LastName}"
                    },
                    {
                        "target-attribute-name": "ContactDetails",
                        "attribute-type": "document",
                        "attribute-sub-type": "dynamodb-map",
                        "value": {
                            "M": {
                                "Home": {
                                    "M": {
                                        "Address": {
                                            "S": "${HomeAddress}"
                                        },
                                        "Phone": {
                                            "S": "${HomePhone}"
                                        }
                                    }
                                },
                                "Work": {
                                    "M": {
                                        "Address": {
                                            "S": "${WorkAddress}"
                                        },
                                        "Phone": {
                                            "S": "${WorkPhone}"
                                        }
                                    }
                                }
                            }
                        }
                    }
                ]
            }
        }
    ]
}
```

By using the `rule-action` parameter *map-record-to-record*, the data for *NickName* and *income* are mapped to items of the same name in the DynamoDB target. 

![\[Get started with AWS DMS\]](http://docs.aws.amazon.com/dms/latest/userguide/images/datarep-dynamodb2.png)


However, suppose that you use the same rules but change the `rule-action` parameter to *map-record-to-document*. In this case, the columns not listed in the `exclude-columns` parameter, *NickName* and *income*, are mapped to a *\$1doc* item.

![\[Get started with AWS DMS\]](http://docs.aws.amazon.com/dms/latest/userguide/images/datarep-dynamodb3.png)


### Using custom condition expressions with object mapping
<a name="CHAP_Target.DynamoDB.ObjectMapping.ConditionExpression"></a>

You can use a feature of DynamoDB called conditional expressions to manipulate data that is being written to a DynamoDB table. For more information about condition expressions in DynamoDB, see [Condition expressions](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.ConditionExpressions.html).

A condition expression member consists of: 
+ an expression (required) 
+ expression attribute values (required). Specifies a DynamoDB json structure of the attribute value. This is useful for comparing an attribute with a value in DynamoDB that you might not know until runtime. You can define an expression attribute value as a placeholder for an actual value.
+ expression attribute names (required). This helps avoid potential conflicts with any DynamoDB reserved words, attribute names containing special characters, and similar.
+ options for when to use the condition expression (optional). The default is apply-during-cdc = false and apply-during-full-load = true

The structure for the rule is as follows:

```
"target-table-name": "customer_t",
      "mapping-parameters": {
        "partition-key-name": "CustomerName",
        "condition-expression": {
          "expression":"<conditional expression>",
          "expression-attribute-values": [
              {
                "name":"<attribute name>",
                "value":<attribute value>
              }
          ],
          "apply-during-cdc":<optional Boolean value>,
          "apply-during-full-load": <optional Boolean value>
        }
```

The following sample highlights the sections used for condition expression.

![\[Get started with AWS DMS\]](http://docs.aws.amazon.com/dms/latest/userguide/images/datarep-Tasks-conditional1.png)


### Using attribute mapping with object mapping
<a name="CHAP_Target.DynamoDB.ObjectMapping.AttributeMapping"></a>

Attribute mapping lets you specify a template string using source column names to restructure data on the target. There is no formatting done other than what the user specifies in the template.

The following example shows the structure of the source database and the desired structure of the DynamoDB target. First is shown the structure of the source, in this case an Oracle database, and then the desired structure of the data in DynamoDB. The example concludes with the JSON used to create the desired target structure.

The structure of the Oracle data is as follows:


****  

| FirstName | LastName | StoreId | HomeAddress | HomePhone | WorkAddress | WorkPhone | DateOfBirth | 
| --- | --- | --- | --- | --- | --- | --- | --- | 
| Primary Key | N/A |  | 
| Randy | Marsh | 5 | 221B Baker Street  | 1234567890 | 31 Spooner Street, Quahog  | 9876543210  | 02/29/1988  | 

The structure of the DynamoDB data is as follows:


****  

| CustomerName | StoreId | ContactDetails | DateOfBirth | 
| --- | --- | --- | --- | 
| Partition Key | Sort Key | N/A | 
| <pre>Randy,Marsh</pre> | <pre>5</pre> | <pre>{<br />    "Name": "Randy",<br />    "Home": {<br />        "Address": "221B Baker Street",<br />        "Phone": 1234567890<br />    },<br />    "Work": {<br />        "Address": "31 Spooner Street, Quahog",<br />        "Phone": 9876541230<br />    }<br />}</pre> | <pre>02/29/1988</pre> | 

The following JSON shows the object mapping and column mapping used to achieve the DynamoDB structure:

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "test",
                "table-name": "%"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "2",
            "rule-name": "TransformToDDB",
            "rule-action": "map-record-to-record",
            "object-locator": {
                "schema-name": "test",
                "table-name": "customer"
            },
            "target-table-name": "customer_t",
            "mapping-parameters": {
                "partition-key-name": "CustomerName",
                "sort-key-name": "StoreId",
                "exclude-columns": [
                    "FirstName",
                    "LastName",
                    "HomeAddress",
                    "HomePhone",
                    "WorkAddress",
                    "WorkPhone"
                ],
                "attribute-mappings": [
                    {
                        "target-attribute-name": "CustomerName",
                        "attribute-type": "scalar",
                        "attribute-sub-type": "string",
                        "value": "${FirstName},${LastName}"
                    },
                    {
                        "target-attribute-name": "StoreId",
                        "attribute-type": "scalar",
                        "attribute-sub-type": "string",
                        "value": "${StoreId}"
                    },
                    {
                        "target-attribute-name": "ContactDetails",
                        "attribute-type": "scalar",
                        "attribute-sub-type": "string",
                        "value": "{\"Name\":\"${FirstName}\",\"Home\":{\"Address\":\"${HomeAddress}\",\"Phone\":\"${HomePhone}\"}, \"Work\":{\"Address\":\"${WorkAddress}\",\"Phone\":\"${WorkPhone}\"}}"
                    }
                ]
            }
        }
    ]
}
```

Another way to use column mapping is to use DynamoDB format as your document type. The following code example uses *dynamodb-map* as the `attribute-sub-type` for attribute mapping. 

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "test",
                "table-name": "%"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "2",
            "rule-name": "TransformToDDB",
            "rule-action": "map-record-to-record",
            "object-locator": {
                "schema-name": "test",
                "table-name": "customer"
            },
            "target-table-name": "customer_t",
            "mapping-parameters": {
                "partition-key-name": "CustomerName",
                "sort-key-name": "StoreId",
                "exclude-columns": [
                    "FirstName",
                    "LastName",
                    "HomeAddress",
                    "HomePhone",
                    "WorkAddress",
                    "WorkPhone"
                ],
                "attribute-mappings": [
                    {
                        "target-attribute-name": "CustomerName",
                        "attribute-type": "scalar",
                        "attribute-sub-type": "string",
                        "value": "${FirstName},${LastName}"
                    },
                    {
                        "target-attribute-name": "StoreId",
                        "attribute-type": "scalar",
                        "attribute-sub-type": "string",
                        "value": "${StoreId}"
                    },
                    {
                        "target-attribute-name": "ContactDetails",
                        "attribute-type": "document",
                        "attribute-sub-type": "dynamodb-map",
                        "value": {
                          "M": {
                            "Name": {
                              "S": "${FirstName}"
                            },
                            "Home": {
                                    "M": {
                                        "Address": {
                                            "S": "${HomeAddress}"
                                        },
                                        "Phone": {
                                            "S": "${HomePhone}"
                                        }
                                    }
                                },
                                "Work": {
                                    "M": {
                                        "Address": {
                                            "S": "${WorkAddress}"
                                        },
                                        "Phone": {
                                            "S": "${WorkPhone}"
                                        }
                                    }
                                }
                            }
                        }        
                    }
                ]
            }
        }
    ]
}
```

As an alternative to `dynamodb-map`, you can use `dynamodb-list` as the attribute-sub-type for attribute mapping, as shown in the following example.

```
{
"target-attribute-name": "ContactDetailsList",
"attribute-type": "document",
"attribute-sub-type": "dynamodb-list",
"value": {
    "L": [
            {
                "N": "${FirstName}"
            },
            {   
                "N": "${HomeAddress}"
            },
            {   
                "N": "${HomePhone}"
            },
            {
                "N": "${WorkAddress}"
            },
            {
                "N": "${WorkPhone}"
            }
        ]   
    }
}
```

### Example 1: Using attribute mapping with object mapping
<a name="CHAP_Target.DynamoDB.ColumnMappingExample1"></a>

The following example migrates data from two MySQL database tables, *nfl\$1data* and *sport\$1team* , to two DynamoDB table called *NFLTeams* and *SportTeams*. The structure of the tables and the JSON used to map the data from the MySQL database tables to the DynamoDB tables are shown following.

The structure of the MySQL database table *nfl\$1data* is shown below:

```
mysql> desc nfl_data;
+---------------+-------------+------+-----+---------+-------+
| Field         | Type        | Null | Key | Default | Extra |
+---------------+-------------+------+-----+---------+-------+
| Position      | varchar(5)  | YES  |     | NULL    |       |
| player_number | smallint(6) | YES  |     | NULL    |       |
| Name          | varchar(40) | YES  |     | NULL    |       |
| status        | varchar(10) | YES  |     | NULL    |       |
| stat1         | varchar(10) | YES  |     | NULL    |       |
| stat1_val     | varchar(10) | YES  |     | NULL    |       |
| stat2         | varchar(10) | YES  |     | NULL    |       |
| stat2_val     | varchar(10) | YES  |     | NULL    |       |
| stat3         | varchar(10) | YES  |     | NULL    |       |
| stat3_val     | varchar(10) | YES  |     | NULL    |       |
| stat4         | varchar(10) | YES  |     | NULL    |       |
| stat4_val     | varchar(10) | YES  |     | NULL    |       |
| team          | varchar(10) | YES  |     | NULL    |       |
+---------------+-------------+------+-----+---------+-------+
```

The structure of the MySQL database table *sport\$1team* is shown below:

```
mysql> desc sport_team;
+---------------------------+--------------+------+-----+---------+----------------+
| Field                     | Type         | Null | Key | Default | Extra          |
+---------------------------+--------------+------+-----+---------+----------------+
| id                        | mediumint(9) | NO   | PRI | NULL    | auto_increment |
| name                      | varchar(30)  | NO   |     | NULL    |                |
| abbreviated_name          | varchar(10)  | YES  |     | NULL    |                |
| home_field_id             | smallint(6)  | YES  | MUL | NULL    |                |
| sport_type_name           | varchar(15)  | NO   | MUL | NULL    |                |
| sport_league_short_name   | varchar(10)  | NO   |     | NULL    |                |
| sport_division_short_name | varchar(10)  | YES  |     | NULL    |                |
```

The table-mapping rules used to map the two tables to the two DynamoDB tables is shown below:

```
{
  "rules":[
    {
      "rule-type": "selection",
      "rule-id": "1",
      "rule-name": "1",
      "object-locator": {
        "schema-name": "dms_sample",
        "table-name": "nfl_data"
      },
      "rule-action": "include"
    },
    {
      "rule-type": "selection",
      "rule-id": "2",
      "rule-name": "2",
      "object-locator": {
        "schema-name": "dms_sample",
        "table-name": "sport_team"
      },
      "rule-action": "include"
    },
    {
      "rule-type":"object-mapping",
      "rule-id":"3",
      "rule-name":"MapNFLData",
      "rule-action":"map-record-to-record",
      "object-locator":{
        "schema-name":"dms_sample",
        "table-name":"nfl_data"
      },
      "target-table-name":"NFLTeams",
      "mapping-parameters":{
        "partition-key-name":"Team",
        "sort-key-name":"PlayerName",
        "exclude-columns": [
          "player_number", "team", "name"
        ],
        "attribute-mappings":[
          {
            "target-attribute-name":"Team",
            "attribute-type":"scalar",
            "attribute-sub-type":"string",
            "value":"${team}"
          },
          {
            "target-attribute-name":"PlayerName",
            "attribute-type":"scalar",
            "attribute-sub-type":"string",
            "value":"${name}"
          },
          {
            "target-attribute-name":"PlayerInfo",
            "attribute-type":"scalar",
            "attribute-sub-type":"string",
            "value":"{\"Number\": \"${player_number}\",\"Position\": \"${Position}\",\"Status\": \"${status}\",\"Stats\": {\"Stat1\": \"${stat1}:${stat1_val}\",\"Stat2\": \"${stat2}:${stat2_val}\",\"Stat3\": \"${stat3}:${
stat3_val}\",\"Stat4\": \"${stat4}:${stat4_val}\"}"
          }
        ]
      }
    },
    {
      "rule-type":"object-mapping",
      "rule-id":"4",
      "rule-name":"MapSportTeam",
      "rule-action":"map-record-to-record",
      "object-locator":{
        "schema-name":"dms_sample",
        "table-name":"sport_team"
      },
      "target-table-name":"SportTeams",
      "mapping-parameters":{
        "partition-key-name":"TeamName",
        "exclude-columns": [
          "name", "id"
        ],
        "attribute-mappings":[
          {
            "target-attribute-name":"TeamName",
            "attribute-type":"scalar",
            "attribute-sub-type":"string",
            "value":"${name}"
          },
          {
            "target-attribute-name":"TeamInfo",
            "attribute-type":"scalar",
            "attribute-sub-type":"string",
            "value":"{\"League\": \"${sport_league_short_name}\",\"Division\": \"${sport_division_short_name}\"}"
          }
        ]
      }
    }
  ]
}
```

The sample output for the *NFLTeams* DynamoDB table is shown below:

```
  "PlayerInfo": "{\"Number\": \"6\",\"Position\": \"P\",\"Status\": \"ACT\",\"Stats\": {\"Stat1\": \"PUNTS:73\",\"Stat2\": \"AVG:46\",\"Stat3\": \"LNG:67\",\"Stat4\": \"IN 20:31\"}",
  "PlayerName": "Allen, Ryan",
  "Position": "P",
  "stat1": "PUNTS",
  "stat1_val": "73",
  "stat2": "AVG",
  "stat2_val": "46",
  "stat3": "LNG",
  "stat3_val": "67",
  "stat4": "IN 20",
  "stat4_val": "31",
  "status": "ACT",
  "Team": "NE"
}
```

The sample output for the SportsTeams *DynamoDB* table is shown below:

```
{
  "abbreviated_name": "IND",
  "home_field_id": 53,
  "sport_division_short_name": "AFC South",
  "sport_league_short_name": "NFL",
  "sport_type_name": "football",
  "TeamInfo": "{\"League\": \"NFL\",\"Division\": \"AFC South\"}",
  "TeamName": "Indianapolis Colts"
}
```

## Target data types for DynamoDB
<a name="CHAP_Target.DynamoDB.DataTypes"></a>

The DynamoDB endpoint for AWS DMS supports most DynamoDB data types. The following table shows the Amazon AWS DMS target data types that are supported when using AWS DMS and the default mapping from AWS DMS data types.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).

When AWS DMS migrates data from heterogeneous databases, we map data types from the source database to intermediate data types called AWS DMS data types. We then map the intermediate data types to the target data types. The following table shows each AWS DMS data type and the data type it maps to in DynamoDB:


| AWS DMS data type | DynamoDB data type | 
| --- | --- | 
|  String  |  String  | 
|  WString  |  String  | 
|  Boolean  |  Boolean  | 
|  Date  |  String  | 
|  DateTime  |  String  | 
|  INT1  |  Number  | 
|  INT2  |  Number  | 
|  INT4  |  Number  | 
|  INT8  |  Number  | 
|  Numeric  |  Number  | 
|  Real4  |  Number  | 
|  Real8  |  Number  | 
|  UINT1  |  Number  | 
|  UINT2  |  Number  | 
|  UINT4  |  Number  | 
| UINT8 | Number | 
| CLOB | String | 

# Using Amazon Kinesis Data Streams as a target for AWS Database Migration Service
<a name="CHAP_Target.Kinesis"></a>

You can use AWS DMS to migrate data to an Amazon Kinesis data stream. Amazon Kinesis data streams are part of the Amazon Kinesis Data Streams service. You can use Kinesis data streams to collect and process large streams of data records in real time.

A Kinesis data stream is made up of shards. *Shards* are uniquely identified sequences of data records in a stream. For more information on shards in Amazon Kinesis Data Streams, see [Shard](https://docs.aws.amazon.com/streams/latest/dev/key-concepts.html#shard) in the *Amazon Kinesis Data Streams Developer Guide.*

AWS Database Migration Service publishes records to a Kinesis data stream using JSON. During conversion, AWS DMS serializes each record from the source database into an attribute-value pair in JSON format or a JSON\$1UNFORMATTED message format. A JSON\$1UNFORMATTED message format is a single line JSON string with new line delimiter. It allows Amazon Data Firehose to deliver Kinesis data to an Amazon S3 destination, and then query it using various query engines including Amazon Athena.

You use object mapping to migrate your data from any supported data source to a target stream. With object mapping, you determine how to structure the data records in the stream. You also define a partition key for each table, which Kinesis Data Streams uses to group the data into its shards. 

AWS DMS also sets several Kinesis Data Streams parameter values. The cost for the table creation depends on the amount of data and the number of tables to be migrated.

**Note**  
The **SSL Mode** option on the AWS DMS console or API doesn’t apply to some data streaming and NoSQL services like Kinesis and DynamoDB. They are secure by default, so AWS DMS shows the SSL mode setting is equal to none (**SSL Mode=None**). You don’t need to provide any additional configuration for your endpoint to make use of SSL. For example, when using Kinesis as a target endpoint, it is secure by default. All API calls to Kinesis use SSL, so there is no need for an additional SSL option in the AWS DMS endpoint. You can securely put data and retrieve data through SSL endpoints using the HTTPS protocol, which AWS DMS uses by default when connecting to a Kinesis Data Stream.

**Kinesis Data Streams endpoint settings**

When you use Kinesis Data Streams target endpoints, you can get transaction and control details using the `KinesisSettings` option in the AWS DMS API. 

You can set connection settings in the following ways:
+ In the AWS DMS console, using endpoint settings.
+ In the CLI, using the `kinesis-settings` option of the [CreateEndpoint](https://docs.aws.amazon.com/dms/latest/APIReference/API_CreateEndpoint.html) command.

In the CLI, use the following request parameters of the `kinesis-settings` option:
**Note**  
Support for the `IncludeNullAndEmpty` endpoint setting is available in AWS DMS version 3.4.1 and higher. But support for the other following endpoint settings for Kinesis Data Streams targets is available in AWS DMS. 
+ `MessageFormat` – The output format for the records created on the endpoint. The message format is `JSON` (default) or `JSON_UNFORMATTED` (a single line with no tab).
+ `IncludeControlDetails` – Shows detailed control information for table definition, column definition, and table and column changes in the Kinesis message output. The default is `false`.
+ `IncludeNullAndEmpty` – Include NULL and empty columns in the target. The default is `false`.
+ `IncludePartitionValue` – Shows the partition value within the Kinesis message output, unless the partition type is `schema-table-type`. The default is `false`.
+ `IncludeTableAlterOperations` – Includes any data definition language (DDL) operations that change the table in the control data, such as `rename-table`, `drop-table`, `add-column`, `drop-column`, and `rename-column`. The default is `false`.
+ `IncludeTransactionDetails` – Provides detailed transaction information from the source database. This information includes a commit timestamp, a log position, and values for `transaction_id`, `previous_transaction_id`, and `transaction_record_id `(the record offset within a transaction). The default is `false`.
+ `PartitionIncludeSchemaTable` – Prefixes schema and table names to partition values, when the partition type is `primary-key-type`. Doing this increases data distribution among Kinesis shards. For example, suppose that a `SysBench` schema has thousands of tables and each table has only limited range for a primary key. In this case, the same primary key is sent from thousands of tables to the same shard, which causes throttling. The default is `false`.
+ `UseLargeIntegerValue` – Use up to 18 digit int instead of casting ints as doubles, available from AWS DMS version 3.5.4. The default is false.

The following example shows the `kinesis-settings` option in use with an example `create-endpoint` command issued using the AWS CLI.

```
aws dms \
  create-endpoint \
    --region <aws-region> \
    --endpoint-identifier <user-endpoint-identifier> \
    --endpoint-type target \
    --engine-name kinesis \
    --kinesis-settings ServiceAccessRoleArn=arn:aws:iam::<account-id>:role/<kinesis-role-name>,StreamArn=arn:aws:kinesis:<aws-region>:<account-id>:stream/<stream-name>,MessageFormat=json-unformatted,
IncludeControlDetails=true,IncludeTransactionDetails=true,IncludePartitionValue=true,PartitionIncludeSchemaTable=true,
IncludeTableAlterOperations=true
```

**Multithreaded full load task settings**

To help increase the speed of the transfer, AWS DMS supports a multithreaded full load to a Kinesis Data Streams target instance. DMS supports this multithreading with task settings that include the following:
+ `MaxFullLoadSubTasks` – Use this option to indicate the maximum number of source tables to load in parallel. DMS loads each table into its corresponding Kinesis target table using a dedicated subtask. The default is 8; the maximum value is 49.
+ `ParallelLoadThreads` – Use this option to specify the number of threads that AWS DMS uses to load each table into its Kinesis target table. The maximum value for a Kinesis Data Streams target is 32. You can ask to have this maximum limit increased.
+ `ParallelLoadBufferSize` – Use this option to specify the maximum number of records to store in the buffer that the parallel load threads use to load data to the Kinesis target. The default value is 50. The maximum value is 1,000. Use this setting with `ParallelLoadThreads`. `ParallelLoadBufferSize` is valid only when there is more than one thread.
+ `ParallelLoadQueuesPerThread` – Use this option to specify the number of queues each concurrent thread accesses to take data records out of queues and generate a batch load for the target. The default is 1. However, for Kinesis targets of various payload sizes, the valid range is 5–512 queues per thread.

**Multithreaded CDC load task settings**

You can improve the performance of change data capture (CDC) for real-time data streaming target endpoints like Kinesis using task settings to modify the behavior of the `PutRecords` API call. To do this, you can specify the number of concurrent threads, queues per thread, and the number of records to store in a buffer using `ParallelApply*` task settings. For example, suppose you want to perform a CDC load and apply 128 threads in parallel. You also want to access 64 queues per thread, with 50 records stored per buffer. 

To promote CDC performance, AWS DMS supports these task settings:
+ `ParallelApplyThreads` – Specifies the number of concurrent threads that AWS DMS uses during a CDC load to push data records to a Kinesis target endpoint. The default value is zero (0) and the maximum value is 32.
+ `ParallelApplyBufferSize` – Specifies the maximum number of records to store in each buffer queue for concurrent threads to push to a Kinesis target endpoint during a CDC load. The default value is 100 and the maximum value is 1,000. Use this option when `ParallelApplyThreads` specifies more than one thread. 
+ `ParallelApplyQueuesPerThread` – Specifies the number of queues that each thread accesses to take data records out of queues and generate a batch load for a Kinesis endpoint during CDC. The default value is 1 and the maximum value is 512.

When using `ParallelApply*` task settings, the `partition-key-type` default is the `primary-key` of the table, not `schema-name.table-name`.

## Using a before image to view original values of CDC rows for a Kinesis data stream as a target
<a name="CHAP_Target.Kinesis.BeforeImage"></a>

When writing CDC updates to a data-streaming target like Kinesis, you can view a source database row's original values before change by an update. To make this possible, AWS DMS populates a *before image* of update events based on data supplied by the source database engine. 

Different source database engines provide different amounts of information for a before image: 
+ Oracle provides updates to columns only if they change. 
+ PostgreSQL provides only data for columns that are part of the primary key (changed or not). To provide data for all columns (changed or not), you need to set `REPLICA_IDENTITY` to `FULL` instead of `DEFAULT`. Note that you should choose the `REPLICA_IDENTITY` setting carefully for each table. If you set `REPLICA_IDENTITY` to `FULL`, all of the column values are written to write-ahead logging (WAL) continuously. This may cause performance or resource issues with tables that are updated frequently.
+ MySQL generally provides data for all columns except for BLOB and CLOB data types (changed or not).

To enable before imaging to add original values from the source database to the AWS DMS output, use either the `BeforeImageSettings` task setting or the `add-before-image-columns` parameter. This parameter applies a column transformation rule. 

`BeforeImageSettings` adds a new JSON attribute to every update operation with values collected from the source database system, as shown following.

```
"BeforeImageSettings": {
    "EnableBeforeImage": boolean,
    "FieldName": string,  
    "ColumnFilter": pk-only (default) / non-lob / all (but only one)
}
```

**Note**  
Only apply `BeforeImageSettings` to AWS DMS tasks that contain a CDC component, such as full load plus CDC tasks (which migrate existing data and replicate ongoing changes), or to CDC only tasks (which replicate data changes only). Don't apply `BeforeImageSettings` to tasks that are full load only.

For `BeforeImageSettings` options, the following applies:
+ Set the `EnableBeforeImage` option to `true` to enable before imaging. The default is `false`. 
+ Use the `FieldName` option to assign a name to the new JSON attribute. When `EnableBeforeImage` is `true`, `FieldName` is required and can't be empty.
+ The `ColumnFilter` option specifies a column to add by using before imaging. To add only columns that are part of the table's primary keys, use the default value, `pk-only`. To add any column that has a before image value, use `all`. Note that the before image does not contain columns with LOB data types, such as CLOB or BLOB.

  ```
  "BeforeImageSettings": {
      "EnableBeforeImage": true,
      "FieldName": "before-image",
      "ColumnFilter": "pk-only"
    }
  ```

**Note**  
Amazon S3 targets don't support `BeforeImageSettings`. For S3 targets, use only the `add-before-image-columns` transformation rule to perform before imaging during CDC.

### Using a before image transformation rule
<a name="CHAP_Target.Kinesis.BeforeImage.Transform-Rule"></a>

As as an alternative to task settings, you can use the `add-before-image-columns` parameter, which applies a column transformation rule. With this parameter, you can enable before imaging during CDC on data streaming targets like Kinesis. 

By using `add-before-image-columns` in a transformation rule, you can apply more fine-grained control of the before image results. Transformation rules enable you to use an object locator that gives you control over tables selected for the rule. Also, you can chain transformation rules together, which allows different rules to be applied to different tables. You can then manipulate the columns produced by using other rules. 

**Note**  
Don't use the `add-before-image-columns` parameter together with the `BeforeImageSettings` task setting within the same task. Instead, use either the parameter or the setting, but not both, for a single task.

A `transformation` rule type with the `add-before-image-columns` parameter for a column must provide a `before-image-def` section. The following shows an example.

```
    {
      "rule-type": "transformation",
      …
      "rule-target": "column",
      "rule-action": "add-before-image-columns",
      "before-image-def":{
        "column-filter": one-of  (pk-only / non-lob / all),
        "column-prefix": string,
        "column-suffix": string,
      }
    }
```

The value of `column-prefix` is prepended to a column name, and the default value of `column-prefix` is `BI_`. The value of `column-suffix` is appended to the column name, and the default is empty. Don't set both `column-prefix` and `column-suffix` to empty strings.

Choose one value for `column-filter`. To add only columns that are part of table primary keys, choose `pk-only` . Choose `non-lob` to only add columns that are not of LOB type. Or choose `all` to add any column that has a before-image value.

### Example for a before image transformation rule
<a name="CHAP_Target.Kinesis.BeforeImage.Example"></a>

The transformation rule in the following example adds a new column called `BI_emp_no` in the target. So a statement like `UPDATE employees SET emp_no = 3 WHERE emp_no = 1;` populates the `BI_emp_no` field with 1. When you write CDC updates to Amazon S3 targets, the `BI_emp_no` column makes it possible to tell which original row was updated.

```
{
  "rules": [
    {
      "rule-type": "selection",
      "rule-id": "1",
      "rule-name": "1",
      "object-locator": {
        "schema-name": "%",
        "table-name": "%"
      },
      "rule-action": "include"
    },
    {
      "rule-type": "transformation",
      "rule-id": "2",
      "rule-name": "2",
      "rule-target": "column",
      "object-locator": {
        "schema-name": "%",
        "table-name": "employees"
      },
      "rule-action": "add-before-image-columns",
      "before-image-def": {
        "column-prefix": "BI_",
        "column-suffix": "",
        "column-filter": "pk-only"
      }
    }
  ]
}
```

For information on using the `add-before-image-columns` rule action, see [Transformation rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.md).

## Prerequisites for using a Kinesis data stream as a target for AWS Database Migration Service
<a name="CHAP_Target.Kinesis.Prerequisites"></a>

### IAM role for using a Kinesis data stream as a target for AWS Database Migration Service
<a name="CHAP_Target.Kinesis.Prerequisites.IAM"></a>

Before you set up a Kinesis data stream as a target for AWS DMS, make sure that you create an IAM role. This role must allow AWS DMS to assume and grant access to the Kinesis data streams that are being migrated into. The minimum set of access permissions is shown in the following IAM policy.

------
#### [ JSON ]

****  

```
{
   "Version":"2012-10-17",		 	 	 
   "Statement": [
   {
     "Sid": "1",
     "Effect": "Allow",
     "Principal": {
        "Service": "dms.amazonaws.com"
     },
   "Action": "sts:AssumeRole"
   }
]
}
```

------

The role that you use for the migration to a Kinesis data stream must have the following permissions.

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "kinesis:DescribeStream",
        "kinesis:PutRecord",
        "kinesis:PutRecords"
      ],
      "Resource": "*"
    }
  ]
}
```

------

### Accessing a Kinesis data stream as a target for AWS Database Migration Service
<a name="CHAP_Target.Kinesis.Prerequisites.Access"></a>

In AWS DMS version 3.4.7 and higher, to connect to an Kinesis endpoint, you must do one of the following:
+ Configure DMS to use VPC endpoints. For information about configuring DMS to use VPC endpoints, see [Configuring VPC endpoints for AWS DMS](CHAP_VPC_Endpoints.md).
+ Configure DMS to use public routes, that is, make your replication instance public. For information about public replication instances, see [Public and private replication instances](CHAP_ReplicationInstance.PublicPrivate.md).

## Limitations when using Kinesis Data Streams as a target for AWS Database Migration Service
<a name="CHAP_Target.Kinesis.Limitations"></a>

The following limitations apply when using Kinesis Data Streams as a target:
+ AWS DMS publishes each update to a single record in the source database as one data record in a given Kinesis data stream regardless of transactions. However, you can include transaction details for each data record by using relevant parameters of the `KinesisSettings` API.
+ Full LOB mode is not supported.
+ The maximum supported LOB size is 1 MB.
+ Kinesis Data Streams don't support deduplication. Applications that consume data from a stream need to handle duplicate records. For more information, see [Handling duplicate records](https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-duplicates.html) in the *Amazon Kinesis Data Streams Developer Guide.*
+ AWS DMS supports the following four forms for partition keys:
  + `SchemaName.TableName`: A combination of the schema and table name.
  + `${AttributeName}`: The value of one of the fields in the JSON, or the primary key of the table in the source database.
  + `transaction-id`: The CDC transaction ID. All records within the same transaction go to the same partition.
  + `constant`: A fixed literal value for every record regardless of table or data. All records are sent to the same partition key value "constant", providing strict global ordering across all tables.

  ```
  {
      "rule-type": "object-mapping",
      "rule-id": "2",
      "rule-name": "PartitionKeyTypeExample",
      "rule-action": "map-record-to-document",
      "object-locator": {
          "schema-name": "onprem",
          "table-name": "it_system"
      },
      "mapping-parameters": {
          "partition-key-type": "transaction-id | constant | attribute-name | schema-table"
      }
  }
  ```
+ For information about encrypting your data at rest within Kinesis Data Streams, see [Data protection in Kinesis Data Streams](https://docs.aws.amazon.com/streams/latest/dev/server-side-encryption.html.html) in the *AWS Key Management Service Developer Guide*. 
+ The `IncludeTransactionDetails` endpoint setting is only supported when the source endpoint is Oracle, SQL Server, PostgreSQL, or MySQL. For other source endpoint types, transaction details will not be included.
+ `BatchApply` is not supported for a Kinesis endpoint. Using Batch Apply (for example, the `BatchApplyEnabled` target metadata task setting) for a Kinesis target causes task failure and data loss. Do not enable `BatchApply` when using Kinesis as a target endpoint.
+ Kinesis targets are only supported for a Kinesis data stream in the same AWS account and the same AWS Region as the replication instance.
+ When migrating from a MySQL source, the BeforeImage data doesn't include CLOB and BLOB data types. For more information, see [Using a before image to view original values of CDC rows for a Kinesis data stream as a target](#CHAP_Target.Kinesis.BeforeImage).
+ AWS DMS doesn't support migrating values of `BigInt` data type with more than 16 digits. To work around this limitation, you can use the following transformation rule to convert the `BigInt` column to a string. For more information about transformation rules, see [Transformation rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.md).

  ```
  {
      "rule-type": "transformation",
      "rule-id": "id",
      "rule-name": "name",
      "rule-target": "column",
      "object-locator": {
          "schema-name": "valid object-mapping rule action",
          "table-name": "",
          "column-name": ""
      },
      "rule-action": "change-data-type",
      "data-type": {
          "type": "string",
          "length": 20
      }
  }
  ```
+ When multiple DML operations within a single transaction modify a Large Object (LOB) column on the source database, the target database retains only the final LOB value from the last operation in that transaction. The intermediate LOB values set by earlier operations in the same transaction are overwritten, which can result in potential data loss or inconsistencies. This behavior occurs due to how LOB data is processed during replication.
+ AWS DMS does not support source data containing embedded `'\0'` characters when using Kinesis as a target endpoint. Data containing embedded `'\0'` characters will be truncated at the first `'\0'` character.

## Using object mapping to migrate data to a Kinesis data stream
<a name="CHAP_Target.Kinesis.ObjectMapping"></a>

AWS DMS uses table-mapping rules to map data from the source to the target Kinesis data stream. To map data to a target stream, you use a type of table-mapping rule called object mapping. You use object mapping to define how data records in the source map to the data records published to the Kinesis data stream. 

Kinesis data streams don't have a preset structure other than having a partition key. In an object mapping rule, the possible values of a `partition-key-type` for data records are `schema-table`, `transaction-id`, `primary-key`, `constant`, and `attribute-name`.

To create an object-mapping rule, you specify `rule-type` as `object-mapping`. This rule specifies what type of object mapping you want to use. 

The structure for the rule is as follows.

```
{
    "rules": [
        {
            "rule-type": "object-mapping",
            "rule-id": "id",
            "rule-name": "name",
            "rule-action": "valid object-mapping rule action",
            "object-locator": {
                "schema-name": "case-sensitive schema name",
                "table-name": ""
            }
        }
    ]
}
```

AWS DMS currently supports `map-record-to-record` and `map-record-to-document` as the only valid values for the `rule-action` parameter. These settings affect values that aren't excluded as part of the `exclude-columns` attribute list. The `map-record-to-record` and `map-record-to-document` values specify how AWS DMS handles these records by default. These values don't affect the attribute mappings in any way. 

Use `map-record-to-record` when migrating from a relational database to a Kinesis data stream. This rule type uses the `taskResourceId.schemaName.tableName` value from the relational database as the partition key in the Kinesis data stream and creates an attribute for each column in the source database. 

When using `map-record-to-record`, note the following:
+ This setting only affects columns excluded by the `exclude-columns` list.
+ For every such column, AWS DMS creates a corresponding attribute in the target topic.
+ AWS DMS creates this corresponding attribute regardless of whether the source column is used in an attribute mapping. 

Use `map-record-to-document` to put source columns into a single, flat document in the appropriate target stream using the attribute name "\$1doc". AWS DMS places the data into a single, flat map on the source called "`_doc`". This placement applies to any column in the source table not listed in the `exclude-columns` attribute list.

One way to understand `map-record-to-record` is to see it in action. For this example, assume that you are starting with a relational database table row with the following structure and data.


| FirstName | LastName | StoreId | HomeAddress | HomePhone | WorkAddress | WorkPhone | DateofBirth | 
| --- | --- | --- | --- | --- | --- | --- | --- | 
| Randy | Marsh | 5 | 221B Baker Street | 1234567890 | 31 Spooner Street, Quahog  | 9876543210 | 02/29/1988 | 

To migrate this information from a schema named `Test` to a Kinesis data stream, you create rules to map the data to the target stream. The following rule illustrates the mapping. 

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "rule-action": "include",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "%"
            }
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "2",
            "rule-name": "DefaultMapToKinesis",
            "rule-action": "map-record-to-record",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "Customers"
            }
        }
    ]
}
```

The following illustrates the resulting record format in the Kinesis data stream: 
+ StreamName: XXX
+ PartitionKey: Test.Customers //schmaName.tableName
+ Data: //The following JSON message

  ```
    {
       "FirstName": "Randy",
       "LastName": "Marsh",
       "StoreId":  "5",
       "HomeAddress": "221B Baker Street",
       "HomePhone": "1234567890",
       "WorkAddress": "31 Spooner Street, Quahog",
       "WorkPhone": "9876543210",
       "DateOfBirth": "02/29/1988"
    }
  ```

However, suppose that you use the same rules but change the `rule-action` parameter to `map-record-to-document` and exclude certain columns. The following rule illustrates the mapping.

```
{
	"rules": [
	   {
			"rule-type": "selection",
			"rule-id": "1",
			"rule-name": "1",
			"rule-action": "include",
			"object-locator": {
				"schema-name": "Test",
				"table-name": "%"
			}
		},
		{
			"rule-type": "object-mapping",
			"rule-id": "2",
			"rule-name": "DefaultMapToKinesis",
			"rule-action": "map-record-to-document",
			"object-locator": {
				"schema-name": "Test",
				"table-name": "Customers"
			},
			"mapping-parameters": {
				"exclude-columns": [
					"homeaddress",
					"homephone",
					"workaddress",
					"workphone"
				]
			}
		}
	]
}
```

In this case, the columns not listed in the `exclude-columns` parameter, `FirstName`, `LastName`, `StoreId` and `DateOfBirth`, are mapped to `_doc`. The following illustrates the resulting record format. 

```
       {
            "data":{
                "_doc":{
                    "FirstName": "Randy",
                    "LastName": "Marsh",
                    "StoreId":  "5",
                    "DateOfBirth": "02/29/1988"
                }
            }
        }
```

### Restructuring data with attribute mapping
<a name="CHAP_Target.Kinesis.AttributeMapping"></a>

You can restructure the data while you are migrating it to a Kinesis data stream using an attribute map. For example, you might want to combine several fields in the source into a single field in the target. The following attribute map illustrates how to restructure the data.

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "rule-action": "include",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "%"
            }
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "2",
            "rule-name": "TransformToKinesis",
            "rule-action": "map-record-to-record",
            "target-table-name": "CustomerData",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "Customers"
            },
            "mapping-parameters": {
                "partition-key-type": "attribute-name",
                "partition-key-name": "CustomerName",
                "exclude-columns": [
                    "firstname",
                    "lastname",
                    "homeaddress",
                    "homephone",
                    "workaddress",
                    "workphone"
                ],
                "attribute-mappings": [
                    {
                        "target-attribute-name": "CustomerName",
                        "attribute-type": "scalar",
                        "attribute-sub-type": "string",
                        "value": "${lastname}, ${firstname}"
                    },
                    {
                        "target-attribute-name": "ContactDetails",
                        "attribute-type": "document",
                        "attribute-sub-type": "json",
                        "value": {
                            "Home": {
                                "Address": "${homeaddress}",
                                "Phone": "${homephone}"
                            },
                            "Work": {
                                "Address": "${workaddress}",
                                "Phone": "${workphone}"
                            }
                        }
                    }
                ]
            }
        }
    ]
}
```

To set a constant value for `partition-key`, specify `"partition-key-type: "constant"`, this sets the partition value to `constant`. For example, you might do this to force all the data to be stored in a single shard. The following mapping illustrates this approach. 

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "%"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "2",
            "rule-name": "TransformToKinesis",
            "rule-action": "map-record-to-document",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "Customer"
            },
            "mapping-parameters": {
                "partition-key-type": "constant",
                "exclude-columns": [
                    "FirstName",
                    "LastName",
                    "HomeAddress",
                    "HomePhone",
                    "WorkAddress",
                    "WorkPhone"
                ],
                "attribute-mappings": [
                    {
                        "target-attribute-name": "CustomerName",
                        "attribute-type": "scalar",
                        "attribute-sub-type": "string",
                        "value": "${FirstName},${LastName}"

                    },
                    {
                        "target-attribute-name": "ContactDetails",
                        "attribute-type": "scalar",
                        "attribute-sub-type": "string",
                        "value": {
                            "Home": {
                                "Address": "${HomeAddress}",
                                "Phone": "${HomePhone}"
                            },
                            "Work": {
                                "Address": "${WorkAddress}",
                                "Phone": "${WorkPhone}"
                            }
                        }
                    },
                    {
                        "target-attribute-name": "DateOfBirth",
                        "attribute-type": "scalar",
                        "attribute-sub-type": "string",
                        "value": "${DateOfBirth}"
                    }
                ]
            }
        }
    ]
}
```

**Note**  
The `partition-key` value for a control record that is for a specific table is `TaskId.SchemaName.TableName`. The `partition-key` value for a control record that is for a specific task is that record's `TaskId`. Specifying a `partition-key` value in the object mapping has no impact on the `partition-key` for a control record.  
 When `partition-key-type` is set to `attribute-name` in a table mapping rule, you must specify `partition-key-name`, which must reference either a column from the source table or a custom column defined in the mapping. Additionally, `attribute-mappings` must be provided to define how source columns map to the target Kinesis Stream.

### Message format for Kinesis Data Streams
<a name="CHAP_Target.Kinesis.Messageformat"></a>

The JSON output is simply a list of key-value pairs. A JSON\$1UNFORMATTED message format is a single line JSON string with new line delimiter.

AWS DMS provides the following reserved fields to make it easier to consume the data from the Kinesis Data Streams: 

**RecordType**  
The record type can be either data or control. *Data records *represent the actual rows in the source. *Control records* are for important events in the stream, for example a restart of the task.

**Operation**  
For data records, the operation can be `load`, `insert`, `update`, or `delete`.  
For control records, the operation can be `create-table`, `rename-table`, `drop-table`, `change-columns`, `add-column`, `drop-column`, `rename-column`, or `column-type-change`.

**SchemaName**  
The source schema for the record. This field can be empty for a control record.

**TableName**  
The source table for the record. This field can be empty for a control record.

**Timestamp**  
The timestamp for when the JSON message was constructed. The field is formatted with the ISO 8601 format.

# Using Apache Kafka as a target for AWS Database Migration Service
<a name="CHAP_Target.Kafka"></a>

You can use AWS DMS to migrate data to an Apache Kafka cluster. Apache Kafka is a distributed streaming platform. You can use Apache Kafka for ingesting and processing streaming data in real-time.

AWS also offers Amazon Managed Streaming for Apache Kafka (Amazon MSK) to use as an AWS DMS target. Amazon MSK is a fully managed Apache Kafka streaming service that simplifies the implementation and management of Apache Kafka instances. It works with open-source Apache Kafka versions, and you access Amazon MSK instances as AWS DMS targets exactly like any Apache Kafka instance. For more information, see [What is Amazon MSK?](https://docs.aws.amazon.com/msk/latest/developerguide/what-is-msk.html) in the* Amazon Managed Streaming for Apache Kafka Developer Guide.*

A Kafka cluster stores streams of records in categories called topics that are divided into partitions. *Partitions* are uniquely identified sequences of data records (messages) in a topic. Partitions can be distributed across multiple brokers in a cluster to enable parallel processing of a topic’s records. For more information on topics and partitions and their distribution in Apache Kafka, see [Topics and logs](https://kafka.apache.org/documentation/#intro_topics) and [Distribution](https://kafka.apache.org/documentation/#intro_distribution).

Your Kafka cluster can be either an Amazon MSK instance, a cluster running on an Amazon EC2 instance, or an on-premises cluster. An Amazon MSK instance or a cluster on an Amazon EC2 instance can be in the same VPC or a different one. If your cluster is on-premises, you can use your own on-premises name server for your replication instance to resolve the cluster's host name. For information about setting up a name server for your replication instance, see [Using your own on-premises name server](CHAP_BestPractices.md#CHAP_BestPractices.Rte53DNSResolver). For more information about setting up a network, see [Setting up a network for a replication instance](CHAP_ReplicationInstance.VPC.md).

When using an Amazon MSK cluster, make sure that its security group allows access from your replication instance. For information about changing the security group for an Amazon MSK cluster, see [Changing an Amazon MSK cluster's security group](https://docs.aws.amazon.com/msk/latest/developerguide/change-security-group.html).

AWS Database Migration Service publishes records to a Kafka topic using JSON. During conversion, AWS DMS serializes each record from the source database into an attribute-value pair in JSON format.

To migrate your data from any supported data source to a target Kafka cluster, you use object mapping. With object mapping, you determine how to structure the data records in the target topic. You also define a partition key for each table, which Apache Kafka uses to group the data into its partitions. 

Currently, AWS DMS supports a single topic per task. For a single task with multiple tables, all messages go to a single topic. Each message includes a metadata section that identifies the target schema and table. AWS DMS versions 3.4.6 and higher support multitopic replication using object mapping. For more information, see [Multitopic replication using object mapping](#CHAP_Target.Kafka.MultiTopic).

**Apache Kafka endpoint settings**

You can specify connection details through endpoint settings in the AWS DMS console, or the `--kafka-settings` option in the CLI. The requirements for each setting follow:
+ `Broker` – Specify the locations of one or more brokers in your Kafka cluster in the form of a comma-separated list of each `broker-hostname:port`. An example is `"ec2-12-345-678-901.compute-1.amazonaws.com:2345,ec2-10-987-654-321.compute-1.amazonaws.com:9876"`. This setting can specify the locations of any or all brokers in the cluster. The cluster brokers all communicate to handle the partitioning of data records migrated to the topic.
+ `Topic` – (Optional) Specify the topic name with a maximum length of 255 letters and symbols. You can use period (.), underscore (\$1), and minus (-). Topic names with a period (.) or underscore (\$1) can collide in internal data structures. Use either one, but not both of these symbols in the topic name. If you don't specify a topic name, AWS DMS uses `"kafka-default-topic"` as the migration topic.
**Note**  
To have AWS DMS create either a migration topic you specify or the default topic, set `auto.create.topics.enable = true` as part of your Kafka cluster configuration. For more information, see [Limitations when using Apache Kafka as a target for AWS Database Migration Service](#CHAP_Target.Kafka.Limitations)
+ `MessageFormat` – The output format for the records created on the endpoint. The message format is `JSON` (default) or `JSON_UNFORMATTED` (a single line with no tab).
+ `MessageMaxBytes` – The maximum size in bytes for records created on the endpoint. The default is 1,000,000.
**Note**  
You can only use the AWS CLI/SDK to change `MessageMaxBytes` to a non-default value. For example, to modify your existing Kafka endpoint and change `MessageMaxBytes`, use the following command.  

  ```
  aws dms modify-endpoint --endpoint-arn your-endpoint 
  --kafka-settings Broker="broker1-server:broker1-port,broker2-server:broker2-port,...",
  Topic=topic-name,MessageMaxBytes=integer-of-max-message-size-in-bytes
  ```
+ `IncludeTransactionDetails` – Provides detailed transaction information from the source database. This information includes a commit timestamp, a log position, and values for `transaction_id`, `previous_transaction_id`, and `transaction_record_id` (the record offset within a transaction). The default is `false`.
+ `IncludePartitionValue` – Shows the partition value within the Kafka message output, unless the partition type is `schema-table-type`. The default is `false`.
+ `PartitionIncludeSchemaTable` – Prefixes schema and table names to partition values, when the partition type is `primary-key-type`. Doing this increases data distribution among Kafka partitions. For example, suppose that a `SysBench` schema has thousands of tables and each table has only limited range for a primary key. In this case, the same primary key is sent from thousands of tables to the same partition, which causes throttling. The default is `false`.
+ `IncludeTableAlterOperations` – Includes any data definition language (DDL) operations that change the table in the control data, such as `rename-table`, `drop-table`, `add-column`, `drop-column`, and `rename-column`. The default is `false`. 
+ `IncludeControlDetails` – Shows detailed control information for table definition, column definition, and table and column changes in the Kafka message output. The default is `false`.
+ `IncludeNullAndEmpty` – Include NULL and empty columns in the target. The default is `false`.
+ `SecurityProtocol` – Sets a secure connection to a Kafka target endpoint using Transport Layer Security (TLS). Options include `ssl-authentication`, `ssl-encryption`, and `sasl-ssl`. Using `sasl-ssl` requires `SaslUsername` and `SaslPassword`.
+ `SslEndpointIdentificationAlgorithm` – Sets hostname verification for the certificate. This setting is supported in AWS DMS version 3.5.1 and later. Options include the following: 
  + `NONE`: Disable hostname verification of the broker in the client connection.
  + `HTTPS`: Enable hostname verification of the broker in the client connection.
+ `useLargeIntegerValue` – Use up to 18 digit int instead of casting ints as doubles, available from AWS DMS version 3.5.4. The default is false.

You can use settings to help increase the speed of your transfer. To do so, AWS DMS supports a multithreaded full load to an Apache Kafka target cluster. AWS DMS supports this multithreading with task settings that include the following:
+ `MaxFullLoadSubTasks` – Use this option to indicate the maximum number of source tables to load in parallel. AWS DMS loads each table into its corresponding Kafka target table using a dedicated subtask. The default is 8; the maximum value is 49.
+ `ParallelLoadThreads` – Use this option to specify the number of threads that AWS DMS uses to load each table into its Kafka target table. The maximum value for an Apache Kafka target is 32. You can ask to have this maximum limit increased.
+ `ParallelLoadBufferSize` – Use this option to specify the maximum number of records to store in the buffer that the parallel load threads use to load data to the Kafka target. The default value is 50. The maximum value is 1,000. Use this setting with `ParallelLoadThreads`. `ParallelLoadBufferSize` is valid only when there is more than one thread.
+ `ParallelLoadQueuesPerThread` – Use this option to specify the number of queues each concurrent thread accesses to take data records out of queues and generate a batch load for the target. The default is 1. The maximum is 512.

You can improve the performance of change data capture (CDC) for Kafka endpoints by tuning task settings for parallel threads and bulk operations. To do this, you can specify the number of concurrent threads, queues per thread, and the number of records to store in a buffer using `ParallelApply*` task settings. For example, suppose you want to perform a CDC load and apply 128 threads in parallel. You also want to access 64 queues per thread, with 50 records stored per buffer. 

To promote CDC performance, AWS DMS supports these task settings:
+ `ParallelApplyThreads` – Specifies the number of concurrent threads that AWS DMS uses during a CDC load to push data records to a Kafka target endpoint. The default value is zero (0) and the maximum value is 32.
+ `ParallelApplyBufferSize` – Specifies the maximum number of records to store in each buffer queue for concurrent threads to push to a Kafka target endpoint during a CDC load. The default value is 100 and the maximum value is 1,000. Use this option when `ParallelApplyThreads` specifies more than one thread. 
+ `ParallelApplyQueuesPerThread` – Specifies the number of queues that each thread accesses to take data records out of queues and generate a batch load for a Kafka endpoint during CDC. The default is 1. The maximum is 512.

When using `ParallelApply*` task settings, the `partition-key-type` default is the `primary-key` of the table, not `schema-name.table-name`.

## Connecting to Kafka using Transport Layer Security (TLS)
<a name="CHAP_Target.Kafka.TLS"></a>

A Kafka cluster accepts secure connections using Transport Layer Security (TLS). With DMS, you can use any one of the following three security protocol options to secure a Kafka endpoint connection.

**SSL encryption (`server-encryption`)**  
Clients validate server identity through the server’s certificate. Then an encrypted connection is made between server and client.

**SSL authentication (`mutual-authentication`)**  
Server and client validate the identity with each other through their own certificates. Then an encrypted connection is made between server and client.

**SASL-SSL (`mutual-authentication`)**  
The Simple Authentication and Security Layer (SASL) method replaces the client’s certificate with a user name and password to validate a client identity. Specifically, you provide a user name and password that the server has registered so that the server can validate the identity of a client. Then an encrypted connection is made between server and client.

**Important**  
Apache Kafka and Amazon MSK accept resolved certificates. This is a known limitation of Kafka and Amazon MSK to be addressed. For more information, see [Apache Kafka issues, KAFKA-3700](https://issues.apache.org/jira/browse/KAFKA-3700).  
If you're using Amazon MSK, consider using access control lists (ACLs) as a workaround to this known limitation. For more information about using ACLs, see [Apache Kafka ACLs](https://docs.aws.amazon.com//msk/latest/developerguide/msk-acls.html) section of *Amazon Managed Streaming for Apache Kafka Developer Guide*.  
If you're using a self-managed Kafka cluster, see [Comment dated 21/Oct/18](https://issues.apache.org/jira/browse/KAFKA-3700?focusedCommentId=16658376) for information about configuring your cluster.

### Using SSL encryption with Amazon MSK or a self-managed Kafka cluster
<a name="CHAP_Target.Kafka.TLS.SSLencryption"></a>

You can use SSL encryption to secure an endpoint connection to Amazon MSK or a self-managed Kafka cluster. When you use the SSL encryption authentication method, clients validate a server's identity through the server’s certificate. Then an encrypted connection is made between server and client.

**To use SSL encryption to connect to Amazon MSK**
+ Set the security protocol endpoint setting (`SecurityProtocol`) using the `ssl-encryption` option when you create your target Kafka endpoint. 

  The JSON example following sets the security protocol as SSL encryption.

```
"KafkaSettings": {
    "SecurityProtocol": "ssl-encryption", 
}
```

**To use SSL encryption for a self-managed Kafka cluster**

1. If you're using a private Certification Authority (CA) in your on-premises Kafka cluster, upload your private CA cert and get an Amazon Resource Name (ARN). 

1. Set the security protocol endpoint setting (`SecurityProtocol`) using the `ssl-encryption` option when you create your target Kafka endpoint. The JSON example following sets the security protocol as `ssl-encryption`.

   ```
   "KafkaSettings": {
       "SecurityProtocol": "ssl-encryption", 
   }
   ```

1. If you're using a private CA, set `SslCaCertificateArn` in the ARN you got in the first step above.

### Using SSL authentication
<a name="CHAP_Target.Kafka.TLS.SSLauthentication"></a>

You can use SSL authentication to secure an endpoint connection to Amazon MSK or a self-managed Kafka cluster.

To enable client authentication and encryption using SSL authentication to connect to Amazon MSK, do the following:
+ Prepare a private key and public certificate for Kafka.
+ Upload certificates to the DMS certificate manager.
+ Create a Kafka target endpoint with corresponding certificate ARNs specified in Kafka endpoint settings.

**To prepare a private key and public certificate for Amazon MSK**

1. Create an EC2 instance and set up a client to use authentication as described in steps 1 through 9 in the [Client Authentication](https://docs.aws.amazon.com/msk/latest/developerguide/msk-authentication.html) section of *Amazon Managed Streaming for Apache Kafka Developer Guide*.

   After you complete those steps, you have a Certificate-ARN (the public certificate ARN saved in ACM), and a private key contained within a `kafka.client.keystore.jks` file.

1. Get the public certificate and copy the certificate to the `signed-certificate-from-acm.pem` file, using the command following:

   ```
   aws acm-pca get-certificate --certificate-authority-arn Private_CA_ARN --certificate-arn Certificate_ARN
   ```

   That command returns information similar to the following example:

   ```
   {"Certificate": "123", "CertificateChain": "456"}
   ```

   You then copy your equivalent of `"123"` to the `signed-certificate-from-acm.pem` file.

1. Get the private key by importing the `msk-rsa` key from `kafka.client.keystore.jks to keystore.p12`, as shown in the following example.

   ```
   keytool -importkeystore \
   -srckeystore kafka.client.keystore.jks \
   -destkeystore keystore.p12 \
   -deststoretype PKCS12 \
   -srcalias msk-rsa-client \
   -deststorepass test1234 \
   -destkeypass test1234
   ```

1. Use the following command to export `keystore.p12` into `.pem` format. 

   ```
   Openssl pkcs12 -in keystore.p12 -out encrypted-private-client-key.pem –nocerts
   ```

   The **Enter PEM pass phrase** message appears and identifies the key that is applied to encrypt the certificate.

1. Remove bag attributes and key attributes from the `.pem` file to make sure that the first line starts with the following string.

   ```
                                   ---BEGIN ENCRYPTED PRIVATE KEY---
   ```

**To upload a public certificate and private key to the DMS certificate manager and test the connection to Amazon MSK**

1. Upload to DMS certificate manager using the following command.

   ```
   aws dms import-certificate --certificate-identifier signed-cert --certificate-pem file://path to signed cert
   aws dms import-certificate --certificate-identifier private-key —certificate-pem file://path to private key
   ```

1. Create an Amazon MSK target endpoint and test connection to make sure that TLS authentication works.

   ```
   aws dms create-endpoint --endpoint-identifier $endpoint-identifier --engine-name kafka --endpoint-type target --kafka-settings 
   '{"Broker": "b-0.kafka260.aaaaa1.a99.kafka.us-east-1.amazonaws.com:0000", "SecurityProtocol":"ssl-authentication", 
   "SslClientCertificateArn": "arn:aws:dms:us-east-1:012346789012:cert:",
   "SslClientKeyArn": "arn:aws:dms:us-east-1:0123456789012:cert:","SslClientKeyPassword":"test1234"}'
   aws dms test-connection -replication-instance-arn=$rep_inst_arn —endpoint-arn=$kafka_tar_arn_msk
   ```

**Important**  
You can use SSL authentication to secure a connection to a self-managed Kafka cluster. In some cases, you might use a private Certification Authority (CA) in your on-premises Kafka cluster. If so, upload your CA chain, public certificate, and private key to the DMS certificate manager. Then, use the corresponding Amazon Resource Name (ARN) in your endpoint settings when you create your on-premises Kafka target endpoint.

**To prepare a private key and signed certificate for a self-managed Kafka cluster**

1. Generate a key pair as shown in the following example.

   ```
   keytool -genkey -keystore kafka.server.keystore.jks -validity 300 -storepass your-keystore-password 
   -keypass your-key-passphrase -dname "CN=your-cn-name" 
   -alias alias-of-key-pair -storetype pkcs12 -keyalg RSA
   ```

1. Generate a Certificate Sign Request (CSR). 

   ```
   keytool -keystore kafka.server.keystore.jks -certreq -file server-cert-sign-request-rsa -alias on-premise-rsa -storepass your-key-store-password 
   -keypass your-key-password
   ```

1. Use the CA in your cluster truststore to sign the CSR. If you don't have a CA, you can create your own private CA.

   ```
   openssl req -new -x509 -keyout ca-key -out ca-cert -days validate-days                            
   ```

1. Import `ca-cert` into the server truststore and keystore. If you don't have a truststore, use the following command to create the truststore and import `ca-cert `into it. 

   ```
   keytool -keystore kafka.server.truststore.jks -alias CARoot -import -file ca-cert
   keytool -keystore kafka.server.keystore.jks -alias CARoot -import -file ca-cert
   ```

1. Sign the certificate.

   ```
   openssl x509 -req -CA ca-cert -CAkey ca-key -in server-cert-sign-request-rsa -out signed-server-certificate.pem 
   -days validate-days -CAcreateserial -passin pass:ca-password
   ```

1. Import the signed certificate to the keystore.

   ```
   keytool -keystore kafka.server.keystore.jks -import -file signed-certificate.pem -alias on-premise-rsa -storepass your-keystore-password 
   -keypass your-key-password
   ```

1. Use the following command to import the `on-premise-rsa` key from `kafka.server.keystore.jks` to `keystore.p12`.

   ```
   keytool -importkeystore \
   -srckeystore kafka.server.keystore.jks \
   -destkeystore keystore.p12 \
   -deststoretype PKCS12 \
   -srcalias on-premise-rsa \
   -deststorepass your-truststore-password \
   -destkeypass your-key-password
   ```

1. Use the following command to export `keystore.p12` into `.pem` format.

   ```
   Openssl pkcs12 -in keystore.p12 -out encrypted-private-server-key.pem –nocerts
   ```

1. Upload `encrypted-private-server-key.pem`, `signed-certificate.pem`, and `ca-cert` to the DMS certificate manager.

1. Create an endpoint by using the returned ARNs.

   ```
   aws dms create-endpoint --endpoint-identifier $endpoint-identifier --engine-name kafka --endpoint-type target --kafka-settings 
   '{"Broker": "b-0.kafka260.aaaaa1.a99.kafka.us-east-1.amazonaws.com:9092", "SecurityProtocol":"ssl-authentication", 
   "SslClientCertificateArn": "your-client-cert-arn","SslClientKeyArn": "your-client-key-arn","SslClientKeyPassword":"your-client-key-password", 
   "SslCaCertificateArn": "your-ca-certificate-arn"}'
                               
   aws dms test-connection -replication-instance-arn=$rep_inst_arn —endpoint-arn=$kafka_tar_arn_msk
   ```

### Using SASL-SSL authentication to connect to Amazon MSK
<a name="CHAP_Target.Kafka.TLS.SSL-SASL"></a>

The Simple Authentication and Security Layer (SASL) method uses a user name and password to validate a client identity, and makes an encrypted connection between server and client.

To use SASL, you first create a secure user name and password when you set up your Amazon MSK cluster. For a description how to set up a secure user name and password for an Amazon MSK cluster, see [Setting up SASL/SCRAM authentication for an Amazon MSK cluster](https://docs.aws.amazon.com/msk/latest/developerguide/msk-password.html#msk-password-tutorial) in the *Amazon Managed Streaming for Apache Kafka Developer Guide*.

Then, when you create your Kafka target endpoint, set the security protocol endpoint setting (`SecurityProtocol`) using the `sasl-ssl` option. You also set `SaslUsername` and `SaslPassword` options. Make sure these are consistent with the secure user name and password that you created when you first set up your Amazon MSK cluster, as shown in the following JSON example.

```
                   
"KafkaSettings": {
    "SecurityProtocol": "sasl-ssl",
    "SaslUsername":"Amazon MSK cluster secure user name",
    "SaslPassword":"Amazon MSK cluster secure password"                    
}
```

**Note**  
Currently, AWS DMS supports only public CA backed SASL-SSL. DMS does not support SASL-SSL for use with self-managed Kafka that is backed by private CA.
For SASL-SSL authentication, AWS DMS supports the SCRAM-SHA-512 mechanism by default. AWS DMS versions 3.5.0 and higher also support the Plain mechanism. To support the Plain mechanism, set the `SaslMechanism` parameter of the `KafkaSettings` API data type to `PLAIN`. The datatype `PLAIN` is supported by Kafka, but not supported by Amazon Kafka (MSK).

## Using a before image to view original values of CDC rows for Apache Kafka as a target
<a name="CHAP_Target.Kafka.BeforeImage"></a>

When writing CDC updates to a data-streaming target like Kafka you can view a source database row's original values before change by an update. To make this possible, AWS DMS populates a *before image* of update events based on data supplied by the source database engine. 

Different source database engines provide different amounts of information for a before image: 
+ Oracle provides updates to columns only if they change. 
+ PostgreSQL provides only data for columns that are part of the primary key (changed or not). If logical replication is in use and REPLICA IDENTITY FULL is set for the source table, you can get entire before and after information on the row written to the WALs and available here.
+ MySQL generally provides data for all columns (changed or not).

To enable before imaging to add original values from the source database to the AWS DMS output, use either the `BeforeImageSettings` task setting or the `add-before-image-columns` parameter. This parameter applies a column transformation rule. 

`BeforeImageSettings` adds a new JSON attribute to every update operation with values collected from the source database system, as shown following.

```
"BeforeImageSettings": {
    "EnableBeforeImage": boolean,
    "FieldName": string,  
    "ColumnFilter": pk-only (default) / non-lob / all (but only one)
}
```

**Note**  
Apply `BeforeImageSettings` to full load plus CDC tasks (which migrate existing data and replicate ongoing changes), or to CDC only tasks (which replicate data changes only). Don't apply `BeforeImageSettings` to tasks that are full load only.

For `BeforeImageSettings` options, the following applies:
+ Set the `EnableBeforeImage` option to `true` to enable before imaging. The default is `false`. 
+ Use the `FieldName` option to assign a name to the new JSON attribute. When `EnableBeforeImage` is `true`, `FieldName` is required and can't be empty.
+ The `ColumnFilter` option specifies a column to add by using before imaging. To add only columns that are part of the table's primary keys, use the default value, `pk-only`. To add only columns that are not of LOB type, use `non-lob`. To add any column that has a before image value, use `all`. 

  ```
  "BeforeImageSettings": {
      "EnableBeforeImage": true,
      "FieldName": "before-image",
      "ColumnFilter": "pk-only"
    }
  ```

### Using a before image transformation rule
<a name="CHAP_Target.Kafka.BeforeImage.Transform-Rule"></a>

As as an alternative to task settings, you can use the `add-before-image-columns` parameter, which applies a column transformation rule. With this parameter, you can enable before imaging during CDC on data streaming targets like Kafka.

By using `add-before-image-columns` in a transformation rule, you can apply more fine-grained control of the before image results. Transformation rules enable you to use an object locator that gives you control over tables selected for the rule. Also, you can chain transformation rules together, which allows different rules to be applied to different tables. You can then manipulate the columns produced by using other rules. 

**Note**  
Don't use the `add-before-image-columns` parameter together with the `BeforeImageSettings` task setting within the same task. Instead, use either the parameter or the setting, but not both, for a single task.

A `transformation` rule type with the `add-before-image-columns` parameter for a column must provide a `before-image-def` section. The following shows an example.

```
    {
      "rule-type": "transformation",
      …
      "rule-target": "column",
      "rule-action": "add-before-image-columns",
      "before-image-def":{
        "column-filter": one-of  (pk-only / non-lob / all),
        "column-prefix": string,
        "column-suffix": string,
      }
    }
```

The value of `column-prefix` is prepended to a column name, and the default value of `column-prefix` is `BI_`. The value of `column-suffix` is appended to the column name, and the default is empty. Don't set both `column-prefix` and `column-suffix` to empty strings.

Choose one value for `column-filter`. To add only columns that are part of table primary keys, choose `pk-only` . Choose `non-lob` to only add columns that are not of LOB type. Or choose `all` to add any column that has a before-image value.

### Example for a before image transformation rule
<a name="CHAP_Target.Kafka.BeforeImage.Example"></a>

The transformation rule in the following example adds a new column called `BI_emp_no` in the target. So a statement like `UPDATE employees SET emp_no = 3 WHERE emp_no = 1;` populates the `BI_emp_no` field with 1. When you write CDC updates to Amazon S3 targets, the `BI_emp_no` column makes it possible to tell which original row was updated.

```
{
  "rules": [
    {
      "rule-type": "selection",
      "rule-id": "1",
      "rule-name": "1",
      "object-locator": {
        "schema-name": "%",
        "table-name": "%"
      },
      "rule-action": "include"
    },
    {
      "rule-type": "transformation",
      "rule-id": "2",
      "rule-name": "2",
      "rule-target": "column",
      "object-locator": {
        "schema-name": "%",
        "table-name": "employees"
      },
      "rule-action": "add-before-image-columns",
      "before-image-def": {
        "column-prefix": "BI_",
        "column-suffix": "",
        "column-filter": "pk-only"
      }
    }
  ]
}
```

For information on using the `add-before-image-columns` rule action, see [Transformation rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.md).

## Limitations when using Apache Kafka as a target for AWS Database Migration Service
<a name="CHAP_Target.Kafka.Limitations"></a>

The following limitations apply when using Apache Kafka as a target:
+ AWS DMS Kafka target endpoints don't support IAM access control for Amazon Managed Streaming for Apache Kafka (Amazon MSK).
+ Full LOB mode is not supported.
+ Specify a Kafka configuration file for your cluster with properties that allow AWS DMS to automatically create new topics. Include the setting, `auto.create.topics.enable = true`. If you are using Amazon MSK, you can specify the default configuration when you create your Kafka cluster, then change the `auto.create.topics.enable` setting to `true`. For more information about the default configuration settings, see [The default Amazon MSK configuration](https://docs.aws.amazon.com/msk/latest/developerguide/msk-default-configuration.html) in the *Amazon Managed Streaming for Apache Kafka Developer Guide*. If you need to modify an existing Kafka cluster created using Amazon MSK, run the AWS CLI command `aws kafka create-configuration` to update your Kafka configuration, as in the following example:

  ```
  14:38:41 $ aws kafka create-configuration --name "kafka-configuration" --kafka-versions "2.2.1" --server-properties file://~/kafka_configuration
  {
      "LatestRevision": {
          "Revision": 1,
          "CreationTime": "2019-09-06T14:39:37.708Z"
      },
      "CreationTime": "2019-09-06T14:39:37.708Z",
      "Name": "kafka-configuration",
      "Arn": "arn:aws:kafka:us-east-1:111122223333:configuration/kafka-configuration/7e008070-6a08-445f-9fe5-36ccf630ecfd-3"
  }
  ```

  Here, `//~/kafka_configuration` is the configuration file you have created with the required property settings.

  If you are using your own Kafka instance installed on Amazon EC2, modify the Kafka cluster configuration with the `auto.create.topics.enable = true` setting to allow AWS DMS to automatically create new topics, using the options provided with your instance.
+ AWS DMS publishes each update to a single record in the source database as one data record (message) in a given Kafka topic regardless of transactions.
+ AWS DMS supports the following four forms for partition keys:
  + `SchemaName.TableName`: A combination of the schema and table name.
  + `${AttributeName}`: The value of one of the fields in the JSON, or the primary key of the table in the source database.
  + `transaction-id`: The CDC transaction ID. All records within the same transaction go to the same partition.
  + `constant`: A fixed literal value for every record regardless of table or data. All records are sent to the same partition key value "constant", providing strict global ordering across all tables.

  ```
  {
      "rule-type": "object-mapping",
      "rule-id": "2",
      "rule-name": "TransactionIdPartitionKey",
      "rule-action": "map-record-to-document",
      "object-locator": {
          "schema-name": "onprem",
          "table-name": "it_system"
      },
      "mapping-parameters": {
          "partition-key-type": "transaction-id | constant | attribute-name | schema-table"
      }
  }
  ```
+ The `IncludeTransactionDetails` endpoint setting is only supported when the source endpoint is Oracle, SQL Server, PostgreSQL, or MySQL. For other source endpoint types, transaction details will not be included.
+ `BatchApply` is not supported for a Kafka endpoint. Using Batch Apply (for example, the `BatchApplyEnabled` target metadata task setting) for a Kafka target might result in loss of data.
+ AWS DMS does not support migrating values of `BigInt` data type with more than 16 digits. To work around this limitation, you can use the following transformation rule to convert the `BigInt` column to a string. For more information about transformation rules, see [Transformation rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.md).

  ```
  {
      "rule-type": "transformation",
      "rule-id": "id",
      "rule-name": "name",
      "rule-target": "column",
      "object-locator": {
          "schema-name": "valid object-mapping rule action",
          "table-name": "",
          "column-name": ""
      },
      "rule-action": "change-data-type",
      "data-type": {
          "type": "string",
          "length": 20
      }
  }
  ```
+ AWS DMS Kafka target endpoints do not support Amazon MSK servless.
+ When defining mapping rules, having both, object mapping rule and a transformation rule is not supported. You must set only one rule. 
+ AWS DMS supports SASL Authentication for Apache Kafka versions up to 3.8. If you are using Kafka 4.0 or higher, you can only connect without SASL authentication.
+ AWS DMS does not support source data containing embedded `'\0'` characters when using Kafka as a target endpoint. Data containing embedded `'\0'` characters will be truncated at the first `'\0'` character.

## Using object mapping to migrate data to a Kafka topic
<a name="CHAP_Target.Kafka.ObjectMapping"></a>

AWS DMS uses table-mapping rules to map data from the source to the target Kafka topic. To map data to a target topic, you use a type of table-mapping rule called object mapping. You use object mapping to define how data records in the source map to the data records published to a Kafka topic. 

Kafka topics don't have a preset structure other than having a partition key.

**Note**  
You don't have to use object mapping. You can use regular table mapping for various transformations. However, the partition key type will follow these default behaviors:   
Primary Key is used as a partition key for Full Load.
If no parallel-apply task settings are used, `schema.table` is used as a partition key for CDC.
If parallel-apply task settings are used, Primary key is used as a partition key for CDC.

To create an object-mapping rule, specify `rule-type` as `object-mapping`. This rule specifies what type of object mapping you want to use. 

The structure for the rule is as follows.

```
{
    "rules": [
        {
            "rule-type": "object-mapping",
            "rule-id": "id",
            "rule-name": "name",
            "rule-action": "valid object-mapping rule action",
            "object-locator": {
                "schema-name": "case-sensitive schema name",
                "table-name": ""
            }
        }
    ]
}
```

AWS DMS currently supports `map-record-to-record` and `map-record-to-document` as the only valid values for the `rule-action` parameter. These settings affect values that aren't excluded as part of the `exclude-columns` attribute list. The `map-record-to-record` and `map-record-to-document` values specify how AWS DMS handles these records by default. These values don't affect the attribute mappings in any way. 

Use `map-record-to-record` when migrating from a relational database to a Kafka topic. This rule type uses the `taskResourceId.schemaName.tableName` value from the relational database as the partition key in the Kafka topic and creates an attribute for each column in the source database. 

When using `map-record-to-record`, note the following:
+ This setting only affects columns excluded by the `exclude-columns` list.
+ For every such column, AWS DMS creates a corresponding attribute in the target topic.
+ AWS DMS creates this corresponding attribute regardless of whether the source column is used in an attribute mapping. 

One way to understand `map-record-to-record` is to see it in action. For this example, assume that you are starting with a relational database table row with the following structure and data.


| FirstName | LastName | StoreId | HomeAddress | HomePhone | WorkAddress | WorkPhone | DateofBirth | 
| --- | --- | --- | --- | --- | --- | --- | --- | 
| Randy | Marsh | 5 | 221B Baker Street | 1234567890 | 31 Spooner Street, Quahog  | 9876543210 | 02/29/1988 | 

To migrate this information from a schema named `Test` to a Kafka topic, you create rules to map the data to the target topic. The following rule illustrates the mapping. 

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "rule-action": "include",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "%"
            }
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "2",
            "rule-name": "DefaultMapToKafka",
            "rule-action": "map-record-to-record",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "Customers"
            }
        }
    ]
}
```

Given a Kafka topic and a partition key (in this case, `taskResourceId.schemaName.tableName`), the following illustrates the resulting record format using our sample data in the Kafka target topic: 

```
  {
     "FirstName": "Randy",
     "LastName": "Marsh",
     "StoreId":  "5",
     "HomeAddress": "221B Baker Street",
     "HomePhone": "1234567890",
     "WorkAddress": "31 Spooner Street, Quahog",
     "WorkPhone": "9876543210",
     "DateOfBirth": "02/29/1988"
  }
```

**Topics**
+ [

### Restructuring data with attribute mapping
](#CHAP_Target.Kafka.AttributeMapping)
+ [

### Multitopic replication using object mapping
](#CHAP_Target.Kafka.MultiTopic)
+ [

### Message format for Apache Kafka
](#CHAP_Target.Kafka.Messageformat)

### Restructuring data with attribute mapping
<a name="CHAP_Target.Kafka.AttributeMapping"></a>

You can restructure the data while you are migrating it to a Kafka topic using an attribute map. For example, you might want to combine several fields in the source into a single field in the target. The following attribute map illustrates how to restructure the data.

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "rule-action": "include",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "%"
            }
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "2",
            "rule-name": "TransformToKafka",
            "rule-action": "map-record-to-record",
            "target-table-name": "CustomerData",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "Customers"
            },
            "mapping-parameters": {
                "partition-key-type": "attribute-name",
                "partition-key-name": "CustomerName",
                "exclude-columns": [
                    "firstname",
                    "lastname",
                    "homeaddress",
                    "homephone",
                    "workaddress",
                    "workphone"
                ],
                "attribute-mappings": [
                    {
                        "target-attribute-name": "CustomerName",
                        "attribute-type": "scalar",
                        "attribute-sub-type": "string",
                        "value": "${lastname}, ${firstname}"
                    },
                    {
                        "target-attribute-name": "ContactDetails",
                        "attribute-type": "document",
                        "attribute-sub-type": "json",
                        "value": {
                            "Home": {
                                "Address": "${homeaddress}",
                                "Phone": "${homephone}"
                            },
                            "Work": {
                                "Address": "${workaddress}",
                                "Phone": "${workphone}"
                            }
                        }
                    }
                ]
            }
        }
    ]
}
```

To set a constant value for `partition-key`, specify `"partition-key-type: "constant"`, this sets the partition value to `constant`. For example, you might do this to force all the data to be stored in a single partition. The following mapping illustrates this approach. 

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "%"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "1",
            "rule-name": "TransformToKafka",
            "rule-action": "map-record-to-document",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "Customer"
            },
            "mapping-parameters": {
                "partition-key-type": "constant",
                "exclude-columns": [
                    "FirstName",
                    "LastName",
                    "HomeAddress",
                    "HomePhone",
                    "WorkAddress",
                    "WorkPhone"
                ],
                "attribute-mappings": [
                    {
                        "attribute-name": "CustomerName",
                        "value": "${FirstName},${LastName}"
                    },
                    {
                        "attribute-name": "ContactDetails",
                        "value": {
                            "Home": {
                                "Address": "${HomeAddress}",
                                "Phone": "${HomePhone}"
                            },
                            "Work": {
                                "Address": "${WorkAddress}",
                                "Phone": "${WorkPhone}"
                            }
                        }
                    },
                    {
                        "attribute-name": "DateOfBirth",
                        "value": "${DateOfBirth}"
                    }
                ]
            }
        }
    ]
}
```

**Note**  
The `partition-key` value for a control record that is for a specific table is `TaskId.SchemaName.TableName`. The `partition-key` value for a control record that is for a specific task is that record's `TaskId`. Specifying a `partition-key` value in the object mapping has no impact on the `partition-key` for a control record.  
 When `partition-key-type` is set to `attribute-name` in a table mapping rule, you must specify `partition-key-name`, which must reference either a column from the source table or a custom column defined in the mapping. Additionally, `attribute-mappings` must be provided to define how source columns map to the target Kafka topic.

### Multitopic replication using object mapping
<a name="CHAP_Target.Kafka.MultiTopic"></a>

By default, AWS DMS tasks migrate all source data to one of the Kafka topics following:
+ As specified in the **Topic** field of the AWS DMS target endpoint.
+ As specified by `kafka-default-topic` if the **Topic** field of the target endpoint isn't populated and the Kafka `auto.create.topics.enable` setting is set to `true`.

With AWS DMS engine versions 3.4.6 and higher, you can use the `kafka-target-topic` attribute to map each migrated source table to a separate topic. For example, the object mapping rules following migrate the source tables `Customer` and `Address` to the Kafka topics `customer_topic` and `address_topic`, respectively. At the same time, AWS DMS migrates all other source tables, including the `Bills` table in the `Test` schema, to the topic specified in the target endpoint.

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "rule-action": "include",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "%"
            }
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "2",
            "rule-name": "MapToKafka1",
            "rule-action": "map-record-to-record",
            "kafka-target-topic": "customer_topic",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "Customer" 
            },
            "partition-key-type": "constant"
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "3",
            "rule-name": "MapToKafka2",
            "rule-action": "map-record-to-record",
            "kafka-target-topic": "address_topic",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "Address"
            },
            "partition-key-type": "constant"
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "4",
            "rule-name": "DefaultMapToKafka",
            "rule-action": "map-record-to-record",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "Bills"
            }
        }
    ]
}
```

By using Kafka multitopic replication, you can group and migrate source tables to separate Kafka topics using a single replication task.

### Message format for Apache Kafka
<a name="CHAP_Target.Kafka.Messageformat"></a>

The JSON output is simply a list of key-value pairs. 

**RecordType**  
The record type can be either data or control. *Data records *represent the actual rows in the source. *Control records* are for important events in the stream, for example a restart of the task.

**Operation**  
For data records, the operation can be `load`, `insert`, `update`, or `delete`.  
For control records, the operation can be `create-table`, `rename-table`, `drop-table`, `change-columns`, `add-column`, `drop-column`, `rename-column`, or `column-type-change`.

**SchemaName**  
The source schema for the record. This field can be empty for a control record.

**TableName**  
The source table for the record. This field can be empty for a control record.

**Timestamp**  
The timestamp for when the JSON message was constructed. The field is formatted with the ISO 8601 format.

The following JSON message example illustrates a data type message with all additional metadata.

```
{ 
   "data":{ 
      "id":100000161,
      "fname":"val61s",
      "lname":"val61s",
      "REGION":"val61s"
   },
   "metadata":{ 
      "timestamp":"2019-10-31T22:53:59.721201Z",
      "record-type":"data",
      "operation":"insert",
      "partition-key-type":"primary-key",
      "partition-key-value":"sbtest.sbtest_x.100000161",
      "schema-name":"sbtest",
      "table-name":"sbtest_x",
      "transaction-id":9324410911751,
      "transaction-record-id":1,
      "prev-transaction-id":9324410910341,
      "prev-transaction-record-id":10,
      "commit-timestamp":"2019-10-31T22:53:55.000000Z",
      "stream-position":"mysql-bin-changelog.002171:36912271:0:36912333:9324410911751:mysql-bin-changelog.002171:36912209"
   }
}
```

The following JSON message example illustrates a control type message.

```
{ 
   "control":{ 
      "table-def":{ 
         "columns":{ 
            "id":{ 
               "type":"WSTRING",
               "length":512,
               "nullable":false
            },
            "fname":{ 
               "type":"WSTRING",
               "length":255,
               "nullable":true
            },
            "lname":{ 
               "type":"WSTRING",
               "length":255,
               "nullable":true
            },
            "REGION":{ 
               "type":"WSTRING",
               "length":1000,
               "nullable":true
            }
         },
         "primary-key":[ 
            "id"
         ],
         "collation-name":"latin1_swedish_ci"
      }
   },
   "metadata":{ 
      "timestamp":"2019-11-21T19:14:22.223792Z",
      "record-type":"control",
      "operation":"create-table",
      "partition-key-type":"task-id",
      "schema-name":"sbtest",
      "table-name":"sbtest_t1"
   }
}
```

# Using an Amazon OpenSearch Service cluster as a target for AWS Database Migration Service
<a name="CHAP_Target.Elasticsearch"></a>

You can use AWS DMS to migrate data to Amazon OpenSearch Service (OpenSearch Service). OpenSearch Service is a managed service that makes it easy to deploy, operate, and scale an OpenSearch Service cluster. 

In OpenSearch Service, you work with indexes and documents. An *index* is a collection of documents, and a *document* is a JSON object containing scalar values, arrays, and other objects. OpenSearch provides a JSON-based query language, so that you can query data in an index and retrieve the corresponding documents.

When AWS DMS creates indexes for a target endpoint for OpenSearch Service, it creates one index for each table from the source endpoint. The cost for creating an OpenSearch Service index depends on several factors. These are the number of indexes created, the total amount of data in these indexes, and the small amount of metadata that OpenSearch stores for each document.

Configure your OpenSearch Service cluster with compute and storage resources that are appropriate for the scope of your migration. We recommend that you consider the following factors, depending on the replication task you want to use:
+ For a full data load, consider the total amount of data that you want to migrate, and also the speed of the transfer.
+ For replicating ongoing changes, consider the frequency of updates, and your end-to-end latency requirements.

Also, configure the index settings on your OpenSearch cluster, paying close attention to the document count.

**Multithreaded full load task settings**

To help increase the speed of the transfer, AWS DMS supports a multithreaded full load to an OpenSearch Service target cluster. AWS DMS supports this multithreading with task settings that include the following:
+ `MaxFullLoadSubTasks` – Use this option to indicate the maximum number of source tables to load in parallel. DMS loads each table into its corresponding OpenSearch Service target index using a dedicated subtask. The default is 8; the maximum value is 49.
+ `ParallelLoadThreads` – Use this option to specify the number of threads that AWS DMS uses to load each table into its OpenSearch Service target index. The maximum value for an OpenSearch Service target is 32. You can ask to have this maximum limit increased.
**Note**  
If you don't change `ParallelLoadThreads` from its default (0), AWS DMS transfers a single record at a time. This approach puts undue load on your OpenSearch Service cluster. Make sure that you set this option to 1 or more.
+ `ParallelLoadBufferSize` – Use this option to specify the maximum number of records to store in the buffer that the parallel load threads use to load data to the OpenSearch Service target. The default value is 50. The maximum value is 1,000. Use this setting with `ParallelLoadThreads`. `ParallelLoadBufferSize` is valid only when there is more than one thread.

For more information on how DMS loads an OpenSearch Service cluster using multithreading, see the AWS blog post [Scale Amazon OpenSearch Service for AWS Database Migration Service migrations](https://aws.amazon.com/blogs/database/scale-amazon-elasticsearch-service-for-aws-database-migration-service-migrations/). 

**Multithreaded CDC load task settings**

You can improve the performance of change data capture (CDC) for an OpenSearch Service target cluster using task settings to modify the behavior of the `PutRecords` API call. To do this, you can specify the number of concurrent threads, queues per thread, and the number of records to store in a buffer using `ParallelApply*` task settings. For example, suppose you want to perform a CDC load and apply 32 threads in parallel. You also want to access 64 queues per thread, with 50 records stored per buffer. 
**Note**  
Support for the use of `ParallelApply*` task settings during CDC to Amazon OpenSearch Service target endpoints is available in AWS DMS versions 3.4.0 and higher.

To promote CDC performance, AWS DMS supports these task settings:
+ `ParallelApplyThreads` – Specifies the number of concurrent threads that AWS DMS uses during a CDC load to push data records to a OpenSearch Service target endpoint. The default value is zero (0) and the maximum value is 32.
+ `ParallelApplyBufferSize` – Specifies the maximum number of records to store in each buffer queue for concurrent threads to push to a OpenSearch Service target endpoint during a CDC load. The default value is 100 and the maximum value is 1,000. Use this option when `ParallelApplyThreads` specifies more than one thread. 
+ `ParallelApplyQueuesPerThread` – Specifies the number of queues that each thread accesses to take data records out of queues and generate a batch load for a OpenSearch Service endpoint during CDC.

When using `ParallelApply*` task settings, the `partition-key-type` default is the `primary-key` of the table, not `schema-name.table-name`.

## Migrating from a relational database table to an OpenSearch Service index
<a name="CHAP_Target.Elasticsearch.RDBMS2Elasticsearch"></a>

AWS DMS supports migrating data to OpenSearch Service's scalar data types. When migrating from a relational database like Oracle or MySQL to OpenSearch Service, you might want to restructure how you store this data.

AWS DMS supports the following OpenSearch Service scalar data types: 
+ Boolean 
+ Date
+ Float
+ Int
+ String

AWS DMS converts data of type Date into type String. You can specify custom mapping to interpret these dates.

AWS DMS does not support migration of LOB data types.

## Prerequisites for using Amazon OpenSearch Service as a target for AWS Database Migration Service
<a name="CHAP_Target.Elasticsearch.Prerequisites"></a>

Before you begin work with an OpenSearch Service database as a target for AWS DMS, make sure that you create an AWS Identity and Access Management (IAM) role. This role should let AWS DMS access the OpenSearch Service indexes at the target endpoint. The minimum set of access permissions is shown in the following IAM policy.

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "1",
            "Effect": "Allow",
            "Principal": {
                "Service": "dms.amazonaws.com"
        },
        "Action": "sts:AssumeRole"
        }
    ]
}
```

------

The role that you use for the migration to OpenSearch Service must have the following permissions.

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "es:ESHttpDelete",
        "es:ESHttpGet",
        "es:ESHttpHead",
        "es:ESHttpPost",
        "es:ESHttpPut"
      ],
      "Resource": "*"
    }
  ]
}
```

------

In the preceding example, replace `region` with the AWS Region identifier, *`account-id`* with your AWS account ID, and `domain-name` with the name of your Amazon OpenSearch Service domain. An example is `arn:aws:es:us-west-2:123456789012:domain/my-es-domain`

## Endpoint settings when using OpenSearch Service as a target for AWS DMS
<a name="CHAP_Target.Elasticsearch.Configuration"></a>

You can use endpoint settings to configure your OpenSearch Service target database similar to using extra connection attributes. You specify the settings when you create the target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--elasticsearch-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with OpenSearch Service as a target.


| Attribute name | Valid values | Default value and description | 
| --- | --- | --- | 
|  `FullLoadErrorPercentage`   |  A positive integer greater than 0 but no larger than 100.  |  10 – For a full load task, this attribute determines the threshold of errors allowed before the task fails. For example, suppose that there are 1,500 rows at the source endpoint and this parameter is set to 10. Then the task fails if AWS DMS encounters more than 150 errors (10 percent of the row count) when writing to the target endpoint.  | 
|   `ErrorRetryDuration`   |  A positive integer greater than 0.  |  300 – If an error occurs at the target endpoint, AWS DMS retries for this many seconds. Otherwise, the task fails.  | 
|  `UseNewMappingType`  | true or false |  `false`, but to work using opensearch v2.x it should be set to `true`.  | 

## Limitations when using Amazon OpenSearch Service as a target for AWS Database Migration Service
<a name="CHAP_Target.Elasticsearch.Limitations"></a>

The following limitations apply when using Amazon OpenSearch Service as a target:
+ OpenSearch Service uses dynamic mapping (auto guess) to determine the data types to use for migrated data.
+ OpenSearch Service stores each document with a unique ID. The following is an example ID. 

  ```
  "_id": "D359F8B537F1888BC71FE20B3D79EAE6674BE7ACA9B645B0279C7015F6FF19FD"
  ```

  Each document ID is 64 bytes long, so anticipate this as a storage requirement. For example, if you migrate 100,000 rows from an AWS DMS source, the resulting OpenSearch Service index requires storage for an additional 6,400,000 bytes.
+ With OpenSearch Service, you can't make updates to the primary key attributes. This restriction is important when using ongoing replication with change data capture (CDC) because it can result in unwanted data in the target. In CDC mode, primary keys are mapped to SHA256 values, which are 32 bytes long. These are converted to human-readable 64-byte strings, and are used as OpenSearch Service document IDs.
+ If AWS DMS encounters any items that can't be migrated, it writes error messages to Amazon CloudWatch Logs. This behavior differs from that of other AWS DMS target endpoints, which write errors to an exceptions table.
+ AWS DMS does not support connection to an Amazon ES cluster that has Fine-grained Access Control enabled with master user and password.
+ AWS DMS does not support OpenSearch Service serverless.
+ OpenSearch Service does not support writing data to pre-existing indexes.
+ The replication task setting, `TargetTablePrepMode:TRUNCATE_BEFORE_LOAD` is not supported for use with a OpenSearch target endpoint.
+ When migrating data to Amazon Elasticsearch using AWS DMS, the source data must have a primary key or a unique identifier column. If the source data does not have a primary key or unique identifier, you need to define one using the define-primary-key transformation rule.

## Target data types for Amazon OpenSearch Service
<a name="CHAP_Target.Elasticsearch.DataTypes"></a>

When AWS DMS migrates data from heterogeneous databases, the service maps data types from the source database to intermediate data types called AWS DMS data types. The service then maps the intermediate data types to the target data types. The following table shows each AWS DMS data type and the data type it maps to in OpenSearch Service.


| AWS DMS data type | OpenSearch Service data type | 
| --- | --- | 
|  Boolean  |  boolean  | 
|  Date  |  string  | 
|  Time  |  date  | 
|  Timestamp  |  date  | 
|  INT4  |  integer  | 
|  Real4  |  float  | 
|  UINT4  |  integer  | 

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).

# Using Amazon DocumentDB as a target for AWS Database Migration Service
<a name="CHAP_Target.DocumentDB"></a>

 For information about what versions of Amazon DocumentDB (with MongoDB compatibility) that AWS DMS supports, see [Targets for AWS DMS](CHAP_Introduction.Targets.md). You can use AWS DMS to migrate data to Amazon DocumentDB (with MongoDB compatibility) from any of the source data engines that AWS DMS supports. The source engine can be on an AWS managed service such as Amazon RDS, Aurora, or Amazon S3. Or the engine can be on a self-managed database, such as MongoDB running on Amazon EC2 or on-premises.

You can use AWS DMS to replicate source data to Amazon DocumentDB databases, collections, or documents. 

**Note**  
If your source endpoint is MongoDB or Amazon DocumentDB, run the migration in **Document mode**.

MongoDB stores data in a binary JSON format (BSON). AWS DMS supports all of the BSON data types that are supported by Amazon DocumentDB. For a list of these data types, see [Supported MongoDB APIs, operations, and data types](https://docs.aws.amazon.com/documentdb/latest/developerguide/mongo-apis.html) in *the Amazon DocumentDB Developer Guide.*

If the source endpoint is a relational database, AWS DMS maps database objects to Amazon DocumentDB as follows:
+ A relational database, or database schema, maps to an Amazon DocumentDB *database*. 
+ Tables within a relational database map to *collections* in Amazon DocumentDB.
+ Records in a relational table map to *documents* in Amazon DocumentDB. Each document is constructed from data in the source record.

If the source endpoint is Amazon S3, then the resulting Amazon DocumentDB objects correspond to AWS DMS mapping rules for Amazon S3. For example, consider the following URI.

```
s3://amzn-s3-demo-bucket/hr/employee
```

In this case, AWS DMS maps the objects in `amzn-s3-demo-bucket` to Amazon DocumentDB as follows:
+ The top-level URI part (`hr`) maps to an Amazon DocumentDB database. 
+ The next URI part (`employee`) maps to an Amazon DocumentDB collection.
+ Each object in `employee` maps to a document in Amazon DocumentDB.

For more information on mapping rules for Amazon S3, see [Using Amazon S3 as a source for AWS DMS](CHAP_Source.S3.md).

**Amazon DocumentDB endpoint settings**

In AWS DMS versions 3.5.0 and higher, you can improve the performance of change data capture (CDC) for Amazon DocumentDB endpoints by tuning task settings for parallel threads and bulk operations. To do this, you can specify the number of concurrent threads, queues per thread, and the number of records to store in a buffer using `ParallelApply*` task settings. For example, suppose you want to perform a CDC load and apply 128 threads in parallel. You also want to access 64 queues per thread, with 50 records stored per buffer. 

To promote CDC performance, AWS DMS supports these task settings:
+ `ParallelApplyThreads` – Specifies the number of concurrent threads that AWS DMS uses during a CDC load to push data records to a Amazon DocumentDB target endpoint. The default value is zero (0) and the maximum value is 32.
+ `ParallelApplyBufferSize` – Specifies the maximum number of records to store in each buffer queue for concurrent threads to push to a Amazon DocumentDB target endpoint during a CDC load. The default value is 100 and the maximum value is 1,000. Use this option when `ParallelApplyThreads` specifies more than one thread. 
+ `ParallelApplyQueuesPerThread` – Specifies the number of queues that each thread accesses to take data records out of queues and generate a batch load for a Amazon DocumentDB endpoint during CDC. The default is 1. The maximum is 512.

**Note**  
 For Amazon DocumentDB targets, parallel CDC apply can cause duplicate key errors or stalled CDC apply for workloads that use secondary unique indexes or require strict ordering of changes. Use the default single-threaded CDC apply configuration for these workloads. 

For additional details on working with Amazon DocumentDB as a target for AWS DMS, see the following sections:

**Topics**
+ [

## Mapping data from a source to an Amazon DocumentDB target
](#CHAP_Target.DocumentDB.data-mapping)
+ [

## Connecting to Amazon DocumentDB Elastic Clusters as a target
](#CHAP_Target.DocumentDB.data-mapping.elastic-cluster-connect)
+ [

## Ongoing replication with Amazon DocumentDB as a target
](#CHAP_Target.DocumentDB.data-mapping.ongoing-replication)
+ [

## Limitations to using Amazon DocumentDB as a target
](#CHAP_Target.DocumentDB.limitations)
+ [

## Using endpoint settings with Amazon DocumentDB as a target
](#CHAP_Target.DocumentDB.ECAs)
+ [

## Target data types for Amazon DocumentDB
](#CHAP_Target.DocumentDB.datatypes)

**Note**  
For a step-by-step walkthrough of the migration process, see [Migrating from MongoDB to Amazon DocumentDB ](https://docs.aws.amazon.com/dms/latest/sbs/CHAP_MongoDB2DocumentDB.html) in the AWS Database Migration Service Step-by-Step Migration Guide.

## Mapping data from a source to an Amazon DocumentDB target
<a name="CHAP_Target.DocumentDB.data-mapping"></a>

AWS DMS reads records from the source endpoint, and constructs JSON documents based on the data it reads. For each JSON document, AWS DMS must determine an `_id` field to act as a unique identifier. It then writes the JSON document to an Amazon DocumentDB collection, using the `_id` field as a primary key.

### Source data that is a single column
<a name="CHAP_Target.DocumentDB.data-mapping.single-column"></a>

If the source data consists of a single column, the data must be of a string type. (Depending on the source engine, the actual data type might be VARCHAR, NVARCHAR, TEXT, LOB, CLOB, or similar.) AWS DMS assumes that the data is a valid JSON document, and replicates the data to Amazon DocumentDB as is.

If the resulting JSON document contains a field named `_id`, then that field is used as the unique `_id` in Amazon DocumentDB.

If the JSON doesn't contain an `_id` field, then Amazon DocumentDB generates an `_id` value automatically.

### Source data that is multiple columns
<a name="CHAP_Target.DocumentDB.data-mapping.multiple-columns"></a>

If the source data consists of multiple columns, then AWS DMS constructs a JSON document from all of these columns. To determine the `_id` field for the document, AWS DMS proceeds as follows:
+ If one of the columns is named `_id`, then the data in that column is used as the target`_id`.
+ If there is no `_id` column, but the source data has a primary key or a unique index, then AWS DMS uses that key or index value as the `_id` value. The data from the primary key or unique index also appears as explicit fields in the JSON document.
+ If there is no `_id` column, and no primary key or a unique index, then Amazon DocumentDB generates an `_id` value automatically.

### Coercing a data type at the target endpoint
<a name="CHAP_Target.DocumentDB.coercing-datatype"></a>

AWS DMS can modify data structures when it writes to an Amazon DocumentDB target endpoint. You can request these changes by renaming columns and tables at the source endpoint, or by providing transformation rules that are applied when a task is running.

#### Using a nested JSON document (json\$1 prefix)
<a name="CHAP_Target.DocumentDB.coercing-datatype.json"></a>

To coerce a data type, you can prefix the source column name with `json_` (that is, `json_columnName`) either manually or using a transformation. In this case, the column is created as a nested JSON document within the target document, rather than as a string field.

For example, suppose that you want to migrate the following document from a MongoDB source endpoint.

```
{
    "_id": "1", 
    "FirstName": "John", 
    "LastName": "Doe",
    "ContactDetails": "{"Home": {"Address": "Boston","Phone": "1111111"},"Work": { "Address": "Boston", "Phone": "2222222222"}}"
}
```

If you don't coerce any of the source data types, the embedded `ContactDetails` document is migrated as a string.

```
{
    "_id": "1", 
    "FirstName": "John", 
    "LastName": "Doe",
    "ContactDetails": "{\"Home\": {\"Address\": \"Boston\",\"Phone\": \"1111111\"},\"Work\": { \"Address\": \"Boston\", \"Phone\": \"2222222222\"}}"
}
```

However, you can add a transformation rule to coerce `ContactDetails` to a JSON object. For example, suppose that the original source column name is `ContactDetails`. To coerce the data type as Nested JSON, the column at source endpoint needs to be renamed as json\$1ContactDetails” either by adding “\$1json\$1\$1“ prefix on the source manually or through transformation rules. For example, you can use the below transformation rule:

```
{
    "rules": [
    {
    "rule-type": "transformation",
    "rule-id": "1",
    "rule-name": "1",
    "rule-target": "column",
    "object-locator": {
    "schema-name": "%",
    "table-name": "%",
    "column-name": "ContactDetails"
     },
    "rule-action": "rename",
    "value": "json_ContactDetails",
    "old-value": null
    }
    ]
}
```

AWS DMS replicates the ContactDetails field as nested JSON, as follows. 

```
{
    "_id": "1",
    "FirstName": "John",
    "LastName": "Doe",
    "ContactDetails": {
        "Home": {
            "Address": "Boston",
            "Phone": "1111111111"
        },
        "Work": {
            "Address": "Boston",
            "Phone": "2222222222"
        }
    }
}
```

#### Using a JSON array (array\$1 prefix)
<a name="CHAP_Target.DocumentDB.coercing-datatype.array"></a>

To coerce a data type, you can prefix a column name with `array_` (that is, `array_columnName`), either manually or using a transformation. In this case, AWS DMS considers the column as a JSON array, and creates it as such in the target document.

Suppose that you want to migrate the following document from a MongoDB source endpoint.

```
{
    "_id" : "1",
    "FirstName": "John",
    "LastName": "Doe", 
    "ContactAddresses": ["Boston", "New York"],             
    "ContactPhoneNumbers": ["1111111111", "2222222222"]
}
```

If you don't coerce any of the source data types, the embedded `ContactDetails` document is migrated as a string.

```
{
    "_id": "1",
    "FirstName": "John",
    "LastName": "Doe", 
    "ContactAddresses": "[\"Boston\", \"New York\"]",             
    "ContactPhoneNumbers": "[\"1111111111\", \"2222222222\"]" 
}
```

 However, you can add transformation rules to coerce `ContactAddress` and `ContactPhoneNumbers` to JSON arrays, as shown in the following table.


****  

| Original source column name | Renamed source column | 
| --- | --- | 
| ContactAddress | array\$1ContactAddress | 
| ContactPhoneNumbers | array\$1ContactPhoneNumbers | 

AWS DMS replicates `ContactAddress` and `ContactPhoneNumbers` as follows.

```
{
    "_id": "1",
    "FirstName": "John",
    "LastName": "Doe",
    "ContactAddresses": [
        "Boston",
        "New York"
    ],
    "ContactPhoneNumbers": [
        "1111111111",
        "2222222222"
    ]
}
```

### Connecting to Amazon DocumentDB using TLS
<a name="CHAP_Target.DocumentDB.tls"></a>

By default, a newly created Amazon DocumentDB cluster accepts secure connections only using Transport Layer Security (TLS). When TLS is enabled, every connection to Amazon DocumentDB requires a public key.

You can retrieve the public key for Amazon DocumentDB by downloading the file, `rds-combined-ca-bundle.pem`, from an AWS hosted Amazon S3 bucket. For more information on downloading this file, see [Encrypting connections using TLS](https://docs.aws.amazon.com/documentdb/latest/developerguide/security.encryption.ssl.html) in the *Amazon DocumentDB Developer Guide*

After you download this .pem file, you can import the public key that it contains into AWS DMS as described following.

#### AWS Management Console
<a name="CHAP_Target.DocumentDB.tls.con"></a>

**To import the public key (.pem) file**

1. Open the AWS DMS console at [https://console.aws.amazon.com/dms](https://console.aws.amazon.com/dms).

1. In the navigation pane, choose **Certificates**.

1. Choose **Import certificate** and do the following:
   + For **Certificate identifier**, enter a unique name for the certificate, for example `docdb-cert`.
   + For **Import file**, navigate to the location where you saved the .pem file.

   When the settings are as you want them, choose **Add new CA certificate**.

#### AWS CLI
<a name="CHAP_Target.DocumentDB.tls.cli"></a>

Use the `aws dms import-certificate` command, as shown in the following example.

```
aws dms import-certificate \
    --certificate-identifier docdb-cert \
    --certificate-pem file://./rds-combined-ca-bundle.pem
```

When you create an AWS DMS target endpoint, provide the certificate identifier (for example, `docdb-cert`). Also, set the SSL mode parameter to `verify-full`.

## Connecting to Amazon DocumentDB Elastic Clusters as a target
<a name="CHAP_Target.DocumentDB.data-mapping.elastic-cluster-connect"></a>

In AWS DMS versions 3.4.7 and higher, you can create a Amazon DocumentDB target endpoint as an Elastic Cluster. If you create your target endpoint as an Elastic Cluster, you need to attach a new SSL certificate to your Amazon DocumentDB Elastic Cluster endpoint because your existing SSL certificate won't work.

**To attach a new SSL certificate to your Amazon DocumentDB Elastic Cluster endpoint**

1. In a browser, open [ https://www.amazontrust.com/repository/SFSRootCAG2.pem](https://www.amazontrust.com/repository/SFSRootCAG2.pem) and save the contents to a `.pem` file with a unique file name, for example `SFSRootCAG2.pem`. This is the certificate file that you need to import in subsequent steps.

1. Create the Elastic Cluster endpoint and set the following options:

   1. Under **Endpoint Configuration**, choose **Add new CA certificate**.

   1. For **Certificate identifier**, enter **SFSRootCAG2.pem**.

   1. For **Import certificate file**, choose **Choose file**, then navigate to the `SFSRootCAG2.pem` file that you previously downloaded.

   1. Select and open the downloaded `SFSRootCAG2.pem` file.

   1. Choose **Import certificate**.

   1. From the **Choose a certificate** drop down, choose **SFSRootCAG2.pem**.

The new SSL certificate from the downloaded `SFSRootCAG2.pem` file is now attached to your Amazon DocumentDB Elastic Cluster endpoint.

## Ongoing replication with Amazon DocumentDB as a target
<a name="CHAP_Target.DocumentDB.data-mapping.ongoing-replication"></a>

If ongoing replication (change data capture, CDC) is enabled for Amazon DocumentDB as a target, AWS DMS versions 3.5.0 and higher provide a performance improvement that is twenty times greater than in prior releases. In prior releases where AWS DMS handles up to 250 records per second, AWS DMS now efficiently processes over 5000 records per second. AWS DMS also ensures that documents in Amazon DocumentDB stay in sync with the source. When a source record is created or updated, AWS DMS must first determine which Amazon DocumentDB record is affected by doing the following:
+ If the source record has a column named `_id`, the value of that column determines the corresponding `_id` in the Amazon DocumentDB collection.
+ If there is no `_id` column, but the source data has a primary key or unique index, then AWS DMS uses that key or index value as the `_id` for the Amazon DocumentDB collection.
+ If the source record doesn't have an `_id` column, a primary key, or a unique index, then AWS DMS matches all of the source columns to the corresponding fields in the Amazon DocumentDB collection.

When a new source record is created, AWS DMS writes a corresponding document to Amazon DocumentDB. If an existing source record is updated, AWS DMS updates the corresponding fields in the target document in Amazon DocumentDB. Any fields that exist in the target document but not in the source record remain untouched.

When a source record is deleted, AWS DMS deletes the corresponding document from Amazon DocumentDB.

### Structural changes (DDL) at the source
<a name="CHAP_Target.DocumentDB.data-mapping.ongoing-replication.ddl"></a>

With ongoing replication, any changes to source data structures (such as tables, columns, and so on) are propagated to their counterparts in Amazon DocumentDB. In relational databases, these changes are initiated using data definition language (DDL) statements. You can see how AWS DMS propagates these changes to Amazon DocumentDB in the following table.


****  

| DDL at source | Effect at Amazon DocumentDB target | 
| --- | --- | 
| CREATE TABLE | Creates an empty collection. | 
| Statement that renames a table (RENAME TABLE, ALTER TABLE...RENAME, and similar) | Renames the collection. | 
| TRUNCATE TABLE | Removes all the documents from the collection, but only if HandleSourceTableTruncated is true. For more information, see [Task settings for change processing DDL handling](CHAP_Tasks.CustomizingTasks.TaskSettings.DDLHandling.md). | 
| DROP TABLE | Deletes the collection, but only if HandleSourceTableDropped is true. For more information, see [Task settings for change processing DDL handling](CHAP_Tasks.CustomizingTasks.TaskSettings.DDLHandling.md). | 
| Statement that adds a column to a table (ALTER TABLE...ADD and similar) | The DDL statement is ignored, and a warning is issued. When the first INSERT is performed at the source, the new field is added to the target document. | 
| ALTER TABLE...RENAME COLUMN | The DDL statement is ignored, and a warning is issued. When the first INSERT is performed at the source, the newly named field is added to the target document. | 
| ALTER TABLE...DROP COLUMN | The DDL statement is ignored, and a warning is issued. | 
| Statement that changes the column data type (ALTER COLUMN...MODIFY and similar) | The DDL statement is ignored, and a warning is issued. When the first INSERT is performed at the source with the new data type, the target document is created with a field of that new data type. | 

## Limitations to using Amazon DocumentDB as a target
<a name="CHAP_Target.DocumentDB.limitations"></a>

The following limitations apply when using Amazon DocumentDB as a target for AWS DMS:
+ In Amazon DocumentDB, collection names can't contain the dollar symbol (\$1). In addition, database names can't contain any Unicode characters.
+ AWS DMS doesn't support merging of multiple source tables into a single Amazon DocumentDB collection.
+ When AWS DMS processes changes from a source table that doesn't have a primary key, any LOB columns in that table are ignored.
+ If the **Change table** option is enabled and AWS DMS encounters a source column named "*\$1id*", then that column appears as "*\$1\$1id*" (two underscores) in the change table.
+ If you choose Oracle as a source endpoint, then the Oracle source must have full supplemental logging enabled. Otherwise, if there are columns at the source that weren't changed, then the data is loaded into Amazon DocumentDB as null values.
+ The replication task setting, `TargetTablePrepMode:TRUNCATE_BEFORE_LOAD` isn't supported for use with a DocumentDB target endpoint. 
+ MongoDB capped collections are not supported in Amazon DocumentDB. However, AWS DMS automatically migrates such objects as uncapped collections on target DocumentDB.
+ Parallel CDC apply to Amazon DocumentDB targets can cause duplicate key errors or stalled CDC apply for workloads that use secondary unique indexes or require strict ordering of changes. For such workloads, use the default single-threaded CDC apply configuration.

## Using endpoint settings with Amazon DocumentDB as a target
<a name="CHAP_Target.DocumentDB.ECAs"></a>

You can use endpoint settings to configure your Amazon DocumentDB target database similar to using extra connection attributes. You specify the settings when you create the target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--doc-db-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with Amazon DocumentDB as a target.


| Attribute name | Valid values | Default value and description | 
| --- | --- | --- | 
|   `replicateShardCollections`   |  boolean `true` `false`  |  When `true`, this endpoint setting has the following effects and imposes the following limitations: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.DocumentDB.html)  | 

## Target data types for Amazon DocumentDB
<a name="CHAP_Target.DocumentDB.datatypes"></a>

In the following table, you can find the Amazon DocumentDB target data types that are supported when using AWS DMS, and the default mapping from AWS DMS data types. For more information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md).


|  AWS DMS data type  |  Amazon DocumentDB data type  | 
| --- | --- | 
|  BOOLEAN  |  Boolean  | 
|  BYTES  |  Binary data  | 
|  DATE  | Date | 
|  TIME  | String (UTF8) | 
|  DATETIME  | Date | 
|  INT1  | 32-bit integer | 
|  INT2  |  32-bit integer  | 
|  INT4  | 32-bit integer | 
|  INT8  |  64-bit integer  | 
|  NUMERIC  | String (UTF8) | 
|  REAL4  |  Double  | 
|  REAL8  | Double | 
|  STRING  |  If the data is recognized as JSON, then AWS DMS migrates it to Amazon DocumentDB as a document. Otherwise, the data is mapped to String (UTF8).  | 
|  UINT1  | 32-bit integer | 
|  UINT2  | 32-bit integer | 
|  UINT4  | 64-bit integer | 
|  UINT8  |  String (UTF8)  | 
|  WSTRING  | If the data is recognized as JSON, then AWS DMS migrates it to Amazon DocumentDB as a document. Otherwise, the data is mapped to String (UTF8). | 
|  BLOB  | Binary | 
|  CLOB  | If the data is recognized as JSON, then AWS DMS migrates it to Amazon DocumentDB as a document. Otherwise, the data is mapped to String (UTF8). | 
|  NCLOB  | If the data is recognized as JSON, then AWS DMS migrates it to Amazon DocumentDB as a document. Otherwise, the data is mapped to String (UTF8). | 

# Using Amazon Neptune as a target for AWS Database Migration Service
<a name="CHAP_Target.Neptune"></a>

Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. The core of Neptune is a purpose-built, high-performance graph database engine. This engine is optimized for storing billions of relationships and querying the graph with milliseconds latency. Neptune supports the popular graph query languages Apache TinkerPop Gremlin and W3C's SPARQL. For more information on Amazon Neptune, see [What is Amazon Neptune?](https://docs.aws.amazon.com/neptune/latest/userguide/intro.html) in the *Amazon Neptune User Guide*. 

Without a graph database such as Neptune, you probably model highly connected data in a relational database. Because the data has potentially dynamic connections, applications that use such data sources have to model connected data queries in SQL. This approach requires you to write an extra layer to convert graph queries into SQL. Also, relational databases come with schema rigidity. Any changes in the schema to model changing connections require downtime and additional maintenance of the query conversion to support the new schema. The query performance is also another big constraint to consider while designing your applications.

Graph databases can greatly simplify such situations. Free from a schema, a rich graph query layer (Gremlin or SPARQL) and indexes optimized for graph queries increase flexibility and performance. The Amazon Neptune graph database also has enterprise features such as encryption at rest, a secure authorization layer, default backups, Multi-AZ support, read replica support, and others.

Using AWS DMS, you can migrate relational data that models a highly connected graph to a Neptune target endpoint from a DMS source endpoint for any supported SQL database.

For more details, see the following.

**Topics**
+ [

## Overview of migrating to Amazon Neptune as a target
](#CHAP_Target.Neptune.MigrationOverview)
+ [

## Specifying endpoint settings for Amazon Neptune as a target
](#CHAP_Target.Neptune.EndpointSettings)
+ [

## Creating an IAM service role for accessing Amazon Neptune as a target
](#CHAP_Target.Neptune.ServiceRole)
+ [

## Specifying graph-mapping rules using Gremlin and R2RML for Amazon Neptune as a target
](#CHAP_Target.Neptune.GraphMapping)
+ [

## Data types for Gremlin and R2RML migration to Amazon Neptune as a target
](#CHAP_Target.Neptune.DataTypes)
+ [

## Limitations of using Amazon Neptune as a target
](#CHAP_Target.Neptune.Limitations)

## Overview of migrating to Amazon Neptune as a target
<a name="CHAP_Target.Neptune.MigrationOverview"></a>

Before starting a migration to a Neptune target, create the following resources in your AWS account:
+ A Neptune cluster for the target endpoint. 
+ A SQL relational database supported by AWS DMS for the source endpoint. 
+ An Amazon S3 bucket for the target endpoint. Create this S3 bucket in the same AWS Region as your Neptune cluster. AWS DMS uses this S3 bucket as intermediate file storage for the target data that it bulk loads to the Neptune database. For more information on creating an S3 bucket, see [Creating a bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html) in the *Amazon Simple Storage Service User Guide.*
+ A virtual private cloud (VPC) endpoint for S3 in the same VPC as the Neptune cluster. 
+ An AWS Identity and Access Management (IAM) role that includes an IAM policy. This policy should specify the `GetObject`, `PutObject`, `DeleteObject` and `ListObject` permissions to the S3 bucket for your target endpoint. This role is assumed by both AWS DMS and Neptune with IAM access to both the target S3 bucket and the Neptune database. For more information, see [Creating an IAM service role for accessing Amazon Neptune as a target](#CHAP_Target.Neptune.ServiceRole).

After you have these resources, setting up and starting a migration to a Neptune target is similar to any full load migration using the console or DMS API. However, a migration to a Neptune target requires some unique steps.

**To migrate an AWS DMS relational database to Neptune**

1. Create a replication instance as described in [Creating a replication instance](CHAP_ReplicationInstance.Creating.md).

1. Create and test a SQL relational database supported by AWS DMS for the source endpoint.

1. Create and test the target endpoint for your Neptune database. 

   To connect the target endpoint to the Neptune database, specify the server name for either the Neptune cluster endpoint or the Neptune writer instance endpoint. Also, specify the S3 bucket folder for AWS DMS to store its intermediate files for bulk load to the Neptune database. 

   During migration, AWS DMS stores all migrated target data in this S3 bucket folder up to a maximum file size that you specify. When this file storage reaches this maximum size, AWS DMS bulk loads the stored S3 data into the target database. It clears the folder to enable storage of any additional target data for subsequent loading to the target database. For more information on specifying these settings, see [Specifying endpoint settings for Amazon Neptune as a target](#CHAP_Target.Neptune.EndpointSettings).

1. Create a full-load replication task with the resources created in steps 1–3 and do the following: 

   1. Use task table mapping as usual to identify specific source schemas, tables, and views to migrate from your relational database using appropriate selection and transformation rules. For more information, see [Using table mapping to specify task settings](CHAP_Tasks.CustomizingTasks.TableMapping.md). 

   1. Specify target mappings by choosing one of the following to specify mapping rules from source tables and views to your Neptune target database graph:
      + Gremlin JSON – For information on using Gremlin JSON to load a Neptune database, see [Gremlin load data format](https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load-tutorial-format-gremlin.html) in the *Amazon Neptune User Guide*.
      + SPARQL RDB to Resource Description Framework Mapping Language (R2RML) – For information on using SPARQL R2RML, see the W3C specification [R2RML: RDB to RDF mapping language](https://www.w3.org/TR/r2rml/).

   1. Do one of the following:
      + Using the AWS DMS console, specify graph-mapping options using **Graph mapping rules** on the **Create database migration task** page. 
      + Using the AWS DMS API, specify these options using the `TaskData` request parameter of the `CreateReplicationTask` API call. 

      For more information and examples using Gremlin JSON and SPARQL R2RML to specify graph-mapping rules, see [Specifying graph-mapping rules using Gremlin and R2RML for Amazon Neptune as a target](#CHAP_Target.Neptune.GraphMapping).

1. Start the replication for your migration task.

## Specifying endpoint settings for Amazon Neptune as a target
<a name="CHAP_Target.Neptune.EndpointSettings"></a>

To create or modify a target endpoint, you can use the console or the `CreateEndpoint` or `ModifyEndpoint` API operations. 

For a Neptune target in the AWS DMS console, specify **Endpoint-specific settings** on the **Create endpoint** or **Modify endpoint** console page. For `CreateEndpoint` and `ModifyEndpoint`, specify request parameters for the `NeptuneSettings` option. The following example shows how to do this using the CLI. 

```
dms create-endpoint --endpoint-identifier my-neptune-target-endpoint
--endpoint-type target --engine-name neptune 
--server-name my-neptune-db.cluster-cspckvklbvgf.us-east-1.neptune.amazonaws.com 
--port 8192
--neptune-settings 
     '{"ServiceAccessRoleArn":"arn:aws:iam::123456789012:role/myNeptuneRole",
       "S3BucketName":"amzn-s3-demo-bucket",
       "S3BucketFolder":"amzn-s3-demo-bucket-folder",
       "ErrorRetryDuration":57,
       "MaxFileSize":100, 
       "MaxRetryCount": 10, 
       "IAMAuthEnabled":false}‘
```

Here, the CLI `--server-name` option specifies the server name for the Neptune cluster writer endpoint. Or you can specify the server name for a Neptune writer instance endpoint. 

The `--neptune-settings` option request parameters follow:
+ `ServiceAccessRoleArn` – (Required) The Amazon Resource Name (ARN) of the service role that you created for the Neptune target endpoint. For more information, see [Creating an IAM service role for accessing Amazon Neptune as a target](#CHAP_Target.Neptune.ServiceRole).
+ `S3BucketName` – (Required) The name of the S3 bucket where DMS can temporarily store migrated graph data in .csv files before bulk loading it to the Neptune target database. DMS maps the SQL source data to graph data before storing it in these .csv files.
+ `S3BucketFolder` – (Required) A folder path where you want DMS to store migrated graph data in the S3 bucket specified by `S3BucketName`.
+ `ErrorRetryDuration` – (Optional) The number of milliseconds for DMS to wait to retry a bulk load of migrated graph data to the Neptune target database before raising an error. The default is 250.
+ `MaxFileSize` – (Optional) The maximum size in KB of migrated graph data stored in a .csv file before DMS bulk loads the data to the Neptune target database. The default is 1,048,576 KB (1 GB). If successful, DMS clears the bucket, ready to store the next batch of migrated graph data.
+ `MaxRetryCount` – (Optional) The number of times for DMS to retry a bulk load of migrated graph data to the Neptune target database before raising an error. The default is 5.
+ `IAMAuthEnabled` – (Optional) If you want IAM authorization enabled for this endpoint, set this parameter to `true` and attach the appropriate IAM policy document to your service role specified by `ServiceAccessRoleArn`. The default is `false`.

## Creating an IAM service role for accessing Amazon Neptune as a target
<a name="CHAP_Target.Neptune.ServiceRole"></a>

To access Neptune as a target, create a service role using IAM. Depending on your Neptune endpoint configuration, attach to this role some or all of the following IAM policy and trust documents. When you create the Neptune endpoint, you provide the ARN of this service role. Doing so enables AWS DMS and Amazon Neptune to assume permissions to access both Neptune and its associated Amazon S3 bucket.

If you set the `IAMAuthEnabled` parameter in `NeptuneSettings` to `true` in your Neptune endpoint configuration, attach an IAM policy like the following to your service role. If you set `IAMAuthEnabled` to `false`, you can ignore this policy.

```
// Policy to access Neptune

    {
        "Version": "2012-10-17",		 	 	 
        "Statement": [
            {
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": "neptune-db:*",
                "Resource": "arn:aws:neptune-db:us-east-1:123456789012:cluster-CLG7H7FHK54AZGHEH6MNS55JKM/*"
            }
        ]
    }
```

The preceding IAM policy allows full access to the Neptune target cluster specified by `Resource`.

Attach an IAM policy like the following to your service role. This policy allows DMS to temporarily store migrated graph data in the S3 bucket that you created for bulk loading to the Neptune target database.

```
//Policy to access S3 bucket

{
	"Version": "2012-10-17",		 	 	 
	"Statement": [{
			"Sid": "ListObjectsInBucket0",
			"Effect": "Allow",
			"Action": "s3:ListBucket",
			"Resource": [
				"arn:aws:s3:::amzn-s3-demo-bucket"
			]
		},
		{
			"Sid": "AllObjectActions",
			"Effect": "Allow",
			"Action": ["s3:GetObject",
				"s3:PutObject",
				"s3:DeleteObject"
			],

			"Resource": [
				"arn:aws:s3:::amzn-s3-demo-bucket/"
			]
		},
		{
			"Sid": "ListObjectsInBucket1",
			"Effect": "Allow",
			"Action": "s3:ListBucket",
			"Resource": [
				"arn:aws:s3:::amzn-s3-demo-bucket",
				"arn:aws:s3:::amzn-s3-demo-bucket/"
			]
		}
	]
}
```

The preceding IAM policy allows your account to query the contents of the S3 bucket (`arn:aws:s3:::amzn-s3-demo-bucket`) created for your Neptune target. It also allows your account to fully operate on the contents of all bucket files and folders (`arn:aws:s3:::amzn-s3-demo-bucket/`).

Edit the trust relationship and attach the following IAM role to your service role to allow both AWS DMS and Amazon Neptune database service to assume the role.

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "dms.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Sid": "neptune",
      "Effect": "Allow",
      "Principal": {
        "Service": "rds.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
```

------

For information about specifying this service role for your Neptune target endpoint, see [Specifying endpoint settings for Amazon Neptune as a target](#CHAP_Target.Neptune.EndpointSettings).

## Specifying graph-mapping rules using Gremlin and R2RML for Amazon Neptune as a target
<a name="CHAP_Target.Neptune.GraphMapping"></a>

The graph-mapping rules that you create specify how data extracted from an SQL relational database source is loaded into a Neptune database cluster target. The format of these mapping rules differs depending on whether the rules are for loading property-graph data using Apache TinkerPop Gremlin or Resource Description Framework (RDF) data using R2RML. Following, you can find information about these formats and where to learn more.

You can specify these mapping rules when you create the migration task using either the console or DMS API. 

Using the console, specify these mapping rules using **Graph mapping rules** on the **Create database migration task** page. In **Graph mapping rules**, you can enter and edit the mapping rules directly using the editor provided. Or you can browse for a file that contains the mapping rules in the appropriate graph-mapping format. 

Using the API, specify these options using the `TaskData` request parameter of the `CreateReplicationTask` API call. Set `TaskData` to the path of a file containing the mapping rules in the appropriate graph-mapping format.

### Graph-mapping rules for generating property-graph data using Gremlin
<a name="CHAP_Target.Neptune.GraphMapping.Gremlin"></a>

Using Gremlin to generate the property-graph data, specify a JSON object with a mapping rule for each graph entity to be generated from the source data. The format of this JSON is defined specifically for bulk loading Amazon Neptune. The following template shows what each rule in this object looks like.

```
{
    "rules": [
        {
            "rule_id": "(an identifier for this rule)",
            "rule_name": "(a name for this rule)",
            "table_name": "(the name of the table or view being loaded)",
            "vertex_definitions": [
                {
                    "vertex_id_template": "{col1}",
                    "vertex_label": "(the vertex to create)",
                    "vertex_definition_id": "(an identifier for this vertex)",
                    "vertex_properties": [
                        {
                            "property_name": "(name of the property)",
                            "property_value_template": "{col2} or text",
                            "property_value_type": "(data type of the property)"
                        }
                    ]
                }
            ]
        },
        {
            "rule_id": "(an identifier for this rule)",
            "rule_name": "(a name for this rule)",
            "table_name": "(the name of the table or view being loaded)",
            "edge_definitions": [
                {
                    "from_vertex": {
                        "vertex_id_template": "{col1}",
                        "vertex_definition_id": "(an identifier for the vertex referenced above)"
                    },
                    "to_vertex": {
                        "vertex_id_template": "{col3}",
                        "vertex_definition_id": "(an identifier for the vertex referenced above)"
                    },
                    "edge_id_template": {
                        "label": "(the edge label to add)",
                        "template": "{col1}_{col3}"
                    },
                    "edge_properties":[
                        {
                            "property_name": "(the property to add)",
                            "property_value_template": "{col4} or text",
                            "property_value_type": "(data type like String, int, double)"
                        }
                    ]
                }
            ]
        }
    ]
}
```

The presence of a vertex label implies that the vertex is being created here. Its absence implies that the vertex is created by a different source, and this definition is only adding vertex properties. Specify as many vertex and edge definitions as required to specify the mappings for your entire relational database source.

A sample rule for an `employee` table follows.

```
{
    "rules": [
        {
            "rule_id": "1",
            "rule_name": "vertex_mapping_rule_from_nodes",
            "table_name": "nodes",
            "vertex_definitions": [
                {
                    "vertex_id_template": "{emp_id}",
                    "vertex_label": "employee",
                    "vertex_definition_id": "1",
                    "vertex_properties": [
                        {
                            "property_name": "name",
                            "property_value_template": "{emp_name}",
                            "property_value_type": "String"
                        }
                    ]
                }
            ]
        },
        {
            "rule_id": "2",
            "rule_name": "edge_mapping_rule_from_emp",
            "table_name": "nodes",
            "edge_definitions": [
                {
                    "from_vertex": {
                        "vertex_id_template": "{emp_id}",
                        "vertex_definition_id": "1"
                    },
                    "to_vertex": {
                        "vertex_id_template": "{mgr_id}",
                        "vertex_definition_id": "1"
                    },
                    "edge_id_template": {
                        "label": "reportsTo",
                        "template": "{emp_id}_{mgr_id}"
                    },
                    "edge_properties":[
                        {
                            "property_name": "team",
                            "property_value_template": "{team}",
                            "property_value_type": "String"
                        }
                    ]
                }
            ]
        }
    ]
}
```

Here, the vertex and edge definitions map a reporting relationship from an `employee` node with employee ID (`EmpID`) and an `employee` node with a manager ID (`managerId`).

For more information about creating graph-mapping rules using Gremlin JSON, see [Gremlin load data format](https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load-tutorial-format-gremlin.html) in the *Amazon Neptune User Guide*.

### Graph-mapping rules for generating RDF/SPARQL data
<a name="CHAP_Target.Neptune.GraphMapping.R2RML"></a>

If you are loading RDF data to be queried using SPARQL, write the graph-mapping rules in R2RML. R2RML is a standard W3C language for mapping relational data to RDF. In an R2RML file, a *triples map* (for example, `<#TriplesMap1>` following) specifies a rule for translating each row of a logical table to zero or more RDF triples. A *subject map* (for example, any `rr:subjectMap` following) specifies a rule for generating the subjects of the RDF triples generated by a triples map. A *predicate-object map* (for example, any `rr:predicateObjectMap` following) is a function that creates one or more predicate-object pairs for each logical table row of a logical table.

A simple example for a `nodes` table follows.

```
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/ns#>.

<#TriplesMap1>
    rr:logicalTable [ rr:tableName "nodes" ];
    rr:subjectMap [
        rr:template "http://data.example.com/employee/{id}";
        rr:class ex:Employee;
    ];
    rr:predicateObjectMap [
        rr:predicate ex:name;
        rr:objectMap [ rr:column "label" ];
    ]
```

In the previous example, the mapping defines graph nodes mapped from a table of employees.

Another simple example for a `Student` table follows.

```
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/#>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.

<#TriplesMap2>
    rr:logicalTable [ rr:tableName "Student" ];
    rr:subjectMap   [ rr:template "http://example.com/{ID}{Name}";
                      rr:class foaf:Person ];
    rr:predicateObjectMap [
        rr:predicate ex:id ;
        rr:objectMap  [ rr:column "ID";
                        rr:datatype xsd:integer ]
    ];
    rr:predicateObjectMap [
        rr:predicate foaf:name ;
        rr:objectMap  [ rr:column "Name" ]
    ].
```

In the previous example, the mapping defines graph nodes mapping friend-of-a-friend relationships between persons in a `Student` table.

For more information about creating graph-mapping rules using SPARQL R2RML, see the W3C specification [R2RML: RDB to RDF mapping language](https://www.w3.org/TR/r2rml/).

## Data types for Gremlin and R2RML migration to Amazon Neptune as a target
<a name="CHAP_Target.Neptune.DataTypes"></a>

AWS DMS performs data type mapping from your SQL source endpoint to your Neptune target in one of two ways. Which way you use depends on the graph mapping format that you're using to load the Neptune database: 
+ Apache TinkerPop Gremlin, using a JSON representation of the migration data.
+ W3C's SPARQL, using an R2RML representation of the migration data. 

For more information on these two graph mapping formats, see [Specifying graph-mapping rules using Gremlin and R2RML for Amazon Neptune as a target](#CHAP_Target.Neptune.GraphMapping).

Following, you can find descriptions of the data type mappings for each format.

### SQL source to Gremlin target data type mappings
<a name="CHAP_Target.Neptune.DataTypes.Gremlin"></a>

The following table shows the data type mappings from a SQL source to a Gremlin formatted target. 

AWS DMS maps any unlisted SQL source data type to a Gremlin `String`.


[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Neptune.html)

For more information on the Gremlin data types for loading Neptune, see [Gremlin data types](https://docs.aws.amazon.com//neptune/latest/userguide/bulk-load-tutorial-format-gremlin.html#bulk-load-tutorial-format-gremlin-datatypes) in the *Neptune User Guide.*

### SQL source to R2RML (RDF) target data type mappings
<a name="CHAP_Target.Neptune.DataTypes.R2RML"></a>

The following table shows the data type mappings from a SQL source to an R2RML formatted target.

All listed RDF data types are case-sensitive, except RDF literal. AWS DMS maps any unlisted SQL source data type to an RDF literal. 

An *RDF literal* is one of a variety of literal lexical forms and data types. For more information, see [RDF literals](https://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Graph-Literal) in the W3C specification *Resource Description Framework (RDF): Concepts and Abstract Syntax*.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Neptune.html)

For more information on the RDF data types for loading Neptune and their mappings to SQL source data types, see [Datatype conversions](https://www.w3.org/TR/r2rml/#datatype-conversions) in the W3C specification *R2RML: RDB to RDF Mapping Language*.

## Limitations of using Amazon Neptune as a target
<a name="CHAP_Target.Neptune.Limitations"></a>

The following limitations apply when using Neptune as a target:
+ AWS DMS currently supports full load tasks only for migration to a Neptune target. Change data capture (CDC) migration to a Neptune target isn't supported.
+ Make sure that your target Neptune database is manually cleared of all data before starting the migration task, as in the following examples.

  To drop all data (vertices and edges) within the graph, run the following Gremlin command.

  ```
  gremlin> g.V().drop().iterate()
  ```

  To drop vertices that have the label `'customer'`, run the following Gremlin command.

  ```
  gremlin> g.V().hasLabel('customer').drop()
  ```
**Note**  
It can take some time to drop a large dataset. You might want to iterate `drop()` with a limit, for example, `limit(1000)`.

  To drop edges that have the label `'rated'`, run the following Gremlin command.

  ```
  gremlin> g.E().hasLabel('rated').drop()
  ```
**Note**  
It can take some time to drop a large dataset. You might want to iterate `drop()` with a limit, for example `limit(1000)`.
+ The DMS API operation `DescribeTableStatistics` can return inaccurate results about a given table because of the nature of Neptune graph data structures.

  During migration, AWS DMS scans each source table and uses graph mapping to convert the source data into a Neptune graph. The converted data is first stored in the S3 bucket folder specified for the target endpoint. If the source is scanned and this intermediate S3 data is generated successfully, `DescribeTableStatistics` assumes that the data was successfully loaded into the Neptune target database. But this isn't always true. To verify that the data was loaded correctly for a given table, compare `count()` return values at both ends of the migration for that table. 

  In the following example, AWS DMS has loaded a `customer` table from the source database, which is assigned the label `'customer'` in the target Neptune database graph. You can make sure that this label is written to the target database. To do this, compare the number of `customer` rows available from the source database with the number of `'customer'` labeled rows loaded in the Neptune target database after the task completes.

  To get the number of customer rows available from the source database using SQL, run the following.

  ```
  select count(*) from customer;
  ```

  To get the number of `'customer'` labeled rows loaded into the target database graph using Gremlin, run the following.

  ```
  gremlin> g.V().hasLabel('customer').count()
  ```
+ Currently, if any single table fails to load, the whole task fails. Unlike in a relational database target, data in Neptune is highly connected, which makes it impossible in many cases to resume a task. If a task can't be resumed successfully because of this type of data load failure, create a new task to load the table that failed to load. Before running this new task, manually clear the partially loaded table from the Neptune target.
**Note**  
You can resume a task that fails migration to a Neptune target if the failure is recoverable (for example, a network transit error).
+ AWS DMS supports most standards for R2RML. However, AWS DMS doesn't support certain R2RML standards, including inverse expressions, joins, and views. A work-around for an R2RML view is to create a corresponding custom SQL view in the source database. In the migration task, use table mapping to choose the view as input. Then map the view to a table that is then consumed by R2RML to generate graph data.
+ When you migrate source data with unsupported SQL data types, the resulting target data can have a loss of precision. For more information, see [Data types for Gremlin and R2RML migration to Amazon Neptune as a target](#CHAP_Target.Neptune.DataTypes).
+ AWS DMS doesn't support migrating LOB data into a Neptune target.

# Using Redis OSS as a target for AWS Database Migration Service
<a name="CHAP_Target.Redis"></a>

Redis OSS is an open-source in-memory data structure store used as a database, cache, and message broker. Managing data in-memory can result in read or write operations taking less than a millisecond, and hundreds of millions of operations performed each second. As an in-memory data store, Redis OSS powers the most demanding applications requiring sub-millisecond response times.

Using AWS DMS, you can migrate data from any supported source database to a target Redis OSS data store with minimal downtime. For additional information about Redis OSS see, [Redis OSS Documentation](https://redis.io/documentation).

In addition to on-premises Redis OSS, AWS Database Migration Service supports the following:
+ [Amazon ElastiCache (Redis OSS)](https://aws.amazon.com/elasticache/redis/) as a target data store. ElastiCache (Redis OSS) works with your Redis OSS clients and uses the open Redis OSS data format to store your data.
+ [Amazon MemoryDB](https://aws.amazon.com/memorydb/) as a target data store. MemoryDB is compatible with Redis OSS and enables you to build applications using all the Redis OSS data structures, APIs, and commands in use today.

For additional information about working with Redis OSS as a target for AWS DMS, see the following sections: 

**Topics**
+ [

## Prerequisites for using a Redis OSS cluster as a target for AWS DMS
](#CHAP_Target.Redis.Prerequisites)
+ [

## Limitations when using Redis as a target for AWS Database Migration Service
](#CHAP_Target.Redis.Limitations)
+ [

## Migrating data from a relational or non-relational database to a Redis OSS target
](#CHAP_Target.Redis.Migrating)
+ [

## Specifying endpoint settings for Redis OSS as a target
](#CHAP_Target.Redis.EndpointSettings)

## Prerequisites for using a Redis OSS cluster as a target for AWS DMS
<a name="CHAP_Target.Redis.Prerequisites"></a>

DMS supports an on-premises Redis OSS target in a standalone configuration, or as a Redis OSS cluster where data is automatically *sharded* across multiple nodes. Sharding is the process of separating data into smaller chunks called shards that are spread across multiple servers or nodes. In effect, a shard is a data partition that contains a subset of the total data set, and serves a slice of the overall workload.

Since Redis OSS is a key-value NoSQL data store, the Redis OSS key naming convention to use when your source is a relational database, is **schema-name.table-name.primary-key**. In Redis OSS, the key and value must not contain the special character %. Otherwise, DMS skips the record. 

**Note**  
If you are using ElastiCache (Redis OSS) as a target, DMS supports *cluster mode enabled* configurations only. For more information about using ElastiCache (Redis OSS) version 6.x or higher to create a cluster mode enabled target data store, see [Getting started](https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/GettingStarted.html) in the *Amazon ElastiCache (Redis OSS) User Guide*. 

Before you begin a database migration, launch your Redis OSS cluster with the following criteria.
+ Your cluster has one or more shards.
+ If you're using an ElastiCache (Redis OSS) target, ensure that your cluster doesn't use IAM role-based access control. Instead, use Redis OSS Auth to authenticate users.
+ Enable Multi-AZ (Availability Zones).
+ Ensure the cluster has sufficient memory available to fit the data to be migrated from your database. 
+ Make sure that your target Redis OSS cluster is clear of all data before starting the initial migration task.

You should determine your security requirements for the data migration prior to creating your cluster configuration. DMS supports migration to target replication groups regardless of their encryption configuration. But you can enable or disable encryption only when you create your cluster configuration.

## Limitations when using Redis as a target for AWS Database Migration Service
<a name="CHAP_Target.Redis.Limitations"></a>

The following limitations apply when using Redis OSS as a target:
+ Since Redis OSS is a key-value no-sql data store, the Redis OSS key naming convention to use when your source is a relational database, is `schema-name.table-name.primary-key`. 
+ In Redis OSS, the key-value can't contain the special character `%`. Otherwise, DMS skips the record.
+ DMS won't migrate rows that contain the `%` character.
+ DMS won't migrate fields that contain the `%` character in the field name.
+ Full LOB mode is not supported.
+  A private Certificate Authority (CA) isn’t supported when using ElastiCache (Redis OSS) as a target.
+ AWS DMS does not support source data containing embedded `'\0'` characters when using Redis as a target endpoint. Data containing embedded `'\0'` characters will be truncated at the first `'\0'` character.

## Migrating data from a relational or non-relational database to a Redis OSS target
<a name="CHAP_Target.Redis.Migrating"></a>

You can migrate data from any source SQL or NoSQL data store directly to a Redis OSS target. Setting up and starting a migration to a Redis OSS target is similar to any full load and change data capture migration using the DMS console or API. To perform a database migration to a Redis OSS target, you do the following.
+ Create a replication instance to perform all the processes for the migration. For more information, see [Creating a replication instance](CHAP_ReplicationInstance.Creating.md).
+ Specify a source endpoint. For more information, see [Creating source and target endpoints](CHAP_Endpoints.Creating.md).
+ Locate the DNS name and port number of your cluster.
+ Download a certificate bundle that you can use to verify SSL connections.
+ Specify a target endpoint, as described below.
+ Create a task or set of tasks to define what tables and replication processes you want to use. For more information, see [Creating a task](CHAP_Tasks.Creating.md).
+ Migrate data from your source database to your target cluster.

You begin a database migration in one of two ways:

1. You can choose the AWS DMS console and perform each step there.

1. You can use the AWS Command Line Interface (AWS CLI). For more information about using the CLI with AWS DMS, see [AWS CLI for AWS DMS](http://docs.aws.amazon.com/cli/latest/reference/dms/index.html).

**To locate the DNS name and port number of your cluster**
+ Use the following AWS CLI command to provide the `replication-group-id` with the name of your replication group.

  ```
  aws elasticache describe-replication-groups --replication-group-id myreplgroup
  ```

  Here, the output shows the DNS name in the `Address` attribute and the port number in the `Port` attribute of the primary node in the cluster. 

  ```
   ...
  "ReadEndpoint": {
  "Port": 6379,
  "Address": "myreplgroup-
  111.1abc1d.1111.uuu1.cache.example.com"
  }
  ...
  ```

  If you are using MemoryDB as your target, use the following AWS CLI command to provide an endpoint address to your Redis OSS cluster. 

  ```
  aws memorydb describe-clusters --clusterid clusterid
  ```

**Download a certificate bundle for use to verify SSL connections**
+ Enter the following `wget` command at the command line. Wget is a free GNU command-line utility tool used to download files from the internet.

  ```
  wget https://s3.aws-api-domain/rds-downloads/rds-combined-ca-bundle.pem
  ```

  Here, `aws-api-domain` completes the Amazon S3 domain in your AWS Region required to access the speciﬁed S3 bucket and the rds-combined-ca-bundle.pem ﬁle that it provides.

**To create a target endpoint using the AWS DMS console**

This endpoint is for your Redis OSS target that is already running. 
+ On the console, choose **Endpoints** from the navigation pane and then choose **Create Endpoint**. The following table describes the settings.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Redis.html)

When you're finished providing all information for your endpoint, AWS DMS creates your Redis OSS target endpoint for use during database migration.

For information about creating a migration task and starting your database migration, see [Creating a task](CHAP_Tasks.Creating.md).

## Specifying endpoint settings for Redis OSS as a target
<a name="CHAP_Target.Redis.EndpointSettings"></a>

To create or modify a target endpoint, you can use the console or the `CreateEndpoint` or `ModifyEndpoint` API operations. 

For a Redis OSS target in the AWS DMS console, specify **Endpoint-specific settings** on the **Create endpoint** or **Modify endpoint** console page.

When using `CreateEndpoint` and `ModifyEndpoint` API operations, specify request parameters for the `RedisSettings` option. The example following shows how to do this using the AWS CLI.

```
aws dms create-endpoint --endpoint-identifier my-redis-target
--endpoint-type target --engine-name redis --redis-settings 
'{"ServerName":"sample-test-sample.zz012zz.cluster.eee1.cache.bbbxxx.com","Port":6379,"AuthType":"auth-token", 
 "SslSecurityProtocol":"ssl-encryption", "AuthPassword":"notanactualpassword"}'

{
    "Endpoint": {
        "EndpointIdentifier": "my-redis-target",
        "EndpointType": "TARGET",
        "EngineName": "redis",
        "EngineDisplayName": "Redis",
        "TransferFiles": false,
        "ReceiveTransferredFiles": false,
        "Status": "active",
        "KmsKeyId": "arn:aws:kms:us-east-1:999999999999:key/x-b188188x",
        "EndpointArn": "arn:aws:dms:us-east-1:555555555555:endpoint:ABCDEFGHIJKLMONOPQRSTUVWXYZ",
        "SslMode": "none",
        "RedisSettings": {
            "ServerName": "sample-test-sample.zz012zz.cluster.eee1.cache.bbbxxx.com",
            "Port": 6379,
            "SslSecurityProtocol": "ssl-encryption",
            "AuthType": "auth-token"
        }
    }
}
```

The `--redis-settings` parameters follow:
+ `ServerName`–(Required) Of type `string`, specifies the Redis OSS cluster that data will be migrated to, and is in your same VPC.
+ `Port`–(Required) Of type `number`, the port value used to access the endpoint.
+ `SslSecurityProtocol`–(Optional) Valid values include `plaintext` and `ssl-encryption`. The default is `ssl-encryption`. 

  The `plaintext` option doesn't provide Transport Layer Security (TLS) encryption for traffic between endpoint and database. 

  Use `ssl-encryption` to make an encrypted connection. `ssl-encryption` doesn’t require an SSL Certificate Authority (CA) ARN to verify a server’s certificate, but one can be identified optionally using the `SslCaCertificateArn` setting. If a certificate authority ARN isn't given, DMS uses the Amazon root CA.

  When using an on-premises Redis OSS target, you can use `SslCaCertificateArn` to import public or private Certificate Authority (CA) into DMS, and provide that ARN for server authentication. A private CA isn’t supported when using ElastiCache (Redis OSS) as a target.
+ `AuthType`–(Required) Indicates the type of authentication to perform when connecting to Redis OSS. Valid values include `none`, `auth-token`, and `auth-role`.

  The `auth-token` option requires an "*AuthPassword*" be provided, while the `auth-role` option requires "*AuthUserName*" and "*AuthPassword*" be provided.

# Using Babelfish as a target for AWS Database Migration Service
<a name="CHAP_Target.Babelfish"></a>

You can migrate data from a Microsoft SQL Server source database to a Babelfish target using AWS Database Migration Service. 

Babelfish for Aurora PostgreSQL extends your Amazon Aurora PostgreSQL-Compatible Edition database with the ability to accept database connections from Microsoft SQL Server clients. Doing this allows applications originally built for SQL Server to work directly with Aurora PostgreSQL with few code changes compared to a traditional migration, and without changing database drivers. 

For information about versions of Babelfish that AWS DMS supports as a target, see [Targets for AWS DMS](CHAP_Introduction.Targets.md). Earlier versions of Babelfish on Aurora PostgreSQL require an upgrade before using the Babelfish endpoint.

**Note**  
The Aurora PostgreSQL target endpoint is the preferred way to migrate data to Babelfish. For more information, see [Using Babelfish for Aurora PostgreSQL as a target](CHAP_Target.PostgreSQL.md#CHAP_Target.PostgreSQL.Babelfish). 

For information about using Babelfish as a database endpoint, see [Babelfish for Aurora PostgreSQL](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.AuroraPostgreSQL.html) in the *Amazon Aurora User Guide for Aurora* 

## Prerequisites to using Babelfish as a target for AWS DMS
<a name="CHAP_Target.Babelfish.Prerequisites"></a>

You must create your tables before migrating data to make sure that AWS DMS uses the correct data types and table metadata. If you don't create your tables on the target before running migration, AWS DMS may create the tables with incorrect data types and permissions. For example, AWS DMS creates a timestamp column as binary(8) instead, and doesn't provide the expected timestamp/rowversion functionality.

**To prepare and create your tables prior to migration**

1. Run your create table DDL statements that include any unique constraints, primary keys, or default constraints. 

   Do not include foreign key constraints, or any DDL statements for objects like views, stored procedures, functions, or triggers. You can apply them after migrating your source database.

1. Identify any identity columns, computed columns, or columns containing rowversion or timestamp data types for your tables. Then, create the necessary transformation rules to handle known issues when running the migration task. For more information see, [Transformation rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.md).

1. Identify columns with data types that Babelfish doesn't support. Then, change the affected columns in the target table to use supported data types, or create a transformation rule that removes them during the migration task. For more information see, [Transformation rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.md).

   The following table lists source data types not supported by Babelfish, and the corresponding recommended target data type to use.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Babelfish.html)

**To set Aurora capacity units (ACUs) level for your Aurora PostgreSQL Serverless V2 source database**

You can improve performance of your AWS DMS migration task prior to running it by setting the minimum ACU value.
+ From the **Severless v2 capacity settings** window, set **Minimum ACUs** to **2**, or a reasonable level for your Aurora DB cluster.

  For additional information about setting Aurora capacity units, see [ Choosing the Aurora Serverless v2 capacity range for an Aurora cluster](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.setting-capacity.html) in the *Amazon Aurora User Guide* 

After running your AWS DMS migration task, you can reset the minimum value of your ACUs to a reasonable level for your Aurora PostgreSQL Serverless V2 source database.

## Security requirements when using Babelfish as a target for AWS Database Migration Service
<a name="CHAP_Target.Babelfish.Security"></a>

The following describes the security requirements for using AWS DMS with a Babelfish target:
+ The administrator user name (the Admin user) used to create the database.
+ PSQL login and user with the sufficient SELECT, INSERT, UPDATE, DELETE, and REFERENCES permissions.

## User permissions for using Babelfish as a target for AWS DMS
<a name="CHAP_Target.Babelfish.Permissions"></a>

**Important**  
For security purposes, the user account used for the data migration must be a registered user in any Babelfish database that you use as a target.

Your Babelfish target endpoint requires minimum user permissions to run an AWS DMS migration.

**To create a login and a low-privileged Transact-SQL (T-SQL) user**

1. Create a login and password to use when connecting to the server.

   ```
   CREATE LOGIN dms_user WITH PASSWORD = 'password';
   GO
   ```

1. Create the virtual database for your Babelfish cluster.

   ```
   CREATE DATABASE my_database;
   GO
   ```

1. Create the T-SQL user for your target database.

   ```
   USE my_database
   GO
   CREATE USER dms_user FOR LOGIN dms_user;
   GO
   ```

1. For each table in your Babelfish database, GRANT permissions to the tables.

   ```
   GRANT SELECT, DELETE, INSERT, REFERENCES, UPDATE ON [dbo].[Categories] TO dms_user;  
   ```

## Limitations on using Babelfish as a target for AWS Database Migration Service
<a name="CHAP_Target.Babelfish.Limitations"></a>

The following limitations apply when using a Babelfish database as a target for AWS DMS:
+ Only table preparation mode “**Do Nothing**“ is supported.
+ The ROWVERSION data type requires a table mapping rule that removes the column name from the table during the migration task.
+ The sql\$1variant data type isn't supported.
+ Full LOB mode is supported. Using SQL Server as a source endpoint requires the SQL Server Endpoint Connection Attribute setting `ForceFullLob=True` to be set in order for LOBs to be migrated to the target endpoint.
+ Replication task settings have the following limitations:

  ```
  {
     "FullLoadSettings": {
        "TargetTablePrepMode": "DO_NOTHING",
        "CreatePkAfterFullLoad": false,
        }.
      
  }
  ```
+ TIME(7), DATETIME2(7), and DATETIMEOFFSET(7) data types in Babelfish limit the precision value for the seconds portion of the time to 6 digits. Consider using a precision value of 6 for your target table when using these data types. For Babelfish versions 2.2.0 and higher, when using TIME(7) and DATETIME2(7), the seventh digit of precision is always zero.
+ In DO\$1NOTHING mode, DMS checks to see if the table already exists. If the table doesn't exist in the target schema, DMS creates the table based on the source table definition, and maps any user defined data types to their base data type.
+ An AWS DMS migration task to a Babelfish target doesn't support tables that have columns using ROWVERSION or TIMESTAMP data types. You can use a table mapping rule that removes the column name from the table during the transfer process. In the following transformation rule example, a table named `Actor` in your source is transformed to remove all columns starting with the characters `col` from the `Actor` table in your target.

  ```
  {
   	"rules": [{
  		"rule-type": "selection",is 
  		"rule-id": "1",
  		"rule-name": "1",
  		"object-locator": {
  			"schema-name": "test",
  			"table-name": "%"
  		},
  		"rule-action": "include"
  	}, {
  		"rule-type": "transformation",
  		"rule-id": "2",
  		"rule-name": "2",
  		"rule-action": "remove-column",
  		"rule-target": "column",
  		"object-locator": {
  			"schema-name": "test",
  			"table-name": "Actor",
  			"column-name": "col%"
  		}
  	}]
   }
  ```
+ For tables with identity or computed columns, where the target tables use mixed case names like Categories, you must create a transformation rule action that converts the table names to lowercase for your DMS task. The following example shows how to create the transformation rule action, **Make lowercase** using the AWS DMS console. For more information, see [Transformation rules and actions](CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.md).  
![\[Babelfish transformation rule\]](http://docs.aws.amazon.com/dms/latest/userguide/images/datarep-babelfish-transform-1.png)
+ Prior to Babelfish version 2.2.0, DMS limited the number of columns that you could replicate to a Babelfish target endpoint to twenty (20) columns. With Babelfish 2.2.0 the limit increased to 100 columns. But with Babelfish versions 2.4.0 and higher, the number of columns that you can replicate increases again. You can run the following code sample against your SQL Server database to determine which tables are too long.

  ```
  USE myDB;
  GO
  DECLARE @Babelfish_version_string_limit INT = 8000; -- Use 380 for Babelfish versions before 2.2.0
  WITH bfendpoint
  AS (
  SELECT 
  	[TABLE_SCHEMA]
        ,[TABLE_NAME]
  	  , COUNT( [COLUMN_NAME] ) AS NumberColumns
  	  , ( SUM( LEN( [COLUMN_NAME] ) + 3)  
  		+ SUM( LEN( FORMAT(ORDINAL_POSITION, 'N0') ) + 3 )  
  	    + LEN( TABLE_SCHEMA ) + 3
  		+ 12 -- INSERT INTO string
  		+ 12)  AS InsertIntoCommandLength -- values string
        , CASE WHEN ( SUM( LEN( [COLUMN_NAME] ) + 3)  
  		+ SUM( LEN( FORMAT(ORDINAL_POSITION, 'N0') ) + 3 )  
  	    + LEN( TABLE_SCHEMA ) + 3
  		+ 12 -- INSERT INTO string
  		+ 12)  -- values string
  			>= @Babelfish_version_string_limit
  			THEN 1
  			ELSE 0
  		END AS IsTooLong
  FROM [INFORMATION_SCHEMA].[COLUMNS]
  GROUP BY [TABLE_SCHEMA], [TABLE_NAME]
  )
  SELECT * 
  FROM bfendpoint
  WHERE IsTooLong = 1
  ORDER BY TABLE_SCHEMA, InsertIntoCommandLength DESC, TABLE_NAME
  ;
  ```

## Target data types for Babelfish
<a name="CHAP_Target.Babelfish.DataTypes"></a>

The following table shows the Babelfish target data types that are supported when using AWS DMS and the default mapping from AWS DMS data types.

For additional information about AWS DMS data types, see [Data types for AWS Database Migration Service](CHAP_Reference.DataTypes.md). 


|  AWS DMS data type  |  Babelfish data type   | 
| --- | --- | 
|  BOOLEAN  |  TINYINT  | 
|  BYTES  |  VARBINARY(length)  | 
|  DATE  |  DATE  | 
|  TIME  |  TIME  | 
|  INT1  |  SMALLINT  | 
|  INT2  |  SMALLINT  | 
|  INT4  |  INT  | 
|  INT8  |  BIGINT  | 
|  NUMERIC   |  NUMERIC(p,s)  | 
|  REAL4  |  REAL  | 
|  REAL8  |  FLOAT  | 
|  STRING  |  If the column is a date or time column, then do the following:  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Babelfish.html) If the column is not a date or time column, use VARCHAR (length).  | 
|  UINT1  |  TINYINT  | 
|  UINT2  |  SMALLINT  | 
|  UINT4  |  INT  | 
|  UINT8  |  BIGINT  | 
|  WSTRING  |  NVARCHAR(length)  | 
|  BLOB  |  VARBINARY(max) To use this data type with DMS, you must enable the use of BLOBs for a specific task. DMS supports BLOB data types only in tables that include a primary key.  | 
|  CLOB  |  VARCHAR(max) To use this data type with DMS, you must enable the use of CLOBs for a specific task.  | 
|  NCLOB  |  NVARCHAR(max) To use this data type with DMS, you must enable the use of NCLOBs for a specific task. During CDC, DMS supports NCLOB data types only in tables that include a primary key.  | 

# Using Amazon Timestream as a target for AWS Database Migration Service
<a name="CHAP_Target.Timestream"></a>

You can use AWS Database Migration Service to migrate data from your source database to a Amazon Timestream target endpoint, with support for Full Load and CDC data migrations.

Amazon Timestream is a fast, scalable, and serverless time series database service built for high-volume data ingestion. Time series data is a sequence of data points collected over a time interval, and is used for measuring events that change over time. It is used to collect, store, and analyze metrics from IoT applications, DevOps applications, and analytics applications. Once you have your data in Timestream, you can visualize and identify trends and patterns in your data in near real-time. For information about Amazon Timestream, see [What is Amazon Timestream?](https://docs.aws.amazon.com/timestream/latest/developerguide/what-is-timestream.html) in the *Amazon Timestream Developer Guide*.

**Topics**
+ [

## Prerequisites for using Amazon Timestream as a target for AWS Database Migration Service
](#CHAP_Target.Timestream.Prerequisites)
+ [

## Multithreaded full load task settings
](#CHAP_Target.Timestream.FLTaskSettings)
+ [

## Multithreaded CDC load task settings
](#CHAP_Target.Timestream.CDCTaskSettings)
+ [

## Endpoint settings when using Timestream as a target for AWS DMS
](#CHAP_Target.Timestream.ConnectionAttrib)
+ [

## Creating and modifying an Amazon Timestream target endpoint
](#CHAP_Target.Timestream.CreateModifyEndpoint)
+ [

## Using object mapping to migrate data to a Timestream topic
](#CHAP_Target.Timestream.ObjectMapping)
+ [

## Limitations when using Amazon Timestream as a target for AWS Database Migration Service
](#CHAP_Target.Timestream.Limitations)

## Prerequisites for using Amazon Timestream as a target for AWS Database Migration Service
<a name="CHAP_Target.Timestream.Prerequisites"></a>

Before you set up Amazon Timestream as a target for AWS DMS, make sure that you create an IAM role. This role must allow AWS DMS to gain access to the data being migrated into Amazon Timestream. The minimum set of access permissions for the role that you use to migrate to Timestream is shown in the following IAM policy.

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Sid": "AllowDescribeEndpoints",
      "Effect": "Allow",
      "Action": [
        "timestream:DescribeEndpoints"
      ],
      "Resource": "*"
    },
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "timestream:ListTables",
        "timestream:DescribeDatabase"
      ],
      "Resource": "arn:aws:timestream:us-east-1:123456789012:database/DATABASE_NAME"
    },
    {
      "Sid": "VisualEditor1",
      "Effect": "Allow",
      "Action": [
        "timestream:DeleteTable",
        "timestream:WriteRecords",
        "timestream:UpdateTable",
        "timestream:CreateTable"
      ],
      "Resource": "arn:aws:timestream:us-east-1:123456789012:database/DATABASE_NAME/table/TABLE_NAME"
    }
  ]
}
```

------

If you intend to migrate all tables, use `*` for *TABLE\$1NAME* in the example above.

Note the following about using Timestream as a target:
+ If you intend to ingest historical data with timestamps exceeding 1 year old, we recommend to use AWS DMS to write the data to Amazon S3 in a comma separated value (csv) format. Then, use Timestream’s batch load to ingest the data into Timestream. For more information, see [Using batch load in Timestream](https://docs.aws.amazon.com/timestream/latest/developerguide/batch-load.html) in the [Amazon Timestream developer guide](https://docs.aws.amazon.com/timestream/latest/developerguide/what-is-timestream.html).
+ For full-load data migrations of data less than 1 year old, we recommend setting the memory store retention period of the Timestream table greater than or equal to the oldest timestamp. Then, once migration completes, edit the table's memory store retention to the desired value. For example, to migrate data with the oldest timestamp being 2 months old, do the following:
  + Set the Timestream target table's memory store retention to 2 months.
  + Start the data migration using AWS DMS.
  + Once the data migration completes, change the retention period of the target Timestream table to your desired value. 

   We recommend estimating the memory store cost prior to the migration, using the information on the following pages:
  + [Amazon Timestream pricing](https://aws.amazon.com/timestream/pricing)
  + [AWS pricing calculator](https://calculator.aws/#/addService) 
+ For CDC data migrations, we recommend setting the memory store retention period of the target table such that ingested data falls within the memory store retention bounds. For more information, see [ Writes Best Practices ](https://docs.aws.amazon.com/timestream/latest/developerguide/data-ingest.html) in the [Amazon Timestream developer guide](https://docs.aws.amazon.com/timestream/latest/developerguide/what-is-timestream.html).

## Multithreaded full load task settings
<a name="CHAP_Target.Timestream.FLTaskSettings"></a>

To help increase the speed of data transfer, AWS DMS supports a multithreaded full load migration task to a Timestream target endpoint with these task settings:
+ `MaxFullLoadSubTasks` – Use this option to indicate the maximum number of source tables to load in parallel. DMS loads each table into its corresponding Amazon Timestream target table using a dedicated subtask. The default is 8; the maximum value is 49.
+ `ParallelLoadThreads` – Use this option to specify the number of threads that AWS DMS uses to load each table into its Amazon Timestream target table. The maximum value for a Timestream target is 32. You can ask to have this maximum limit increased.
+ `ParallelLoadBufferSize` – Use this option to specify the maximum number of records to store in the buffer that the parallel load threads use to load data to the Amazon Timestream target. The default value is 50. The maximum value is 1,000. Use this setting with `ParallelLoadThreads`. `ParallelLoadBufferSize` is valid only when there is more than one thread.
+ `ParallelLoadQueuesPerThread` – Use this option to specify the number of queues each concurrent thread accesses to take data records out of queues and generate a batch load for the target. The default is 1. However, for Amazon Timestream targets of various payload sizes, the valid range is 5–512 queues per thread.

## Multithreaded CDC load task settings
<a name="CHAP_Target.Timestream.CDCTaskSettings"></a>

To promote CDC performance, AWS DMS supports these task settings:
+ `ParallelApplyThreads` – Specifies the number of concurrent threads that AWS DMS uses during a CDC load to push data records to a Timestream target endpoint. The default value is 0 and the maximum value is 32.
+ `ParallelApplyBufferSize` – Specifies the maximum number of records to store in each buffer queue for concurrent threads to push to a Timestream target endpoint during a CDC load. The default value is 100 and the maximum value is 1,000. Use this option when `ParallelApplyThreads` specifies more than one thread. 
+ `ParallelApplyQueuesPerThread` – Specifies the number of queues that each thread accesses to take data records out of queues and generate a batch load for a Timestream endpoint during CDC. The default value is 1 and the maximum value is 512.

## Endpoint settings when using Timestream as a target for AWS DMS
<a name="CHAP_Target.Timestream.ConnectionAttrib"></a>

You can use endpoint settings to configure your Timestream target database similar to using extra connection attributes. You specify the settings when you create the target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--timestream-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with Timestream as a target.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Timestream.html)

## Creating and modifying an Amazon Timestream target endpoint
<a name="CHAP_Target.Timestream.CreateModifyEndpoint"></a>

Once you have created an IAM role and established the minimum set of access permissions, you can create a Amazon Timestream target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--timestream-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following examples show how to create and modify a Timestream target endpoint using the AWS CLI.

**Create Timestream target endpoint command**

```
aws dms create-endpoint —endpoint-identifier timestream-target-demo
--endpoint-type target —engine-name timestream
--service-access-role-arn arn:aws:iam::123456789012:role/my-role
--timestream-settings
{
    "MemoryDuration": 20,
    "DatabaseName":"db_name",
    "MagneticDuration": 3,
    "CdcInsertsAndUpdates": true,
    "EnableMagneticStoreWrites": true,
}
```

**Modify Timestream target endpoint command**

```
aws dms modify-endpoint —endpoint-identifier timestream-target-demo
--endpoint-type target —engine-name timestream
--service-access-role-arn arn:aws:iam::123456789012:role/my-role
--timestream-settings
{
    "MemoryDuration": 20,
    "MagneticDuration": 3,
}
```

## Using object mapping to migrate data to a Timestream topic
<a name="CHAP_Target.Timestream.ObjectMapping"></a>

AWS DMS uses table-mapping rules to map data from the source to the target Timestream topic. To map data to a target topic, you use a type of table-mapping rule called object mapping. You use object mapping to define how data records in the source map to the data records published to a Timestream topic. 

Timestream topics don't have a preset structure other than having a partition key.

**Note**  
You don't have to use object mapping. You can use regular table mapping for various transformations. However, the partition key type will follow these default behaviors:   
Primary Key is used as a partition key for Full Load.
If no parallel-apply task settings are used, `schema.table` is used as a partition key for CDC.
If parallel-apply task settings are used, Primary key is used as a partition key for CDC.

To create an object-mapping rule, specify `rule-type` as `object-mapping`. This rule specifies what type of object mapping you want to use. The structure for the rule is as follows.

```
{
    "rules": [
        {
            "rule-type": "object-mapping",
            "rule-id": "id",
            "rule-name": "name",
            "rule-action": "valid object-mapping rule action",
            "object-locator": {
                "schema-name": "case-sensitive schema name",
                "table-name": ""
            }
        }
    ]
}
```


```
{
    "rules": [
        {
            "rule-type": "object-mapping",
            "rule-id": "1",
            "rule-name": "timestream-map",
            "rule-action": "map-record-to-record",
            "target-table-name": "tablename",
            "object-locator": {
                "schema-name": "",
                "table-name": ""
            },
            "mapping-parameters": {
                "timestream-dimensions": [
                    "column_name1",
                     "column_name2"
                ],
                "timestream-timestamp-name": "time_column_name",
                "timestream-multi-measure-name": "column_name1or2",
                "timestream-hash-measure-name":  true or false,
                "timestream-memory-duration": x,
                "timestream-magnetic-duration": y
            }
        }
    ]
}
```

AWS DMS currently supports `map-record-to-record` and `map-record-to-document` as the only valid values for the `rule-action` parameter. The `map-record-to-record` and `map-record-to-document` values specify what AWS DMS does by default to records that aren't excluded as part of the `exclude-columns` attribute list. These values don't affect the attribute mappings in any way. 

Use `map-record-to-record` when migrating from a relational database to a Timestream topic. This rule type uses the `taskResourceId.schemaName.tableName` value from the relational database as the partition key in the Timestream topic and creates an attribute for each column in the source database. When using `map-record-to-record`, for any column in the source table not listed in the `exclude-columns` attribute list, AWS DMS creates a corresponding attribute in the target topic. This corresponding attribute is created regardless of whether that source column is used in an attribute mapping. 

One way to understand `map-record-to-record` is to see it in action. For this example, assume that you are starting with a relational database table row with the following structure and data.


| FirstName | LastName | StoreId | HomeAddress | HomePhone | WorkAddress | WorkPhone | DateofBirth | 
| --- | --- | --- | --- | --- | --- | --- | --- | 
| Randy | Marsh | 5 | 221B Baker Street | 1234567890 | 31 Spooner Street, Quahog  | 9876543210 | 02/29/1988 | 

To migrate this information from a schema named `Test` to a Timestream topic, you create rules to map the data to the target topic. The following rule illustrates the mapping. 

```
{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "rule-action": "include",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "%"
            }
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "2",
            "rule-name": "DefaultMapToTimestream",
            "rule-action": "map-record-to-record",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "Customers"
            }
        }
    ]
}
```

Given a Timestream topic and a partition key (in this case, `taskResourceId.schemaName.tableName`), the following illustrates the resulting record format using our sample data in the Timestream target topic: 

```
  {
     "FirstName": "Randy",
     "LastName": "Marsh",
     "StoreId":  "5",
     "HomeAddress": "221B Baker Street",
     "HomePhone": "1234567890",
     "WorkAddress": "31 Spooner Street, Quahog",
     "WorkPhone": "9876543210",
     "DateOfBirth": "02/29/1988"
  }
```

## Limitations when using Amazon Timestream as a target for AWS Database Migration Service
<a name="CHAP_Target.Timestream.Limitations"></a>

The following limitations apply when using Amazon Timestream as a target:
+ **Dimensions and Timestamps:** Timestream uses the dimensions and timestamps in the source data like a composite primary key, and also does not allow you to upsert these values. This means that if you change the timestamp or the dimensions for a record in the source database, the Timestream database will try to create a new record. It is thus possible that if you change the dimension or timestamp of a record such that they match those of another existing record, then AWS DMS updates the values of the other record instead of creating a new record or updating the previous corresponding record.
+ **DDL Commands:** The current release of AWS DMS only supports `CREATE TABLE` and `DROP TABLE` DDL commands.
+ **Record Limitations:** Timestream has limitations for records such as record size and measure size. For more information, see [Quotas](https://docs.aws.amazon.com/timestream/latest/developerguide/what-is-timestream.html) in the [Amazon Timestream Developer Guide](https://docs.aws.amazon.com/).
+ **Deleting Records and Null Values: ** Timestream doesn't support deleting records. To support migrating records deleted from the source, AWS DMS clears the corresponding fields in the records in the Timestream target database. AWS DMS changes the values in the fields of the corresponding target record with **0** for numeric fields, **null** for text fields, and **false** for boolean fields.
+ Timestream as a target doesn't support sources that aren't relational databases (RDBMS).
+ AWS DMS only supports Timestream as a target in the following regions:
  + US East (N. Virginia)
  + US East (Ohio)
  + US West (Oregon)
  + Europe (Ireland)
  + Europe (Frankfurt)
  + Asia Pacific (Sydney)
  + Asia Pacific (Tokyo)
+ Timestream as a target doesn't support setting `TargetTablePrepMode` to `TRUNCATE_BEFORE_LOAD`. We recommend using `DROP_AND_CREATE` for this setting.

# Using Amazon RDS for Db2 and IBM Db2 LUW as a target for AWS DMS
<a name="CHAP_Target.DB2"></a>

You can migrate data to an Amazon RDS for Db2 or an on-premises Db2 database from a Db2 LUW database using AWS Database Migration Service (AWS DMS). 

For information about versions of Db2 LUW that AWS DMS supports as a target, see [Targets for AWS DMS](CHAP_Introduction.Targets.md).

You can use Secure Sockets Layer (SSL) to encrypt connections between your Db2 LUW endpoint and the replication instance. For more information about using SSL with a Db2 LUW endpoint, see [Using SSL with AWS Database Migration Service](CHAP_Security.SSL.md).

## Limitations when using Db2 LUW as a target for AWS DMS
<a name="CHAP_Target.DB2.Limitations"></a>

The following limitations apply when using Db2 LUW database as a target for AWS DMS. For limitations on using Db2 LUW as a source, see [Limitations when using Db2 LUW as a source for AWS DMS](CHAP_Source.DB2.md#CHAP_Source.DB2.Limitations).
+ AWS DMS only supports Db2 LUW as a target when the source is either Db2 LUW or Db2 for z/OS.
+ Using Db2 LUW as a target doesn't support replications with full LOB mode.
+ Using Db2 LUW as a target doesn't support the XML datatype in the full load phase. This is a limitation of the IBM dbload utility. For more information, see [The dbload utility](https://www.ibm.com/docs/en/informix-servers/14.10?topic=utilities-dbload-utility) in the *IBM Informix Servers* documentation.
+ AWS DMS truncates BLOB fields with values corresponding to the double quote character ("). This is a limitation of the IBM dbload utility. 
+ AWS DMS does not support the parallel full load option when migrating to a Db2 LUW target in DMS version 3.5.3. This option is available from DMS version 3.5.4 or later.

## Endpoint settings when using Db2 LUW as a target for AWS DMS
<a name="CHAP_Target.DB2.ConnectionAttrib"></a>

You can use endpoint settings to configure your Db2 LUW target database similar to using extra connection attributes. You specify the settings when you create the target endpoint using the AWS DMS console, or by using the `create-endpoint` command in the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/dms/index.html), with the `--ibm-db2-settings '{"EndpointSetting": "value", ...}'` JSON syntax.

The following table shows the endpoint settings that you can use with Db2 LUW as a target.


| Name | Description | 
| --- | --- | 
|  `KeepCsvFiles`  |  If true, AWS DMS saves any .csv files to the Db2 LUW target that were used to replicate data. DMS uses these files for analysis and troubleshooting..  | 
|  `LoadTimeout`  |  The amount of time (in milliseconds) before AWS DMS times out operations performed by DMS on the Db2 target. The default value is 1200 (20 minutes).  | 
|  `MaxFileSize`  |  Specifies the maximum size (in KB) of .csv files used to transfer data to Db2 LUW.  | 
|  `WriteBufferSize`  |  The size (in KB) of the in-memory file write buffer used when generating .csv files on the local disk on the DMS replication instance. The default value is 1024 (1 MB).  | 

# Configuring VPC endpoints for AWS DMS
<a name="CHAP_VPC_Endpoints"></a>

AWS DMS supports Amazon virtual private cloud (VPC) endpoints as sources and targets. AWS DMS can connect to any AWS source or target database with Amazon VPC endpoints as long as explicitly defined routes to these source and target databases are defined in their AWS DMS VPC.

By supporting Amazon VPC endpoints, AWS DMS makes it easier to maintain end-to-end network security for all replication tasks without additional networking configuration and setup. Using VPC endpoints for all source and target endpoints ensures that all your traffic remains within your VPC and under your control.

For AWS DMS replication instance created in private subnet or AWS DMS serverless replication, to connect to AWS managed databases, it is necessary to set up an Amazon VPC endpoint:
+ Amazon S3
+ Amazon DynamoDB
+ Amazon Kinesis
+ Amazon Redshift
+ Amazon OpenSearch Service

If you are using AWS Secrets Manager to store connection credentials for DMS to use, you also need to set up a VPC endpoint.

Starting with AWS DMS version 3.4.7, VPC endpoints are required to establish connection between DMS replication instance or Serverless replication and the above Amazon services, when private network is used.

## Common AWS DMS prerequisites
<a name="CHAP_VPC_Endpoints.prereq"></a>

Before you configure a VPC endpoint, you must meet the following prerequisites:
+ Locate or create the VPC to use with AWS DMS replication instance or AWS DMS serverless replication. If you do not provide this information, DMS attempts to use the default VPC in the region, where it is setup.
+ Ensure you have IAM permissions to create VPC endpoint. To connect to Amazon S3 and Amazon DynamoDB, you can create Gateway VPC endpoints that provide reliable connectivity without requiring an internet gateway or a NAT device for your VPC. Gateway endpoints do not use AWS PrivateLink, unlike other types of VPC endpoints. For more information, see [Gateway endpoints](https://docs.aws.amazon.com/vpc/latest/privatelink/gateway-endpoints.html) in the *AWS PrivateLink guide*.
+ Configure IAM permissions to use DMS:
  + Configure the `dms-vpc-role` role. For more information, see [AWS managed policy: AmazonDMSVPCManagementRole](https://docs.aws.amazon.com/dms/latest/userguide/security-iam-awsmanpol.html#security-iam-awsmanpol-AmazonDMSVPCManagementRole).
  + Configure the `dms-cloudwatch-logs-role` role. For more information, see [AWS managed policy: AmazonDMSCloudWatchLogsRole](https://docs.aws.amazon.com/dms/latest/userguide/security-iam-awsmanpol.html#security-iam-awsmanpol-AmazonDMSCloudWatchLogsRole).
  + AWS DMS Serverless requires a service linked role (SLR) to exist in your account. AWS DMS manages the creation and usage of this role. For more information about making sure that you have the necessary SLR, see [Service-linked role for AWS DMS Serverless](https://docs.aws.amazon.com/dms/latest/userguide/slr-services-sl.html). When you create a replication, AWS DMS Serverless programmatically creates a Serverless service linked role. You can view this role in the IAM console.

## Set up an Amazon VPC endpoint with AWS Secrets Manager
<a name="CHAP_VPC_Endpoints.vpcforsecrets"></a>

You can set up an Amazon VPC endpoint for AWS Secrets Manager to work with AWS DMS. By creating this endpoint, you enable AWS DMS replication instances or serverless replication configurations in private subnets to securely access database credentials stored in Secrets Manager without requiring public internet access.

**Prerequisites**

Before you configure a VPC endpoint with AWS Secrets Manager in AWS DMS, you must meet the following prerequisites:
+ Ensure you configure all the [Common AWS DMS prerequisites](#CHAP_VPC_Endpoints.prereq).
+ Create and configure [source](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Introduction.Sources.html) or [target](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Introduction.Targets.html) database that you want to connect with.
+ Create secret in AWS Secrets manager with credentials to access source and target databases. Secret must be located in the same region as AWS DMS replication instance or AWS DMS serverless replication. Depending on the database type, the schema of the secret can vary. For more information, see [Working with AWS DMS endpoints](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Endpoints.html).
**Important**  
AWS DMS replication instance and AWS DMS serverless replication do not work with secrets, managed by Amazon RDS. These credentials do not include host and port information, that is required by AWS DMS to establish connections.
+ Configuring IAM permissions to manage DMS endpoint is required for some databases: Amazon S3, Amazon Kinesis, Amazon DynamoDB, Amazon Redshift, Amazon OpenSearch Service, Amazon Neptune, and Amazon Timestream. For more information see [Working with AWS DMS endpoints](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Endpoints.html).

**Create VPC endpoint for AWS Secrets Manager**

1. Sign in to the AWS Management Console and open the Amazon VPC console at [https://console.aws.amazon.com/vpc/](https://console.aws.amazon.com/vpc/).

1. On the VPC console menu bar, choose the same AWS Region as your AWS DMS replication instance.

1. In the VPC navigation pane, choose **Endpoints**.

1. In **Endpoints**, choose **Create endpoint**.

1. Configure the VPC endpoint as follows:

   1. Select **Type** as **AWS Services**.

   1. In the **Service Name** textbox, search for **secretsmanager** and select **com.amazonaws.[region].secretsmanager**. Ensure that the **Type** for your selected service is **Interface**.

   1. Under **Network settings** select the VPC that is running in the same region as your DMS replication instance or where you created Serverless replication.

   1. In the **Subnets** section, select your desired subnets where you want DMS to operate. Ensure you only select private subnets. You can identify a private subnet with the subnet ID. For example: `vpc-xxxxxx-subnet-private1-us-west-2a`.

      If your DMS replication instance is created without public access, you must choose the route tables associated with the private subnets where your replication instance resides.
**Note**  
Ensure you note the private subnets as you are required to provide them when creating DMS replication subnet group. To connect DMS with the AWS Secrets manager using the VPC endpoints, the subnets specified for VPC endpoint must be the same as the subnets in DMS replication subnet group.

   1. Select your desired **Security groups**. The security group rules control the traffic to the endpoint network interface from the resources in your VPC. If you do not specify a security group, the default security group is selected.

1. Select **Full access** under **Policy**. If you want to use custom policy to specify your own access control select **Custom**. You can use a trust policy that conforms with the JSON policy document, `dms-vpc-role`. For more information, see [Creating the IAM roles to use with AWS DMS](https://docs.aws.amazon.com/dms/latest/userguide/security-iam.html#CHAP_Security.APIRole).

1. Select **Create endpoint**.

   You must wait until the status becomes `Available`. Your VPC endpoint now has an ID starting with `vpce-xxxx`.

You have now successfully created a VPC endpoint. You must configure AWS DMS Endpoints, DMS subnet groups. Depending on migration option you choose, configure DMS replication instance or Serverless replication.

## Set up an Amazon VPC endpoint with Amazon S3
<a name="CHAP_VPC_Endpoints.vpcfors3"></a>

You can set up an Amazon VPC endpoint for Amazon S3 to work with AWS DMS. By creating this endpoint, you enable AWS DMS replication instances or serverless replication configurations in private subnets to securely access database credentials stored in S3 buckets without requiring public internet access.

**Prerequisites**

Before you configure a VPC endpoint with Amazon S3 in AWS DMS, you must meet the following prerequisites:
+ Ensure you configure all the [Common AWS DMS prerequisites](#CHAP_VPC_Endpoints.prereq).
+ Create an Amazon S3 bucket to use as [source](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Introduction.Sources.html) or [target](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Introduction.Targets.html) databases with AWS DMS. Do not enable versioning for S3. If you need S3 versioning, use lifecycle policies to actively delete old versions. Otherwise, you can encounter endpoint test connection failures because of an S3 list-object call timeout.
+ Configure IAM permissions to manage DMS Amazon S3 endpoint. If you are using AWS DMS Console, the IAM role with the necessary permissions can be created if you have permissions to create IAM roles.

**Create a VPC endpoint for Amazon S3**

1. Sign in to the AWS Management Console and open the Amazon VPC console at [https://console.aws.amazon.com/vpc/](https://console.aws.amazon.com/vpc/).

1. On the VPC console menu bar, choose the same AWS Region as your AWS DMS replication instance.

1. In the VPC navigation pane, choose **Endpoints**.

1. In **Endpoints**, choose **Create endpoint**.

1. Configure the VPC endpoint as follows:

   1. Select **Type** as **AWS Services**.

   1. In the **Service Name** textbox, search for **s3** and select **com.amazonaws.[region].s3**. Ensure that the **Type** for your selected service is **Gateway**. You can create a Gateway VPC endpoint when connecting to Amazon S3 and DynamoDB. Gateway endpoints do not use AWS PrivateLink, unlike other types of VPC endpoints.

   1. Under **Network settings** select the VPC that is running in the same region as your DMS replication instance or where you created Serverless replication.

   1. In the **Subnets** section, select your desired subnets where you want DMS to operate. Ensure you only select private subnets. You can identify a private subnet with the subnet ID. For example: `vpc-xxxxxx-subnet-private1-us-west-2a`.
**Note**  
If you have created your DMS replication instance without public access, you must choose the route tables associated with private subnets that are in the same region as your DMS instance. Ensure you note the private subnets as you are required to provide them when creating DMS replication subnet group. To connect DMS with Amazon S3 using the VPC endpoints, the subnets specified for VPC endpoint must be the same as the subnets in DMS replication subnet group.

1. Select **Full access** under **Policy**. If you want to use custom policy to specify your own access control select **Custom**. You can use a trust policy that conforms with the JSON policy document, `dms-vpc-role`. For more information, see [Creating the IAM roles to use with AWS DMS](https://docs.aws.amazon.com/dms/latest/userguide/security-iam.html#CHAP_Security.APIRole).

1. Select **Create endpoint**.

   You must wait until the status becomes `Available`. Your VPC endpoint now has an ID starting with `vpce-xxxx`.

You have now successfully created a VPC endpoint. You must configure AWS DMS Endpoints, DMS subnet groups. Depending on migration option you choose, configure DMS replication instance or Serverless replication.

## Setup an Amazon VPC endpoint for Amazon DynamoDB
<a name="CHAP_VPC_Endpoints.vpcfordynamoDB"></a>

When using AWS DMS replication instances in private subnets or AWS DMS serverless replication, you must create a VPC endpoint to establish secure connectivity with Amazon DynamoDB. Without a VPC endpoint configuration, AWS DMS faces connection errors.

When creating the VPC endpoint, you must select **Endpoint type** as **Gateway** or **Interface** in the DMS Console. For more information, see:
+ [Common AWS DMS prerequisites](#CHAP_VPC_Endpoints.prereq)
+ [Gateway endpoints for Amazon DynamoDB](https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-ddb.html)
+ [How do I troubleshoot connectivity issues with my gateway Amazon VPC endpoints?](https://repost.aws/knowledge-center/connect-s3-vpc-endpoint)

## Setup an Amazon VPC endpoint for Amazon Kinesis
<a name="CHAP_VPC_Endpoints.vpcforkinesis"></a>

When using AWS DMS replication instances in private subnets or AWS DMS serverless replication, you must create a VPC endpoint to establish secure connectivity with Amazon Kinesis. Without a VPC endpoint configuration, AWS DMS faces connection errors. For more information, see:
+ [Common AWS DMS prerequisites](#CHAP_VPC_Endpoints.prereq)
+ [Using Amazon Kinesis Data Streams as a target for AWS Database Migration Service](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Kinesis.html)

## Setup an Amazon VPC endpoint for Amazon Redshift
<a name="CHAP_VPC_Endpoints.vpcforredshift"></a>

When using AWS DMS replication instances in private subnets or AWS DMS serverless replication, you must create a VPC endpoint to establish secure connectivity with Amazon Redshift. Without a VPC endpoint configuration, AWS DMS faces connection errors. For more information, see:
+ [Common AWS DMS prerequisites](#CHAP_VPC_Endpoints.prereq)
+ [Redshift-managed VPC endpoints](https://docs.aws.amazon.com/redshift/latest/mgmt/managing-cluster-cross-vpc.html)

## Setup an Amazon VPC endpoint for Amazon OpenSearch Service
<a name="CHAP_VPC_Endpoints.vpcforos"></a>

When using AWS DMS replication instances in private subnets or AWS DMS serverless replication, you must create a VPC endpoint to establish secure connectivity with Amazon OpenSearch Service. Without a VPC endpoint configuration, AWS DMS faces connection errors. For more information, see:
+ [Common AWS DMS prerequisites](#CHAP_VPC_Endpoints.prereq)
+ [Configuring VPC access for Amazon OpenSearch Ingestion pipelines](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/pipeline-security.html)

## Setup replication instances, DMS subnet groups, and DMS Endpoints
<a name="CHAP_VPC_Endpoints.option.123"></a>

You must configure AWS DMS replication resources after creating VPC endpoints. You can set up replication subnet groups for network isolation, replication instances or serverless replications for processing, and endpoints for connecting to source and target databases to enable secure database migration within your VPC.

### Setup an AWS DMS replication instance
<a name="CHAP_VPC_Endpoints.option.123.provisioned"></a>

To configure an AWS DMS provisioned replication instance you must setup DMS replication subnets groups.

**Create DMS replication subnet groups**

1. Sign in to the AWS Management Console and open the DMS console.

1. From the left navigation pane, open to **Subnet groups** and select **Create subnet group**.

1. Enter **Name** and **Description**.

1. From the **VPC** dropdown menu, select the VPC that is running in the same region where you want to create your DMS replication instance.

1. From the **Add subnets** dropdown menu, add the private subnets that you have specified when creating your VPC endpoint. You can identify a private subnet with the subnet ID. For example: `vpc-xxxxxx-subnet-private1-us-west-2a`.

1. Click **Create subnet group**.

**Create DMS replication instance (provisioned)**

1. Navigate to the AWS Management Console to create a replication instance. For more information, see [Creating a replication instance](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_ReplicationInstance.Creating.html). To understand more choosing, sizing, and configuring replication instances, see [Working with an AWS DMS replication instance](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_ReplicationInstance.html).

1. In the **Connectivity and security** section, select the VPC from the **Virtual private cloud (VPC) for IPv4** or **Dual-stack mode** where you want to create the AWS DMS replication instance. For more information, see [Setting up a network for a replication instance](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_ReplicationInstance.VPC.html).

1. From the **Replication subnet group** dropdown menu, choose the subnet group that you created for your replication instance.
**Note**  
Ensure the subnets specified for the VPC endpoint are identical to the subnets in the DMS replication instance subnet group. You must remove any subnets from your subnet group that are not associated with VPC endpoint.

1. Uncheck the **Public accessible** checkbox to disable public access.

1. In **Advanced settings** section, from the **VPC security groups** dropdown menu, select all the VPC subnet groups associated with your replication instance. These groups must include the subnet group that includes subnets you specified when creating the VPC endpoint.

   If you do not specify the subnet groups, DMS chooses the default **Replication subnet group** or creates it if it does not exist. For more information, see [Security group configuration for AWS DMS](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Advanced.Endpoints.securitygroup.html).

1. Complete the replication instance configuration and select **Create replication instance**.

   You must wait until the status becomes `Available`.

**Create AWS DMS source and target Endpoints**

1. Sign in to the DMS console.

1. Navigate to the **AWS DMS endpoints** and select **Create endpoint**.

1. Create and configure [source](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.html) and [target](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.html) endpoints.

1. In the DMS console, you can choose an existing IAM role or create a new IAM role to access your stored database credentials in the AWS Secrets manager.

1. Click **Run test** to test the endpoint connection in your DMS replication instance. Your replication instance should have `Available` status to run the test with it.

1. Select **Create endpoint**.

### Setup an AWS DMS serverless replication
<a name="CHAP_VPC_Endpoints.option.123.serverless"></a>

To configure an AWS DMS serverless replication you must setup DMS replication subnets groups.

**Create AWS DMS source and target Endpoints**

1. Sign in to the DMS console.

1. Navigate to the **AWS DMS endpoints** and select **Create endpoint**.

1. Create and configure [source](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.html) and [target](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.html) endpoints.

1. In the DMS console, you can choose an existing IAM role or create a new IAM role to access your stored database credentials in the AWS Secrets manager.
**Note**  
For AWS DMS serverless replication, you cannot test the connection for the DMS endpoint or use the `TestConnection` API. The connection test is performed during serverless replication launch between DMS instance and your source/target databases. For more information, see [AWS DMS Serverless components](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Serverless.Components.html).

1. Select **Create endpoint**.

**Create DMS replication subnet groups**

1. Sign in to the AWS Management Console and open the DMS console.

1. From the left navigation pane, open to **Subnet groups** and select **Create subnet group**.

1. Enter **Name** and **Description**.

1. From the **VPC** dropdown menu, select the VPC that is running in the same region as your DMS serverless instance.

1. From the **Add subnets** dropdown menu, add the private subnets that you have specified when creating your VPC endpoint. You can identify a private subnet with the subnet ID. For example: `vpc-xxxxxx-subnet-private1-us-west-2a`.

1. Click **Create subnet group**.

**Create DMS serverless replication**

1. Navigate to the DMS console to create a serverless instance. For more information, see [Creating a serverless replication](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Serverless.Components.html#CHAP_Serverless.create). To understand more choosing, sizing, and configuring serverless instances, see [Working with AWS DMS serverless](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Serverless.html).

1. In the **Connectivity and security** section, select the VPC from the **Virtual private cloud (VPC)** dropdown menu where you want to create the AWS DMS serverless instance . For more information, see [Setting up a network for a replication instance](https://docs.aws.amazon.com/dms/latest/userguide/CHAP_ReplicationInstance.VPC.html).

1. From the **Subnet group** dropdown menu, choose the subnet group that you created for your serverless instance.
**Note**  
Ensure the subnets specified for the VPC endpoint are identical to the subnets in the DMS serverless instance subnet group. You must remove any subnets from your subnet group that are not associated with your serverless instance.

1. Select **Availability zone**.

1. From the **Maximum DMS capacity units (DCU)** dropdown menu, select the desired DCU capacity.

1. Select **Create task**. This creates a DMS serverless replication configuration that appears in the task list with the status `Start required`.

1. To start serverless replication, choose your task and select **Start** from the **Actions** menu.

## Who is impacted when migrating to AWS DMS versions 3.4.7 and higher?
<a name="CHAP_VPC_Endpoints.Users_Impacted"></a>

You are impacted if you are using one or more of the previously listed AWS DMS endpoints, and these endpoints are not publicly routable or they don’t have VPC endpoints already associated with them.

## Who is not impacted when migrating to AWS DMS versions 3.4.7 and higher?
<a name="CHAP_VPC_Endpoints.Users_Not_Impacted"></a>

You are not impacted if:
+ You aren't using one or more of the previously listed AWS DMS endpoints.
+ You are using any of the previously listed endpoints and they are publicly routable.
+ You are using any of the previously listed endpoints and they have VPC endpoints associated with them.

## Preparing a migration to AWS DMS versions 3.4.7 and higher
<a name="CHAP_VPC_Endpoints.User_Mitigation"></a>

To prevent AWS DMS task failures when you are using any of the endpoints described previously, take one of the steps following prior to upgrading AWS DMS to version 3.4.7 or higher:
+ Make the impacted AWS DMS endpoints publicly routable. For example, add an Internet Gateway (IGW) route to any VPC already used by your AWS DMS replication instance to make all its source and target endpoints publicly routable.
+ Create VPC endpoints to access all source and target endpoints used by AWS DMS as described following.

For any existing VPC endpoints that you use for your AWS DMS source and target endpoints, ensure that they use a trust policy that conforms with the XML policy document, `dms-vpc-role`. For more information on this XML policy document, see [Creating the IAM roles to use with AWS DMS](security-iam.md#CHAP_Security.APIRole).

Otherwise, configure your replication instances as VPC endpoints by adding a VPC endpoint to the VPC containing them. If you configured your replication instances without public endpoints, adding a publicly-accessible VPC endpoint to the VPC that contains your replication instances makes them publicly accessible. You don't need to do anything further to specifically associate your replication instances with the VPC endpoint.

**Note**  
Different services might have unique VPC endpoint configurations. For instance, when using AWS Secrets Manager, you typically don't need to adjust the routing table. Always check the specific requirements for each service.

For more information on configuring VPC endpoints for an AWS DMS replication instance, see [Network configurations for database migration](CHAP_ReplicationInstance.VPC.md#CHAP_ReplicationInstance.VPC.Configurations). For more information on creating interface VPC endpoints for accessing AWS services generally, see [Access an AWS service using an interface VPC endpoint](https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html) in the *AWS PrivateLink Guide*. For information on AWS DMS regional availability for VPC endpoints, see the [AWS Region Table](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/).

# DDL statements supported by AWS DMS
<a name="CHAP_Introduction.SupportedDDL"></a>

You can execute data definition language (DDL) statements on the source database during the data migration process. These statements are replicated to the target database by the replication server. 

Supported DDL statements include the following: 
+ Create table
+ Drop table
+ Rename table
+ Truncate table
+ Add column
+ Drop column
+ Rename column
+ Change column data type

DMS doesn’t capture all supported DDL statements for some source engine types. And DMS handles DDL statements differently when applying them to specific target engines. For information about which DDL statements are supported for a specific source, and how they’re applied to a target, see the specific documentation topic for that source and target endpoint.

You can use task settings to configure the way DMS handles DDL behavior during change data capture (CDC). For more information, see [Task settings for change processing DDL handling](CHAP_Tasks.CustomizingTasks.TaskSettings.DDLHandling.md).

## Limitations and considerations
<a name="CHAP_Introduction.SupportedDDL.Limitations"></a>

Rapid sequences of DDL operations in the source database (such as DDL>DML>DDL) can cause AWS DMS to parse the log incorrectly, leading to data loss or unexpected behavior. To maintain data consistency, wait for AWS DMS to apply each change to the target before performing subsequent operations.

For example, during change data capture (CDC), multiple rapid table rename operations on a source table can trigger errors. If you rename a table and then quickly rename it back to its original name, AWS DMS might report that the table already exists in the target database.

# Advanced endpoint configuration
<a name="CHAP_Advanced.Endpoints"></a>

You can configure advanced settings for your endpoints in AWS Database Migration Service (AWS DMS) to setup control over how source and target endpoints behave during the migration process. As part of the advanced setup you can configure AWS DMS VPC peering to enable secure communication between VPCs, DMS Security Groups to control inbound and outbound traffic, Newtwork Access Control lists (NACLs) as additional layer of security, and VPC endpoints for AWS Secrets Manager.

You can set these configurations during endpoint creation or modified later through the AWS DMS Console or API, to fine-tune the migration processes based on specific database engine requirements and performance needs.

Following, you can find out more details about advanced endpoint configuration.

**Topics**
+ [

# VPC peering configuration for AWS DMS.
](CHAP_Advanced.Endpoints.vpc.peering.md)
+ [

# Security group configuration for AWS DMS
](CHAP_Advanced.Endpoints.securitygroup.md)
+ [

# Network Access Control List (NACL) configuration for AWS DMS
](CHAP_Advanced.Ednpoints.NACL.md)
+ [

# Configuring AWS DMS secrets manager VPC Endpoint
](CHAP_Advanced.Endpoints.secretsmanager.md)
+ [

## Additional considerations
](#CHAP_secretsmanager.additionalconsiderations)

# VPC peering configuration for AWS DMS.
<a name="CHAP_Advanced.Endpoints.vpc.peering"></a>

VPC peering enables private network connectivity between two VPCs, allowing AWS DMS replication instances and database endpoints to communicate across different VPCs as if they were in the same network. This is crucial when your DMS replication instance resides in one VPC while source or target databases exist in separate VPCs, enabling direct, secure data migration without traversing the public internet.

When using Amazon RDS, you must configure VPC peering between DMS and RDS if your instances are located in different VPCs.

You must perform the following steps:

**Creating a VPC peering connection**

1. Navigate to the [Amazon VPC console](https://console.aws.amazon.com/vpc/).

1. In the navigation pane, select **Peering Connections** under **Virtual private cloud**.

1. Click **Create Peering Connection**.

1. Configure the peering connections:
   + Name tag (optional): Enter a name for the peering connection (example: `DMS-RDS-Peering`).

     **VPC Requester**: Select the VPC that contains your DMS instance.
   + **VPC accepter**: Select the VPC that contains your RDS instance.
**Note**  
If the accepter VPC is associated with a different AWS account, you must have the Account ID and VPC ID for that acount.

1. Click **Create the Peering Connection**.

**Accepting the VPC peering connection**

1. In the **Peering Connections** list, find the new peering connection with a **Pending Acceptance** status.

1. Select the appropriate peering connection, click **Actions** and select **Accept Request**.

   The peering connection status changes to **Active**.

**Updating route tables**

To enable traffic between the VPCs, you must update the route table in both your VPCs. To update the route tables in the DMS VPC:

1. Identify CIDR block of the RDS VPC:

   1. Navigate to your VPCs and select your RDS VPC.

   1. Copy the IPv4 CIDR value in **CIDRs** tab.

1. Identify relevant DMS route tables using resource map:

   1. Navigate to your VPCs and select your DMS VPC.

   1. Click the **Resource Map** tab and note the route tables associated with the subnets where your DMS instance is located.

1. Update all route tables in the DMS VPC:

   1. Navigate to the route tables in the [Amazon VPC console](https://console.aws.amazon.com/vpc/).

   1. Select the route tables identifies for the DMS VPC. You can open them from the VPC's **Resource map** tab.

   1. Click **Edit routes**.

   1. Click Add route and enter the following information:
      + **Destination**: Enter the IPv4 CIDR block of the RDS VPC (Example: `10.1.0.0/16`).
      + **Target**: Select the peering configuration ID (Example: `pcx-1234567890abcdef`).

   1. Click **Save routes**.

      Your VPC routes are saved for the DMS VPC. Perform the same steps for your RDS VPC.

**Update Security Groups**

1. Verify the DMS instance Security Group:

   1. You must ensure that the outbound rules allow traffic to the RDS instance:
     + **Type**: Custom TCP or the specific database port (Example: 3306 fir MySQL).
     + **Destination**: The CIDR block of the RDS VPC or the security group of the RDS instance.

1. Verify the RDS instance Security Group:

   1. You must ensure that the inbound rules allow traffic from the DMS instance:
     + **Type**: The specific database port.
     + Source: The CIDR block of the DMS VPC or the security group of the RDS instabce.

**Note**  
You must also ensure the following:  
**Active Peering Connection**: Ensure the VPC peering connection is in the **Active** state before proceeding.
**Resource Map**: Use the **Resource map** tab in the [Amazon VPC console](https://console.aws.amazon.com/vpc/) console to identify which route tables need update.
**No Overlapping CIDR Blocks**: The VPCs must have non-overlapping CIDR blocks.
**Security Best Practices**: Restrcict Security Group rules to the necessary ports and sources.  
For more information, see [VPC peering connections](https://docs.aws.amazon.com/vpc/latest/peering/working-with-vpc-peering.html) in the *Amazon Virtual Private Cloud user guide*.

# Security group configuration for AWS DMS
<a name="CHAP_Advanced.Endpoints.securitygroup"></a>

Security group in AWS DMS must allow inbound and outbound connections for your replication instances on the appropriate database port. If you are using Amazon RDS, you must configure the security group between DMS and RDS for your instances.

You must perform the following steps:

**Configure the RDS instance security group**

1. Navigate to the [Amazon VPC console](https://console.aws.amazon.com/vpc/).

1. In the navigation pane on the left under **Security**, select **Security Groups**.

1. Select the RDS Security Group associated with your RDS instance.

1. Edit the inbound rules:

   1. Click **Actions** and select **Edit inbound rules**.

   1. Click **Add Rule** to create a new rule.

   1. Configure the rule as follows:
      + **Type**: Select your database type (Example: MySQL/Aurora for port 3306, PostgreSQL for port 5432).
      + **Protocol**: This auto-populates based on your database type.
      + **Port Range**: this auto-populates based on your database type.
      + **Source**: Choose **Custom**, and paste the security group ID associated with your DMS instance. This allows traffic from any resource within that security group. You can also specify the IP range (CIDR block) of your DMS instance.

   1. Click **Save rules**.

**Configure the DMS replication instance security group**

1. Navigate to the [Amazon VPC console](https://console.aws.amazon.com/vpc/).

1. In the navigation pane on the left under **Security**, select **Security Groups**.

1. In the **Security Group** list find and select the security group associated with your DMS replication instance.

1. Edit the outbound rules:

   1. Click **Actions** and select **Edit outbound rules**.

   1. Click **Add Rule** to create a new rule.

   1. Configure the rule as follows:
      + Type: Select your database type (Example: MySQL/Aurora, PostgreSQL).
      + Protocol: This auto-populates based on your database type.
      + Port Range: this auto-populates based on your database type.
      + Source: Choose **Custom**, and paste the security group ID associated with your RDS instance. This allows traffic from any resource within that security group. You can also specify the IP range (CIDR block) of your RDS instance.

   1. Click **Save rules**.

## Additional Considerations
<a name="CHAP_securitygroup_additional_considerations"></a>

You must consider the following additional configuration information:
+ **Use Security Group References**: Referencing security groups in the source or destional instances allows for dynamic management and is more secure than using IP addresses as it automatically included all resources within the group.
+ **Database Ports**: Ensure you are using the correct port for your database.
+ **Security Best Practices**: Only open the necessary ports to minimize security risks. you must also regular review of your security group rules to ensure they meed your security standards and requirements.

# Network Access Control List (NACL) configuration for AWS DMS
<a name="CHAP_Advanced.Ednpoints.NACL"></a>

When using Amazon RDS as a replication source, you should update the Network Access Control Lists (NACLs) for your DMS and RDS instance. Ensure that the NACLs are associated with the subnets where these instances reside. This allows inbound and outbound traffic on the specific database port.

To update the Network Access Control Lists, you must perform the following steps:

**Note**  
If your DMS and RDS instances are in the same subnet, you only need to update that subnet's NACL.

**Identify the relevant NACLs**

1. Navigate to the [Amazon VPC console](https://console.aws.amazon.com/vpc/).

1. In the navigation pane on the left under **Security**, select **Network ACLs**.

1. Select the relevant NACLs associated with the subnets where your DMS and RDS instances reside.

**Update the NACLs for the DMS instance subnet**

1. Identify the NACL associated with your DMS instance's subnet. To do so, you can browse through the subnets in the [Amazon VPC console](https://console.aws.amazon.com/vpc/), find the DMS subnet, and note the associated NACL ID.

1. Edit the inbound rules:

   1. Click the **Inbound Rules** tab for the selected NACL.

   1. Select **Edit inbound rules**.

   1. Add a new rule:
      + **Rule \$1**: Choose a unique number (Example: 100).
      + **Type**: Select **Custom TCP Rule**.
      + **Protocol**: TCP
      + **Port Range**: Enter your database port (Example: 3306 for MySQL).
      + **Source**: Enter the CIDR block of the RDS subnet (Example: 10.1.0.0/16).
      + **Allow/Deny**: Select **Allow**.

1. Edit the outbound rules:

   1. Click the **Outbound Rules** tab for the selected NACL.

   1. Click **Edit outbound rules**.

   1. Add a new rule:
      + **Rule \$1**: Use the same number as used in the inbound rules.
      + **Type**: All traffic.
      + **Destination**: 0.0.0.0/0
      + **Allow/Deny**: Select **Allow**.

1. Click **Save changes**.

1. Perform the same steps to update the NACLs associated with the RDS instance's subnet.

## Verify the NACL rules
<a name="CHAP_NACL.verify.NACL.Rules"></a>

You must ensure the following criteria for regarding the NACL rules.:
+ **Order of rules**: NACLs processes rules in the ascending order based on th erule number. Ensure that all the rules set as "**Allow**" have lower rule numbers than all the rules set as "**Deny**" as that might block traffic.
+ **Stateless nature**: NACLs are stateless. You must explicity allow both inbound and outbound traffic.
+ **CIDR blocks**: You must ensure that the CIDR blocks you use accurately represent the subnets of your DMS and RDS instances.

# Configuring AWS DMS secrets manager VPC Endpoint
<a name="CHAP_Advanced.Endpoints.secretsmanager"></a>

You must create a VPC endpoint to access the AWS Secrets Manager from a replication instance in a private subnet. This allows the replication instance access the Secrets Manager directly through the private network without sending traffic over the public internet.

To configure, you must follow the following steps:

**Create a security group for the VPC endpoint.**

1. Navigate to the [Amazon VPC console](https://console.aws.amazon.com/vpc/).

1. In the navigation pane on the left, select **Security groups**, and choose **Create security group**.

1. Configure security group details:
   + **Security group name**: Example: `SecretsManagerEndpointSG`
   + **Description**: Enter an appropriate description. (Example: Security group for secrets manager VPC endpoint).
   + **VPC**: Select the VPC where your replication instance and endpoints reside.

1. Click **Add Rule** to set inbound rules and configure the following:
   + Type: HTTPS (As the secrets manager uses HTTPS on port 443).
   + Source: Choose **Custom**, and enter the securty group ID of your replication instance. This ensures that any instance associated with that security group can access the VPC endpoint.

1. Review the changes and click **Create security group**.

**Create a VPC endpoint for secrets manager**
**Note**  
Create an interface VPC endpoint as outline in the [Creating an Interface Endpoint documentation](https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html#create-interface-endpoint-aws) topic in the *Amazon Virtual Private Cloud user guide*. When following this procedure, ensure the following:  
For **Service Category**, you should select **AWS services.**
For **Service name**, search `seretsmanager` and select the secretes manager service.

1. Select **VPC and Subnets** and configure the following:
   + **VPC**: Ensure it is the same VPC as your replication instance.
   + **Subnets**: Select the subnets where your replication instance resides.

1. In **Additional Settings**, ensure that the **Enable DNS name** is enabled by default for the interface endpoints

1. Under **Security group**, select the appropriate security group name. Example: `SecretsManagerEndpointSG` as created earlier).

1. Review all the settings and Click **Create endpoint**.

**Retrieve the VPC endpoint DNS name**

1. Access the VPC endpoint details:

   1. Navigate to the [Amazon VPC console](https://console.aws.amazon.com/vpc/) and choose **Endpoints**.

   1. Select the appropriate endpoint you created.

1. Copy the DNS name:

   1. Under the **Details** tab, navigate to the **DNS Names** section.

   1. Copy the first DNS name listed. (Example: `vpce-0abc123def456789g-secretsmanager.us-east-1.vpce.amazonaws.com`). This is the regional DNS name.

**Update your DMS endpoint**

1. Navigate to the [AWS DMS](https://console.aws.amazon.com/dms/v2) console.

1. Modify the DMS endpoint:

   1. In the navigation pane on the left, select **Endpoints**.

   1. Choose the appropriate endpoint you want to configure.

   1. Click **Actions** and select **Modify**.

1. Configure endpoint settings:

   1. Navigate to **Endpoint settings** and select **Use endpoint connection attributes** checkbox.

   1. In the **Connection attributes** field, add: `secretsManagerEndpointOverride=<copied DNS name>`.
**Note**  
If you have multiple connection attributes, you can separate them with a semicolon ";". For example: `datePartitionEnabled=false;secretsManagerEndpointOverride=vpce-0abc123def456789g-secretsmanager.us-east-1.vpce.amazonaws.com`

1. Click **Modify endpoint** to save your changes.

## Additional considerations
<a name="CHAP_secretsmanager.additionalconsiderations"></a>

You must consider the following additional configuration information:

**Replication instance security group:**
+ Ensure that the security group associated with your replication instance allows outbound traffic to the VPC endpoint on port 443 (HTTPS).

**VPC DNS settings:**
+ Confirm that **DNS resolution** and **DNS hotnames** are enabled in your VPC. This allows your instances to resolve the VPC endpoint DNS names. You can confirm that by navigating to VPCs in the [Amazon VPC console](https://console.aws.amazon.com/vpc/) and select your VPC to verify that **DNS resolution** and **DNS hotnames** are set to "**Yes**".

**Testing connectivity:**
+ From your replication instance, you can perform a DNS lookup to ensure it resolves the VPC endpoint: `nslookup secretsmanager.<region>amazonaws.com`. It must return the Ip address associated with your VPC endpoint