Migrate an Oracle partitioned table to PostgreSQL by using AWS DMS
Created by Saurav Mishra (AWS) and Eduardo Valentim (AWS)
Environment: PoC or pilot | Source: Oracle database | Target: PostgreSQL 9.0 |
R Type: Re-architect | Workload: Oracle | Technologies: Migration; Databases; Storage & backup |
AWS services: AWS DMS |
Summary
This pattern describes how to speed up loading a partitioned table from Oracle to PostgreSQL by using AWS Database Migration Service (AWS DMS), which doesn't support native partitioning. The target PostgreSQL database can be installed on Amazon Elastic Compute Cloud (Amazon EC2), or it can be an Amazon Relational Database Service (Amazon RDS) for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition DB instance.
Uploading a partitioned table includes the following steps:
Create a parent table similar to the Oracle partition table, but don't include any partition.
Create child tables that will inherit from the parent table that you created in step 1.
Create a procedure function and trigger to handle the inserts in the parent table.
However, because the trigger is fired for every insert, the initial load using AWS DMS can be very slow.
To speed up initial loads from Oracle to PostgreSQL 9.0, this pattern creates a separate AWS DMS task for each partition and loads the corresponding child tables. You then create a trigger during cutover.
PostgreSQL version 10 supports native partitioning. However, you might decide to use inherited partitioning in some cases. For more information, see the Additional information section.
Prerequisites and limitations
Prerequisites
An active AWS account
A source Oracle database with a partitioned table
A PostgreSQL database on AWS
Product versions
PostgreSQL 9.0
Architecture
Source technology stack
A partitioned table in Oracle
Target technology stack
A partitioned table in PostgreSQL (on Amazon EC2, Amazon RDS for PostgreSQL, or Aurora PostgreSQL)
Target architecture
Tools
AWS Database Migration Service (AWS DMS) helps you migrate data stores into the AWS Cloud or between combinations of cloud and on-premises setups.
Epics
Task | Description | Skills required |
---|---|---|
Create the tables in PostgreSQL. | Create the parent and corresponding child tables in PostgreSQL with the required check conditions for partitions. | DBA |
Create the AWS DMS task for each partition. | Include the filter condition of the partition in the AWS DMS task. Map the partitions to the corresponding PostgreSQL child tables. | DBA |
Run the AWS DMS tasks using full load and change data capture (CDC). | Make sure that the | DBA |
Task | Description | Skills required |
---|---|---|
Stop the replication tasks. | Before you stop the tasks, confirm that the source and destination are in sync. | DBA |
Create a trigger on the parent table. | Because the parent table will receive all insert and update commands, create a trigger that will route these commands to the respective child tables based on the partitioning condition. | DBA |
Related resources
Additional information
Although PostgreSQL version 10 supports native partitioning, you might decide to use inherited partitioning for the following use cases:
Partitioning enforces a rule that all partitions must have the same set of columns as the parent, but table inheritance supports children having extra columns.
Table inheritance supports multiple inheritances.
Declarative partitioning supports only list and range partitioning. With table inheritance, you can divide the data as you want. However, if the constraint exclusion can't prune partitions effectively, query performance will suffer.
Some operations need a stronger lock when using declarative partitioning than when using table inheritance. For example, adding or removing a partition to or from a partitioned table requires an
ACCESS EXCLUSIVE
lock on the parent table, whereas aSHARE UPDATE EXCLUSIVE
lock is enough for regular inheritance.
When you use separate job partitions, you can also reload partitions if there are any AWS DMS validation issues. For better performance and replication control, run tasks on separate replication instances.