To finish creating a zero-ETL integration that replicates data from Amazon RDS into Amazon Redshift, you must create a destination database in Amazon Redshift.
First, connect to your Amazon Redshift cluster or workgroup and create a database with a reference to your integration identifier. Then, you can add data to your source RDS database and see it replicated in Amazon Redshift.
Topics
Creating a destination database in Amazon Redshift
Before you can start replicating data into Amazon Redshift, after you create an integration, you must create a destination database in your target data warehouse. This destination database must include a reference to the integration identifier. You can use the Amazon Redshift console or the Query editor v2 to create the database.
For instructions to create a destination database, see Create a destination database in Amazon Redshift.
Adding data to the source database
After you configure your integration, you can add some data to the RDS database that you want to replicate into your Amazon Redshift data warehouse.
Note
There are differences between data types in Amazon RDS and Amazon Redshift. For a table of data type mappings, see Data type differences between RDS and Amazon Redshift databases.
First, connect to the source database using the MySQL client of your choice. For instructions, see Connecting to your MySQL DB instance.
Then, create a table and insert a row of sample data.
Important
Make sure that the table has a primary key. Otherwise, it can't be replicated to the target data warehouse.
The following example uses the MySQL Workbench utility
CREATE DATABASE my_db
;
USE my_db
;
CREATE TABLE books_table
(ID int NOT NULL, Title VARCHAR(50) NOT NULL, Author VARCHAR(50) NOT NULL,
Copyright INT NOT NULL, Genre VARCHAR(50) NOT NULL, PRIMARY KEY (ID));
INSERT INTO books_table
VALUES (1, 'The Shining', 'Stephen King', 1977, 'Supernatural fiction');
Querying your Amazon RDS
data in Amazon Redshift
After you add data to the RDS database, it's replicated into Amazon Redshift and is ready to be queried.
To query the replicated data
-
Navigate to the Amazon Redshift console and choose Query editor v2 from the left navigation pane.
-
Connect to your cluster or workgroup and choose your destination database (which you created from the integration) from the dropdown menu (destination_database in this example). For instructions to create a destination database, see Create a destination database in Amazon Redshift.
-
Use a SELECT statement to query your data. In this example, you can run the following command to select all data from the table that you created in the source RDS database:
SELECT * from
my_db
."books_table
";-
is the RDS database schema name.my_db
-
is the RDS table name.books_table
-
You can also query the data using the a command line client. For example:
destination_database=# select * from my_db
."books_table
";
ID | Title | Author | Copyright | Genre | txn_seq | txn_id
----+–------------+---------------+-------------+------------------------+----------+--------+
1 | The Shining | Stephen King | 1977 | Supernatural fiction | 2 | 12192
Note
For case-sensitivity, use double quotes (" ") for schema, table, and column names. For more information, see enable_case_sensitive_identifier.
Data type differences between RDS and
Amazon Redshift databases
The following table shows the mapping of an RDS for MySQL data type to a corresponding Amazon Redshift data type. Amazon RDS currently supports only these data types for zero-ETL integrations.
If a table in your source database includes an unsupported data type, the table goes
out of sync and isn't consumable by the Amazon Redshift target. Streaming from the source to the
target continues, but the table with the unsupported data type isn't available. To fix
the table and make it available in Amazon Redshift, you must manually revert the breaking change
and then refresh the integration by running ALTER DATABASE...INTEGRATION
REFRESH
.
RDS for MySQL
RDS for MySQL data type | Amazon Redshift data type | Description | Limitations |
---|---|---|---|
INT | INTEGER | Signed four-byte integer | None |
SMALLINT | SMALLINT | Signed two-byte integer | None |
TINYINT | SMALLINT | Signed two-byte integer | None |
MEDIUMINT | INTEGER | Signed four-byte integer | None |
BIGINT | BIGINT | Signed eight-byte integer | None |
INT UNSIGNED | BIGINT | Signed eight-byte integer | None |
TINYINT UNSIGNED | SMALLINT | Signed two-byte integer | None |
MEDIUMINT UNSIGNED | INTEGER | Signed four-byte integer | None |
BIGINT UNSIGNED | DECIMAL(20,0) | Exact numeric of selectable precision | None |
DECIMAL(p,s) = NUMERIC(p,s) | DECIMAL(p,s) | Exact numeric of selectable precision |
Precision greater than 38 and scale greater than 37 not supported |
DECIMAL(p,s) UNSIGNED = NUMERIC(p,s) UNSIGNED | DECIMAL(p,s) | Exact numeric of selectable precision |
Precision greater than 38 and scale greater than 37 not supported |
FLOAT4/REAL | REAL | Single precision floating-point number | None |
FLOAT4/REAL UNSIGNED | REAL | Single precision floating-point number | None |
DOUBLE/REAL/FLOAT8 | DOUBLE PRECISION | Double precision floating-point number | None |
DOUBLE/REAL/FLOAT8 UNSIGNED | DOUBLE PRECISION | Double precision floating-point number | None |
BIT(n) | VARBYTE(8) | Variable-length binary value | None |
BINARY(n) | VARBYTE(n) | Variable-length binary value | None |
VARBINARY(n) | VARBYTE(n) | Variable-length binary value | None |
CHAR(n) | VARCHAR(n) | Variable-length string value | None |
VARCHAR(n) | VARCHAR(n) | Variable-length string value | None |
TEXT | VARCHAR(65535) | Variable-length string value up to 65,535 characters | None |
TINYTEXT | VARCHAR(255) | Variable-length string value up to 255 characters | None |
MEDIUMTEXT | VARCHAR(65535) | Variable-length string value up to 65,535 characters | None |
LONGTEXT | VARCHAR(65535) | Variable-length string value up to 65,535 characters | None |
ENUM | VARCHAR(1020) | Variable-length string value up to 1,020 characters | None |
SET | VARCHAR(1020) | Variable-length string value up to 1,020 characters | None |
DATE | DATE | Calendar date (year, month, day) | None |
DATETIME | TIMESTAMP | Date and time (without time zone) | None |
TIMESTAMP(p) | TIMESTAMP | Date and time (without time zone) | None |
TIME | VARCHAR(18) | Variable-length string value up to 18 characters | None |
YEAR | VARCHAR(4) | Variable-length string value up to 4 characters | None |
JSON | SUPER | Semistructured data or documents as values | None |