Data sharing in Amazon Redshift - Amazon Redshift

Data sharing in Amazon Redshift

With Amazon Redshift, you can securely share data across Amazon Redshift clusters or with other AWS services. Data sharing lets you share live data, without having to create a copy or move it. Database administrators and data engineers can use data sharing to provide secure, read-only access to data for analytics purposes, while maintaining control over the data. Data analysts, business intelligence professionals, and data scientists can leverage shared data to gain insights without duplicating or moving data. Common use cases include sharing data with partners, enabling cross-functional analysis, and facilitating data democratization within an organization. The following sections cover the details of configuring and managing data sharing in Amazon Redshift.

With Amazon Redshift data sharing, you can securely share access to live data across Amazon Redshift clusters, workgroups, AWS accounts, and AWS Regions without manually moving or copying the data. Since the data is live, all users can see the most up-to-date and consistent information in Amazon Redshift as soon as it’s updated.

You can share data across provisioned clusters, serverless workgroups, Availability Zones, AWS accounts, and AWS Regions. You can share between cluster types as well as between provisioned clusters and serverless.

You can share database objects for both reads and writes across different Amazon Redshift clusters or Amazon Redshift Serverless workgroups within the same AWS account, or from one AWS account to another. You can write data across regions as well. You can grant permissions such as SELECT, INSERT, and UPDATE for different tables and USAGE and CREATE for different schemas. The data is live and available to all warehouses as soon as a write transaction is committed.

For more information about configuring capabilities for data sharing in the PREVIEW_2023 track, see Sharing write access to data (Preview).

Note

Multi-warehouse writes through data sharing is not currently available on ra3.xlplus clusters. To use this feature, create ra3.4xl clusters, ra3.16xl clusters, or Amazon Redshift Serverless workgroups.

Considerations when using data sharing in Amazon Redshift

The following are considerations for working with Amazon Redshift data sharing. For information on data sharing limitations, see Limitations for data sharing.

  • Cross-region data sharing includes additional cross-region data-transfer charges. These data-transfer charges don't apply within the same region, only across regions. For more information, see Managing cost control for cross-Region data sharing.

  • When you read data from a datashare, you remain connected to your local cluster database. For more information about setting up and reading from a database created from a datashare, see Querying datashare objects and Materialized views on external data lake tables in Amazon Redshift Spectrum.

  • The consumer is charged for all compute and cross-region data transfer fees required to query the producer's data. The producer is charged for the underlying storage of data in their provisioned cluster or serverless namespace.

  • The performance of the queries on shared data depends on the compute capacity of the consumer clusters.

Cluster encryption management for data sharing

To share data across AWS account, both the producer and consumer clusters must be encrypted.

In Amazon Redshift, you can turn on database encryption for your clusters to help protect data at rest. When you turn on encryption for a cluster, the data blocks and system metadata are encrypted for the cluster and its snapshots. You can turn on encryption when you launch your cluster, or you can modify an unencrypted cluster to use AWS Key Management Service (AWS KMS) encryption. For more information about Amazon Redshift database encryption, see Amazon Redshift database encryption in the Amazon Redshift Management Guide.

To protect data in transit, all data is encrypted in transit through the encryption schema of the producer cluster. The consumer cluster adopts this encryption schema when data is loaded. The consumer cluster then operates as a normal encrypted cluster. Communications between the producer and consumer are also encrypted using a shared key schema. For more information about encryption in transit, Encryption in transit.

Limitations for data sharing

The following are limitations when working with datashares in Amazon Redshift:

  • Data sharing is supported for all provisioned RA3 cluster types and Amazon Redshift Serverless. It isn't supported for other cluster types.

  • If both the producer and consumer clusters and serverless namespaces are in the same account, they must have the same encryption type (either both unencrypted, or both encrypted). In every other case, including Lake Formation managed datashares, both the consumer and producer must be encrypted. This is for security purposes. However, they don't need to share the same encryption key.

  • You can only share SQL UDFs through datashares. Python and Lambda UDFs aren't supported.

  • If the producer database has specific collation, use the same collation settings for the consumer database.

  • Amazon Redshift doesn't support nested SQL user-defined functions on producer clusters.

  • Amazon Redshift doesn't support sharing tables with interleaved sort keys and views that refer to tables with interleaved sort keys.

  • Consumers can't add datashare objects to another datashare. Additionally, consumers can't add views referencing datashare objects to another datashare.

  • Amazon Redshift doesn't support accessing a datashare object which had a concurrent DDL occur between the Prepare and Execute of the access.

  • Amazon Redshift doesn't support sharing stored procedures through datashares.

  • Amazon Redshift doesn't support sharing metadata system views and system tables.

Regions where data sharing is available

The following table lists availability for data-sharing capabilities.

Region Same-region data sharing Cross-region data sharing AWS Lake Formation governed data shares
US East (N. Virginia) (us-east-1) Yes Yes Yes
US East (Ohio) (us-east-2) Yes Yes Yes
US West (N. California) (us-west-1) Yes Yes Yes
US West (Oregon) (us-west-2) Yes Yes Yes
Asia Pacific (Hong Kong) (ap-east-1) Yes No No
Asia Pacific (Mumbai) (ap-south-1) Yes Yes Yes
Asia Pacific (Hyderabad) (ap-south-2) Yes No No
Asia Pacific (Tokyo) (ap-northeast-1) Yes Yes Yes
Asia Pacific (Singapore) (ap-southeast-1) Yes Yes Yes
Asia Pacific (Sydney) (ap-southeast-2) Yes Yes Yes
Asia Pacific (Jakarta); (ap-southeast-3) Yes No No
Asia Pacific (Melbourne) (ap-southeast-4) Yes No No
Asia Pacific (Seoul) (ap-northeast-2) Yes Yes Yes
Asia Pacific (Osaka) (ap-northeast-3) Yes No No
China (Beijing) (cn-north-1) Yes No No
China (Ningxia) (cn-northwest-1) Yes No No
Africa (Cape Town) (af-south-1) Yes No No
Canada West (Calgary) (ca-west-1) Yes No No
Canada (Central) (ca-central-1) Yes Yes Yes
Europe (Frankfurt) (eu-central-1) Yes Yes Yes
Europe (Zurich) (eu-central-2) Yes No No
Europe (Ireland) (eu-west-1) Yes Yes Yes
Europe (London) (eu-west-2) Yes Yes Yes
Europe (Paris) (eu-west-3) Yes Yes Yes
Europe (Milan) (eu-south-1) Yes No No
Europe (Spain) (eu-south-2) Yes No No
Europe (Stockholm) (eu-north-1) Yes Yes Yes
Middle East (UAE) (me-central-1) Yes No No
Middle East (Bahrain) (me-south-1) Yes No No
Israel (Tel Aviv) (il-central-1) Yes No No
South America (São Paulo) (sa-east-1) Yes Yes Yes
AWS GovCloud (US-East) (us-gov-east-1) Yes No Yes
AWS GovCloud (US-West) (us-gov-west-1) Yes No Yes

Regional availability for multi-warehouse writes for data sharing

In the PREVIEW_2023 track, data sharing has the capability for write operations and more granular sharing capabilities. For more information about how to configure these, see Sharing write access to data (Preview). For information about regions where preview capabilities are available, see Regions where data sharing is available (preview).