Secrets Manager에서 보안 인증 검색 IAM에서 보안 인증 검색

Apache Spark용 Amazon Redshift 통합으로 인증

AWS Secrets Manager 를 사용하여 자격 증명을 검색하고 Amazon Redshift에 연결

다음 코드 샘플은 AWS Secrets Manager 를 사용하여 자격 증명을 검색하여 Python의 Apache Spark용 PySpark 인터페이스를 사용하여 Amazon Redshift 클러스터에 연결하는 방법을 보여줍니다.


from pyspark.sql import SQLContext
import boto3

sc = # existing SparkContext
sql_context = SQLContext(sc)

secretsmanager_client = boto3.client('secretsmanager')
secret_manager_response = secretsmanager_client.get_secret_value(
    SecretId='string',
    VersionId='string',
    VersionStage='string'
)
username = # get username from secret_manager_response
password = # get password from secret_manager_response
url = "jdbc:redshift://redshifthost:5439/database?user=" + username + "&password=" + password

# Read data from a table
df = sql_context.read \
    .format("io.github.spark_redshift_community.spark.redshift") \
    .option("url", url) \
    .option("dbtable", "my_table") \
    .option("tempdir", "s3://path/for/temp/data") \
    .load()

IAM을 사용하여 보안 인증을 검색하고 Amazon Redshift에 연결

Amazon Redshift 제공 JDBC 드라이버 버전 2 드라이버를 사용하여 Spark 커넥터로 Amazon Redshift에 연결할 수 있습니다. AWS Identity and Access Management (IAM)를 사용하려면 IAM 인증을 사용하도록 JDBC URL을 구성합니다. Amazon EMR에서 Redshift 클러스터에 연결하려면 임시 IAM 보안 인증을 검색할 권한을 IAM 역할에 부여해야 합니다. 보안 인증을 검색하고 Amazon S3 작업을 실행할 수 있도록 IAM 역할에 다음 권한을 할당합니다.

Redshift:GetClusterCredentials(프로비저닝된 Amazon Redshift 클러스터용)
Redshift:DescribeClusters(프로비저닝된 Amazon Redshift 클러스터용)
Redshift:GetWorkgroup(Amazon Redshift Serverless 작업 그룹용)
Redshift:GetCredentials(Amazon Redshift Serverless 작업 그룹용)
s3:GetBucket
s3:GetBucketLocation
s3:GetObject
s3:PutObject
s3:GetBucketLifecycleConfiguration

GetClusterCredentials에 대한 자세한 내용은 GetClusterCredentials에 대한 리소스 정책을 참조하세요.

또한 COPY 및 UNLOAD 작업 중에 Amazon Redshift가 IAM 역할을 맡을 수 있는지 확인해야 합니다.


{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "redshift.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

다음 예제에서는 Spark와 Amazon Redshift 사이에서 IAM 인증을 사용합니다.


from pyspark.sql import SQLContext
import boto3

sc = # existing SparkContext
sql_context = SQLContext(sc)

url = "jdbc:redshift:iam//redshift-host:redshift-port/db-name"
iam_role_arn = "arn:aws:iam::account-id:role/role-name"

# Read data from a table
df = sql_context.read \
    .format("io.github.spark_redshift_community.spark.redshift") \
    .option("url", url) \
    .option("aws_iam_role", iam_role_arn) \
    .option("dbtable", "my_table") \
    .option("tempdir", "s3a://path/for/temp/data") \
    .mode("error") \
    .load()

javascript가 브라우저에서 비활성화되거나 사용이 불가합니다.

AWS 설명서를 사용하려면 Javascript가 활성화되어야 합니다. 지침을 보려면 브라우저의 도움말 페이지를 참조하십시오.

문서 규칙

Spark 애플리케이션 시작

Amazon Redshift에 대한 읽고 쓰기