Enable requester pays Amazon S3 buckets in Athena for Spark
When an Amazon S3 bucket is configured as requester pays, the account of the user running the query is charged for data access and data transfer fees associated with the query. For more information, see Using Requester Pays buckets for storage transfers and usage in the Amazon S3 User Guide.
In Athena for Spark, requester pays buckets are enabled per session, not per workgroup. At a high level, enabling requester pays buckets includes the following steps:
-
In the Amazon S3 console, enable requester pays on the properties for the bucket and add a bucket policy to specify access.
-
In the IAM console, create an IAM policy to allow access to the bucket, and then attach the policy to the IAM role that will be used to access the requester pays bucket.
-
In Athena for Spark, add a session property to enable the requester pays feature.
Step 1: Enable requester pays on an Amazon S3 bucket and add a bucket policy
To enable requester pays on an Amazon S3 bucket
Open the Amazon S3 console at https://console.aws.amazon.com/s3/
. -
In the list of buckets, choose the link for the bucket that you want to enable requester pays for.
-
On the bucket page, choose the Properties tab.
-
Scroll down to the Requester pays section, and then choose Edit.
-
On the Edit requester pays page, choose Enable, and then choose Save changes.
-
Choose the Permissions tab.
-
In the Bucket policy section, choose Edit.
-
On the Edit bucket policy page, apply the bucket policy that you want to the source bucket. The following example policy gives access to all AWS principals (
"AWS": "*"
), but your access can be more granular. For example, you might want to specify only a specific IAM role in another account.{ "Version": "2012-10-17", "Statement": [ { "Sid": "Statement1", "Effect": "Allow", "Principal": { "AWS": "*" }, "Action": "s3:*", "Resource": [ "arn:aws:s3:::
account_number
-us-east-1-my-s3-requester-pays-bucket", "arn:aws:s3:::account_number
-us-east-1-my-s3-requester-pays-bucket/*" ] } ] }
Step 2: Create an IAM policy and attach it to an IAM role
Next, you create an IAM policy to allow access to the bucket. Then you attach the policy to the role that will be used to access the requester pays bucket.
To create an IAM policy for the requester pays bucket and attach the policy to a role
Open the IAM console at https://console.aws.amazon.com/iam/
. -
In the IAM console navigation pane, choose Policies.
-
Choose Create policy.
-
Choose JSON.
-
In the Policy editor, add a policy like the following:
{ "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:*" ], "Effect": "Allow", "Resource": [ "arn:aws:s3:::
account_number
-us-east-1-my-s3-requester-pays-bucket", "arn:aws:s3:::account_number
-us-east-1-my-s3-requester-pays-bucket/*" ] } ] } -
Choose Next.
-
On the Review and create page, enter a name for the policy and an optional description, and then choose Create policy.
-
In the navigation pane, choose Roles.
-
On the Roles page, find the role that you want to use, and then choose the role name link.
-
In the Permissions policies section, choose Add permissions, Attach policies.
-
In the Other permissions policies section, select the check box for the policy that you created, and then choose Add permissions.
Step 3: Add an Athena for Spark session property
After you have configured the Amazon S3 bucket and associated permissions for requester pays, you can enable the feature in an Athena for Spark session.
To enable requester pays buckets in an Athena for Spark session
-
In the notebook editor, from the Session menu on the upper right, choose Edit session.
-
Expand Spark properties.
-
Choose Edit in JSON.
-
In the JSON text editor, enter the following:
{ "spark.hadoop.fs.s3.useRequesterPaysHeader":"true" }
-
Choose Save.