To connect to Google BigQuery from AWS Glue, you will need to create and store your Google Cloud Platform credentials in a AWS Secrets Manager secret, then associate that secret with a Google BigQuery AWS Glue connection.
To configure a connection to BigQuery:
In Google Cloud Platform, create and identify relevant resources:
Create or identify a GCP project containing BigQuery tables you would like to connect to.
Enable the BigQuery API. For more information, see Use the BigQuery Storage Read API to read table data
.
In Google Cloud Platform, create and export service account credentials:
You can use the BigQuery credentials wizard to expedite this step: Create credentials
. To create a service account in GCP, follow the tutorial available in Create service accounts
. -
When selecting project, select the project containing your BigQuery table.
-
When selecting GCP IAM roles for your service account, add or create a role that would grant appropriate permissions to run BigQuery jobs to read, write or create BigQuery tables.
To create credentials for your service account, follow the tutorial available in Create a service account key
. -
When selecting key type, select JSON.
You should now have downloaded a JSON file with credentials for your service account. It should look similar to the following:
{ "type": "service_account", "project_id": "*****", "private_key_id": "*****", "private_key": "*****", "client_email": "*****", "client_id": "*****", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "*****", "universe_domain": "googleapis.com" }
-
base64 encode your downloaded credentials file. On an AWS CloudShell session or similar, you can do this from the command line by running
cat
. Retain the output of this command,credentialsFile.json
| base64 -w 0credentialString
.In AWS Secrets Manager, create a secret using your Google Cloud Platform credentials. To create a secret in Secrets Manager, follow the tutorial available in Create an AWS Secrets Manager secret in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name,
secretName
for the next step.-
When selecting Key/value pairs, create a pair for the key
credentials
with the valuecredentialString
.
-
In the AWS Glue Data Catalog, create a connection by following the steps in https://docs.aws.amazon.com/glue/latest/dg/console-connections.html. After creating the connection, keep the connection name,
connectionName
, for the next step.When selecting a Connection type, select Google BigQuery.
When selecting an AWS Secret, provide
secretName
.
Grant the IAM role associated with your AWS Glue job permission to read
secretName
.In your AWS Glue job configuration, provide
connectionName
as an Additional network connection.