Connecting to data

Focus mode

Connecting to data - AWS Glue

Overview of using connectors and connections

An AWS Glue connection is a Data Catalog object that stores login credentials, URI strings, virtual private cloud (VPC) information, and more for a particular data store. AWS Glue crawlers, jobs, and development endpoints use connections in order to access certain types of data stores. You can use connections for both sources and targets, and reuse the same connection across multiple crawler or extract, transform, and load (ETL) jobs.

The latest version of the AWS Glue connections schema provides a unified way to manage data connections across AWS services and applications, such as AWS Glue, Amazon Athena, and Amazon SageMaker AI Unified Studio.

Overview of using connectors and connections

A connection contains the properties that are required to connect to a particular data store. When you create a connection, it is stored in the AWS Glue Data Catalog. You choose a connector, and then create a connection based on that connector.

You can subscribe to connectors for non-natively supported data stores in AWS Marketplace, and then use those connectors when you're creating connections. Developers can also create their own connectors, and you can use them when creating connections.

Note

Connections created using custom or AWS Marketplace connectors in AWS Glue Studio appear in the AWS Glue console with type set to UNKNOWN.

The following steps describe the overall process of using connectors in AWS Glue Studio:

Subscribe to a connector in AWS Marketplace, or develop your own connector and upload it to AWS Glue Studio. For more information, see Adding connectors to AWS Glue Studio.
Review the connector usage information. You can find this information on the Usage tab on the connector product page. For example, if you click the Usage tab on this product page, AWS Glue Connector for Google BigQuery, you can see in the Additional Resources section a link to a blog about using this connector.
Create a connection. You choose which connector to use and provide additional information for the connection, such as login credentials, URI strings, and virtual private cloud (VPC) information. For more information, see Creating connections for connectors.
Create an IAM role for your job. The job assumes the permissions of the IAM role that you specify when you create it. This IAM role must have the necessary permissions to authenticate with, extract data from, and write data to your data stores.
Create an ETL job and configure the data source properties for your ETL job. Provide the connection options and authentication information as instructed by the custom connector provider. For more information, see Authoring jobs with custom connectors.
Customize your ETL job by adding transforms or additional data stores, as described in Starting visual ETL jobs in AWS Glue Studio.
If using a connector for the data target, configure the data target properties for your ETL job. Provide the connection options and authentication information as instructed by the custom connector provider. For more information, see Authoring jobs with custom connectors.
Customize the job run environment by configuring job properties, as described in Modify the job properties.
Run the job.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Migrating to AWS Glue Schema Registry

Unified connections

Next topic:

Unified connections

Previous topic:

Migrating to AWS Glue Schema Registry

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Connecting to data

Overview of using connectors and connections

Note

Next topic:

Previous topic:

Need help?

On this page

Did this page help you?