Running PySpark jobs

Focus mode

Running PySpark jobs - AWS Clean Rooms

As the member who can query, you can run a PySpark job on a configured table by using an approved PySpark analysis template.

Prerequisites

Before you run a Python job, you must have:

An active membership in AWS Clean Rooms collaboration
Access to at least one analysis template in the collaboration
Access to at least one configured table in the collaboration
Permissions to write the results of a PySpark job to a specified S3 bucket

For information about creating the required service role, see Create a service role to write results of a PySpark job.
The member who is responsible to pay for compute costs has joined the collaboration as an active member

For information about how to query data or view queries by calling the AWS Clean Rooms StartProtectedJob API operation directly or by using the AWS SDKs, see the AWS Clean Rooms API Reference.

For information about job logging, see Analysis logging in AWS Clean Rooms.

For information about receiving job results, see Receiving and using analysis results.

The following topics explain how to run a PySpark job on a configured table in a collaboration using the AWS Clean Rooms console.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Viewing query details

Run PySpark job using an analysis template

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Running PySpark jobs

Topics

Did this page help you?

Next topic:

Previous topic:

Need help?