创建和管理 Amazon EMR Serverless 使用 Step Functions 的应用程序 - AWS Step Functions

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

创建和管理 Amazon EMR Serverless 使用 Step Functions 的应用程序

学习如何使用 Step Function EMR s 在 Serverless 上创建、启动、停止和删除应用程序。本页列出了支持的Task状态APIs并提供了执行常见用例的示例状态。

要了解如何与集成 AWS Step Functions 中的服务,参见集成 服务和。在 Step Functions API 中向服务传递参数

“优化” 的主要特点 EMR Serverless 集成
  • 经过优化的 EMR Serverless 服务集成有一组自定义APIs的包裹底层 EMR Serverless APIs。由于这种自定义,优化了 EMR Serverless 集成与 AWS SDK服务集成。

  • 此外,经过优化的 EMR Serverless 集成支持运行作业 (.sync)集成模式。

  • 支持等待带有任务令牌的回调集成模式。

EMR Serverless 服务集成 APIs

要整合 AWS Step Functions 替换为 EMR Serverless,你可以使用以下六个 EMR Serverless 服务集成APIs。这些服务集成APIs类似于相应的 EMR Serverless APIs,但传递的字段和返回的响应有一些差异。

下表描述了两者之间的区别 EMR Serverless 服务集成API及其相应的 EMR Serverless API.

EMR Serverless 服务集成 API 相应的 EMR Serverless API 差异

createApplication

创建应用程序。

EMR Serverless 链接到一种独特类型的 IAM 角色称为服务相关角色。要使 createApplicationcreateApplication.sync 起作用,您必须配置必要的权限以创建与服务关联的角色 AWS ServiceRoleForAmazonEMRServerless。有关这方面的更多信息,包括您可以添加的声明 IAM 权限策略,请参阅使用服务相关角色获取 EMR Serverless.

CreateApplication

createApplication.sync

创建应用程序。

CreateApplication

的请求和响应之间没有区别 EMR Serverless API和 EMR Serverless 服务集成API。但是,createApplication.sync 会等待应用程序进入状态。CREATED

startApplication

启动指定的应用程序并初始化该应用程序的初始容量(如果已配置)。

StartApplication

这些区域有:EMR Serverless API响应不包含任何数据,但是 EMR Serverless 服务集成API响应包括以下数据。

{ "ApplicationId": "string" }

startApplication.sync

启动指定的应用程序并初始化初始容量(如果已配置)。

StartApplication

这些区域有:EMR Serverless API响应不包含任何数据,但是 EMR Serverless 服务集成API响应包括以下数据。

{ "ApplicationId": "string" }

此外,startApplication.sync 会等待应用程序进入状态。STARTED

stopApplication

停止指定的应用程序并释放初始容量(如果已配置)。在停止应用程序之前,必须完成或取消所有已计划和正在运行的作业。

StopApplication

这些区域有:EMR Serverless API响应不包含任何数据,但是 EMR Serverless 服务集成API响应包括以下数据。

{ "ApplicationId": "string" }

stopApplication.sync

停止指定的应用程序并释放初始容量(如果已配置)。在停止应用程序之前,必须完成或取消所有已计划和正在运行的作业。

StopApplication

这些区域有:EMR Serverless API响应不包含任何数据,但是 EMR Serverless 服务集成API响应包括以下数据。

{ "ApplicationId": "string" }

此外,stopApplication.sync 会等待应用程序进入状态。STOPPED

deleteApplication

删除应用程序 应用程序必须处于 STOPPEDCREATED 状态才能被删除。

DeleteApplication

这些区域有:EMR Serverless API响应不包含任何数据,但是 EMR Serverless 服务集成API响应包括以下数据。

{ "ApplicationId": "string" }

deleteApplication.sync

删除应用程序 应用程序必须处于 STOPPEDCREATED 状态才能被删除。

DeleteApplication

这些区域有:EMR Serverless API响应不包含任何数据,但是 EMR Serverless 服务集成API响应包括以下数据。

{ "ApplicationId": "string" }

此外,stopApplication.sync 会等待应用程序进入状态。TERMINATED

startJobRun

启动作业运行。

StartJobRun

startJobRun.sync

启动作业运行。

StartJobRun

的请求和响应之间没有区别 EMR Serverless API和 EMR Serverless 服务集成API。但是,startJobRun.sync 会等待应用程序进入状态。SUCCESS

cancelJobRun

取消作业运行。

CancelJobRun

cancelJobRun.sync

取消作业运行。

CancelJobRun

的请求和响应之间没有区别 EMR Serverless API和 EMR Serverless 服务集成API。但是,cancelJobRun.sync 会等待应用程序进入状态。CANCELLED

EMR无服务器集成用例

对于经过优化的 EMR Serverless 服务集成,我们建议您创建一个应用程序,然后使用该应用程序运行多个作业。例如,在单个状态机中,您可以包含多个startJobRun请求,所有这些请求都使用同一个应用程序。以下任务工作流状态状态示例显示了要集成的用例 EMR Serverless APIs与 Step Functions。 有关其他用例的信息 EMR Serverless,参见什么是 Amazon EMR Serverless.

提示

部署与集成的状态机的示例 EMR Serverless 用于为你运行多个作业 AWS 账户,请参阅 跑一个 EMR Serverless 作业

要了解有关配置的信息 IAM 使用时的权限 Step Functions 和其他 AWS 服务,请参阅Step Functions 如何为集成服务生成IAM策略

在以下用例所示的示例中,替换 italicized 包含您的资源特定信息的文本。例如,替换 yourApplicationId 用你的身份证 EMR Serverless 应用程序,例如00yv7iv71inak893

创建应用程序

以下任务状态示例使用 createApplication.sync 服务集成API创建应用程序。

"Create_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:createApplication.sync", "Parameters": { "Name": "MyApplication", "ReleaseLabel": "emr-6.9.0", "Type": "SPARK" }, "End": true }

启动应用程序

以下任务状态示例使用 startApplication.sync 服务集成API启动应用程序。

"Start_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:startApplication.sync", "Parameters": { "ApplicationId": "yourApplicationId" }, "End": true }

停止应用程序

以下任务状态示例使用 stopApplication.sync 服务集成API停止应用程序。

"Stop_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:stopApplication.sync", "Parameters": { "ApplicationId": "yourApplicationId" }, "End": true }

删除应用程序

以下任务状态示例使用 deleteApplication.sync 服务集成API删除应用程序。

"Delete_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:deleteApplication.sync", "Parameters": { "ApplicationId": "yourApplicationId" }, "End": true }

启动应用程序中的作业

以下任务状态示例使用 startJobRun.sync 服务集成API在应用程序中启动作业。

"Start_Job": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:startJobRun.sync", "Parameters": { "ApplicationId": "yourApplicationId", "ExecutionRoleArn": "arn:aws:iam::123456789012:role/myEMRServerless-execution-role", "JobDriver": { "SparkSubmit": { "EntryPoint": "s3://<amzn-s3-demo-bucket>/sample.py", "EntryPointArguments": ["1"], "SparkSubmitParameters": "--conf spark.executor.cores=4 --conf spark.executor.memory=4g --conf spark.driver.cores=2 --conf spark.driver.memory=4g --conf spark.executor.instances=1" } } }, "End": true }

取消应用程序中的作业

以下任务状态示例使用 cancelJobRun.sync 服务集成取消应用程序中的作业。API

"Cancel_Job": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:cancelJobRun.sync", "Parameters": { "ApplicationId.$": "$.ApplicationId", "JobRunId.$": "$.JobRunId" }, "End": true }

IAM通话政策 Amazon EMR Serverless

使用控制台创建状态机时,Step Functions 使用所需的最低权限自动为您的状态机创建执行角色。这些是自动生成的 IAM 角色适用于 AWS 区域 你可以在其中创建状态机。

以下示例模板演示了如何操作 AWS Step Functions 根据状态机定义中的资源生成IAM策略。有关更多信息,请参阅Step Functions 如何为集成服务生成IAM策略在 Step Functions 中探索服务集成模式

我们建议您在创建时这样做 IAM 策略,请勿在策略中包含通配符。作为安全最佳实操,应尽可能缩小策略范围。只有在运行时不知道某些输入参数时,才应使用动态策略。

此外,管理员用户在向非管理员用户授予运行状态机的执行角色时应谨慎行事。如果您自己创建 passRole 策略,我们建议您在执行角色中加入策略。我们还建议在执行角色中添加 aws:SourceARNaws:SourceAccount 上下文密钥。

IAMEMR无服务器与 Step Functions 集成的策略示例

IAM的策略示例 CreateApplication

以下是带有状态的状态机的IAM策略示例。 CreateApplication 任务工作流状态

注意

在账户中创建有史以来第一个应用程序时,您需要在IAM策略中指定 CreateServiceLinkedRole 权限。此后,便无需再添加此权限。有关信息 CreateServiceLinkedRole,请参阅 CreateServiceLinkedRole https://docs.aws.amazon.com/IAM/latest//APIReference。

以下策略的静态资源和动态资源相同。

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CreateApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/*" ] }, { "Effect": "Allow", "Action": [ "emr-serverless:GetApplication", "emr-serverless:DeleteApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:{{region}}:{{accountId}}:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] }, { "Effect": "Allow", "Action": "iam:CreateServiceLinkedRole", "Resource": "arn:aws:iam::{{accountId}}:role/aws-service-role/ops.emr-serverless.amazonaws.com/AWS ServiceRoleForAmazonEMRServerless*", "Condition": { "StringLike": { "iam:AWSServiceName": "ops.emr-serverless.amazonaws.com" } } } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CreateApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/*" ] }, { "Effect": "Allow", "Action": "iam:CreateServiceLinkedRole", "Resource": "arn:aws:iam::{{accountId}}:role/aws-service-role/ops.emr-serverless.amazonaws.com/AWS ServiceRoleForAmazonEMRServerless*", "Condition": { "StringLike": { "iam:AWSServiceName": "ops.emr-serverless.amazonaws.com" } } } ] }

IAM的策略示例 StartApplication

静态资源

以下是使用带有状态的状态机时静态资源的IAM策略示例。 StartApplication 任务工作流状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartApplication", "emr-serverless:GetApplication", "emr-serverless:StopApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/[[applicationId]]" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:{{region}}:{{accountId}}:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/[[applicationId]]" ] } ] }
动态资源

以下是使用带有状态的状态机时动态资源的IAM策略示例。 StartApplication 任务工作流状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartApplication", "emr-serverless:GetApplication", "emr-serverless:StopApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:{{region}}:{{accountId}}:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/*" ] } ] }

IAM的策略示例 StopApplication

静态资源

以下是使用带有状态的状态机时静态资源的IAM策略示例。 StopApplication 任务工作流状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StopApplication", "emr-serverless:GetApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/[[applicationId]]" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:{{region}}:{{accountId}}:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StopApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/[[applicationId]]" ] } ] }
动态资源

以下是使用带有状态的状态机时动态资源的IAM策略示例。 StopApplication 任务工作流状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StopApplication", "emr-serverless:GetApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:{{region}}:{{accountId}}:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StopApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/*" ] } ] }

IAM的策略示例 DeleteApplication

静态资源

以下是使用带有状态的状态机时静态资源的IAM策略示例。 DeleteApplication 任务工作流状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:DeleteApplication", "emr-serverless:GetApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/[[applicationId]]" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:{{region}}:{{accountId}}:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:DeleteApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/[[applicationId]]" ] } ] }
动态资源

以下是使用带有状态的状态机时动态资源的IAM策略示例。 DeleteApplication 任务工作流状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:DeleteApplication", "emr-serverless:GetApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:{{region}}:{{accountId}}:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:DeleteApplication" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/*" ] } ] }

IAM的策略示例 StartJobRun

静态资源

以下是使用带有状态的状态机时静态资源的IAM策略示例。 StartJobRun 任务工作流状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartJobRun" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/[[applicationId]]" ] }, { "Effect": "Allow", "Action": "iam:PassRole", "Resource": [ "[[jobExecutionRoleArn]]" ], "Condition": { "StringEquals": { "iam:PassedToService": "emr-serverless.amazonaws.com" } } }, { "Effect": "Allow", "Action": [ "emr-serverless:GetJobRun", "emr-serverless:CancelJobRun" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/[[applicationId]]/jobruns/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:{{region}}:{{accountId}}:rule/StepFunctionsGetEventsForEMRServerlessJobRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartJobRun" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/[[applicationId]]" ] }, { "Effect": "Allow", "Action": "iam:PassRole", "Resource": [ "[[jobExecutionRoleArn]]" ], "Condition": { "StringEquals": { "iam:PassedToService": "emr-serverless.amazonaws.com" } } } ] }
动态资源

以下是使用带有状态的状态机时动态资源的IAM策略示例。 StartJobRun 任务工作流状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartJobRun", "emr-serverless:GetJobRun", "emr-serverless:CancelJobRun" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/*" ] }, { "Effect": "Allow", "Action": "iam:PassRole", "Resource": [ "[[jobExecutionRoleArn]]" ], "Condition": { "StringEquals": { "iam:PassedToService": "emr-serverless.amazonaws.com" } } }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:{{region}}:{{accountId}}:rule/StepFunctionsGetEventsForEMRServerlessJobRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartJobRun" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/*" ] }, { "Effect": "Allow", "Action": "iam:PassRole", "Resource": [ "[[jobExecutionRoleArn]]" ], "Condition": { "StringEquals": { "iam:PassedToService": "emr-serverless.amazonaws.com" } } } ] }

IAM的策略示例 CancelJobRun

静态资源

以下是使用带有状态的状态机时静态资源的IAM策略示例。 CancelJobRun 任务工作流状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CancelJobRun", "emr-serverless:GetJobRun" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/[[applicationId]]/jobruns/[[jobRunId]]" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:{{region}}:{{accountId}}:rule/StepFunctionsGetEventsForEMRServerlessJobRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CancelJobRun" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/[[applicationId]]/jobruns/[[jobRunId]]" ] } ] }
动态资源

以下是使用带有状态的状态机时动态资源的IAM策略示例。 CancelJobRun 任务工作流状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CancelJobRun", "emr-serverless:GetJobRun" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:{{region}}:{{accountId}}:rule/StepFunctionsGetEventsForEMRServerlessJobRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CancelJobRun" ], "Resource": [ "arn:aws:emr-serverless:{{region}}:{{accountId}}:/applications/*" ] } ] }