SDK for PHP 3.x

Client: Aws\Emr\EmrClient
Service ID: elasticmapreduce
Version: 2009-03-31

This page describes the parameters and results for the operations of the Amazon EMR (2009-03-31), and shows how to use the Aws\Emr\EmrClient object to call the described operations. This documentation is specific to the 2009-03-31 API version of the service.

Operation Summary

Each of the following operations can be created from a client using $client->getCommand('CommandName'), where "CommandName" is the name of one of the following operations. Note: a command is a value that encapsulates an operation and the parameters used to create an HTTP request.

You can also create and send a command immediately using the magic methods available on a client object: $client->commandName(/* parameters */). You can send the command asynchronously (returning a promise) by appending the word "Async" to the operation name: $client->commandNameAsync(/* parameters */).

AddInstanceFleet ( array $params = [] )
Adds an instance fleet to a running cluster.
AddInstanceGroups ( array $params = [] )
Adds one or more instance groups to a running cluster.
AddJobFlowSteps ( array $params = [] )
AddJobFlowSteps adds new steps to a running cluster.
AddTags ( array $params = [] )
Adds tags to an Amazon EMR resource, such as a cluster or an Amazon EMR Studio.
CancelSteps ( array $params = [] )
Cancels a pending step or steps in a running cluster.
CreateSecurityConfiguration ( array $params = [] )
Creates a security configuration, which is stored in the service and can be specified when a cluster is created.
CreateStudio ( array $params = [] )
Creates a new Amazon EMR Studio.
CreateStudioSessionMapping ( array $params = [] )
Maps a user or group to the Amazon EMR Studio specified by StudioId, and applies a session policy to refine Studio permissions for that user or group.
DeleteSecurityConfiguration ( array $params = [] )
Deletes a security configuration.
DeleteStudio ( array $params = [] )
Removes an Amazon EMR Studio from the Studio metadata store.
DeleteStudioSessionMapping ( array $params = [] )
Removes a user or group from an Amazon EMR Studio.
DescribeCluster ( array $params = [] )
Provides cluster-level details including status, hardware and software configuration, VPC settings, and so on.
DescribeJobFlows ( array $params = [] )
This API is no longer supported and will eventually be removed.
DescribeNotebookExecution ( array $params = [] )
Provides details of a notebook execution.
DescribeReleaseLabel ( array $params = [] )
Provides Amazon EMR release label details, such as the releases available the Region where the API request is run, and the available applications for a specific Amazon EMR release label.
DescribeSecurityConfiguration ( array $params = [] )
Provides the details of a security configuration by returning the configuration JSON.
DescribeStep ( array $params = [] )
Provides more detail about the cluster step.
DescribeStudio ( array $params = [] )
Returns details for the specified Amazon EMR Studio including ID, Name, VPC, Studio access URL, and so on.
GetAutoTerminationPolicy ( array $params = [] )
Returns the auto-termination policy for an Amazon EMR cluster.
GetBlockPublicAccessConfiguration ( array $params = [] )
Returns the Amazon EMR block public access configuration for your Amazon Web Services account in the current Region.
GetClusterSessionCredentials ( array $params = [] )
Provides temporary, HTTP basic credentials that are associated with a given runtime IAM role and used by a cluster with fine-grained access control activated.
GetManagedScalingPolicy ( array $params = [] )
Fetches the attached managed scaling policy for an Amazon EMR cluster.
GetStudioSessionMapping ( array $params = [] )
Fetches mapping details for the specified Amazon EMR Studio and identity (user or group).
ListBootstrapActions ( array $params = [] )
Provides information about the bootstrap actions associated with a cluster.
ListClusters ( array $params = [] )
Provides the status of all clusters visible to this Amazon Web Services account.
ListInstanceFleets ( array $params = [] )
Lists all available details about the instance fleets in a cluster.
ListInstanceGroups ( array $params = [] )
Provides all available details about the instance groups in a cluster.
ListInstances ( array $params = [] )
Provides information for all active Amazon EC2 instances and Amazon EC2 instances terminated in the last 30 days, up to a maximum of 2,000.
ListNotebookExecutions ( array $params = [] )
Provides summaries of all notebook executions.
ListReleaseLabels ( array $params = [] )
Retrieves release labels of Amazon EMR services in the Region where the API is called.
ListSecurityConfigurations ( array $params = [] )
Lists all the security configurations visible to this account, providing their creation dates and times, and their names.
ListSteps ( array $params = [] )
Provides a list of steps for the cluster in reverse order unless you specify stepIds with the request or filter by StepStates.
ListStudioSessionMappings ( array $params = [] )
Returns a list of all user or group session mappings for the Amazon EMR Studio specified by StudioId.
ListStudios ( array $params = [] )
Returns a list of all Amazon EMR Studios associated with the Amazon Web Services account.
ListSupportedInstanceTypes ( array $params = [] )
A list of the instance types that Amazon EMR supports.
ModifyCluster ( array $params = [] )
Modifies the number of steps that can be executed concurrently for the cluster specified using ClusterID.
ModifyInstanceFleet ( array $params = [] )
Modifies the target On-Demand and target Spot capacities for the instance fleet with the specified InstanceFleetID within the cluster specified using ClusterID.
ModifyInstanceGroups ( array $params = [] )
ModifyInstanceGroups modifies the number of nodes and configuration settings of an instance group.
PutAutoScalingPolicy ( array $params = [] )
Creates or updates an automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster.
PutAutoTerminationPolicy ( array $params = [] )
Auto-termination is supported in Amazon EMR releases 5.
PutBlockPublicAccessConfiguration ( array $params = [] )
Creates or updates an Amazon EMR block public access configuration for your Amazon Web Services account in the current Region.
PutManagedScalingPolicy ( array $params = [] )
Creates or updates a managed scaling policy for an Amazon EMR cluster.
RemoveAutoScalingPolicy ( array $params = [] )
Removes an automatic scaling policy from a specified instance group within an Amazon EMR cluster.
RemoveAutoTerminationPolicy ( array $params = [] )
Removes an auto-termination policy from an Amazon EMR cluster.
RemoveManagedScalingPolicy ( array $params = [] )
Removes a managed scaling policy from a specified Amazon EMR cluster.
RemoveTags ( array $params = [] )
Removes tags from an Amazon EMR resource, such as a cluster or Amazon EMR Studio.
RunJobFlow ( array $params = [] )
RunJobFlow creates and starts running a new cluster (job flow).
SetKeepJobFlowAliveWhenNoSteps ( array $params = [] )
You can use the SetKeepJobFlowAliveWhenNoSteps to configure a cluster (job flow) to terminate after the step execution, i.
SetTerminationProtection ( array $params = [] )
SetTerminationProtection locks a cluster (job flow) so the Amazon EC2 instances in the cluster cannot be terminated by user intervention, an API call, or in the event of a job-flow error.
SetUnhealthyNodeReplacement ( array $params = [] )
Specify whether to enable unhealthy node replacement, which lets Amazon EMR gracefully replace core nodes on a cluster if any nodes become unhealthy.
SetVisibleToAllUsers ( array $params = [] )
The SetVisibleToAllUsers parameter is no longer supported.
StartNotebookExecution ( array $params = [] )
Starts a notebook execution.
StopNotebookExecution ( array $params = [] )
Stops a notebook execution.
TerminateJobFlows ( array $params = [] )
TerminateJobFlows shuts a list of clusters (job flows) down.
UpdateStudio ( array $params = [] )
Updates an Amazon EMR Studio configuration, including attributes such as name, description, and subnets.
UpdateStudioSessionMapping ( array $params = [] )
Updates the session policy attached to the user or group for the specified Amazon EMR Studio.

Paginators

Paginators handle automatically iterating over paginated API results. Paginators are associated with specific API operations, and they accept the parameters that the corresponding API operation accepts. You can get a paginator from a client class using getPaginator($paginatorName, $operationParameters). This client supports the following paginators:

DescribeJobFlows
ListBootstrapActions
ListClusters
ListInstanceFleets
ListInstanceGroups
ListInstances
ListNotebookExecutions
ListReleaseLabels
ListSecurityConfigurations
ListSteps
ListStudioSessionMappings
ListStudios
ListSupportedInstanceTypes

Waiters

Waiters allow you to poll a resource until it enters into a desired state. A waiter has a name used to describe what it does, and is associated with an API operation. When creating a waiter, you can provide the API operation parameters associated with the corresponding operation. Waiters can be accessed using the getWaiter($waiterName, $operationParameters) method of a client object. This client supports the following waiters:

Waiter name API Operation Delay Max Attempts
ClusterRunning DescribeCluster 30 60
StepComplete DescribeStep 30 60
ClusterTerminated DescribeCluster 30 60

Operations

AddInstanceFleet

$result = $client->addInstanceFleet([/* ... */]);
$promise = $client->addInstanceFleetAsync([/* ... */]);

Adds an instance fleet to a running cluster.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x.

Parameter Syntax

$result = $client->addInstanceFleet([
    'ClusterId' => '<string>', // REQUIRED
    'InstanceFleet' => [ // REQUIRED
        'Context' => '<string>',
        'InstanceFleetType' => 'MASTER|CORE|TASK', // REQUIRED
        'InstanceTypeConfigs' => [
            [
                'BidPrice' => '<string>',
                'BidPriceAsPercentageOfOnDemandPrice' => <float>,
                'Configurations' => [
                    [
                        'Classification' => '<string>',
                        'Configurations' => [...], // RECURSIVE
                        'Properties' => ['<string>', ...],
                    ],
                    // ...
                ],
                'CustomAmiId' => '<string>',
                'EbsConfiguration' => [
                    'EbsBlockDeviceConfigs' => [
                        [
                            'VolumeSpecification' => [ // REQUIRED
                                'Iops' => <integer>,
                                'SizeInGB' => <integer>, // REQUIRED
                                'Throughput' => <integer>,
                                'VolumeType' => '<string>', // REQUIRED
                            ],
                            'VolumesPerInstance' => <integer>,
                        ],
                        // ...
                    ],
                    'EbsOptimized' => true || false,
                ],
                'InstanceType' => '<string>', // REQUIRED
                'Priority' => <float>,
                'WeightedCapacity' => <integer>,
            ],
            // ...
        ],
        'LaunchSpecifications' => [
            'OnDemandSpecification' => [
                'AllocationStrategy' => 'lowest-price|prioritized', // REQUIRED
                'CapacityReservationOptions' => [
                    'CapacityReservationPreference' => 'open|none',
                    'CapacityReservationResourceGroupArn' => '<string>',
                    'UsageStrategy' => 'use-capacity-reservations-first',
                ],
            ],
            'SpotSpecification' => [
                'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized',
                'BlockDurationMinutes' => <integer>,
                'TimeoutAction' => 'SWITCH_TO_ON_DEMAND|TERMINATE_CLUSTER', // REQUIRED
                'TimeoutDurationMinutes' => <integer>, // REQUIRED
            ],
        ],
        'Name' => '<string>',
        'ResizeSpecifications' => [
            'OnDemandResizeSpecification' => [
                'AllocationStrategy' => 'lowest-price|prioritized',
                'CapacityReservationOptions' => [
                    'CapacityReservationPreference' => 'open|none',
                    'CapacityReservationResourceGroupArn' => '<string>',
                    'UsageStrategy' => 'use-capacity-reservations-first',
                ],
                'TimeoutDurationMinutes' => <integer>,
            ],
            'SpotResizeSpecification' => [
                'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized',
                'TimeoutDurationMinutes' => <integer>,
            ],
        ],
        'TargetOnDemandCapacity' => <integer>,
        'TargetSpotCapacity' => <integer>,
    ],
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The unique identifier of the cluster.

InstanceFleet
Required: Yes
Type: InstanceFleetConfig structure

Specifies the configuration of the instance fleet.

Result Syntax

[
    'ClusterArn' => '<string>',
    'ClusterId' => '<string>',
    'InstanceFleetId' => '<string>',
]

Result Details

Members
ClusterArn
Type: string

The Amazon Resource Name of the cluster.

ClusterId
Type: string

The unique identifier of the cluster.

InstanceFleetId
Type: string

The unique identifier of the instance fleet.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

AddInstanceGroups

$result = $client->addInstanceGroups([/* ... */]);
$promise = $client->addInstanceGroupsAsync([/* ... */]);

Adds one or more instance groups to a running cluster.

Parameter Syntax

$result = $client->addInstanceGroups([
    'InstanceGroups' => [ // REQUIRED
        [
            'AutoScalingPolicy' => [
                'Constraints' => [ // REQUIRED
                    'MaxCapacity' => <integer>, // REQUIRED
                    'MinCapacity' => <integer>, // REQUIRED
                ],
                'Rules' => [ // REQUIRED
                    [
                        'Action' => [ // REQUIRED
                            'Market' => 'ON_DEMAND|SPOT',
                            'SimpleScalingPolicyConfiguration' => [ // REQUIRED
                                'AdjustmentType' => 'CHANGE_IN_CAPACITY|PERCENT_CHANGE_IN_CAPACITY|EXACT_CAPACITY',
                                'CoolDown' => <integer>,
                                'ScalingAdjustment' => <integer>, // REQUIRED
                            ],
                        ],
                        'Description' => '<string>',
                        'Name' => '<string>', // REQUIRED
                        'Trigger' => [ // REQUIRED
                            'CloudWatchAlarmDefinition' => [ // REQUIRED
                                'ComparisonOperator' => 'GREATER_THAN_OR_EQUAL|GREATER_THAN|LESS_THAN|LESS_THAN_OR_EQUAL', // REQUIRED
                                'Dimensions' => [
                                    [
                                        'Key' => '<string>',
                                        'Value' => '<string>',
                                    ],
                                    // ...
                                ],
                                'EvaluationPeriods' => <integer>,
                                'MetricName' => '<string>', // REQUIRED
                                'Namespace' => '<string>',
                                'Period' => <integer>, // REQUIRED
                                'Statistic' => 'SAMPLE_COUNT|AVERAGE|SUM|MINIMUM|MAXIMUM',
                                'Threshold' => <float>, // REQUIRED
                                'Unit' => 'NONE|SECONDS|MICRO_SECONDS|MILLI_SECONDS|BYTES|KILO_BYTES|MEGA_BYTES|GIGA_BYTES|TERA_BYTES|BITS|KILO_BITS|MEGA_BITS|GIGA_BITS|TERA_BITS|PERCENT|COUNT|BYTES_PER_SECOND|KILO_BYTES_PER_SECOND|MEGA_BYTES_PER_SECOND|GIGA_BYTES_PER_SECOND|TERA_BYTES_PER_SECOND|BITS_PER_SECOND|KILO_BITS_PER_SECOND|MEGA_BITS_PER_SECOND|GIGA_BITS_PER_SECOND|TERA_BITS_PER_SECOND|COUNT_PER_SECOND',
                            ],
                        ],
                    ],
                    // ...
                ],
            ],
            'BidPrice' => '<string>',
            'Configurations' => [
                [
                    'Classification' => '<string>',
                    'Configurations' => [...], // RECURSIVE
                    'Properties' => ['<string>', ...],
                ],
                // ...
            ],
            'CustomAmiId' => '<string>',
            'EbsConfiguration' => [
                'EbsBlockDeviceConfigs' => [
                    [
                        'VolumeSpecification' => [ // REQUIRED
                            'Iops' => <integer>,
                            'SizeInGB' => <integer>, // REQUIRED
                            'Throughput' => <integer>,
                            'VolumeType' => '<string>', // REQUIRED
                        ],
                        'VolumesPerInstance' => <integer>,
                    ],
                    // ...
                ],
                'EbsOptimized' => true || false,
            ],
            'InstanceCount' => <integer>, // REQUIRED
            'InstanceRole' => 'MASTER|CORE|TASK', // REQUIRED
            'InstanceType' => '<string>', // REQUIRED
            'Market' => 'ON_DEMAND|SPOT',
            'Name' => '<string>',
        ],
        // ...
    ],
    'JobFlowId' => '<string>', // REQUIRED
]);

Parameter Details

Members
InstanceGroups
Required: Yes
Type: Array of InstanceGroupConfig structures

Instance groups to add.

JobFlowId
Required: Yes
Type: string

Job flow in which to add the instance groups.

Result Syntax

[
    'ClusterArn' => '<string>',
    'InstanceGroupIds' => ['<string>', ...],
    'JobFlowId' => '<string>',
]

Result Details

Members
ClusterArn
Type: string

The Amazon Resource Name of the cluster.

InstanceGroupIds
Type: Array of strings

Instance group IDs of the newly created instance groups.

JobFlowId
Type: string

The job flow ID in which the instance groups are added.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

AddJobFlowSteps

$result = $client->addJobFlowSteps([/* ... */]);
$promise = $client->addJobFlowStepsAsync([/* ... */]);

AddJobFlowSteps adds new steps to a running cluster. A maximum of 256 steps are allowed in each job flow.

If your cluster is long-running (such as a Hive data warehouse) or complex, you may require more than 256 steps to process your data. You can bypass the 256-step limitation in various ways, including using SSH to connect to the master node and submitting queries directly to the software running on the master node, such as Hive and Hadoop.

A step specifies the location of a JAR file stored either on the master node of the cluster or in Amazon S3. Each step is performed by the main function of the main class of the JAR file. The main class can be specified either in the manifest of the JAR or by using the MainFunction parameter of the step.

Amazon EMR executes each step in the order listed. For a step to be considered complete, the main function must exit with a zero exit code and all Hadoop jobs started while the step was running must have completed and run successfully.

You can only add steps to a cluster that is in one of the following states: STARTING, BOOTSTRAPPING, RUNNING, or WAITING.

The string values passed into HadoopJarStep object cannot exceed a total of 10240 characters.

Parameter Syntax

$result = $client->addJobFlowSteps([
    'ExecutionRoleArn' => '<string>',
    'JobFlowId' => '<string>', // REQUIRED
    'Steps' => [ // REQUIRED
        [
            'ActionOnFailure' => 'TERMINATE_JOB_FLOW|TERMINATE_CLUSTER|CANCEL_AND_WAIT|CONTINUE',
            'HadoopJarStep' => [ // REQUIRED
                'Args' => ['<string>', ...],
                'Jar' => '<string>', // REQUIRED
                'MainClass' => '<string>',
                'Properties' => [
                    [
                        'Key' => '<string>',
                        'Value' => '<string>',
                    ],
                    // ...
                ],
            ],
            'Name' => '<string>', // REQUIRED
        ],
        // ...
    ],
]);

Parameter Details

Members
ExecutionRoleArn
Type: string

The Amazon Resource Name (ARN) of the runtime role for a step on the cluster. The runtime role can be a cross-account IAM role. The runtime role ARN is a combination of account ID, role name, and role type using the following format: arn:partition:service:region:account:resource.

For example, arn:aws:IAM::1234567890:role/ReadOnly is a correctly formatted runtime role ARN.

JobFlowId
Required: Yes
Type: string

A string that uniquely identifies the job flow. This identifier is returned by RunJobFlow and can also be obtained from ListClusters.

Steps
Required: Yes
Type: Array of StepConfig structures

A list of StepConfig to be executed by the job flow.

Result Syntax

[
    'StepIds' => ['<string>', ...],
]

Result Details

Members
StepIds
Type: Array of strings

The identifiers of the list of steps added to the job flow.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

AddTags

$result = $client->addTags([/* ... */]);
$promise = $client->addTagsAsync([/* ... */]);

Adds tags to an Amazon EMR resource, such as a cluster or an Amazon EMR Studio. Tags make it easier to associate resources in various ways, such as grouping clusters to track your Amazon EMR resource allocation costs. For more information, see Tag Clusters.

Parameter Syntax

$result = $client->addTags([
    'ResourceId' => '<string>', // REQUIRED
    'Tags' => [ // REQUIRED
        [
            'Key' => '<string>',
            'Value' => '<string>',
        ],
        // ...
    ],
]);

Parameter Details

Members
ResourceId
Required: Yes
Type: string

The Amazon EMR resource identifier to which tags will be added. For example, a cluster identifier or an Amazon EMR Studio ID.

Tags
Required: Yes
Type: Array of Tag structures

A list of tags to associate with a resource. Tags are user-defined key-value pairs that consist of a required key string with a maximum of 128 characters, and an optional value string with a maximum of 256 characters.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

CancelSteps

$result = $client->cancelSteps([/* ... */]);
$promise = $client->cancelStepsAsync([/* ... */]);

Cancels a pending step or steps in a running cluster. Available only in Amazon EMR versions 4.8.0 and later, excluding version 5.0.0. A maximum of 256 steps are allowed in each CancelSteps request. CancelSteps is idempotent but asynchronous; it does not guarantee that a step will be canceled, even if the request is successfully submitted. When you use Amazon EMR releases 5.28.0 and later, you can cancel steps that are in a PENDING or RUNNING state. In earlier versions of Amazon EMR, you can only cancel steps that are in a PENDING state.

Parameter Syntax

$result = $client->cancelSteps([
    'ClusterId' => '<string>', // REQUIRED
    'StepCancellationOption' => 'SEND_INTERRUPT|TERMINATE_PROCESS',
    'StepIds' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The ClusterID for the specified steps that will be canceled. Use RunJobFlow and ListClusters to get ClusterIDs.

StepCancellationOption
Type: string

The option to choose to cancel RUNNING steps. By default, the value is SEND_INTERRUPT.

StepIds
Required: Yes
Type: Array of strings

The list of StepIDs to cancel. Use ListSteps to get steps and their states for the specified cluster.

Result Syntax

[
    'CancelStepsInfoList' => [
        [
            'Reason' => '<string>',
            'Status' => 'SUBMITTED|FAILED',
            'StepId' => '<string>',
        ],
        // ...
    ],
]

Result Details

Members
CancelStepsInfoList
Type: Array of CancelStepsInfo structures

A list of CancelStepsInfo, which shows the status of specified cancel requests for each StepID specified.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

CreateSecurityConfiguration

$result = $client->createSecurityConfiguration([/* ... */]);
$promise = $client->createSecurityConfigurationAsync([/* ... */]);

Creates a security configuration, which is stored in the service and can be specified when a cluster is created.

Parameter Syntax

$result = $client->createSecurityConfiguration([
    'Name' => '<string>', // REQUIRED
    'SecurityConfiguration' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the security configuration.

SecurityConfiguration
Required: Yes
Type: string

The security configuration details in JSON format. For JSON parameters and examples, see Use Security Configurations to Set Up Cluster Security in the Amazon EMR Management Guide.

Result Syntax

[
    'CreationDateTime' => <DateTime>,
    'Name' => '<string>',
]

Result Details

Members
CreationDateTime
Required: Yes
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time the security configuration was created.

Name
Required: Yes
Type: string

The name of the security configuration.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

CreateStudio

$result = $client->createStudio([/* ... */]);
$promise = $client->createStudioAsync([/* ... */]);

Creates a new Amazon EMR Studio.

Parameter Syntax

$result = $client->createStudio([
    'AuthMode' => 'SSO|IAM', // REQUIRED
    'DefaultS3Location' => '<string>', // REQUIRED
    'Description' => '<string>',
    'EncryptionKeyArn' => '<string>',
    'EngineSecurityGroupId' => '<string>', // REQUIRED
    'IdcInstanceArn' => '<string>',
    'IdcUserAssignment' => 'REQUIRED|OPTIONAL',
    'IdpAuthUrl' => '<string>',
    'IdpRelayStateParameterName' => '<string>',
    'Name' => '<string>', // REQUIRED
    'ServiceRole' => '<string>', // REQUIRED
    'SubnetIds' => ['<string>', ...], // REQUIRED
    'Tags' => [
        [
            'Key' => '<string>',
            'Value' => '<string>',
        ],
        // ...
    ],
    'TrustedIdentityPropagationEnabled' => true || false,
    'UserRole' => '<string>',
    'VpcId' => '<string>', // REQUIRED
    'WorkspaceSecurityGroupId' => '<string>', // REQUIRED
]);

Parameter Details

Members
AuthMode
Required: Yes
Type: string

Specifies whether the Studio authenticates users using IAM or IAM Identity Center.

DefaultS3Location
Required: Yes
Type: string

The Amazon S3 location to back up Amazon EMR Studio Workspaces and notebook files.

Description
Type: string

A detailed description of the Amazon EMR Studio.

EncryptionKeyArn
Type: string

The KMS key identifier (ARN) used to encrypt Amazon EMR Studio workspace and notebook files when backed up to Amazon S3.

EngineSecurityGroupId
Required: Yes
Type: string

The ID of the Amazon EMR Studio Engine security group. The Engine security group allows inbound network traffic from the Workspace security group, and it must be in the same VPC specified by VpcId.

IdcInstanceArn
Type: string

The ARN of the IAM Identity Center instance to create the Studio application.

IdcUserAssignment
Type: string

Specifies whether IAM Identity Center user assignment is REQUIRED or OPTIONAL. If the value is set to REQUIRED, users must be explicitly assigned to the Studio application to access the Studio.

IdpAuthUrl
Type: string

The authentication endpoint of your identity provider (IdP). Specify this value when you use IAM authentication and want to let federated users log in to a Studio with the Studio URL and credentials from your IdP. Amazon EMR Studio redirects users to this endpoint to enter credentials.

IdpRelayStateParameterName
Type: string

The name that your identity provider (IdP) uses for its RelayState parameter. For example, RelayState or TargetSource. Specify this value when you use IAM authentication and want to let federated users log in to a Studio using the Studio URL. The RelayState parameter differs by IdP.

Name
Required: Yes
Type: string

A descriptive name for the Amazon EMR Studio.

ServiceRole
Required: Yes
Type: string

The IAM role that the Amazon EMR Studio assumes. The service role provides a way for Amazon EMR Studio to interoperate with other Amazon Web Services services.

SubnetIds
Required: Yes
Type: Array of strings

A list of subnet IDs to associate with the Amazon EMR Studio. A Studio can have a maximum of 5 subnets. The subnets must belong to the VPC specified by VpcId. Studio users can create a Workspace in any of the specified subnets.

Tags
Type: Array of Tag structures

A list of tags to associate with the Amazon EMR Studio. Tags are user-defined key-value pairs that consist of a required key string with a maximum of 128 characters, and an optional value string with a maximum of 256 characters.

TrustedIdentityPropagationEnabled
Type: boolean

A Boolean indicating whether to enable Trusted identity propagation for the Studio. The default value is false.

UserRole
Type: string

The IAM user role that users and groups assume when logged in to an Amazon EMR Studio. Only specify a UserRole when you use IAM Identity Center authentication. The permissions attached to the UserRole can be scoped down for each user or group using session policies.

VpcId
Required: Yes
Type: string

The ID of the Amazon Virtual Private Cloud (Amazon VPC) to associate with the Studio.

WorkspaceSecurityGroupId
Required: Yes
Type: string

The ID of the Amazon EMR Studio Workspace security group. The Workspace security group allows outbound network traffic to resources in the Engine security group, and it must be in the same VPC specified by VpcId.

Result Syntax

[
    'StudioId' => '<string>',
    'Url' => '<string>',
]

Result Details

Members
StudioId
Type: string

The ID of the Amazon EMR Studio.

Url
Type: string

The unique Studio access URL.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

CreateStudioSessionMapping

$result = $client->createStudioSessionMapping([/* ... */]);
$promise = $client->createStudioSessionMappingAsync([/* ... */]);

Maps a user or group to the Amazon EMR Studio specified by StudioId, and applies a session policy to refine Studio permissions for that user or group. Use CreateStudioSessionMapping to assign users to a Studio when you use IAM Identity Center authentication. For instructions on how to assign users to a Studio when you use IAM authentication, see Assign a user or group to your EMR Studio.

Parameter Syntax

$result = $client->createStudioSessionMapping([
    'IdentityId' => '<string>',
    'IdentityName' => '<string>',
    'IdentityType' => 'USER|GROUP', // REQUIRED
    'SessionPolicyArn' => '<string>', // REQUIRED
    'StudioId' => '<string>', // REQUIRED
]);

Parameter Details

Members
IdentityId
Type: string

The globally unique identifier (GUID) of the user or group from the IAM Identity Center Identity Store. For more information, see UserId and GroupId in the IAM Identity Center Identity Store API Reference. Either IdentityName or IdentityId must be specified, but not both.

IdentityName
Type: string

The name of the user or group. For more information, see UserName and DisplayName in the IAM Identity Center Identity Store API Reference. Either IdentityName or IdentityId must be specified, but not both.

IdentityType
Required: Yes
Type: string

Specifies whether the identity to map to the Amazon EMR Studio is a user or a group.

SessionPolicyArn
Required: Yes
Type: string

The Amazon Resource Name (ARN) for the session policy that will be applied to the user or group. You should specify the ARN for the session policy that you want to apply, not the ARN of your user role. For more information, see Create an Amazon EMR Studio User Role with Session Policies.

StudioId
Required: Yes
Type: string

The ID of the Amazon EMR Studio to which the user or group will be mapped.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

DeleteSecurityConfiguration

$result = $client->deleteSecurityConfiguration([/* ... */]);
$promise = $client->deleteSecurityConfigurationAsync([/* ... */]);

Deletes a security configuration.

Parameter Syntax

$result = $client->deleteSecurityConfiguration([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the security configuration.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

DeleteStudio

$result = $client->deleteStudio([/* ... */]);
$promise = $client->deleteStudioAsync([/* ... */]);

Removes an Amazon EMR Studio from the Studio metadata store.

Parameter Syntax

$result = $client->deleteStudio([
    'StudioId' => '<string>', // REQUIRED
]);

Parameter Details

Members
StudioId
Required: Yes
Type: string

The ID of the Amazon EMR Studio.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

DeleteStudioSessionMapping

$result = $client->deleteStudioSessionMapping([/* ... */]);
$promise = $client->deleteStudioSessionMappingAsync([/* ... */]);

Removes a user or group from an Amazon EMR Studio.

Parameter Syntax

$result = $client->deleteStudioSessionMapping([
    'IdentityId' => '<string>',
    'IdentityName' => '<string>',
    'IdentityType' => 'USER|GROUP', // REQUIRED
    'StudioId' => '<string>', // REQUIRED
]);

Parameter Details

Members
IdentityId
Type: string

The globally unique identifier (GUID) of the user or group to remove from the Amazon EMR Studio. For more information, see UserId and GroupId in the IAM Identity Center Identity Store API Reference. Either IdentityName or IdentityId must be specified.

IdentityName
Type: string

The name of the user name or group to remove from the Amazon EMR Studio. For more information, see UserName and DisplayName in the IAM Identity Center Store API Reference. Either IdentityName or IdentityId must be specified.

IdentityType
Required: Yes
Type: string

Specifies whether the identity to delete from the Amazon EMR Studio is a user or a group.

StudioId
Required: Yes
Type: string

The ID of the Amazon EMR Studio.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

DescribeCluster

$result = $client->describeCluster([/* ... */]);
$promise = $client->describeClusterAsync([/* ... */]);

Provides cluster-level details including status, hardware and software configuration, VPC settings, and so on.

Parameter Syntax

$result = $client->describeCluster([
    'ClusterId' => '<string>', // REQUIRED
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The identifier of the cluster to describe.

Result Syntax

[
    'Cluster' => [
        'Applications' => [
            [
                'AdditionalInfo' => ['<string>', ...],
                'Args' => ['<string>', ...],
                'Name' => '<string>',
                'Version' => '<string>',
            ],
            // ...
        ],
        'AutoScalingRole' => '<string>',
        'AutoTerminate' => true || false,
        'ClusterArn' => '<string>',
        'Configurations' => [
            [
                'Classification' => '<string>',
                'Configurations' => [...], // RECURSIVE
                'Properties' => ['<string>', ...],
            ],
            // ...
        ],
        'CustomAmiId' => '<string>',
        'EbsRootVolumeIops' => <integer>,
        'EbsRootVolumeSize' => <integer>,
        'EbsRootVolumeThroughput' => <integer>,
        'Ec2InstanceAttributes' => [
            'AdditionalMasterSecurityGroups' => ['<string>', ...],
            'AdditionalSlaveSecurityGroups' => ['<string>', ...],
            'Ec2AvailabilityZone' => '<string>',
            'Ec2KeyName' => '<string>',
            'Ec2SubnetId' => '<string>',
            'EmrManagedMasterSecurityGroup' => '<string>',
            'EmrManagedSlaveSecurityGroup' => '<string>',
            'IamInstanceProfile' => '<string>',
            'RequestedEc2AvailabilityZones' => ['<string>', ...],
            'RequestedEc2SubnetIds' => ['<string>', ...],
            'ServiceAccessSecurityGroup' => '<string>',
        ],
        'Id' => '<string>',
        'InstanceCollectionType' => 'INSTANCE_FLEET|INSTANCE_GROUP',
        'KerberosAttributes' => [
            'ADDomainJoinPassword' => '<string>',
            'ADDomainJoinUser' => '<string>',
            'CrossRealmTrustPrincipalPassword' => '<string>',
            'KdcAdminPassword' => '<string>',
            'Realm' => '<string>',
        ],
        'LogEncryptionKmsKeyId' => '<string>',
        'LogUri' => '<string>',
        'MasterPublicDnsName' => '<string>',
        'Name' => '<string>',
        'NormalizedInstanceHours' => <integer>,
        'OSReleaseLabel' => '<string>',
        'OutpostArn' => '<string>',
        'PlacementGroups' => [
            [
                'InstanceRole' => 'MASTER|CORE|TASK',
                'PlacementStrategy' => 'SPREAD|PARTITION|CLUSTER|NONE',
            ],
            // ...
        ],
        'ReleaseLabel' => '<string>',
        'RepoUpgradeOnBoot' => 'SECURITY|NONE',
        'RequestedAmiVersion' => '<string>',
        'RunningAmiVersion' => '<string>',
        'ScaleDownBehavior' => 'TERMINATE_AT_INSTANCE_HOUR|TERMINATE_AT_TASK_COMPLETION',
        'SecurityConfiguration' => '<string>',
        'ServiceRole' => '<string>',
        'Status' => [
            'ErrorDetails' => [
                [
                    'ErrorCode' => '<string>',
                    'ErrorData' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'ErrorMessage' => '<string>',
                ],
                // ...
            ],
            'State' => 'STARTING|BOOTSTRAPPING|RUNNING|WAITING|TERMINATING|TERMINATED|TERMINATED_WITH_ERRORS',
            'StateChangeReason' => [
                'Code' => 'INTERNAL_ERROR|VALIDATION_ERROR|INSTANCE_FAILURE|INSTANCE_FLEET_TIMEOUT|BOOTSTRAP_FAILURE|USER_REQUEST|STEP_FAILURE|ALL_STEPS_COMPLETED',
                'Message' => '<string>',
            ],
            'Timeline' => [
                'CreationDateTime' => <DateTime>,
                'EndDateTime' => <DateTime>,
                'ReadyDateTime' => <DateTime>,
            ],
        ],
        'StepConcurrencyLevel' => <integer>,
        'Tags' => [
            [
                'Key' => '<string>',
                'Value' => '<string>',
            ],
            // ...
        ],
        'TerminationProtected' => true || false,
        'UnhealthyNodeReplacement' => true || false,
        'VisibleToAllUsers' => true || false,
    ],
]

Result Details

Members
Cluster
Type: Cluster structure

This output contains the details for the requested cluster.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

DescribeJobFlows

$result = $client->describeJobFlows([/* ... */]);
$promise = $client->describeJobFlowsAsync([/* ... */]);

This API is no longer supported and will eventually be removed. We recommend you use ListClusters, DescribeCluster, ListSteps, ListInstanceGroups and ListBootstrapActions instead.

DescribeJobFlows returns a list of job flows that match all of the supplied parameters. The parameters can include a list of job flow IDs, job flow states, and restrictions on job flow creation date and time.

Regardless of supplied parameters, only job flows created within the last two months are returned.

If no parameters are supplied, then job flows matching either of the following criteria are returned:

  • Job flows created and completed in the last two weeks

  • Job flows created within the last two months that are in one of the following states: RUNNING, WAITING, SHUTTING_DOWN, STARTING

Amazon EMR can return a maximum of 512 job flow descriptions.

Parameter Syntax

$result = $client->describeJobFlows([
    'CreatedAfter' => <integer || string || DateTime>,
    'CreatedBefore' => <integer || string || DateTime>,
    'JobFlowIds' => ['<string>', ...],
    'JobFlowStates' => ['<string>', ...],
]);

Parameter Details

Members
CreatedAfter
Type: timestamp (string|DateTime or anything parsable by strtotime)

Return only job flows created after this date and time.

CreatedBefore
Type: timestamp (string|DateTime or anything parsable by strtotime)

Return only job flows created before this date and time.

JobFlowIds
Type: Array of strings

Return only job flows whose job flow ID is contained in this list.

JobFlowStates
Type: Array of strings

Return only job flows whose state is contained in this list.

Result Syntax

[
    'JobFlows' => [
        [
            'AmiVersion' => '<string>',
            'AutoScalingRole' => '<string>',
            'BootstrapActions' => [
                [
                    'BootstrapActionConfig' => [
                        'Name' => '<string>',
                        'ScriptBootstrapAction' => [
                            'Args' => ['<string>', ...],
                            'Path' => '<string>',
                        ],
                    ],
                ],
                // ...
            ],
            'ExecutionStatusDetail' => [
                'CreationDateTime' => <DateTime>,
                'EndDateTime' => <DateTime>,
                'LastStateChangeReason' => '<string>',
                'ReadyDateTime' => <DateTime>,
                'StartDateTime' => <DateTime>,
                'State' => 'STARTING|BOOTSTRAPPING|RUNNING|WAITING|SHUTTING_DOWN|TERMINATED|COMPLETED|FAILED',
            ],
            'Instances' => [
                'Ec2KeyName' => '<string>',
                'Ec2SubnetId' => '<string>',
                'HadoopVersion' => '<string>',
                'InstanceCount' => <integer>,
                'InstanceGroups' => [
                    [
                        'BidPrice' => '<string>',
                        'CreationDateTime' => <DateTime>,
                        'CustomAmiId' => '<string>',
                        'EndDateTime' => <DateTime>,
                        'InstanceGroupId' => '<string>',
                        'InstanceRequestCount' => <integer>,
                        'InstanceRole' => 'MASTER|CORE|TASK',
                        'InstanceRunningCount' => <integer>,
                        'InstanceType' => '<string>',
                        'LastStateChangeReason' => '<string>',
                        'Market' => 'ON_DEMAND|SPOT',
                        'Name' => '<string>',
                        'ReadyDateTime' => <DateTime>,
                        'StartDateTime' => <DateTime>,
                        'State' => 'PROVISIONING|BOOTSTRAPPING|RUNNING|RECONFIGURING|RESIZING|SUSPENDED|TERMINATING|TERMINATED|ARRESTED|SHUTTING_DOWN|ENDED',
                    ],
                    // ...
                ],
                'KeepJobFlowAliveWhenNoSteps' => true || false,
                'MasterInstanceId' => '<string>',
                'MasterInstanceType' => '<string>',
                'MasterPublicDnsName' => '<string>',
                'NormalizedInstanceHours' => <integer>,
                'Placement' => [
                    'AvailabilityZone' => '<string>',
                    'AvailabilityZones' => ['<string>', ...],
                ],
                'SlaveInstanceType' => '<string>',
                'TerminationProtected' => true || false,
                'UnhealthyNodeReplacement' => true || false,
            ],
            'JobFlowId' => '<string>',
            'JobFlowRole' => '<string>',
            'LogEncryptionKmsKeyId' => '<string>',
            'LogUri' => '<string>',
            'Name' => '<string>',
            'ScaleDownBehavior' => 'TERMINATE_AT_INSTANCE_HOUR|TERMINATE_AT_TASK_COMPLETION',
            'ServiceRole' => '<string>',
            'Steps' => [
                [
                    'ExecutionStatusDetail' => [
                        'CreationDateTime' => <DateTime>,
                        'EndDateTime' => <DateTime>,
                        'LastStateChangeReason' => '<string>',
                        'StartDateTime' => <DateTime>,
                        'State' => 'PENDING|RUNNING|CONTINUE|COMPLETED|CANCELLED|FAILED|INTERRUPTED',
                    ],
                    'StepConfig' => [
                        'ActionOnFailure' => 'TERMINATE_JOB_FLOW|TERMINATE_CLUSTER|CANCEL_AND_WAIT|CONTINUE',
                        'HadoopJarStep' => [
                            'Args' => ['<string>', ...],
                            'Jar' => '<string>',
                            'MainClass' => '<string>',
                            'Properties' => [
                                [
                                    'Key' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        'Name' => '<string>',
                    ],
                ],
                // ...
            ],
            'SupportedProducts' => ['<string>', ...],
            'VisibleToAllUsers' => true || false,
        ],
        // ...
    ],
]

Result Details

Members
JobFlows
Type: Array of JobFlowDetail structures

A list of job flows matching the parameters supplied.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

DescribeNotebookExecution

$result = $client->describeNotebookExecution([/* ... */]);
$promise = $client->describeNotebookExecutionAsync([/* ... */]);

Provides details of a notebook execution.

Parameter Syntax

$result = $client->describeNotebookExecution([
    'NotebookExecutionId' => '<string>', // REQUIRED
]);

Parameter Details

Members
NotebookExecutionId
Required: Yes
Type: string

The unique identifier of the notebook execution.

Result Syntax

[
    'NotebookExecution' => [
        'Arn' => '<string>',
        'EditorId' => '<string>',
        'EndTime' => <DateTime>,
        'EnvironmentVariables' => ['<string>', ...],
        'ExecutionEngine' => [
            'ExecutionRoleArn' => '<string>',
            'Id' => '<string>',
            'MasterInstanceSecurityGroupId' => '<string>',
            'Type' => 'EMR',
        ],
        'LastStateChangeReason' => '<string>',
        'NotebookExecutionId' => '<string>',
        'NotebookExecutionName' => '<string>',
        'NotebookInstanceSecurityGroupId' => '<string>',
        'NotebookParams' => '<string>',
        'NotebookS3Location' => [
            'Bucket' => '<string>',
            'Key' => '<string>',
        ],
        'OutputNotebookFormat' => 'HTML',
        'OutputNotebookS3Location' => [
            'Bucket' => '<string>',
            'Key' => '<string>',
        ],
        'OutputNotebookURI' => '<string>',
        'StartTime' => <DateTime>,
        'Status' => 'START_PENDING|STARTING|RUNNING|FINISHING|FINISHED|FAILING|FAILED|STOP_PENDING|STOPPING|STOPPED',
        'Tags' => [
            [
                'Key' => '<string>',
                'Value' => '<string>',
            ],
            // ...
        ],
    ],
]

Result Details

Members
NotebookExecution
Type: NotebookExecution structure

Properties of the notebook execution.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

DescribeReleaseLabel

$result = $client->describeReleaseLabel([/* ... */]);
$promise = $client->describeReleaseLabelAsync([/* ... */]);

Provides Amazon EMR release label details, such as the releases available the Region where the API request is run, and the available applications for a specific Amazon EMR release label. Can also list Amazon EMR releases that support a specified version of Spark.

Parameter Syntax

$result = $client->describeReleaseLabel([
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
    'ReleaseLabel' => '<string>',
]);

Parameter Details

Members
MaxResults
Type: int

Reserved for future use. Currently set to null.

NextToken
Type: string

The pagination token. Reserved for future use. Currently set to null.

ReleaseLabel
Type: string

The target release label to be described.

Result Syntax

[
    'Applications' => [
        [
            'Name' => '<string>',
            'Version' => '<string>',
        ],
        // ...
    ],
    'AvailableOSReleases' => [
        [
            'Label' => '<string>',
        ],
        // ...
    ],
    'NextToken' => '<string>',
    'ReleaseLabel' => '<string>',
]

Result Details

Members
Applications
Type: Array of SimplifiedApplication structures

The list of applications available for the target release label. Name is the name of the application. Version is the concise version of the application.

AvailableOSReleases
Type: Array of OSRelease structures

The list of available Amazon Linux release versions for an Amazon EMR release. Contains a Label field that is formatted as shown in Amazon Linux 2 Release Notes . For example, 2.0.20220218.1.

NextToken
Type: string

The pagination token. Reserved for future use. Currently set to null.

ReleaseLabel
Type: string

The target release label described in the response.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

DescribeSecurityConfiguration

$result = $client->describeSecurityConfiguration([/* ... */]);
$promise = $client->describeSecurityConfigurationAsync([/* ... */]);

Provides the details of a security configuration by returning the configuration JSON.

Parameter Syntax

$result = $client->describeSecurityConfiguration([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the security configuration.

Result Syntax

[
    'CreationDateTime' => <DateTime>,
    'Name' => '<string>',
    'SecurityConfiguration' => '<string>',
]

Result Details

Members
CreationDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time the security configuration was created

Name
Type: string

The name of the security configuration.

SecurityConfiguration
Type: string

The security configuration details in JSON format.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

DescribeStep

$result = $client->describeStep([/* ... */]);
$promise = $client->describeStepAsync([/* ... */]);

Provides more detail about the cluster step.

Parameter Syntax

$result = $client->describeStep([
    'ClusterId' => '<string>', // REQUIRED
    'StepId' => '<string>', // REQUIRED
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The identifier of the cluster with steps to describe.

StepId
Required: Yes
Type: string

The identifier of the step to describe.

Result Syntax

[
    'Step' => [
        'ActionOnFailure' => 'TERMINATE_JOB_FLOW|TERMINATE_CLUSTER|CANCEL_AND_WAIT|CONTINUE',
        'Config' => [
            'Args' => ['<string>', ...],
            'Jar' => '<string>',
            'MainClass' => '<string>',
            'Properties' => ['<string>', ...],
        ],
        'ExecutionRoleArn' => '<string>',
        'Id' => '<string>',
        'Name' => '<string>',
        'Status' => [
            'FailureDetails' => [
                'LogFile' => '<string>',
                'Message' => '<string>',
                'Reason' => '<string>',
            ],
            'State' => 'PENDING|CANCEL_PENDING|RUNNING|COMPLETED|CANCELLED|FAILED|INTERRUPTED',
            'StateChangeReason' => [
                'Code' => 'NONE',
                'Message' => '<string>',
            ],
            'Timeline' => [
                'CreationDateTime' => <DateTime>,
                'EndDateTime' => <DateTime>,
                'StartDateTime' => <DateTime>,
            ],
        ],
    ],
]

Result Details

Members
Step
Type: Step structure

The step details for the requested step identifier.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

DescribeStudio

$result = $client->describeStudio([/* ... */]);
$promise = $client->describeStudioAsync([/* ... */]);

Returns details for the specified Amazon EMR Studio including ID, Name, VPC, Studio access URL, and so on.

Parameter Syntax

$result = $client->describeStudio([
    'StudioId' => '<string>', // REQUIRED
]);

Parameter Details

Members
StudioId
Required: Yes
Type: string

The Amazon EMR Studio ID.

Result Syntax

[
    'Studio' => [
        'AuthMode' => 'SSO|IAM',
        'CreationTime' => <DateTime>,
        'DefaultS3Location' => '<string>',
        'Description' => '<string>',
        'EncryptionKeyArn' => '<string>',
        'EngineSecurityGroupId' => '<string>',
        'IdcInstanceArn' => '<string>',
        'IdcUserAssignment' => 'REQUIRED|OPTIONAL',
        'IdpAuthUrl' => '<string>',
        'IdpRelayStateParameterName' => '<string>',
        'Name' => '<string>',
        'ServiceRole' => '<string>',
        'StudioArn' => '<string>',
        'StudioId' => '<string>',
        'SubnetIds' => ['<string>', ...],
        'Tags' => [
            [
                'Key' => '<string>',
                'Value' => '<string>',
            ],
            // ...
        ],
        'TrustedIdentityPropagationEnabled' => true || false,
        'Url' => '<string>',
        'UserRole' => '<string>',
        'VpcId' => '<string>',
        'WorkspaceSecurityGroupId' => '<string>',
    ],
]

Result Details

Members
Studio
Type: Studio structure

The Amazon EMR Studio details.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

GetAutoTerminationPolicy

$result = $client->getAutoTerminationPolicy([/* ... */]);
$promise = $client->getAutoTerminationPolicyAsync([/* ... */]);

Returns the auto-termination policy for an Amazon EMR cluster.

Parameter Syntax

$result = $client->getAutoTerminationPolicy([
    'ClusterId' => '<string>', // REQUIRED
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

Specifies the ID of the Amazon EMR cluster for which the auto-termination policy will be fetched.

Result Syntax

[
    'AutoTerminationPolicy' => [
        'IdleTimeout' => <integer>,
    ],
]

Result Details

Members
AutoTerminationPolicy
Type: AutoTerminationPolicy structure

Specifies the auto-termination policy that is attached to an Amazon EMR cluster.

Errors

There are no errors described for this operation.

GetBlockPublicAccessConfiguration

$result = $client->getBlockPublicAccessConfiguration([/* ... */]);
$promise = $client->getBlockPublicAccessConfigurationAsync([/* ... */]);

Returns the Amazon EMR block public access configuration for your Amazon Web Services account in the current Region. For more information see Configure Block Public Access for Amazon EMR in the Amazon EMR Management Guide.

Parameter Syntax

$result = $client->getBlockPublicAccessConfiguration([
]);

Parameter Details

Members

Result Syntax

[
    'BlockPublicAccessConfiguration' => [
        'BlockPublicSecurityGroupRules' => true || false,
        'PermittedPublicSecurityGroupRuleRanges' => [
            [
                'MaxRange' => <integer>,
                'MinRange' => <integer>,
            ],
            // ...
        ],
    ],
    'BlockPublicAccessConfigurationMetadata' => [
        'CreatedByArn' => '<string>',
        'CreationDateTime' => <DateTime>,
    ],
]

Result Details

Members
BlockPublicAccessConfiguration
Required: Yes
Type: BlockPublicAccessConfiguration structure

A configuration for Amazon EMR block public access. The configuration applies to all clusters created in your account for the current Region. The configuration specifies whether block public access is enabled. If block public access is enabled, security groups associated with the cluster cannot have rules that allow inbound traffic from 0.0.0.0/0 or ::/0 on a port, unless the port is specified as an exception using PermittedPublicSecurityGroupRuleRanges in the BlockPublicAccessConfiguration. By default, Port 22 (SSH) is an exception, and public access is allowed on this port. You can change this by updating the block public access configuration to remove the exception.

For accounts that created clusters in a Region before November 25, 2019, block public access is disabled by default in that Region. To use this feature, you must manually enable and configure it. For accounts that did not create an Amazon EMR cluster in a Region before this date, block public access is enabled by default in that Region.

BlockPublicAccessConfigurationMetadata
Required: Yes
Type: BlockPublicAccessConfigurationMetadata structure

Properties that describe the Amazon Web Services principal that created the BlockPublicAccessConfiguration using the PutBlockPublicAccessConfiguration action as well as the date and time that the configuration was created. Each time a configuration for block public access is updated, Amazon EMR updates this metadata.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

GetClusterSessionCredentials

$result = $client->getClusterSessionCredentials([/* ... */]);
$promise = $client->getClusterSessionCredentialsAsync([/* ... */]);

Provides temporary, HTTP basic credentials that are associated with a given runtime IAM role and used by a cluster with fine-grained access control activated. You can use these credentials to connect to cluster endpoints that support username and password authentication.

Parameter Syntax

$result = $client->getClusterSessionCredentials([
    'ClusterId' => '<string>', // REQUIRED
    'ExecutionRoleArn' => '<string>',
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The unique identifier of the cluster.

ExecutionRoleArn
Type: string

The Amazon Resource Name (ARN) of the runtime role for interactive workload submission on the cluster. The runtime role can be a cross-account IAM role. The runtime role ARN is a combination of account ID, role name, and role type using the following format: arn:partition:service:region:account:resource.

Result Syntax

[
    'Credentials' => [
        'UsernamePassword' => [
            'Password' => '<string>',
            'Username' => '<string>',
        ],
    ],
    'ExpiresAt' => <DateTime>,
]

Result Details

Members
Credentials
Type: Credentials structure

The credentials that you can use to connect to cluster endpoints that support username and password authentication.

ExpiresAt
Type: timestamp (string|DateTime or anything parsable by strtotime)

The time when the credentials that are returned by the GetClusterSessionCredentials API expire.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

GetManagedScalingPolicy

$result = $client->getManagedScalingPolicy([/* ... */]);
$promise = $client->getManagedScalingPolicyAsync([/* ... */]);

Fetches the attached managed scaling policy for an Amazon EMR cluster.

Parameter Syntax

$result = $client->getManagedScalingPolicy([
    'ClusterId' => '<string>', // REQUIRED
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

Specifies the ID of the cluster for which the managed scaling policy will be fetched.

Result Syntax

[
    'ManagedScalingPolicy' => [
        'ComputeLimits' => [
            'MaximumCapacityUnits' => <integer>,
            'MaximumCoreCapacityUnits' => <integer>,
            'MaximumOnDemandCapacityUnits' => <integer>,
            'MinimumCapacityUnits' => <integer>,
            'UnitType' => 'InstanceFleetUnits|Instances|VCPU',
        ],
        'ScalingStrategy' => 'DEFAULT|ADVANCED',
        'UtilizationPerformanceIndex' => <integer>,
    ],
]

Result Details

Members
ManagedScalingPolicy
Type: ManagedScalingPolicy structure

Specifies the managed scaling policy that is attached to an Amazon EMR cluster.

Errors

There are no errors described for this operation.

GetStudioSessionMapping

$result = $client->getStudioSessionMapping([/* ... */]);
$promise = $client->getStudioSessionMappingAsync([/* ... */]);

Fetches mapping details for the specified Amazon EMR Studio and identity (user or group).

Parameter Syntax

$result = $client->getStudioSessionMapping([
    'IdentityId' => '<string>',
    'IdentityName' => '<string>',
    'IdentityType' => 'USER|GROUP', // REQUIRED
    'StudioId' => '<string>', // REQUIRED
]);

Parameter Details

Members
IdentityId
Type: string

The globally unique identifier (GUID) of the user or group. For more information, see UserId and GroupId in the IAM Identity Center Identity Store API Reference. Either IdentityName or IdentityId must be specified.

IdentityName
Type: string

The name of the user or group to fetch. For more information, see UserName and DisplayName in the IAM Identity Center Identity Store API Reference. Either IdentityName or IdentityId must be specified.

IdentityType
Required: Yes
Type: string

Specifies whether the identity to fetch is a user or a group.

StudioId
Required: Yes
Type: string

The ID of the Amazon EMR Studio.

Result Syntax

[
    'SessionMapping' => [
        'CreationTime' => <DateTime>,
        'IdentityId' => '<string>',
        'IdentityName' => '<string>',
        'IdentityType' => 'USER|GROUP',
        'LastModifiedTime' => <DateTime>,
        'SessionPolicyArn' => '<string>',
        'StudioId' => '<string>',
    ],
]

Result Details

Members
SessionMapping
Type: SessionMappingDetail structure

The session mapping details for the specified Amazon EMR Studio and identity, including session policy ARN and creation time.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListBootstrapActions

$result = $client->listBootstrapActions([/* ... */]);
$promise = $client->listBootstrapActionsAsync([/* ... */]);

Provides information about the bootstrap actions associated with a cluster.

Parameter Syntax

$result = $client->listBootstrapActions([
    'ClusterId' => '<string>', // REQUIRED
    'Marker' => '<string>',
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The cluster identifier for the bootstrap actions to list.

Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

Result Syntax

[
    'BootstrapActions' => [
        [
            'Args' => ['<string>', ...],
            'Name' => '<string>',
            'ScriptPath' => '<string>',
        ],
        // ...
    ],
    'Marker' => '<string>',
]

Result Details

Members
BootstrapActions
Type: Array of Command structures

The bootstrap actions associated with the cluster.

Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListClusters

$result = $client->listClusters([/* ... */]);
$promise = $client->listClustersAsync([/* ... */]);

Provides the status of all clusters visible to this Amazon Web Services account. Allows you to filter the list of clusters based on certain criteria; for example, filtering by cluster creation date and time or by status. This call returns a maximum of 50 clusters in unsorted order per call, but returns a marker to track the paging of the cluster list across multiple ListClusters calls.

Parameter Syntax

$result = $client->listClusters([
    'ClusterStates' => ['<string>', ...],
    'CreatedAfter' => <integer || string || DateTime>,
    'CreatedBefore' => <integer || string || DateTime>,
    'Marker' => '<string>',
]);

Parameter Details

Members
ClusterStates
Type: Array of strings

The cluster state filters to apply when listing clusters. Clusters that change state while this action runs may be not be returned as expected in the list of clusters.

CreatedAfter
Type: timestamp (string|DateTime or anything parsable by strtotime)

The creation date and time beginning value filter for listing clusters.

CreatedBefore
Type: timestamp (string|DateTime or anything parsable by strtotime)

The creation date and time end value filter for listing clusters.

Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

Result Syntax

[
    'Clusters' => [
        [
            'ClusterArn' => '<string>',
            'Id' => '<string>',
            'Name' => '<string>',
            'NormalizedInstanceHours' => <integer>,
            'OutpostArn' => '<string>',
            'Status' => [
                'ErrorDetails' => [
                    [
                        'ErrorCode' => '<string>',
                        'ErrorData' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'ErrorMessage' => '<string>',
                    ],
                    // ...
                ],
                'State' => 'STARTING|BOOTSTRAPPING|RUNNING|WAITING|TERMINATING|TERMINATED|TERMINATED_WITH_ERRORS',
                'StateChangeReason' => [
                    'Code' => 'INTERNAL_ERROR|VALIDATION_ERROR|INSTANCE_FAILURE|INSTANCE_FLEET_TIMEOUT|BOOTSTRAP_FAILURE|USER_REQUEST|STEP_FAILURE|ALL_STEPS_COMPLETED',
                    'Message' => '<string>',
                ],
                'Timeline' => [
                    'CreationDateTime' => <DateTime>,
                    'EndDateTime' => <DateTime>,
                    'ReadyDateTime' => <DateTime>,
                ],
            ],
        ],
        // ...
    ],
    'Marker' => '<string>',
]

Result Details

Members
Clusters
Type: Array of ClusterSummary structures

The list of clusters for the account based on the given filters.

Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListInstanceFleets

$result = $client->listInstanceFleets([/* ... */]);
$promise = $client->listInstanceFleetsAsync([/* ... */]);

Lists all available details about the instance fleets in a cluster.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

Parameter Syntax

$result = $client->listInstanceFleets([
    'ClusterId' => '<string>', // REQUIRED
    'Marker' => '<string>',
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The unique identifier of the cluster.

Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

Result Syntax

[
    'InstanceFleets' => [
        [
            'Context' => '<string>',
            'Id' => '<string>',
            'InstanceFleetType' => 'MASTER|CORE|TASK',
            'InstanceTypeSpecifications' => [
                [
                    'BidPrice' => '<string>',
                    'BidPriceAsPercentageOfOnDemandPrice' => <float>,
                    'Configurations' => [
                        [
                            'Classification' => '<string>',
                            'Configurations' => [...], // RECURSIVE
                            'Properties' => ['<string>', ...],
                        ],
                        // ...
                    ],
                    'CustomAmiId' => '<string>',
                    'EbsBlockDevices' => [
                        [
                            'Device' => '<string>',
                            'VolumeSpecification' => [
                                'Iops' => <integer>,
                                'SizeInGB' => <integer>,
                                'Throughput' => <integer>,
                                'VolumeType' => '<string>',
                            ],
                        ],
                        // ...
                    ],
                    'EbsOptimized' => true || false,
                    'InstanceType' => '<string>',
                    'Priority' => <float>,
                    'WeightedCapacity' => <integer>,
                ],
                // ...
            ],
            'LaunchSpecifications' => [
                'OnDemandSpecification' => [
                    'AllocationStrategy' => 'lowest-price|prioritized',
                    'CapacityReservationOptions' => [
                        'CapacityReservationPreference' => 'open|none',
                        'CapacityReservationResourceGroupArn' => '<string>',
                        'UsageStrategy' => 'use-capacity-reservations-first',
                    ],
                ],
                'SpotSpecification' => [
                    'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized',
                    'BlockDurationMinutes' => <integer>,
                    'TimeoutAction' => 'SWITCH_TO_ON_DEMAND|TERMINATE_CLUSTER',
                    'TimeoutDurationMinutes' => <integer>,
                ],
            ],
            'Name' => '<string>',
            'ProvisionedOnDemandCapacity' => <integer>,
            'ProvisionedSpotCapacity' => <integer>,
            'ResizeSpecifications' => [
                'OnDemandResizeSpecification' => [
                    'AllocationStrategy' => 'lowest-price|prioritized',
                    'CapacityReservationOptions' => [
                        'CapacityReservationPreference' => 'open|none',
                        'CapacityReservationResourceGroupArn' => '<string>',
                        'UsageStrategy' => 'use-capacity-reservations-first',
                    ],
                    'TimeoutDurationMinutes' => <integer>,
                ],
                'SpotResizeSpecification' => [
                    'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized',
                    'TimeoutDurationMinutes' => <integer>,
                ],
            ],
            'Status' => [
                'State' => 'PROVISIONING|BOOTSTRAPPING|RUNNING|RESIZING|SUSPENDED|TERMINATING|TERMINATED',
                'StateChangeReason' => [
                    'Code' => 'INTERNAL_ERROR|VALIDATION_ERROR|INSTANCE_FAILURE|CLUSTER_TERMINATED',
                    'Message' => '<string>',
                ],
                'Timeline' => [
                    'CreationDateTime' => <DateTime>,
                    'EndDateTime' => <DateTime>,
                    'ReadyDateTime' => <DateTime>,
                ],
            ],
            'TargetOnDemandCapacity' => <integer>,
            'TargetSpotCapacity' => <integer>,
        ],
        // ...
    ],
    'Marker' => '<string>',
]

Result Details

Members
InstanceFleets
Type: Array of InstanceFleet structures

The list of instance fleets for the cluster and given filters.

Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListInstanceGroups

$result = $client->listInstanceGroups([/* ... */]);
$promise = $client->listInstanceGroupsAsync([/* ... */]);

Provides all available details about the instance groups in a cluster.

Parameter Syntax

$result = $client->listInstanceGroups([
    'ClusterId' => '<string>', // REQUIRED
    'Marker' => '<string>',
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The identifier of the cluster for which to list the instance groups.

Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

Result Syntax

[
    'InstanceGroups' => [
        [
            'AutoScalingPolicy' => [
                'Constraints' => [
                    'MaxCapacity' => <integer>,
                    'MinCapacity' => <integer>,
                ],
                'Rules' => [
                    [
                        'Action' => [
                            'Market' => 'ON_DEMAND|SPOT',
                            'SimpleScalingPolicyConfiguration' => [
                                'AdjustmentType' => 'CHANGE_IN_CAPACITY|PERCENT_CHANGE_IN_CAPACITY|EXACT_CAPACITY',
                                'CoolDown' => <integer>,
                                'ScalingAdjustment' => <integer>,
                            ],
                        ],
                        'Description' => '<string>',
                        'Name' => '<string>',
                        'Trigger' => [
                            'CloudWatchAlarmDefinition' => [
                                'ComparisonOperator' => 'GREATER_THAN_OR_EQUAL|GREATER_THAN|LESS_THAN|LESS_THAN_OR_EQUAL',
                                'Dimensions' => [
                                    [
                                        'Key' => '<string>',
                                        'Value' => '<string>',
                                    ],
                                    // ...
                                ],
                                'EvaluationPeriods' => <integer>,
                                'MetricName' => '<string>',
                                'Namespace' => '<string>',
                                'Period' => <integer>,
                                'Statistic' => 'SAMPLE_COUNT|AVERAGE|SUM|MINIMUM|MAXIMUM',
                                'Threshold' => <float>,
                                'Unit' => 'NONE|SECONDS|MICRO_SECONDS|MILLI_SECONDS|BYTES|KILO_BYTES|MEGA_BYTES|GIGA_BYTES|TERA_BYTES|BITS|KILO_BITS|MEGA_BITS|GIGA_BITS|TERA_BITS|PERCENT|COUNT|BYTES_PER_SECOND|KILO_BYTES_PER_SECOND|MEGA_BYTES_PER_SECOND|GIGA_BYTES_PER_SECOND|TERA_BYTES_PER_SECOND|BITS_PER_SECOND|KILO_BITS_PER_SECOND|MEGA_BITS_PER_SECOND|GIGA_BITS_PER_SECOND|TERA_BITS_PER_SECOND|COUNT_PER_SECOND',
                            ],
                        ],
                    ],
                    // ...
                ],
                'Status' => [
                    'State' => 'PENDING|ATTACHING|ATTACHED|DETACHING|DETACHED|FAILED',
                    'StateChangeReason' => [
                        'Code' => 'USER_REQUEST|PROVISION_FAILURE|CLEANUP_FAILURE',
                        'Message' => '<string>',
                    ],
                ],
            ],
            'BidPrice' => '<string>',
            'Configurations' => [
                [
                    'Classification' => '<string>',
                    'Configurations' => [...], // RECURSIVE
                    'Properties' => ['<string>', ...],
                ],
                // ...
            ],
            'ConfigurationsVersion' => <integer>,
            'CustomAmiId' => '<string>',
            'EbsBlockDevices' => [
                [
                    'Device' => '<string>',
                    'VolumeSpecification' => [
                        'Iops' => <integer>,
                        'SizeInGB' => <integer>,
                        'Throughput' => <integer>,
                        'VolumeType' => '<string>',
                    ],
                ],
                // ...
            ],
            'EbsOptimized' => true || false,
            'Id' => '<string>',
            'InstanceGroupType' => 'MASTER|CORE|TASK',
            'InstanceType' => '<string>',
            'LastSuccessfullyAppliedConfigurations' => [
                [
                    'Classification' => '<string>',
                    'Configurations' => [...], // RECURSIVE
                    'Properties' => ['<string>', ...],
                ],
                // ...
            ],
            'LastSuccessfullyAppliedConfigurationsVersion' => <integer>,
            'Market' => 'ON_DEMAND|SPOT',
            'Name' => '<string>',
            'RequestedInstanceCount' => <integer>,
            'RunningInstanceCount' => <integer>,
            'ShrinkPolicy' => [
                'DecommissionTimeout' => <integer>,
                'InstanceResizePolicy' => [
                    'InstanceTerminationTimeout' => <integer>,
                    'InstancesToProtect' => ['<string>', ...],
                    'InstancesToTerminate' => ['<string>', ...],
                ],
            ],
            'Status' => [
                'State' => 'PROVISIONING|BOOTSTRAPPING|RUNNING|RECONFIGURING|RESIZING|SUSPENDED|TERMINATING|TERMINATED|ARRESTED|SHUTTING_DOWN|ENDED',
                'StateChangeReason' => [
                    'Code' => 'INTERNAL_ERROR|VALIDATION_ERROR|INSTANCE_FAILURE|CLUSTER_TERMINATED',
                    'Message' => '<string>',
                ],
                'Timeline' => [
                    'CreationDateTime' => <DateTime>,
                    'EndDateTime' => <DateTime>,
                    'ReadyDateTime' => <DateTime>,
                ],
            ],
        ],
        // ...
    ],
    'Marker' => '<string>',
]

Result Details

Members
InstanceGroups
Type: Array of InstanceGroup structures

The list of instance groups for the cluster and given filters.

Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListInstances

$result = $client->listInstances([/* ... */]);
$promise = $client->listInstancesAsync([/* ... */]);

Provides information for all active Amazon EC2 instances and Amazon EC2 instances terminated in the last 30 days, up to a maximum of 2,000. Amazon EC2 instances in any of the following states are considered active: AWAITING_FULFILLMENT, PROVISIONING, BOOTSTRAPPING, RUNNING.

Parameter Syntax

$result = $client->listInstances([
    'ClusterId' => '<string>', // REQUIRED
    'InstanceFleetId' => '<string>',
    'InstanceFleetType' => 'MASTER|CORE|TASK',
    'InstanceGroupId' => '<string>',
    'InstanceGroupTypes' => ['<string>', ...],
    'InstanceStates' => ['<string>', ...],
    'Marker' => '<string>',
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The identifier of the cluster for which to list the instances.

InstanceFleetId
Type: string

The unique identifier of the instance fleet.

InstanceFleetType
Type: string

The node type of the instance fleet. For example MASTER, CORE, or TASK.

InstanceGroupId
Type: string

The identifier of the instance group for which to list the instances.

InstanceGroupTypes
Type: Array of strings

The type of instance group for which to list the instances.

InstanceStates
Type: Array of strings

A list of instance states that will filter the instances returned with this request.

Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

Result Syntax

[
    'Instances' => [
        [
            'EbsVolumes' => [
                [
                    'Device' => '<string>',
                    'VolumeId' => '<string>',
                ],
                // ...
            ],
            'Ec2InstanceId' => '<string>',
            'Id' => '<string>',
            'InstanceFleetId' => '<string>',
            'InstanceGroupId' => '<string>',
            'InstanceType' => '<string>',
            'Market' => 'ON_DEMAND|SPOT',
            'PrivateDnsName' => '<string>',
            'PrivateIpAddress' => '<string>',
            'PublicDnsName' => '<string>',
            'PublicIpAddress' => '<string>',
            'Status' => [
                'State' => 'AWAITING_FULFILLMENT|PROVISIONING|BOOTSTRAPPING|RUNNING|TERMINATED',
                'StateChangeReason' => [
                    'Code' => 'INTERNAL_ERROR|VALIDATION_ERROR|INSTANCE_FAILURE|BOOTSTRAP_FAILURE|CLUSTER_TERMINATED',
                    'Message' => '<string>',
                ],
                'Timeline' => [
                    'CreationDateTime' => <DateTime>,
                    'EndDateTime' => <DateTime>,
                    'ReadyDateTime' => <DateTime>,
                ],
            ],
        ],
        // ...
    ],
    'Marker' => '<string>',
]

Result Details

Members
Instances
Type: Array of Instance structures

The list of instances for the cluster and given filters.

Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListNotebookExecutions

$result = $client->listNotebookExecutions([/* ... */]);
$promise = $client->listNotebookExecutionsAsync([/* ... */]);

Provides summaries of all notebook executions. You can filter the list based on multiple criteria such as status, time range, and editor id. Returns a maximum of 50 notebook executions and a marker to track the paging of a longer notebook execution list across multiple ListNotebookExecutions calls.

Parameter Syntax

$result = $client->listNotebookExecutions([
    'EditorId' => '<string>',
    'ExecutionEngineId' => '<string>',
    'From' => <integer || string || DateTime>,
    'Marker' => '<string>',
    'Status' => 'START_PENDING|STARTING|RUNNING|FINISHING|FINISHED|FAILING|FAILED|STOP_PENDING|STOPPING|STOPPED',
    'To' => <integer || string || DateTime>,
]);

Parameter Details

Members
EditorId
Type: string

The unique ID of the editor associated with the notebook execution.

ExecutionEngineId
Type: string

The unique ID of the execution engine.

From
Type: timestamp (string|DateTime or anything parsable by strtotime)

The beginning of time range filter for listing notebook executions. The default is the timestamp of 30 days ago.

Marker
Type: string

The pagination token, returned by a previous ListNotebookExecutions call, that indicates the start of the list for this ListNotebookExecutions call.

Status
Type: string

The status filter for listing notebook executions.

  • START_PENDING indicates that the cluster has received the execution request but execution has not begun.

  • STARTING indicates that the execution is starting on the cluster.

  • RUNNING indicates that the execution is being processed by the cluster.

  • FINISHING indicates that execution processing is in the final stages.

  • FINISHED indicates that the execution has completed without error.

  • FAILING indicates that the execution is failing and will not finish successfully.

  • FAILED indicates that the execution failed.

  • STOP_PENDING indicates that the cluster has received a StopNotebookExecution request and the stop is pending.

  • STOPPING indicates that the cluster is in the process of stopping the execution as a result of a StopNotebookExecution request.

  • STOPPED indicates that the execution stopped because of a StopNotebookExecution request.

To
Type: timestamp (string|DateTime or anything parsable by strtotime)

The end of time range filter for listing notebook executions. The default is the current timestamp.

Result Syntax

[
    'Marker' => '<string>',
    'NotebookExecutions' => [
        [
            'EditorId' => '<string>',
            'EndTime' => <DateTime>,
            'ExecutionEngineId' => '<string>',
            'NotebookExecutionId' => '<string>',
            'NotebookExecutionName' => '<string>',
            'NotebookS3Location' => [
                'Bucket' => '<string>',
                'Key' => '<string>',
            ],
            'StartTime' => <DateTime>,
            'Status' => 'START_PENDING|STARTING|RUNNING|FINISHING|FINISHED|FAILING|FAILED|STOP_PENDING|STOPPING|STOPPED',
        ],
        // ...
    ],
]

Result Details

Members
Marker
Type: string

A pagination token that a subsequent ListNotebookExecutions can use to determine the next set of results to retrieve.

NotebookExecutions
Type: Array of NotebookExecutionSummary structures

A list of notebook executions.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListReleaseLabels

$result = $client->listReleaseLabels([/* ... */]);
$promise = $client->listReleaseLabelsAsync([/* ... */]);

Retrieves release labels of Amazon EMR services in the Region where the API is called.

Parameter Syntax

$result = $client->listReleaseLabels([
    'Filters' => [
        'Application' => '<string>',
        'Prefix' => '<string>',
    ],
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
]);

Parameter Details

Members
Filters
Type: ReleaseLabelFilter structure

Filters the results of the request. Prefix specifies the prefix of release labels to return. Application specifies the application (with/without version) of release labels to return.

MaxResults
Type: int

Defines the maximum number of release labels to return in a single response. The default is 100.

NextToken
Type: string

Specifies the next page of results. If NextToken is not specified, which is usually the case for the first request of ListReleaseLabels, the first page of results are determined by other filtering parameters or by the latest version. The ListReleaseLabels request fails if the identity (Amazon Web Services account ID) and all filtering parameters are different from the original request, or if the NextToken is expired or tampered with.

Result Syntax

[
    'NextToken' => '<string>',
    'ReleaseLabels' => ['<string>', ...],
]

Result Details

Members
NextToken
Type: string

Used to paginate the next page of results if specified in the next ListReleaseLabels request.

ReleaseLabels
Type: Array of strings

The returned release labels.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListSecurityConfigurations

$result = $client->listSecurityConfigurations([/* ... */]);
$promise = $client->listSecurityConfigurationsAsync([/* ... */]);

Lists all the security configurations visible to this account, providing their creation dates and times, and their names. This call returns a maximum of 50 clusters per call, but returns a marker to track the paging of the cluster list across multiple ListSecurityConfigurations calls.

Parameter Syntax

$result = $client->listSecurityConfigurations([
    'Marker' => '<string>',
]);

Parameter Details

Members
Marker
Type: string

The pagination token that indicates the set of results to retrieve.

Result Syntax

[
    'Marker' => '<string>',
    'SecurityConfigurations' => [
        [
            'CreationDateTime' => <DateTime>,
            'Name' => '<string>',
        ],
        // ...
    ],
]

Result Details

Members
Marker
Type: string

A pagination token that indicates the next set of results to retrieve. Include the marker in the next ListSecurityConfiguration call to retrieve the next page of results, if required.

SecurityConfigurations
Type: Array of SecurityConfigurationSummary structures

The creation date and time, and name, of each security configuration.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListSteps

$result = $client->listSteps([/* ... */]);
$promise = $client->listStepsAsync([/* ... */]);

Provides a list of steps for the cluster in reverse order unless you specify stepIds with the request or filter by StepStates. You can specify a maximum of 10 stepIDs. The CLI automatically paginates results to return a list greater than 50 steps. To return more than 50 steps using the CLI, specify a Marker, which is a pagination token that indicates the next set of steps to retrieve.

Parameter Syntax

$result = $client->listSteps([
    'ClusterId' => '<string>', // REQUIRED
    'Marker' => '<string>',
    'StepIds' => ['<string>', ...],
    'StepStates' => ['<string>', ...],
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The identifier of the cluster for which to list the steps.

Marker
Type: string

The maximum number of steps that a single ListSteps action returns is 50. To return a longer list of steps, use multiple ListSteps actions along with the Marker parameter, which is a pagination token that indicates the next set of results to retrieve.

StepIds
Type: Array of strings

The filter to limit the step list based on the identifier of the steps. You can specify a maximum of ten Step IDs. The character constraint applies to the overall length of the array.

StepStates
Type: Array of strings

The filter to limit the step list based on certain states.

Result Syntax

[
    'Marker' => '<string>',
    'Steps' => [
        [
            'ActionOnFailure' => 'TERMINATE_JOB_FLOW|TERMINATE_CLUSTER|CANCEL_AND_WAIT|CONTINUE',
            'Config' => [
                'Args' => ['<string>', ...],
                'Jar' => '<string>',
                'MainClass' => '<string>',
                'Properties' => ['<string>', ...],
            ],
            'Id' => '<string>',
            'Name' => '<string>',
            'Status' => [
                'FailureDetails' => [
                    'LogFile' => '<string>',
                    'Message' => '<string>',
                    'Reason' => '<string>',
                ],
                'State' => 'PENDING|CANCEL_PENDING|RUNNING|COMPLETED|CANCELLED|FAILED|INTERRUPTED',
                'StateChangeReason' => [
                    'Code' => 'NONE',
                    'Message' => '<string>',
                ],
                'Timeline' => [
                    'CreationDateTime' => <DateTime>,
                    'EndDateTime' => <DateTime>,
                    'StartDateTime' => <DateTime>,
                ],
            ],
        ],
        // ...
    ],
]

Result Details

Members
Marker
Type: string

The maximum number of steps that a single ListSteps action returns is 50. To return a longer list of steps, use multiple ListSteps actions along with the Marker parameter, which is a pagination token that indicates the next set of results to retrieve.

Steps
Type: Array of StepSummary structures

The filtered list of steps for the cluster.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListStudioSessionMappings

$result = $client->listStudioSessionMappings([/* ... */]);
$promise = $client->listStudioSessionMappingsAsync([/* ... */]);

Returns a list of all user or group session mappings for the Amazon EMR Studio specified by StudioId.

Parameter Syntax

$result = $client->listStudioSessionMappings([
    'IdentityType' => 'USER|GROUP',
    'Marker' => '<string>',
    'StudioId' => '<string>',
]);

Parameter Details

Members
IdentityType
Type: string

Specifies whether to return session mappings for users or groups. If not specified, the results include session mapping details for both users and groups.

Marker
Type: string

The pagination token that indicates the set of results to retrieve.

StudioId
Type: string

The ID of the Amazon EMR Studio.

Result Syntax

[
    'Marker' => '<string>',
    'SessionMappings' => [
        [
            'CreationTime' => <DateTime>,
            'IdentityId' => '<string>',
            'IdentityName' => '<string>',
            'IdentityType' => 'USER|GROUP',
            'SessionPolicyArn' => '<string>',
            'StudioId' => '<string>',
        ],
        // ...
    ],
]

Result Details

Members
Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

SessionMappings
Type: Array of SessionMappingSummary structures

A list of session mapping summary objects. Each object includes session mapping details such as creation time, identity type (user or group), and Amazon EMR Studio ID.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListStudios

$result = $client->listStudios([/* ... */]);
$promise = $client->listStudiosAsync([/* ... */]);

Returns a list of all Amazon EMR Studios associated with the Amazon Web Services account. The list includes details such as ID, Studio Access URL, and creation time for each Studio.

Parameter Syntax

$result = $client->listStudios([
    'Marker' => '<string>',
]);

Parameter Details

Members
Marker
Type: string

The pagination token that indicates the set of results to retrieve.

Result Syntax

[
    'Marker' => '<string>',
    'Studios' => [
        [
            'AuthMode' => 'SSO|IAM',
            'CreationTime' => <DateTime>,
            'Description' => '<string>',
            'Name' => '<string>',
            'StudioId' => '<string>',
            'Url' => '<string>',
            'VpcId' => '<string>',
        ],
        // ...
    ],
]

Result Details

Members
Marker
Type: string

The pagination token that indicates the next set of results to retrieve.

Studios
Type: Array of StudioSummary structures

The list of Studio summary objects.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ListSupportedInstanceTypes

$result = $client->listSupportedInstanceTypes([/* ... */]);
$promise = $client->listSupportedInstanceTypesAsync([/* ... */]);

A list of the instance types that Amazon EMR supports. You can filter the list by Amazon Web Services Region and Amazon EMR release.

Parameter Syntax

$result = $client->listSupportedInstanceTypes([
    'Marker' => '<string>',
    'ReleaseLabel' => '<string>', // REQUIRED
]);

Parameter Details

Members
Marker
Type: string

The pagination token that marks the next set of results to retrieve.

ReleaseLabel
Required: Yes
Type: string

The Amazon EMR release label determines the versions of open-source application packages that Amazon EMR has installed on the cluster. Release labels are in the format emr-x.x.x, where x.x.x is an Amazon EMR release number such as emr-6.10.0. For more information about Amazon EMR releases and their included application versions and features, see the Amazon EMR Release Guide .

Result Syntax

[
    'Marker' => '<string>',
    'SupportedInstanceTypes' => [
        [
            'Architecture' => '<string>',
            'EbsOptimizedAvailable' => true || false,
            'EbsOptimizedByDefault' => true || false,
            'EbsStorageOnly' => true || false,
            'InstanceFamilyId' => '<string>',
            'Is64BitsOnly' => true || false,
            'MemoryGB' => <float>,
            'NumberOfDisks' => <integer>,
            'StorageGB' => <integer>,
            'Type' => '<string>',
            'VCPU' => <integer>,
        ],
        // ...
    ],
]

Result Details

Members
Marker
Type: string

The pagination token that marks the next set of results to retrieve.

SupportedInstanceTypes
Type: Array of SupportedInstanceType structures

The list of instance types that the release specified in ListSupportedInstanceTypesInput$ReleaseLabel supports, filtered by Amazon Web Services Region.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ModifyCluster

$result = $client->modifyCluster([/* ... */]);
$promise = $client->modifyClusterAsync([/* ... */]);

Modifies the number of steps that can be executed concurrently for the cluster specified using ClusterID.

Parameter Syntax

$result = $client->modifyCluster([
    'ClusterId' => '<string>', // REQUIRED
    'StepConcurrencyLevel' => <integer>,
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The unique identifier of the cluster.

StepConcurrencyLevel
Type: int

The number of steps that can be executed concurrently. You can specify a minimum of 1 step and a maximum of 256 steps. We recommend that you do not change this parameter while steps are running or the ActionOnFailure setting may not behave as expected. For more information see Step$ActionOnFailure.

Result Syntax

[
    'StepConcurrencyLevel' => <integer>,
]

Result Details

Members
StepConcurrencyLevel
Type: int

The number of steps that can be executed concurrently.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ModifyInstanceFleet

$result = $client->modifyInstanceFleet([/* ... */]);
$promise = $client->modifyInstanceFleetAsync([/* ... */]);

Modifies the target On-Demand and target Spot capacities for the instance fleet with the specified InstanceFleetID within the cluster specified using ClusterID. The call either succeeds or fails atomically.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

Parameter Syntax

$result = $client->modifyInstanceFleet([
    'ClusterId' => '<string>', // REQUIRED
    'InstanceFleet' => [ // REQUIRED
        'Context' => '<string>',
        'InstanceFleetId' => '<string>', // REQUIRED
        'InstanceTypeConfigs' => [
            [
                'BidPrice' => '<string>',
                'BidPriceAsPercentageOfOnDemandPrice' => <float>,
                'Configurations' => [
                    [
                        'Classification' => '<string>',
                        'Configurations' => [...], // RECURSIVE
                        'Properties' => ['<string>', ...],
                    ],
                    // ...
                ],
                'CustomAmiId' => '<string>',
                'EbsConfiguration' => [
                    'EbsBlockDeviceConfigs' => [
                        [
                            'VolumeSpecification' => [ // REQUIRED
                                'Iops' => <integer>,
                                'SizeInGB' => <integer>, // REQUIRED
                                'Throughput' => <integer>,
                                'VolumeType' => '<string>', // REQUIRED
                            ],
                            'VolumesPerInstance' => <integer>,
                        ],
                        // ...
                    ],
                    'EbsOptimized' => true || false,
                ],
                'InstanceType' => '<string>', // REQUIRED
                'Priority' => <float>,
                'WeightedCapacity' => <integer>,
            ],
            // ...
        ],
        'ResizeSpecifications' => [
            'OnDemandResizeSpecification' => [
                'AllocationStrategy' => 'lowest-price|prioritized',
                'CapacityReservationOptions' => [
                    'CapacityReservationPreference' => 'open|none',
                    'CapacityReservationResourceGroupArn' => '<string>',
                    'UsageStrategy' => 'use-capacity-reservations-first',
                ],
                'TimeoutDurationMinutes' => <integer>,
            ],
            'SpotResizeSpecification' => [
                'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized',
                'TimeoutDurationMinutes' => <integer>,
            ],
        ],
        'TargetOnDemandCapacity' => <integer>,
        'TargetSpotCapacity' => <integer>,
    ],
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

The unique identifier of the cluster.

InstanceFleet
Required: Yes
Type: InstanceFleetModifyConfig structure

The configuration parameters of the instance fleet.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

ModifyInstanceGroups

$result = $client->modifyInstanceGroups([/* ... */]);
$promise = $client->modifyInstanceGroupsAsync([/* ... */]);

ModifyInstanceGroups modifies the number of nodes and configuration settings of an instance group. The input parameters include the new target instance count for the group and the instance group ID. The call will either succeed or fail atomically.

Parameter Syntax

$result = $client->modifyInstanceGroups([
    'ClusterId' => '<string>',
    'InstanceGroups' => [
        [
            'Configurations' => [
                [
                    'Classification' => '<string>',
                    'Configurations' => [...], // RECURSIVE
                    'Properties' => ['<string>', ...],
                ],
                // ...
            ],
            'EC2InstanceIdsToTerminate' => ['<string>', ...],
            'InstanceCount' => <integer>,
            'InstanceGroupId' => '<string>', // REQUIRED
            'ReconfigurationType' => 'OVERWRITE|MERGE',
            'ShrinkPolicy' => [
                'DecommissionTimeout' => <integer>,
                'InstanceResizePolicy' => [
                    'InstanceTerminationTimeout' => <integer>,
                    'InstancesToProtect' => ['<string>', ...],
                    'InstancesToTerminate' => ['<string>', ...],
                ],
            ],
        ],
        // ...
    ],
]);

Parameter Details

Members
ClusterId
Type: string

The ID of the cluster to which the instance group belongs.

InstanceGroups
Type: Array of InstanceGroupModifyConfig structures

Instance groups to change.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

PutAutoScalingPolicy

$result = $client->putAutoScalingPolicy([/* ... */]);
$promise = $client->putAutoScalingPolicyAsync([/* ... */]);

Creates or updates an automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. The automatic scaling policy defines how an instance group dynamically adds and terminates Amazon EC2 instances in response to the value of a CloudWatch metric.

Parameter Syntax

$result = $client->putAutoScalingPolicy([
    'AutoScalingPolicy' => [ // REQUIRED
        'Constraints' => [ // REQUIRED
            'MaxCapacity' => <integer>, // REQUIRED
            'MinCapacity' => <integer>, // REQUIRED
        ],
        'Rules' => [ // REQUIRED
            [
                'Action' => [ // REQUIRED
                    'Market' => 'ON_DEMAND|SPOT',
                    'SimpleScalingPolicyConfiguration' => [ // REQUIRED
                        'AdjustmentType' => 'CHANGE_IN_CAPACITY|PERCENT_CHANGE_IN_CAPACITY|EXACT_CAPACITY',
                        'CoolDown' => <integer>,
                        'ScalingAdjustment' => <integer>, // REQUIRED
                    ],
                ],
                'Description' => '<string>',
                'Name' => '<string>', // REQUIRED
                'Trigger' => [ // REQUIRED
                    'CloudWatchAlarmDefinition' => [ // REQUIRED
                        'ComparisonOperator' => 'GREATER_THAN_OR_EQUAL|GREATER_THAN|LESS_THAN|LESS_THAN_OR_EQUAL', // REQUIRED
                        'Dimensions' => [
                            [
                                'Key' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'EvaluationPeriods' => <integer>,
                        'MetricName' => '<string>', // REQUIRED
                        'Namespace' => '<string>',
                        'Period' => <integer>, // REQUIRED
                        'Statistic' => 'SAMPLE_COUNT|AVERAGE|SUM|MINIMUM|MAXIMUM',
                        'Threshold' => <float>, // REQUIRED
                        'Unit' => 'NONE|SECONDS|MICRO_SECONDS|MILLI_SECONDS|BYTES|KILO_BYTES|MEGA_BYTES|GIGA_BYTES|TERA_BYTES|BITS|KILO_BITS|MEGA_BITS|GIGA_BITS|TERA_BITS|PERCENT|COUNT|BYTES_PER_SECOND|KILO_BYTES_PER_SECOND|MEGA_BYTES_PER_SECOND|GIGA_BYTES_PER_SECOND|TERA_BYTES_PER_SECOND|BITS_PER_SECOND|KILO_BITS_PER_SECOND|MEGA_BITS_PER_SECOND|GIGA_BITS_PER_SECOND|TERA_BITS_PER_SECOND|COUNT_PER_SECOND',
                    ],
                ],
            ],
            // ...
        ],
    ],
    'ClusterId' => '<string>', // REQUIRED
    'InstanceGroupId' => '<string>', // REQUIRED
]);

Parameter Details

Members
AutoScalingPolicy
Required: Yes
Type: AutoScalingPolicy structure

Specifies the definition of the automatic scaling policy.

ClusterId
Required: Yes
Type: string

Specifies the ID of a cluster. The instance group to which the automatic scaling policy is applied is within this cluster.

InstanceGroupId
Required: Yes
Type: string

Specifies the ID of the instance group to which the automatic scaling policy is applied.

Result Syntax

[
    'AutoScalingPolicy' => [
        'Constraints' => [
            'MaxCapacity' => <integer>,
            'MinCapacity' => <integer>,
        ],
        'Rules' => [
            [
                'Action' => [
                    'Market' => 'ON_DEMAND|SPOT',
                    'SimpleScalingPolicyConfiguration' => [
                        'AdjustmentType' => 'CHANGE_IN_CAPACITY|PERCENT_CHANGE_IN_CAPACITY|EXACT_CAPACITY',
                        'CoolDown' => <integer>,
                        'ScalingAdjustment' => <integer>,
                    ],
                ],
                'Description' => '<string>',
                'Name' => '<string>',
                'Trigger' => [
                    'CloudWatchAlarmDefinition' => [
                        'ComparisonOperator' => 'GREATER_THAN_OR_EQUAL|GREATER_THAN|LESS_THAN|LESS_THAN_OR_EQUAL',
                        'Dimensions' => [
                            [
                                'Key' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'EvaluationPeriods' => <integer>,
                        'MetricName' => '<string>',
                        'Namespace' => '<string>',
                        'Period' => <integer>,
                        'Statistic' => 'SAMPLE_COUNT|AVERAGE|SUM|MINIMUM|MAXIMUM',
                        'Threshold' => <float>,
                        'Unit' => 'NONE|SECONDS|MICRO_SECONDS|MILLI_SECONDS|BYTES|KILO_BYTES|MEGA_BYTES|GIGA_BYTES|TERA_BYTES|BITS|KILO_BITS|MEGA_BITS|GIGA_BITS|TERA_BITS|PERCENT|COUNT|BYTES_PER_SECOND|KILO_BYTES_PER_SECOND|MEGA_BYTES_PER_SECOND|GIGA_BYTES_PER_SECOND|TERA_BYTES_PER_SECOND|BITS_PER_SECOND|KILO_BITS_PER_SECOND|MEGA_BITS_PER_SECOND|GIGA_BITS_PER_SECOND|TERA_BITS_PER_SECOND|COUNT_PER_SECOND',
                    ],
                ],
            ],
            // ...
        ],
        'Status' => [
            'State' => 'PENDING|ATTACHING|ATTACHED|DETACHING|DETACHED|FAILED',
            'StateChangeReason' => [
                'Code' => 'USER_REQUEST|PROVISION_FAILURE|CLEANUP_FAILURE',
                'Message' => '<string>',
            ],
        ],
    ],
    'ClusterArn' => '<string>',
    'ClusterId' => '<string>',
    'InstanceGroupId' => '<string>',
]

Result Details

Members
AutoScalingPolicy

The automatic scaling policy definition.

ClusterArn
Type: string

The Amazon Resource Name (ARN) of the cluster.

ClusterId
Type: string

Specifies the ID of a cluster. The instance group to which the automatic scaling policy is applied is within this cluster.

InstanceGroupId
Type: string

Specifies the ID of the instance group to which the scaling policy is applied.

Errors

There are no errors described for this operation.

PutAutoTerminationPolicy

$result = $client->putAutoTerminationPolicy([/* ... */]);
$promise = $client->putAutoTerminationPolicyAsync([/* ... */]);

Auto-termination is supported in Amazon EMR releases 5.30.0 and 6.1.0 and later. For more information, see Using an auto-termination policy.

Creates or updates an auto-termination policy for an Amazon EMR cluster. An auto-termination policy defines the amount of idle time in seconds after which a cluster automatically terminates. For alternative cluster termination options, see Control cluster termination.

Parameter Syntax

$result = $client->putAutoTerminationPolicy([
    'AutoTerminationPolicy' => [
        'IdleTimeout' => <integer>,
    ],
    'ClusterId' => '<string>', // REQUIRED
]);

Parameter Details

Members
AutoTerminationPolicy
Type: AutoTerminationPolicy structure

Specifies the auto-termination policy to attach to the cluster.

ClusterId
Required: Yes
Type: string

Specifies the ID of the Amazon EMR cluster to which the auto-termination policy will be attached.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

There are no errors described for this operation.

PutBlockPublicAccessConfiguration

$result = $client->putBlockPublicAccessConfiguration([/* ... */]);
$promise = $client->putBlockPublicAccessConfigurationAsync([/* ... */]);

Creates or updates an Amazon EMR block public access configuration for your Amazon Web Services account in the current Region. For more information see Configure Block Public Access for Amazon EMR in the Amazon EMR Management Guide.

Parameter Syntax

$result = $client->putBlockPublicAccessConfiguration([
    'BlockPublicAccessConfiguration' => [ // REQUIRED
        'BlockPublicSecurityGroupRules' => true || false, // REQUIRED
        'PermittedPublicSecurityGroupRuleRanges' => [
            [
                'MaxRange' => <integer>,
                'MinRange' => <integer>, // REQUIRED
            ],
            // ...
        ],
    ],
]);

Parameter Details

Members
BlockPublicAccessConfiguration
Required: Yes
Type: BlockPublicAccessConfiguration structure

A configuration for Amazon EMR block public access. The configuration applies to all clusters created in your account for the current Region. The configuration specifies whether block public access is enabled. If block public access is enabled, security groups associated with the cluster cannot have rules that allow inbound traffic from 0.0.0.0/0 or ::/0 on a port, unless the port is specified as an exception using PermittedPublicSecurityGroupRuleRanges in the BlockPublicAccessConfiguration. By default, Port 22 (SSH) is an exception, and public access is allowed on this port. You can change this by updating BlockPublicSecurityGroupRules to remove the exception.

For accounts that created clusters in a Region before November 25, 2019, block public access is disabled by default in that Region. To use this feature, you must manually enable and configure it. For accounts that did not create an Amazon EMR cluster in a Region before this date, block public access is enabled by default in that Region.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

PutManagedScalingPolicy

$result = $client->putManagedScalingPolicy([/* ... */]);
$promise = $client->putManagedScalingPolicyAsync([/* ... */]);

Creates or updates a managed scaling policy for an Amazon EMR cluster. The managed scaling policy defines the limits for resources, such as Amazon EC2 instances that can be added or terminated from a cluster. The policy only applies to the core and task nodes. The master node cannot be scaled after initial configuration.

Parameter Syntax

$result = $client->putManagedScalingPolicy([
    'ClusterId' => '<string>', // REQUIRED
    'ManagedScalingPolicy' => [ // REQUIRED
        'ComputeLimits' => [
            'MaximumCapacityUnits' => <integer>, // REQUIRED
            'MaximumCoreCapacityUnits' => <integer>,
            'MaximumOnDemandCapacityUnits' => <integer>,
            'MinimumCapacityUnits' => <integer>, // REQUIRED
            'UnitType' => 'InstanceFleetUnits|Instances|VCPU', // REQUIRED
        ],
        'ScalingStrategy' => 'DEFAULT|ADVANCED',
        'UtilizationPerformanceIndex' => <integer>,
    ],
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

Specifies the ID of an Amazon EMR cluster where the managed scaling policy is attached.

ManagedScalingPolicy
Required: Yes
Type: ManagedScalingPolicy structure

Specifies the constraints for the managed scaling policy.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

There are no errors described for this operation.

RemoveAutoScalingPolicy

$result = $client->removeAutoScalingPolicy([/* ... */]);
$promise = $client->removeAutoScalingPolicyAsync([/* ... */]);

Removes an automatic scaling policy from a specified instance group within an Amazon EMR cluster.

Parameter Syntax

$result = $client->removeAutoScalingPolicy([
    'ClusterId' => '<string>', // REQUIRED
    'InstanceGroupId' => '<string>', // REQUIRED
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

Specifies the ID of a cluster. The instance group to which the automatic scaling policy is applied is within this cluster.

InstanceGroupId
Required: Yes
Type: string

Specifies the ID of the instance group to which the scaling policy is applied.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

There are no errors described for this operation.

RemoveAutoTerminationPolicy

$result = $client->removeAutoTerminationPolicy([/* ... */]);
$promise = $client->removeAutoTerminationPolicyAsync([/* ... */]);

Removes an auto-termination policy from an Amazon EMR cluster.

Parameter Syntax

$result = $client->removeAutoTerminationPolicy([
    'ClusterId' => '<string>', // REQUIRED
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

Specifies the ID of the Amazon EMR cluster from which the auto-termination policy will be removed.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

There are no errors described for this operation.

RemoveManagedScalingPolicy

$result = $client->removeManagedScalingPolicy([/* ... */]);
$promise = $client->removeManagedScalingPolicyAsync([/* ... */]);

Removes a managed scaling policy from a specified Amazon EMR cluster.

Parameter Syntax

$result = $client->removeManagedScalingPolicy([
    'ClusterId' => '<string>', // REQUIRED
]);

Parameter Details

Members
ClusterId
Required: Yes
Type: string

Specifies the ID of the cluster from which the managed scaling policy will be removed.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

There are no errors described for this operation.

RemoveTags

$result = $client->removeTags([/* ... */]);
$promise = $client->removeTagsAsync([/* ... */]);

Removes tags from an Amazon EMR resource, such as a cluster or Amazon EMR Studio. Tags make it easier to associate resources in various ways, such as grouping clusters to track your Amazon EMR resource allocation costs. For more information, see Tag Clusters.

The following example removes the stack tag with value Prod from a cluster:

Parameter Syntax

$result = $client->removeTags([
    'ResourceId' => '<string>', // REQUIRED
    'TagKeys' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
ResourceId
Required: Yes
Type: string

The Amazon EMR resource identifier from which tags will be removed. For example, a cluster identifier or an Amazon EMR Studio ID.

TagKeys
Required: Yes
Type: Array of strings

A list of tag keys to remove from the resource.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

RunJobFlow

$result = $client->runJobFlow([/* ... */]);
$promise = $client->runJobFlowAsync([/* ... */]);

RunJobFlow creates and starts running a new cluster (job flow). The cluster runs the steps specified. After the steps complete, the cluster stops and the HDFS partition is lost. To prevent loss of data, configure the last step of the job flow to store results in Amazon S3. If the JobFlowInstancesConfig KeepJobFlowAliveWhenNoSteps parameter is set to TRUE, the cluster transitions to the WAITING state rather than shutting down after the steps have completed.

For additional protection, you can set the JobFlowInstancesConfig TerminationProtected parameter to TRUE to lock the cluster and prevent it from being terminated by API call, user intervention, or in the event of a job flow error.

A maximum of 256 steps are allowed in each job flow.

If your cluster is long-running (such as a Hive data warehouse) or complex, you may require more than 256 steps to process your data. You can bypass the 256-step limitation in various ways, including using the SSH shell to connect to the master node and submitting queries directly to the software running on the master node, such as Hive and Hadoop.

For long-running clusters, we recommend that you periodically store your results.

The instance fleets configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions. The RunJobFlow request can contain InstanceFleets parameters or InstanceGroups parameters, but not both.

Parameter Syntax

$result = $client->runJobFlow([
    'AdditionalInfo' => '<string>',
    'AmiVersion' => '<string>',
    'Applications' => [
        [
            'AdditionalInfo' => ['<string>', ...],
            'Args' => ['<string>', ...],
            'Name' => '<string>',
            'Version' => '<string>',
        ],
        // ...
    ],
    'AutoScalingRole' => '<string>',
    'AutoTerminationPolicy' => [
        'IdleTimeout' => <integer>,
    ],
    'BootstrapActions' => [
        [
            'Name' => '<string>', // REQUIRED
            'ScriptBootstrapAction' => [ // REQUIRED
                'Args' => ['<string>', ...],
                'Path' => '<string>', // REQUIRED
            ],
        ],
        // ...
    ],
    'Configurations' => [
        [
            'Classification' => '<string>',
            'Configurations' => [...], // RECURSIVE
            'Properties' => ['<string>', ...],
        ],
        // ...
    ],
    'CustomAmiId' => '<string>',
    'EbsRootVolumeIops' => <integer>,
    'EbsRootVolumeSize' => <integer>,
    'EbsRootVolumeThroughput' => <integer>,
    'Instances' => [ // REQUIRED
        'AdditionalMasterSecurityGroups' => ['<string>', ...],
        'AdditionalSlaveSecurityGroups' => ['<string>', ...],
        'Ec2KeyName' => '<string>',
        'Ec2SubnetId' => '<string>',
        'Ec2SubnetIds' => ['<string>', ...],
        'EmrManagedMasterSecurityGroup' => '<string>',
        'EmrManagedSlaveSecurityGroup' => '<string>',
        'HadoopVersion' => '<string>',
        'InstanceCount' => <integer>,
        'InstanceFleets' => [
            [
                'Context' => '<string>',
                'InstanceFleetType' => 'MASTER|CORE|TASK', // REQUIRED
                'InstanceTypeConfigs' => [
                    [
                        'BidPrice' => '<string>',
                        'BidPriceAsPercentageOfOnDemandPrice' => <float>,
                        'Configurations' => [
                            [
                                'Classification' => '<string>',
                                'Configurations' => [...], // RECURSIVE
                                'Properties' => ['<string>', ...],
                            ],
                            // ...
                        ],
                        'CustomAmiId' => '<string>',
                        'EbsConfiguration' => [
                            'EbsBlockDeviceConfigs' => [
                                [
                                    'VolumeSpecification' => [ // REQUIRED
                                        'Iops' => <integer>,
                                        'SizeInGB' => <integer>, // REQUIRED
                                        'Throughput' => <integer>,
                                        'VolumeType' => '<string>', // REQUIRED
                                    ],
                                    'VolumesPerInstance' => <integer>,
                                ],
                                // ...
                            ],
                            'EbsOptimized' => true || false,
                        ],
                        'InstanceType' => '<string>', // REQUIRED
                        'Priority' => <float>,
                        'WeightedCapacity' => <integer>,
                    ],
                    // ...
                ],
                'LaunchSpecifications' => [
                    'OnDemandSpecification' => [
                        'AllocationStrategy' => 'lowest-price|prioritized', // REQUIRED
                        'CapacityReservationOptions' => [
                            'CapacityReservationPreference' => 'open|none',
                            'CapacityReservationResourceGroupArn' => '<string>',
                            'UsageStrategy' => 'use-capacity-reservations-first',
                        ],
                    ],
                    'SpotSpecification' => [
                        'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized',
                        'BlockDurationMinutes' => <integer>,
                        'TimeoutAction' => 'SWITCH_TO_ON_DEMAND|TERMINATE_CLUSTER', // REQUIRED
                        'TimeoutDurationMinutes' => <integer>, // REQUIRED
                    ],
                ],
                'Name' => '<string>',
                'ResizeSpecifications' => [
                    'OnDemandResizeSpecification' => [
                        'AllocationStrategy' => 'lowest-price|prioritized',
                        'CapacityReservationOptions' => [
                            'CapacityReservationPreference' => 'open|none',
                            'CapacityReservationResourceGroupArn' => '<string>',
                            'UsageStrategy' => 'use-capacity-reservations-first',
                        ],
                        'TimeoutDurationMinutes' => <integer>,
                    ],
                    'SpotResizeSpecification' => [
                        'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized',
                        'TimeoutDurationMinutes' => <integer>,
                    ],
                ],
                'TargetOnDemandCapacity' => <integer>,
                'TargetSpotCapacity' => <integer>,
            ],
            // ...
        ],
        'InstanceGroups' => [
            [
                'AutoScalingPolicy' => [
                    'Constraints' => [ // REQUIRED
                        'MaxCapacity' => <integer>, // REQUIRED
                        'MinCapacity' => <integer>, // REQUIRED
                    ],
                    'Rules' => [ // REQUIRED
                        [
                            'Action' => [ // REQUIRED
                                'Market' => 'ON_DEMAND|SPOT',
                                'SimpleScalingPolicyConfiguration' => [ // REQUIRED
                                    'AdjustmentType' => 'CHANGE_IN_CAPACITY|PERCENT_CHANGE_IN_CAPACITY|EXACT_CAPACITY',
                                    'CoolDown' => <integer>,
                                    'ScalingAdjustment' => <integer>, // REQUIRED
                                ],
                            ],
                            'Description' => '<string>',
                            'Name' => '<string>', // REQUIRED
                            'Trigger' => [ // REQUIRED
                                'CloudWatchAlarmDefinition' => [ // REQUIRED
                                    'ComparisonOperator' => 'GREATER_THAN_OR_EQUAL|GREATER_THAN|LESS_THAN|LESS_THAN_OR_EQUAL', // REQUIRED
                                    'Dimensions' => [
                                        [
                                            'Key' => '<string>',
                                            'Value' => '<string>',
                                        ],
                                        // ...
                                    ],
                                    'EvaluationPeriods' => <integer>,
                                    'MetricName' => '<string>', // REQUIRED
                                    'Namespace' => '<string>',
                                    'Period' => <integer>, // REQUIRED
                                    'Statistic' => 'SAMPLE_COUNT|AVERAGE|SUM|MINIMUM|MAXIMUM',
                                    'Threshold' => <float>, // REQUIRED
                                    'Unit' => 'NONE|SECONDS|MICRO_SECONDS|MILLI_SECONDS|BYTES|KILO_BYTES|MEGA_BYTES|GIGA_BYTES|TERA_BYTES|BITS|KILO_BITS|MEGA_BITS|GIGA_BITS|TERA_BITS|PERCENT|COUNT|BYTES_PER_SECOND|KILO_BYTES_PER_SECOND|MEGA_BYTES_PER_SECOND|GIGA_BYTES_PER_SECOND|TERA_BYTES_PER_SECOND|BITS_PER_SECOND|KILO_BITS_PER_SECOND|MEGA_BITS_PER_SECOND|GIGA_BITS_PER_SECOND|TERA_BITS_PER_SECOND|COUNT_PER_SECOND',
                                ],
                            ],
                        ],
                        // ...
                    ],
                ],
                'BidPrice' => '<string>',
                'Configurations' => [
                    [
                        'Classification' => '<string>',
                        'Configurations' => [...], // RECURSIVE
                        'Properties' => ['<string>', ...],
                    ],
                    // ...
                ],
                'CustomAmiId' => '<string>',
                'EbsConfiguration' => [
                    'EbsBlockDeviceConfigs' => [
                        [
                            'VolumeSpecification' => [ // REQUIRED
                                'Iops' => <integer>,
                                'SizeInGB' => <integer>, // REQUIRED
                                'Throughput' => <integer>,
                                'VolumeType' => '<string>', // REQUIRED
                            ],
                            'VolumesPerInstance' => <integer>,
                        ],
                        // ...
                    ],
                    'EbsOptimized' => true || false,
                ],
                'InstanceCount' => <integer>, // REQUIRED
                'InstanceRole' => 'MASTER|CORE|TASK', // REQUIRED
                'InstanceType' => '<string>', // REQUIRED
                'Market' => 'ON_DEMAND|SPOT',
                'Name' => '<string>',
            ],
            // ...
        ],
        'KeepJobFlowAliveWhenNoSteps' => true || false,
        'MasterInstanceType' => '<string>',
        'Placement' => [
            'AvailabilityZone' => '<string>',
            'AvailabilityZones' => ['<string>', ...],
        ],
        'ServiceAccessSecurityGroup' => '<string>',
        'SlaveInstanceType' => '<string>',
        'TerminationProtected' => true || false,
        'UnhealthyNodeReplacement' => true || false,
    ],
    'JobFlowRole' => '<string>',
    'KerberosAttributes' => [
        'ADDomainJoinPassword' => '<string>',
        'ADDomainJoinUser' => '<string>',
        'CrossRealmTrustPrincipalPassword' => '<string>',
        'KdcAdminPassword' => '<string>', // REQUIRED
        'Realm' => '<string>', // REQUIRED
    ],
    'LogEncryptionKmsKeyId' => '<string>',
    'LogUri' => '<string>',
    'ManagedScalingPolicy' => [
        'ComputeLimits' => [
            'MaximumCapacityUnits' => <integer>, // REQUIRED
            'MaximumCoreCapacityUnits' => <integer>,
            'MaximumOnDemandCapacityUnits' => <integer>,
            'MinimumCapacityUnits' => <integer>, // REQUIRED
            'UnitType' => 'InstanceFleetUnits|Instances|VCPU', // REQUIRED
        ],
        'ScalingStrategy' => 'DEFAULT|ADVANCED',
        'UtilizationPerformanceIndex' => <integer>,
    ],
    'Name' => '<string>', // REQUIRED
    'NewSupportedProducts' => [
        [
            'Args' => ['<string>', ...],
            'Name' => '<string>',
        ],
        // ...
    ],
    'OSReleaseLabel' => '<string>',
    'PlacementGroupConfigs' => [
        [
            'InstanceRole' => 'MASTER|CORE|TASK', // REQUIRED
            'PlacementStrategy' => 'SPREAD|PARTITION|CLUSTER|NONE',
        ],
        // ...
    ],
    'ReleaseLabel' => '<string>',
    'RepoUpgradeOnBoot' => 'SECURITY|NONE',
    'ScaleDownBehavior' => 'TERMINATE_AT_INSTANCE_HOUR|TERMINATE_AT_TASK_COMPLETION',
    'SecurityConfiguration' => '<string>',
    'ServiceRole' => '<string>',
    'StepConcurrencyLevel' => <integer>,
    'Steps' => [
        [
            'ActionOnFailure' => 'TERMINATE_JOB_FLOW|TERMINATE_CLUSTER|CANCEL_AND_WAIT|CONTINUE',
            'HadoopJarStep' => [ // REQUIRED
                'Args' => ['<string>', ...],
                'Jar' => '<string>', // REQUIRED
                'MainClass' => '<string>',
                'Properties' => [
                    [
                        'Key' => '<string>',
                        'Value' => '<string>',
                    ],
                    // ...
                ],
            ],
            'Name' => '<string>', // REQUIRED
        ],
        // ...
    ],
    'SupportedProducts' => ['<string>', ...],
    'Tags' => [
        [
            'Key' => '<string>',
            'Value' => '<string>',
        ],
        // ...
    ],
    'VisibleToAllUsers' => true || false,
]);

Parameter Details

Members
AdditionalInfo
Type: string

A JSON string for selecting additional features.

AmiVersion
Type: string

Applies only to Amazon EMR AMI versions 3.x and 2.x. For Amazon EMR releases 4.0 and later, ReleaseLabel is used. To specify a custom AMI, use CustomAmiID.

Applications
Type: Array of Application structures

Applies to Amazon EMR releases 4.0 and later. A case-insensitive list of applications for Amazon EMR to install and configure when launching the cluster. For a list of applications available for each Amazon EMR release version, see the Amazon EMRRelease Guide.

AutoScalingRole
Type: string

An IAM role for automatic scaling policies. The default role is EMR_AutoScaling_DefaultRole. The IAM role provides permissions that the automatic scaling feature requires to launch and terminate Amazon EC2 instances in an instance group.

AutoTerminationPolicy
Type: AutoTerminationPolicy structure

An auto-termination policy for an Amazon EMR cluster. An auto-termination policy defines the amount of idle time in seconds after which a cluster automatically terminates. For alternative cluster termination options, see Control cluster termination.

BootstrapActions
Type: Array of BootstrapActionConfig structures

A list of bootstrap actions to run before Hadoop starts on the cluster nodes.

Configurations
Type: Array of Configuration structures

For Amazon EMR releases 4.0 and later. The list of configurations supplied for the Amazon EMR cluster that you are creating.

CustomAmiId
Type: string

Available only in Amazon EMR releases 5.7.0 and later. The ID of a custom Amazon EBS-backed Linux AMI. If specified, Amazon EMR uses this AMI when it launches cluster Amazon EC2 instances. For more information about custom AMIs in Amazon EMR, see Using a Custom AMI in the Amazon EMR Management Guide. If omitted, the cluster uses the base Linux AMI for the ReleaseLabel specified. For Amazon EMR releases 2.x and 3.x, use AmiVersion instead.

For information about creating a custom AMI, see Creating an Amazon EBS-Backed Linux AMI in the Amazon Elastic Compute Cloud User Guide for Linux Instances. For information about finding an AMI ID, see Finding a Linux AMI.

EbsRootVolumeIops
Type: int

The IOPS, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.

EbsRootVolumeSize
Type: int

The size, in GiB, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 4.x and later.

EbsRootVolumeThroughput
Type: int

The throughput, in MiB/s, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.

Instances
Required: Yes
Type: JobFlowInstancesConfig structure

A specification of the number and type of Amazon EC2 instances.

JobFlowRole
Type: string

Also called instance profile and Amazon EC2 role. An IAM role for an Amazon EMR cluster. The Amazon EC2 instances of the cluster assume this role. The default role is EMR_EC2_DefaultRole. In order to use the default role, you must have already created it using the CLI or console.

KerberosAttributes
Type: KerberosAttributes structure

Attributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration. For more information see Use Kerberos Authentication in the Amazon EMR Management Guide.

LogEncryptionKmsKeyId
Type: string

The KMS key used for encrypting log files. If a value is not provided, the logs remain encrypted by AES-256. This attribute is only available with Amazon EMR releases 5.30.0 and later, excluding Amazon EMR 6.0.0.

LogUri
Type: string

The location in Amazon S3 to write the log files of the job flow. If a value is not provided, logs are not created.

ManagedScalingPolicy
Type: ManagedScalingPolicy structure

The specified managed scaling policy for an Amazon EMR cluster.

Name
Required: Yes
Type: string

The name of the job flow.

NewSupportedProducts
Type: Array of SupportedProductConfig structures

For Amazon EMR releases 3.x and 2.x. For Amazon EMR releases 4.x and later, use Applications.

A list of strings that indicates third-party software to use with the job flow that accepts a user argument list. Amazon EMR accepts and forwards the argument list to the corresponding installation script as bootstrap action arguments. For more information, see "Launch a Job Flow on the MapR Distribution for Hadoop" in the Amazon EMR Developer Guide. Supported values are:

  • "mapr-m3" - launch the cluster using MapR M3 Edition.

  • "mapr-m5" - launch the cluster using MapR M5 Edition.

  • "mapr" with the user arguments specifying "--edition,m3" or "--edition,m5" - launch the job flow using MapR M3 or M5 Edition respectively.

  • "mapr-m7" - launch the cluster using MapR M7 Edition.

  • "hunk" - launch the cluster with the Hunk Big Data Analytics Platform.

  • "hue"- launch the cluster with Hue installed.

  • "spark" - launch the cluster with Apache Spark installed.

  • "ganglia" - launch the cluster with the Ganglia Monitoring System installed.

OSReleaseLabel
Type: string

Specifies a particular Amazon Linux release for all nodes in a cluster launch RunJobFlow request. If a release is not specified, Amazon EMR uses the latest validated Amazon Linux release for cluster launch.

PlacementGroupConfigs
Type: Array of PlacementGroupConfig structures

The specified placement group configuration for an Amazon EMR cluster.

ReleaseLabel
Type: string

The Amazon EMR release label, which determines the version of open-source application packages installed on the cluster. Release labels are in the form emr-x.x.x, where x.x.x is an Amazon EMR release version such as emr-5.14.0. For more information about Amazon EMR release versions and included application versions and features, see https://docs.aws.amazon.com/emr/latest/ReleaseGuide/. The release label applies only to Amazon EMR releases version 4.0 and later. Earlier versions use AmiVersion.

RepoUpgradeOnBoot
Type: string

Applies only when CustomAmiID is used. Specifies which updates from the Amazon Linux AMI package repositories to apply automatically when the instance boots using the AMI. If omitted, the default is SECURITY, which indicates that only security updates are applied. If NONE is specified, no updates are applied, and all updates must be applied manually.

ScaleDownBehavior
Type: string

Specifies the way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized. TERMINATE_AT_INSTANCE_HOUR indicates that Amazon EMR terminates nodes at the instance-hour boundary, regardless of when the request to terminate the instance was submitted. This option is only available with Amazon EMR 5.1.0 and later and is the default for clusters created using that version. TERMINATE_AT_TASK_COMPLETION indicates that Amazon EMR adds nodes to a deny list and drains tasks from nodes before terminating the Amazon EC2 instances, regardless of the instance-hour boundary. With either behavior, Amazon EMR removes the least active nodes first and blocks instance termination if it could lead to HDFS corruption. TERMINATE_AT_TASK_COMPLETION available only in Amazon EMR releases 4.1.0 and later, and is the default for releases of Amazon EMR earlier than 5.1.0.

SecurityConfiguration
Type: string

The name of a security configuration to apply to the cluster.

ServiceRole
Type: string

The IAM role that Amazon EMR assumes in order to access Amazon Web Services resources on your behalf. If you've created a custom service role path, you must specify it for the service role when you launch your cluster.

StepConcurrencyLevel
Type: int

Specifies the number of steps that can be executed concurrently. The default value is 1. The maximum value is 256.

Steps
Type: Array of StepConfig structures

A list of steps to run.

SupportedProducts
Type: Array of strings

For Amazon EMR releases 3.x and 2.x. For Amazon EMR releases 4.x and later, use Applications.

A list of strings that indicates third-party software to use. For more information, see the Amazon EMR Developer Guide. Currently supported values are:

  • "mapr-m3" - launch the job flow using MapR M3 Edition.

  • "mapr-m5" - launch the job flow using MapR M5 Edition.

Tags
Type: Array of Tag structures

A list of tags to associate with a cluster and propagate to Amazon EC2 instances.

VisibleToAllUsers
Type: boolean

The VisibleToAllUsers parameter is no longer supported. By default, the value is set to true. Setting it to false now has no effect.

Set this value to true so that IAM principals in the Amazon Web Services account associated with the cluster can perform Amazon EMR actions on the cluster that their IAM policies allow. This value defaults to true for clusters created using the Amazon EMR API or the CLI create-cluster command.

When set to false, only the IAM principal that created the cluster and the Amazon Web Services account root user can perform Amazon EMR actions for the cluster, regardless of the IAM permissions policies attached to other IAM principals. For more information, see Understanding the Amazon EMR cluster VisibleToAllUsers setting in the Amazon EMR Management Guide.

Result Syntax

[
    'ClusterArn' => '<string>',
    'JobFlowId' => '<string>',
]

Result Details

Members
ClusterArn
Type: string

The Amazon Resource Name (ARN) of the cluster.

JobFlowId
Type: string

A unique identifier for the job flow.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

SetKeepJobFlowAliveWhenNoSteps

$result = $client->setKeepJobFlowAliveWhenNoSteps([/* ... */]);
$promise = $client->setKeepJobFlowAliveWhenNoStepsAsync([/* ... */]);

You can use the SetKeepJobFlowAliveWhenNoSteps to configure a cluster (job flow) to terminate after the step execution, i.e., all your steps are executed. If you want a transient cluster that shuts down after the last of the current executing steps are completed, you can configure SetKeepJobFlowAliveWhenNoSteps to false. If you want a long running cluster, configure SetKeepJobFlowAliveWhenNoSteps to true.

For more information, see Managing Cluster Termination in the Amazon EMR Management Guide.

Parameter Syntax

$result = $client->setKeepJobFlowAliveWhenNoSteps([
    'JobFlowIds' => ['<string>', ...], // REQUIRED
    'KeepJobFlowAliveWhenNoSteps' => true || false, // REQUIRED
]);

Parameter Details

Members
JobFlowIds
Required: Yes
Type: Array of strings

A list of strings that uniquely identify the clusters to protect. This identifier is returned by RunJobFlow and can also be obtained from DescribeJobFlows.

KeepJobFlowAliveWhenNoSteps
Required: Yes
Type: boolean

A Boolean that indicates whether to terminate the cluster after all steps are executed.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

SetTerminationProtection

$result = $client->setTerminationProtection([/* ... */]);
$promise = $client->setTerminationProtectionAsync([/* ... */]);

SetTerminationProtection locks a cluster (job flow) so the Amazon EC2 instances in the cluster cannot be terminated by user intervention, an API call, or in the event of a job-flow error. The cluster still terminates upon successful completion of the job flow. Calling SetTerminationProtection on a cluster is similar to calling the Amazon EC2 DisableAPITermination API on all Amazon EC2 instances in a cluster.

SetTerminationProtection is used to prevent accidental termination of a cluster and to ensure that in the event of an error, the instances persist so that you can recover any data stored in their ephemeral instance storage.

To terminate a cluster that has been locked by setting SetTerminationProtection to true, you must first unlock the job flow by a subsequent call to SetTerminationProtection in which you set the value to false.

For more information, see Managing Cluster Termination in the Amazon EMR Management Guide.

Parameter Syntax

$result = $client->setTerminationProtection([
    'JobFlowIds' => ['<string>', ...], // REQUIRED
    'TerminationProtected' => true || false, // REQUIRED
]);

Parameter Details

Members
JobFlowIds
Required: Yes
Type: Array of strings

A list of strings that uniquely identify the clusters to protect. This identifier is returned by RunJobFlow and can also be obtained from DescribeJobFlows .

TerminationProtected
Required: Yes
Type: boolean

A Boolean that indicates whether to protect the cluster and prevent the Amazon EC2 instances in the cluster from shutting down due to API calls, user intervention, or job-flow error.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

SetUnhealthyNodeReplacement

$result = $client->setUnhealthyNodeReplacement([/* ... */]);
$promise = $client->setUnhealthyNodeReplacementAsync([/* ... */]);

Specify whether to enable unhealthy node replacement, which lets Amazon EMR gracefully replace core nodes on a cluster if any nodes become unhealthy. For example, a node becomes unhealthy if disk usage is above 90%. If unhealthy node replacement is on and TerminationProtected are off, Amazon EMR immediately terminates the unhealthy core nodes. To use unhealthy node replacement and retain unhealthy core nodes, use to turn on termination protection. In such cases, Amazon EMR adds the unhealthy nodes to a denylist, reducing job interruptions and failures.

If unhealthy node replacement is on, Amazon EMR notifies YARN and other applications on the cluster to stop scheduling tasks with these nodes, moves the data, and then terminates the nodes.

For more information, see graceful node replacement in the Amazon EMR Management Guide.

Parameter Syntax

$result = $client->setUnhealthyNodeReplacement([
    'JobFlowIds' => ['<string>', ...], // REQUIRED
    'UnhealthyNodeReplacement' => true || false, // REQUIRED
]);

Parameter Details

Members
JobFlowIds
Required: Yes
Type: Array of strings

The list of strings that uniquely identify the clusters for which to turn on unhealthy node replacement. You can get these identifiers by running the RunJobFlow or the DescribeJobFlows operations.

UnhealthyNodeReplacement
Required: Yes
Type: boolean

Indicates whether to turn on or turn off graceful unhealthy node replacement.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

SetVisibleToAllUsers

$result = $client->setVisibleToAllUsers([/* ... */]);
$promise = $client->setVisibleToAllUsersAsync([/* ... */]);

The SetVisibleToAllUsers parameter is no longer supported. Your cluster may be visible to all users in your account. To restrict cluster access using an IAM policy, see Identity and Access Management for Amazon EMR.

Sets the Cluster$VisibleToAllUsers value for an Amazon EMR cluster. When true, IAM principals in the Amazon Web Services account can perform Amazon EMR cluster actions that their IAM policies allow. When false, only the IAM principal that created the cluster and the Amazon Web Services account root user can perform Amazon EMR actions on the cluster, regardless of IAM permissions policies attached to other IAM principals.

This action works on running clusters. When you create a cluster, use the RunJobFlowInput$VisibleToAllUsers parameter.

For more information, see Understanding the Amazon EMR Cluster VisibleToAllUsers Setting in the Amazon EMR Management Guide.

Parameter Syntax

$result = $client->setVisibleToAllUsers([
    'JobFlowIds' => ['<string>', ...], // REQUIRED
    'VisibleToAllUsers' => true || false, // REQUIRED
]);

Parameter Details

Members
JobFlowIds
Required: Yes
Type: Array of strings

The unique identifier of the job flow (cluster).

VisibleToAllUsers
Required: Yes
Type: boolean

A value of true indicates that an IAM principal in the Amazon Web Services account can perform Amazon EMR actions on the cluster that the IAM policies attached to the principal allow. A value of false indicates that only the IAM principal that created the cluster and the Amazon Web Services root user can perform Amazon EMR actions on the cluster.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

StartNotebookExecution

$result = $client->startNotebookExecution([/* ... */]);
$promise = $client->startNotebookExecutionAsync([/* ... */]);

Starts a notebook execution.

Parameter Syntax

$result = $client->startNotebookExecution([
    'EditorId' => '<string>',
    'EnvironmentVariables' => ['<string>', ...],
    'ExecutionEngine' => [ // REQUIRED
        'ExecutionRoleArn' => '<string>',
        'Id' => '<string>', // REQUIRED
        'MasterInstanceSecurityGroupId' => '<string>',
        'Type' => 'EMR',
    ],
    'NotebookExecutionName' => '<string>',
    'NotebookInstanceSecurityGroupId' => '<string>',
    'NotebookParams' => '<string>',
    'NotebookS3Location' => [
        'Bucket' => '<string>',
        'Key' => '<string>',
    ],
    'OutputNotebookFormat' => 'HTML',
    'OutputNotebookS3Location' => [
        'Bucket' => '<string>',
        'Key' => '<string>',
    ],
    'RelativePath' => '<string>',
    'ServiceRole' => '<string>', // REQUIRED
    'Tags' => [
        [
            'Key' => '<string>',
            'Value' => '<string>',
        ],
        // ...
    ],
]);

Parameter Details

Members
EditorId
Type: string

The unique identifier of the Amazon EMR Notebook to use for notebook execution.

EnvironmentVariables
Type: Associative array of custom strings keys (XmlStringMaxLen256) to strings

The environment variables associated with the notebook execution.

ExecutionEngine
Required: Yes
Type: ExecutionEngineConfig structure

Specifies the execution engine (cluster) that runs the notebook execution.

NotebookExecutionName
Type: string

An optional name for the notebook execution.

NotebookInstanceSecurityGroupId
Type: string

The unique identifier of the Amazon EC2 security group to associate with the Amazon EMR Notebook for this notebook execution.

NotebookParams
Type: string

Input parameters in JSON format passed to the Amazon EMR Notebook at runtime for execution.

NotebookS3Location
Type: NotebookS3LocationFromInput structure

The Amazon S3 location for the notebook execution input.

OutputNotebookFormat
Type: string

The output format for the notebook execution.

OutputNotebookS3Location

The Amazon S3 location for the notebook execution output.

RelativePath
Type: string

The path and file name of the notebook file for this execution, relative to the path specified for the Amazon EMR Notebook. For example, if you specify a path of s3://MyBucket/MyNotebooks when you create an Amazon EMR Notebook for a notebook with an ID of e-ABCDEFGHIJK1234567890ABCD (the EditorID of this request), and you specify a RelativePath of my_notebook_executions/notebook_execution.ipynb, the location of the file for the notebook execution is s3://MyBucket/MyNotebooks/e-ABCDEFGHIJK1234567890ABCD/my_notebook_executions/notebook_execution.ipynb.

ServiceRole
Required: Yes
Type: string

The name or ARN of the IAM role that is used as the service role for Amazon EMR (the Amazon EMR role) for the notebook execution.

Tags
Type: Array of Tag structures

A list of tags associated with a notebook execution. Tags are user-defined key-value pairs that consist of a required key string with a maximum of 128 characters and an optional value string with a maximum of 256 characters.

Result Syntax

[
    'NotebookExecutionId' => '<string>',
]

Result Details

Members
NotebookExecutionId
Type: string

The unique identifier of the notebook execution.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

StopNotebookExecution

$result = $client->stopNotebookExecution([/* ... */]);
$promise = $client->stopNotebookExecutionAsync([/* ... */]);

Stops a notebook execution.

Parameter Syntax

$result = $client->stopNotebookExecution([
    'NotebookExecutionId' => '<string>', // REQUIRED
]);

Parameter Details

Members
NotebookExecutionId
Required: Yes
Type: string

The unique identifier of the notebook execution.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

TerminateJobFlows

$result = $client->terminateJobFlows([/* ... */]);
$promise = $client->terminateJobFlowsAsync([/* ... */]);

TerminateJobFlows shuts a list of clusters (job flows) down. When a job flow is shut down, any step not yet completed is canceled and the Amazon EC2 instances on which the cluster is running are stopped. Any log files not already saved are uploaded to Amazon S3 if a LogUri was specified when the cluster was created.

The maximum number of clusters allowed is 10. The call to TerminateJobFlows is asynchronous. Depending on the configuration of the cluster, it may take up to 1-5 minutes for the cluster to completely terminate and release allocated resources, such as Amazon EC2 instances.

Parameter Syntax

$result = $client->terminateJobFlows([
    'JobFlowIds' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
JobFlowIds
Required: Yes
Type: Array of strings

A list of job flows to be shut down.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

UpdateStudio

$result = $client->updateStudio([/* ... */]);
$promise = $client->updateStudioAsync([/* ... */]);

Updates an Amazon EMR Studio configuration, including attributes such as name, description, and subnets.

Parameter Syntax

$result = $client->updateStudio([
    'DefaultS3Location' => '<string>',
    'Description' => '<string>',
    'EncryptionKeyArn' => '<string>',
    'Name' => '<string>',
    'StudioId' => '<string>', // REQUIRED
    'SubnetIds' => ['<string>', ...],
]);

Parameter Details

Members
DefaultS3Location
Type: string

The Amazon S3 location to back up Workspaces and notebook files for the Amazon EMR Studio.

Description
Type: string

A detailed description to assign to the Amazon EMR Studio.

EncryptionKeyArn
Type: string

The KMS key identifier (ARN) used to encrypt Amazon EMR Studio workspace and notebook files when backed up to Amazon S3.

Name
Type: string

A descriptive name for the Amazon EMR Studio.

StudioId
Required: Yes
Type: string

The ID of the Amazon EMR Studio to update.

SubnetIds
Type: Array of strings

A list of subnet IDs to associate with the Amazon EMR Studio. The list can include new subnet IDs, but must also include all of the subnet IDs previously associated with the Studio. The list order does not matter. A Studio can have a maximum of 5 subnets. The subnets must belong to the same VPC as the Studio.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerException:

This exception occurs when there is an internal failure in the Amazon EMR service.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

UpdateStudioSessionMapping

$result = $client->updateStudioSessionMapping([/* ... */]);
$promise = $client->updateStudioSessionMappingAsync([/* ... */]);

Updates the session policy attached to the user or group for the specified Amazon EMR Studio.

Parameter Syntax

$result = $client->updateStudioSessionMapping([
    'IdentityId' => '<string>',
    'IdentityName' => '<string>',
    'IdentityType' => 'USER|GROUP', // REQUIRED
    'SessionPolicyArn' => '<string>', // REQUIRED
    'StudioId' => '<string>', // REQUIRED
]);

Parameter Details

Members
IdentityId
Type: string

The globally unique identifier (GUID) of the user or group. For more information, see UserId and GroupId in the IAM Identity Center Identity Store API Reference. Either IdentityName or IdentityId must be specified.

IdentityName
Type: string

The name of the user or group to update. For more information, see UserName and DisplayName in the IAM Identity Center Identity Store API Reference. Either IdentityName or IdentityId must be specified.

IdentityType
Required: Yes
Type: string

Specifies whether the identity to update is a user or a group.

SessionPolicyArn
Required: Yes
Type: string

The Amazon Resource Name (ARN) of the session policy to associate with the specified user or group.

StudioId
Required: Yes
Type: string

The ID of the Amazon EMR Studio.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServerError:

Indicates that an error occurred while processing the request and that the request was not completed.

InvalidRequestException:

This exception occurs when there is something wrong with user input.

Shapes

Application

Description

With Amazon EMR release version 4.0 and later, the only accepted parameter is the application name. To pass arguments to applications, you use configuration classifications specified using configuration JSON objects. For more information, see Configuring Applications.

With earlier Amazon EMR releases, the application is any Amazon or third-party software that you can add to the cluster. This structure contains a list of strings that indicates the software to use with the cluster and accepts a user argument list. Amazon EMR accepts and forwards the argument list to the corresponding installation script as bootstrap action argument.

Members
AdditionalInfo
Type: Associative array of custom strings keys (String) to strings

This option is for advanced users only. This is meta information about third-party applications that third-party vendors use for testing purposes.

Args
Type: Array of strings

Arguments for Amazon EMR to pass to the application.

Name
Type: string

The name of the application.

Version
Type: string

The version of the application.

AutoScalingPolicy

Description

An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. An automatic scaling policy defines how an instance group dynamically adds and terminates Amazon EC2 instances in response to the value of a CloudWatch metric. See PutAutoScalingPolicy.

Members
Constraints
Required: Yes
Type: ScalingConstraints structure

The upper and lower Amazon EC2 instance limits for an automatic scaling policy. Automatic scaling activity will not cause an instance group to grow above or below these limits.

Rules
Required: Yes
Type: Array of ScalingRule structures

The scale-in and scale-out rules that comprise the automatic scaling policy.

AutoScalingPolicyDescription

Description

An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. The automatic scaling policy defines how an instance group dynamically adds and terminates Amazon EC2 instances in response to the value of a CloudWatch metric. See PutAutoScalingPolicy.

Members
Constraints
Type: ScalingConstraints structure

The upper and lower Amazon EC2 instance limits for an automatic scaling policy. Automatic scaling activity will not cause an instance group to grow above or below these limits.

Rules
Type: Array of ScalingRule structures

The scale-in and scale-out rules that comprise the automatic scaling policy.

Status
Type: AutoScalingPolicyStatus structure

The status of an automatic scaling policy.

AutoScalingPolicyStateChangeReason

Description

The reason for an AutoScalingPolicyStatus change.

Members
Code
Type: string

The code indicating the reason for the change in status.USER_REQUEST indicates that the scaling policy status was changed by a user. PROVISION_FAILURE indicates that the status change was because the policy failed to provision. CLEANUP_FAILURE indicates an error.

Message
Type: string

A friendly, more verbose message that accompanies an automatic scaling policy state change.

AutoScalingPolicyStatus

Description

The status of an automatic scaling policy.

Members
State
Type: string

Indicates the status of the automatic scaling policy.

StateChangeReason

The reason for a change in status.

AutoTerminationPolicy

Description

An auto-termination policy for an Amazon EMR cluster. An auto-termination policy defines the amount of idle time in seconds after which a cluster automatically terminates. For alternative cluster termination options, see Control cluster termination.

Members
IdleTimeout
Type: long (int|float)

Specifies the amount of idle time in seconds after which the cluster automatically terminates. You can specify a minimum of 60 seconds and a maximum of 604800 seconds (seven days).

BlockPublicAccessConfiguration

Description

A configuration for Amazon EMR block public access. When BlockPublicSecurityGroupRules is set to true, Amazon EMR prevents cluster creation if one of the cluster's security groups has a rule that allows inbound traffic from 0.0.0.0/0 or ::/0 on a port, unless the port is specified as an exception using PermittedPublicSecurityGroupRuleRanges.

Members
BlockPublicSecurityGroupRules
Required: Yes
Type: boolean

Indicates whether Amazon EMR block public access is enabled (true) or disabled (false). By default, the value is false for accounts that have created Amazon EMR clusters before July 2019. For accounts created after this, the default is true.

PermittedPublicSecurityGroupRuleRanges
Type: Array of PortRange structures

Specifies ports and port ranges that are permitted to have security group rules that allow inbound traffic from all public sources. For example, if Port 23 (Telnet) is specified for PermittedPublicSecurityGroupRuleRanges, Amazon EMR allows cluster creation if a security group associated with the cluster has a rule that allows inbound traffic on Port 23 from IPv4 0.0.0.0/0 or IPv6 port ::/0 as the source.

By default, Port 22, which is used for SSH access to the cluster Amazon EC2 instances, is in the list of PermittedPublicSecurityGroupRuleRanges.

BlockPublicAccessConfigurationMetadata

Description

Properties that describe the Amazon Web Services principal that created the BlockPublicAccessConfiguration using the PutBlockPublicAccessConfiguration action as well as the date and time that the configuration was created. Each time a configuration for block public access is updated, Amazon EMR updates this metadata.

Members
CreatedByArn
Required: Yes
Type: string

The Amazon Resource Name that created or last modified the configuration.

CreationDateTime
Required: Yes
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time that the configuration was created.

BootstrapActionConfig

Description

Configuration of a bootstrap action.

Members
Name
Required: Yes
Type: string

The name of the bootstrap action.

ScriptBootstrapAction
Required: Yes
Type: ScriptBootstrapActionConfig structure

The script run by the bootstrap action.

BootstrapActionDetail

Description

Reports the configuration of a bootstrap action in a cluster (job flow).

Members
BootstrapActionConfig
Type: BootstrapActionConfig structure

A description of the bootstrap action.

CancelStepsInfo

Description

Specification of the status of a CancelSteps request. Available only in Amazon EMR version 4.8.0 and later, excluding version 5.0.0.

Members
Reason
Type: string

The reason for the failure if the CancelSteps request fails.

Status
Type: string

The status of a CancelSteps Request. The value may be SUBMITTED or FAILED.

StepId
Type: string

The encrypted StepId of a step.

CloudWatchAlarmDefinition

Description

The definition of a CloudWatch metric alarm, which determines when an automatic scaling activity is triggered. When the defined alarm conditions are satisfied, scaling activity begins.

Members
ComparisonOperator
Required: Yes
Type: string

Determines how the metric specified by MetricName is compared to the value specified by Threshold.

Dimensions
Type: Array of MetricDimension structures

A CloudWatch metric dimension.

EvaluationPeriods
Type: int

The number of periods, in five-minute increments, during which the alarm condition must exist before the alarm triggers automatic scaling activity. The default value is 1.

MetricName
Required: Yes
Type: string

The name of the CloudWatch metric that is watched to determine an alarm condition.

Namespace
Type: string

The namespace for the CloudWatch metric. The default is AWS/ElasticMapReduce.

Period
Required: Yes
Type: int

The period, in seconds, over which the statistic is applied. CloudWatch metrics for Amazon EMR are emitted every five minutes (300 seconds), so if you specify a CloudWatch metric, specify 300.

Statistic
Type: string

The statistic to apply to the metric associated with the alarm. The default is AVERAGE.

Threshold
Required: Yes
Type: double

The value against which the specified statistic is compared.

Unit
Type: string

The unit of measure associated with the CloudWatch metric being watched. The value specified for Unit must correspond to the units specified in the CloudWatch metric.

Cluster

Description

The detailed description of the cluster.

Members
Applications
Type: Array of Application structures

The applications installed on this cluster.

AutoScalingRole
Type: string

An IAM role for automatic scaling policies. The default role is EMR_AutoScaling_DefaultRole. The IAM role provides permissions that the automatic scaling feature requires to launch and terminate Amazon EC2 instances in an instance group.

AutoTerminate
Type: boolean

Specifies whether the cluster should terminate after completing all steps.

ClusterArn
Type: string

The Amazon Resource Name of the cluster.

Configurations
Type: Array of Configuration structures

Applies only to Amazon EMR releases 4.x and later. The list of configurations that are supplied to the Amazon EMR cluster.

CustomAmiId
Type: string

Available only in Amazon EMR releases 5.7.0 and later. The ID of a custom Amazon EBS-backed Linux AMI if the cluster uses a custom AMI.

EbsRootVolumeIops
Type: int

The IOPS, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.

EbsRootVolumeSize
Type: int

The size, in GiB, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 4.x and later.

EbsRootVolumeThroughput
Type: int

The throughput, in MiB/s, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.

Ec2InstanceAttributes
Type: Ec2InstanceAttributes structure

Provides information about the Amazon EC2 instances in a cluster grouped by category. For example, key name, subnet ID, IAM instance profile, and so on.

Id
Type: string

The unique identifier for the cluster.

InstanceCollectionType
Type: string

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

The instance group configuration of the cluster. A value of INSTANCE_GROUP indicates a uniform instance group configuration. A value of INSTANCE_FLEET indicates an instance fleets configuration.

KerberosAttributes
Type: KerberosAttributes structure

Attributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration. For more information see Use Kerberos Authentication in the Amazon EMR Management Guide.

LogEncryptionKmsKeyId
Type: string

The KMS key used for encrypting log files. This attribute is only available with Amazon EMR 5.30.0 and later, excluding Amazon EMR 6.0.0.

LogUri
Type: string

The path to the Amazon S3 location where logs for this cluster are stored.

MasterPublicDnsName
Type: string

The DNS name of the master node. If the cluster is on a private subnet, this is the private DNS name. On a public subnet, this is the public DNS name.

Name
Type: string

The name of the cluster. This parameter can't contain the characters <, >, $, |, or ` (backtick).

NormalizedInstanceHours
Type: int

An approximation of the cost of the cluster, represented in m1.small/hours. This value is incremented one time for every hour an m1.small instance runs. Larger instances are weighted more, so an Amazon EC2 instance that is roughly four times more expensive would result in the normalized instance hours being incremented by four. This result is only an approximation and does not reflect the actual billing rate.

OSReleaseLabel
Type: string

The Amazon Linux release specified in a cluster launch RunJobFlow request. If no Amazon Linux release was specified, the default Amazon Linux release is shown in the response.

OutpostArn
Type: string

The Amazon Resource Name (ARN) of the Outpost where the cluster is launched.

PlacementGroups
Type: Array of PlacementGroupConfig structures

Placement group configured for an Amazon EMR cluster.

ReleaseLabel
Type: string

The Amazon EMR release label, which determines the version of open-source application packages installed on the cluster. Release labels are in the form emr-x.x.x, where x.x.x is an Amazon EMR release version such as emr-5.14.0. For more information about Amazon EMR release versions and included application versions and features, see https://docs.aws.amazon.com/emr/latest/ReleaseGuide/. The release label applies only to Amazon EMR releases version 4.0 and later. Earlier versions use AmiVersion.

RepoUpgradeOnBoot
Type: string

Applies only when CustomAmiID is used. Specifies the type of updates that the Amazon Linux AMI package repositories apply when an instance boots using the AMI.

RequestedAmiVersion
Type: string

The AMI version requested for this cluster.

RunningAmiVersion
Type: string

The AMI version running on this cluster.

ScaleDownBehavior
Type: string

The way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized. TERMINATE_AT_INSTANCE_HOUR indicates that Amazon EMR terminates nodes at the instance-hour boundary, regardless of when the request to terminate the instance was submitted. This option is only available with Amazon EMR 5.1.0 and later and is the default for clusters created using that version. TERMINATE_AT_TASK_COMPLETION indicates that Amazon EMR adds nodes to a deny list and drains tasks from nodes before terminating the Amazon EC2 instances, regardless of the instance-hour boundary. With either behavior, Amazon EMR removes the least active nodes first and blocks instance termination if it could lead to HDFS corruption. TERMINATE_AT_TASK_COMPLETION is available only in Amazon EMR releases 4.1.0 and later, and is the default for versions of Amazon EMR earlier than 5.1.0.

SecurityConfiguration
Type: string

The name of the security configuration applied to the cluster.

ServiceRole
Type: string

The IAM role that Amazon EMR assumes in order to access Amazon Web Services resources on your behalf.

Status
Type: ClusterStatus structure

The current status details about the cluster.

StepConcurrencyLevel
Type: int

Specifies the number of steps that can be executed concurrently.

Tags
Type: Array of Tag structures

A list of tags associated with a cluster.

TerminationProtected
Type: boolean

Indicates whether Amazon EMR will lock the cluster to prevent the Amazon EC2 instances from being terminated by an API call or user intervention, or in the event of a cluster error.

UnhealthyNodeReplacement
Type: boolean

Indicates whether Amazon EMR should gracefully replace Amazon EC2 core instances that have degraded within the cluster.

VisibleToAllUsers
Type: boolean

Indicates whether the cluster is visible to IAM principals in the Amazon Web Services account associated with the cluster. When true, IAM principals in the Amazon Web Services account can perform Amazon EMR cluster actions on the cluster that their IAM policies allow. When false, only the IAM principal that created the cluster and the Amazon Web Services account root user can perform Amazon EMR actions, regardless of IAM permissions policies attached to other IAM principals.

The default value is true if a value is not provided when creating a cluster using the Amazon EMR API RunJobFlow command, the CLI create-cluster command, or the Amazon Web Services Management Console.

ClusterStateChangeReason

Description

The reason that the cluster changed to its current state.

Members
Code
Type: string

The programmatic code for the state change reason.

Message
Type: string

The descriptive message for the state change reason.

ClusterStatus

Description

The detailed status of the cluster.

Members
ErrorDetails
Type: Array of ErrorDetail structures

A list of tuples that provides information about the errors that caused a cluster to terminate. This structure can contain up to 10 different ErrorDetail tuples.

State
Type: string

The current state of the cluster.

StateChangeReason
Type: ClusterStateChangeReason structure

The reason for the cluster status change.

Timeline
Type: ClusterTimeline structure

A timeline that represents the status of a cluster over the lifetime of the cluster.

ClusterSummary

Description

The summary description of the cluster.

Members
ClusterArn
Type: string

The Amazon Resource Name of the cluster.

Id
Type: string

The unique identifier for the cluster.

Name
Type: string

The name of the cluster.

NormalizedInstanceHours
Type: int

An approximation of the cost of the cluster, represented in m1.small/hours. This value is incremented one time for every hour an m1.small instance runs. Larger instances are weighted more, so an Amazon EC2 instance that is roughly four times more expensive would result in the normalized instance hours being incremented by four. This result is only an approximation and does not reflect the actual billing rate.

OutpostArn
Type: string

The Amazon Resource Name (ARN) of the Outpost where the cluster is launched.

Status
Type: ClusterStatus structure

The details about the current status of the cluster.

ClusterTimeline

Description

Represents the timeline of the cluster's lifecycle.

Members
CreationDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The creation date and time of the cluster.

EndDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the cluster was terminated.

ReadyDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the cluster was ready to run steps.

Command

Description

An entity describing an executable that runs on a cluster.

Members
Args
Type: Array of strings

Arguments for Amazon EMR to pass to the command for execution.

Name
Type: string

The name of the command.

ScriptPath
Type: string

The Amazon S3 location of the command script.

ComputeLimits

Description

The Amazon EC2 unit limits for a managed scaling policy. The managed scaling activity of a cluster can not be above or below these limits. The limit only applies to the core and task nodes. The master node cannot be scaled after initial configuration.

Members
MaximumCapacityUnits
Required: Yes
Type: int

The upper boundary of Amazon EC2 units. It is measured through vCPU cores or instances for instance groups and measured through units for instance fleets. Managed scaling activities are not allowed beyond this boundary. The limit only applies to the core and task nodes. The master node cannot be scaled after initial configuration.

MaximumCoreCapacityUnits
Type: int

The upper boundary of Amazon EC2 units for core node type in a cluster. It is measured through vCPU cores or instances for instance groups and measured through units for instance fleets. The core units are not allowed to scale beyond this boundary. The parameter is used to split capacity allocation between core and task nodes.

MaximumOnDemandCapacityUnits
Type: int

The upper boundary of On-Demand Amazon EC2 units. It is measured through vCPU cores or instances for instance groups and measured through units for instance fleets. The On-Demand units are not allowed to scale beyond this boundary. The parameter is used to split capacity allocation between On-Demand and Spot Instances.

MinimumCapacityUnits
Required: Yes
Type: int

The lower boundary of Amazon EC2 units. It is measured through vCPU cores or instances for instance groups and measured through units for instance fleets. Managed scaling activities are not allowed beyond this boundary. The limit only applies to the core and task nodes. The master node cannot be scaled after initial configuration.

UnitType
Required: Yes
Type: string

The unit type used for specifying a managed scaling policy.

Configuration

Description

Amazon EMR releases 4.x or later.

An optional configuration specification to be used when provisioning cluster instances, which can include configurations for applications and software bundled with Amazon EMR. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file. For more information, see Configuring Applications.

Members
Classification
Type: string

The classification within a configuration.

Configurations
Type: Array of Configuration structures

A list of additional configurations to apply within a configuration object.

Properties
Type: Associative array of custom strings keys (String) to strings

A set of properties specified within a configuration classification.

Credentials

Description

The credentials that you can use to connect to cluster endpoints. Credentials consist of a username and a password.

Members
UsernamePassword
Type: UsernamePassword structure

The username and password that you use to connect to cluster endpoints.

EbsBlockDevice

Description

Configuration of requested EBS block device associated with the instance group.

Members
Device
Type: string

The device name that is exposed to the instance, such as /dev/sdh.

VolumeSpecification
Type: VolumeSpecification structure

EBS volume specifications such as volume type, IOPS, size (GiB) and throughput (MiB/s) that are requested for the EBS volume attached to an Amazon EC2 instance in the cluster.

EbsBlockDeviceConfig

Description

Configuration of requested EBS block device associated with the instance group with count of volumes that are associated to every instance.

Members
VolumeSpecification
Required: Yes
Type: VolumeSpecification structure

EBS volume specifications such as volume type, IOPS, size (GiB) and throughput (MiB/s) that are requested for the EBS volume attached to an Amazon EC2 instance in the cluster.

VolumesPerInstance
Type: int

Number of EBS volumes with a specific volume configuration that are associated with every instance in the instance group

EbsConfiguration

Description

The Amazon EBS configuration of a cluster instance.

Members
EbsBlockDeviceConfigs
Type: Array of EbsBlockDeviceConfig structures

An array of Amazon EBS volume specifications attached to a cluster instance.

EbsOptimized
Type: boolean

Indicates whether an Amazon EBS volume is EBS-optimized.

EbsVolume

Description

EBS block device that's attached to an Amazon EC2 instance.

Members
Device
Type: string

The device name that is exposed to the instance, such as /dev/sdh.

VolumeId
Type: string

The volume identifier of the EBS volume.

Ec2InstanceAttributes

Description

Provides information about the Amazon EC2 instances in a cluster grouped by category. For example, key name, subnet ID, IAM instance profile, and so on.

Members
AdditionalMasterSecurityGroups
Type: Array of strings

A list of additional Amazon EC2 security group IDs for the master node.

AdditionalSlaveSecurityGroups
Type: Array of strings

A list of additional Amazon EC2 security group IDs for the core and task nodes.

Ec2AvailabilityZone
Type: string

The Availability Zone in which the cluster will run.

Ec2KeyName
Type: string

The name of the Amazon EC2 key pair to use when connecting with SSH into the master node as a user named "hadoop".

Ec2SubnetId
Type: string

Set this parameter to the identifier of the Amazon VPC subnet where you want the cluster to launch. If you do not specify this value, and your account supports EC2-Classic, the cluster launches in EC2-Classic.

EmrManagedMasterSecurityGroup
Type: string

The identifier of the Amazon EC2 security group for the master node.

EmrManagedSlaveSecurityGroup
Type: string

The identifier of the Amazon EC2 security group for the core and task nodes.

IamInstanceProfile
Type: string

The IAM role that was specified when the cluster was launched. The Amazon EC2 instances of the cluster assume this role.

RequestedEc2AvailabilityZones
Type: Array of strings

Applies to clusters configured with the instance fleets option. Specifies one or more Availability Zones in which to launch Amazon EC2 cluster instances when the EC2-Classic network configuration is supported. Amazon EMR chooses the Availability Zone with the best fit from among the list of RequestedEc2AvailabilityZones, and then launches all cluster instances within that Availability Zone. If you do not specify this value, Amazon EMR chooses the Availability Zone for you. RequestedEc2SubnetIDs and RequestedEc2AvailabilityZones cannot be specified together.

RequestedEc2SubnetIds
Type: Array of strings

Applies to clusters configured with the instance fleets option. Specifies the unique identifier of one or more Amazon EC2 subnets in which to launch Amazon EC2 cluster instances. Subnets must exist within the same VPC. Amazon EMR chooses the Amazon EC2 subnet with the best fit from among the list of RequestedEc2SubnetIds, and then launches all cluster instances within that Subnet. If this value is not specified, and the account and Region support EC2-Classic networks, the cluster launches instances in the EC2-Classic network and uses RequestedEc2AvailabilityZones instead of this setting. If EC2-Classic is not supported, and no Subnet is specified, Amazon EMR chooses the subnet for you. RequestedEc2SubnetIDs and RequestedEc2AvailabilityZones cannot be specified together.

ServiceAccessSecurityGroup
Type: string

The identifier of the Amazon EC2 security group for the Amazon EMR service to access clusters in VPC private subnets.

ErrorDetail

Description

A tuple that provides information about an error that caused a cluster to terminate.

Members
ErrorCode
Type: string

The name or code associated with the error.

ErrorData
Type: Array of stringss

A list of key value pairs that provides contextual information about why an error occured.

ErrorMessage
Type: string

A message that describes the error.

ExecutionEngineConfig

Description

Specifies the execution engine (cluster) to run the notebook and perform the notebook execution, for example, an Amazon EMR cluster.

Members
ExecutionRoleArn
Type: string

The execution role ARN required for the notebook execution.

Id
Required: Yes
Type: string

The unique identifier of the execution engine. For an Amazon EMR cluster, this is the cluster ID.

MasterInstanceSecurityGroupId
Type: string

An optional unique ID of an Amazon EC2 security group to associate with the master instance of the Amazon EMR cluster for this notebook execution. For more information see Specifying Amazon EC2 Security Groups for Amazon EMR Notebooks in the EMR Management Guide.

Type
Type: string

The type of execution engine. A value of EMR specifies an Amazon EMR cluster.

FailureDetails

Description

The details of the step failure. The service attempts to detect the root cause for many common failures.

Members
LogFile
Type: string

The path to the log file where the step failure root cause was originally recorded.

Message
Type: string

The descriptive message including the error the Amazon EMR service has identified as the cause of step failure. This is text from an error log that describes the root cause of the failure.

Reason
Type: string

The reason for the step failure. In the case where the service cannot successfully determine the root cause of the failure, it returns "Unknown Error" as a reason.

HadoopJarStepConfig

Description

A job flow step consisting of a JAR file whose main function will be executed. The main function submits a job for Hadoop to execute and waits for the job to finish or fail.

Members
Args
Type: Array of strings

A list of command line arguments passed to the JAR file's main function when executed.

Jar
Required: Yes
Type: string

A path to a JAR file run during the step.

MainClass
Type: string

The name of the main class in the specified Java file. If not specified, the JAR file should specify a Main-Class in its manifest file.

Properties
Type: Array of KeyValue structures

A list of Java properties that are set when the step runs. You can use these properties to pass key-value pairs to your main function.

HadoopStepConfig

Description

A cluster step consisting of a JAR file whose main function will be executed. The main function submits a job for Hadoop to execute and waits for the job to finish or fail.

Members
Args
Type: Array of strings

The list of command line arguments to pass to the JAR file's main function for execution.

Jar
Type: string

The path to the JAR file that runs during the step.

MainClass
Type: string

The name of the main class in the specified Java file. If not specified, the JAR file should specify a main class in its manifest file.

Properties
Type: Associative array of custom strings keys (String) to strings

The list of Java properties that are set when the step runs. You can use these properties to pass key-value pairs to your main function.

Instance

Description

Represents an Amazon EC2 instance provisioned as part of cluster.

Members
EbsVolumes
Type: Array of EbsVolume structures

The list of Amazon EBS volumes that are attached to this instance.

Ec2InstanceId
Type: string

The unique identifier of the instance in Amazon EC2.

Id
Type: string

The unique identifier for the instance in Amazon EMR.

InstanceFleetId
Type: string

The unique identifier of the instance fleet to which an Amazon EC2 instance belongs.

InstanceGroupId
Type: string

The identifier of the instance group to which this instance belongs.

InstanceType
Type: string

The Amazon EC2 instance type, for example m3.xlarge.

Market
Type: string

The instance purchasing option. Valid values are ON_DEMAND or SPOT.

PrivateDnsName
Type: string

The private DNS name of the instance.

PrivateIpAddress
Type: string

The private IP address of the instance.

PublicDnsName
Type: string

The public DNS name of the instance.

PublicIpAddress
Type: string

The public IP address of the instance.

Status
Type: InstanceStatus structure

The current status of the instance.

InstanceFleet

Description

Describes an instance fleet, which is a group of Amazon EC2 instances that host a particular node type (master, core, or task) in an Amazon EMR cluster. Instance fleets can consist of a mix of instance types and On-Demand and Spot Instances, which are provisioned to meet a defined target capacity.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

Members
Context
Type: string

Reserved.

Id
Type: string

The unique identifier of the instance fleet.

InstanceFleetType
Type: string

The node type that the instance fleet hosts. Valid values are MASTER, CORE, or TASK.

InstanceTypeSpecifications
Type: Array of InstanceTypeSpecification structures

An array of specifications for the instance types that comprise an instance fleet.

LaunchSpecifications

Describes the launch specification for an instance fleet.

Name
Type: string

A friendly name for the instance fleet.

ProvisionedOnDemandCapacity
Type: int

The number of On-Demand units that have been provisioned for the instance fleet to fulfill TargetOnDemandCapacity. This provisioned capacity might be less than or greater than TargetOnDemandCapacity.

ProvisionedSpotCapacity
Type: int

The number of Spot units that have been provisioned for this instance fleet to fulfill TargetSpotCapacity. This provisioned capacity might be less than or greater than TargetSpotCapacity.

ResizeSpecifications

The resize specification for the instance fleet.

Status
Type: InstanceFleetStatus structure

The current status of the instance fleet.

TargetOnDemandCapacity
Type: int

The target capacity of On-Demand units for the instance fleet, which determines how many On-Demand Instances to provision. When the instance fleet launches, Amazon EMR tries to provision On-Demand Instances as specified by InstanceTypeConfig. Each instance configuration has a specified WeightedCapacity. When an On-Demand Instance is provisioned, the WeightedCapacity units count toward the target capacity. Amazon EMR provisions instances until the target capacity is totally fulfilled, even if this results in an overage. For example, if there are 2 units remaining to fulfill capacity, and Amazon EMR can only provision an instance with a WeightedCapacity of 5 units, the instance is provisioned, and the target capacity is exceeded by 3 units. You can use InstanceFleet$ProvisionedOnDemandCapacity to determine the Spot capacity units that have been provisioned for the instance fleet.

If not specified or set to 0, only Spot Instances are provisioned for the instance fleet using TargetSpotCapacity. At least one of TargetSpotCapacity and TargetOnDemandCapacity should be greater than 0. For a master instance fleet, only one of TargetSpotCapacity and TargetOnDemandCapacity can be specified, and its value must be 1.

TargetSpotCapacity
Type: int

The target capacity of Spot units for the instance fleet, which determines how many Spot Instances to provision. When the instance fleet launches, Amazon EMR tries to provision Spot Instances as specified by InstanceTypeConfig. Each instance configuration has a specified WeightedCapacity. When a Spot instance is provisioned, the WeightedCapacity units count toward the target capacity. Amazon EMR provisions instances until the target capacity is totally fulfilled, even if this results in an overage. For example, if there are 2 units remaining to fulfill capacity, and Amazon EMR can only provision an instance with a WeightedCapacity of 5 units, the instance is provisioned, and the target capacity is exceeded by 3 units. You can use InstanceFleet$ProvisionedSpotCapacity to determine the Spot capacity units that have been provisioned for the instance fleet.

If not specified or set to 0, only On-Demand Instances are provisioned for the instance fleet. At least one of TargetSpotCapacity and TargetOnDemandCapacity should be greater than 0. For a master instance fleet, only one of TargetSpotCapacity and TargetOnDemandCapacity can be specified, and its value must be 1.

InstanceFleetConfig

Description

The configuration that defines an instance fleet.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

Members
Context
Type: string

Reserved.

InstanceFleetType
Required: Yes
Type: string

The node type that the instance fleet hosts. Valid values are MASTER, CORE, and TASK.

InstanceTypeConfigs
Type: Array of InstanceTypeConfig structures

The instance type configurations that define the Amazon EC2 instances in the instance fleet.

LaunchSpecifications

The launch specification for the instance fleet.

Name
Type: string

The friendly name of the instance fleet.

ResizeSpecifications

The resize specification for the instance fleet.

TargetOnDemandCapacity
Type: int

The target capacity of On-Demand units for the instance fleet, which determines how many On-Demand Instances to provision. When the instance fleet launches, Amazon EMR tries to provision On-Demand Instances as specified by InstanceTypeConfig. Each instance configuration has a specified WeightedCapacity. When an On-Demand Instance is provisioned, the WeightedCapacity units count toward the target capacity. Amazon EMR provisions instances until the target capacity is totally fulfilled, even if this results in an overage. For example, if there are 2 units remaining to fulfill capacity, and Amazon EMR can only provision an instance with a WeightedCapacity of 5 units, the instance is provisioned, and the target capacity is exceeded by 3 units.

If not specified or set to 0, only Spot Instances are provisioned for the instance fleet using TargetSpotCapacity. At least one of TargetSpotCapacity and TargetOnDemandCapacity should be greater than 0. For a master instance fleet, only one of TargetSpotCapacity and TargetOnDemandCapacity can be specified, and its value must be 1.

TargetSpotCapacity
Type: int

The target capacity of Spot units for the instance fleet, which determines how many Spot Instances to provision. When the instance fleet launches, Amazon EMR tries to provision Spot Instances as specified by InstanceTypeConfig. Each instance configuration has a specified WeightedCapacity. When a Spot Instance is provisioned, the WeightedCapacity units count toward the target capacity. Amazon EMR provisions instances until the target capacity is totally fulfilled, even if this results in an overage. For example, if there are 2 units remaining to fulfill capacity, and Amazon EMR can only provision an instance with a WeightedCapacity of 5 units, the instance is provisioned, and the target capacity is exceeded by 3 units.

If not specified or set to 0, only On-Demand Instances are provisioned for the instance fleet. At least one of TargetSpotCapacity and TargetOnDemandCapacity should be greater than 0. For a master instance fleet, only one of TargetSpotCapacity and TargetOnDemandCapacity can be specified, and its value must be 1.

InstanceFleetModifyConfig

Description

Configuration parameters for an instance fleet modification request.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

Members
Context
Type: string

Reserved.

InstanceFleetId
Required: Yes
Type: string

A unique identifier for the instance fleet.

InstanceTypeConfigs
Type: Array of InstanceTypeConfig structures

An array of InstanceTypeConfig objects that specify how Amazon EMR provisions Amazon EC2 instances when it fulfills On-Demand and Spot capacities. For more information, see InstanceTypeConfig.

ResizeSpecifications

The resize specification for the instance fleet.

TargetOnDemandCapacity
Type: int

The target capacity of On-Demand units for the instance fleet. For more information see InstanceFleetConfig$TargetOnDemandCapacity.

TargetSpotCapacity
Type: int

The target capacity of Spot units for the instance fleet. For more information, see InstanceFleetConfig$TargetSpotCapacity.

InstanceFleetProvisioningSpecifications

Description

The launch specification for On-Demand and Spot Instances in the fleet.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions. On-Demand and Spot instance allocation strategies are available in Amazon EMR releases 5.12.1 and later.

Members
OnDemandSpecification

The launch specification for On-Demand Instances in the instance fleet, which determines the allocation strategy and capacity reservation options.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions. On-Demand Instances allocation strategy is available in Amazon EMR releases 5.12.1 and later.

SpotSpecification

The launch specification for Spot instances in the fleet, which determines the allocation strategy, defined duration, and provisioning timeout behavior.

InstanceFleetResizingSpecifications

Description

The resize specification for On-Demand and Spot Instances in the fleet.

Members
OnDemandResizeSpecification

The resize specification for On-Demand Instances in the instance fleet, which contains the allocation strategy, capacity reservation options, and the resize timeout period.

SpotResizeSpecification
Type: SpotResizingSpecification structure

The resize specification for Spot Instances in the instance fleet, which contains the allocation strategy and the resize timeout period.

InstanceFleetStateChangeReason

Description

Provides status change reason details for the instance fleet.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

Members
Code
Type: string

A code corresponding to the reason the state change occurred.

Message
Type: string

An explanatory message.

InstanceFleetStatus

Description

The status of the instance fleet.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

Members
State
Type: string

A code representing the instance fleet status.

  • PROVISIONING—The instance fleet is provisioning Amazon EC2 resources and is not yet ready to run jobs.

  • BOOTSTRAPPING—Amazon EC2 instances and other resources have been provisioned and the bootstrap actions specified for the instances are underway.

  • RUNNING—Amazon EC2 instances and other resources are running. They are either executing jobs or waiting to execute jobs.

  • RESIZING—A resize operation is underway. Amazon EC2 instances are either being added or removed.

  • SUSPENDED—A resize operation could not complete. Existing Amazon EC2 instances are running, but instances can't be added or removed.

  • TERMINATING—The instance fleet is terminating Amazon EC2 instances.

  • TERMINATED—The instance fleet is no longer active, and all Amazon EC2 instances have been terminated.

StateChangeReason

Provides status change reason details for the instance fleet.

Timeline
Type: InstanceFleetTimeline structure

Provides historical timestamps for the instance fleet, including the time of creation, the time it became ready to run jobs, and the time of termination.

InstanceFleetTimeline

Description

Provides historical timestamps for the instance fleet, including the time of creation, the time it became ready to run jobs, and the time of termination.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

Members
CreationDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The time and date the instance fleet was created.

EndDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The time and date the instance fleet terminated.

ReadyDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The time and date the instance fleet was ready to run jobs.

InstanceGroup

Description

This entity represents an instance group, which is a group of instances that have common purpose. For example, CORE instance group is used for HDFS.

Members
AutoScalingPolicy

An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. The automatic scaling policy defines how an instance group dynamically adds and terminates Amazon EC2 instances in response to the value of a CloudWatch metric. See PutAutoScalingPolicy.

BidPrice
Type: string

If specified, indicates that the instance group uses Spot Instances. This is the maximum price you are willing to pay for Spot Instances. Specify OnDemandPrice to set the amount equal to the On-Demand price, or specify an amount in USD.

Configurations
Type: Array of Configuration structures

Amazon EMR releases 4.x or later.

The list of configurations supplied for an Amazon EMR cluster instance group. You can specify a separate configuration for each instance group (master, core, and task).

ConfigurationsVersion
Type: long (int|float)

The version number of the requested configuration specification for this instance group.

CustomAmiId
Type: string

The custom AMI ID to use for the provisioned instance group.

EbsBlockDevices
Type: Array of EbsBlockDevice structures

The EBS block devices that are mapped to this instance group.

EbsOptimized
Type: boolean

If the instance group is EBS-optimized. An Amazon EBS-optimized instance uses an optimized configuration stack and provides additional, dedicated capacity for Amazon EBS I/O.

Id
Type: string

The identifier of the instance group.

InstanceGroupType
Type: string

The type of the instance group. Valid values are MASTER, CORE or TASK.

InstanceType
Type: string

The Amazon EC2 instance type for all instances in the instance group.

LastSuccessfullyAppliedConfigurations
Type: Array of Configuration structures

A list of configurations that were successfully applied for an instance group last time.

LastSuccessfullyAppliedConfigurationsVersion
Type: long (int|float)

The version number of a configuration specification that was successfully applied for an instance group last time.

Market
Type: string

The marketplace to provision instances for this group. Valid values are ON_DEMAND or SPOT.

Name
Type: string

The name of the instance group.

RequestedInstanceCount
Type: int

The target number of instances for the instance group.

RunningInstanceCount
Type: int

The number of instances currently running in this instance group.

ShrinkPolicy
Type: ShrinkPolicy structure

Policy for customizing shrink operations.

Status
Type: InstanceGroupStatus structure

The current status of the instance group.

InstanceGroupConfig

Description

Configuration defining a new instance group.

Members
AutoScalingPolicy
Type: AutoScalingPolicy structure

An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. The automatic scaling policy defines how an instance group dynamically adds and terminates Amazon EC2 instances in response to the value of a CloudWatch metric. See PutAutoScalingPolicy.

BidPrice
Type: string

If specified, indicates that the instance group uses Spot Instances. This is the maximum price you are willing to pay for Spot Instances. Specify OnDemandPrice to set the amount equal to the On-Demand price, or specify an amount in USD.

Configurations
Type: Array of Configuration structures

Amazon EMR releases 4.x or later.

The list of configurations supplied for an Amazon EMR cluster instance group. You can specify a separate configuration for each instance group (master, core, and task).

CustomAmiId
Type: string

The custom AMI ID to use for the provisioned instance group.

EbsConfiguration
Type: EbsConfiguration structure

EBS configurations that will be attached to each Amazon EC2 instance in the instance group.

InstanceCount
Required: Yes
Type: int

Target number of instances for the instance group.

InstanceRole
Required: Yes
Type: string

The role of the instance group in the cluster.

InstanceType
Required: Yes
Type: string

The Amazon EC2 instance type for all instances in the instance group.

Market
Type: string

Market type of the Amazon EC2 instances used to create a cluster node.

Name
Type: string

Friendly name given to the instance group.

InstanceGroupDetail

Description

Detailed information about an instance group.

Members
BidPrice
Type: string

If specified, indicates that the instance group uses Spot Instances. This is the maximum price you are willing to pay for Spot Instances. Specify OnDemandPrice to set the amount equal to the On-Demand price, or specify an amount in USD.

CreationDateTime
Required: Yes
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date/time the instance group was created.

CustomAmiId
Type: string

The custom AMI ID to use for the provisioned instance group.

EndDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date/time the instance group was terminated.

InstanceGroupId
Type: string

Unique identifier for the instance group.

InstanceRequestCount
Required: Yes
Type: int

Target number of instances to run in the instance group.

InstanceRole
Required: Yes
Type: string

Instance group role in the cluster

InstanceRunningCount
Required: Yes
Type: int

Actual count of running instances.

InstanceType
Required: Yes
Type: string

Amazon EC2 instance type.

LastStateChangeReason
Type: string

Details regarding the state of the instance group.

Market
Required: Yes
Type: string

Market type of the Amazon EC2 instances used to create a cluster node.

Name
Type: string

Friendly name for the instance group.

ReadyDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date/time the instance group was available to the cluster.

StartDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date/time the instance group was started.

State
Required: Yes
Type: string

State of instance group. The following values are no longer supported: STARTING, TERMINATED, and FAILED.

InstanceGroupModifyConfig

Description

Modify the size or configurations of an instance group.

Members
Configurations
Type: Array of Configuration structures

A list of new or modified configurations to apply for an instance group.

EC2InstanceIdsToTerminate
Type: Array of strings

The Amazon EC2 InstanceIds to terminate. After you terminate the instances, the instance group will not return to its original requested size.

InstanceCount
Type: int

Target size for the instance group.

InstanceGroupId
Required: Yes
Type: string

Unique ID of the instance group to modify.

ReconfigurationType
Type: string

Type of reconfiguration requested. Valid values are MERGE and OVERWRITE.

ShrinkPolicy
Type: ShrinkPolicy structure

Policy for customizing shrink operations.

InstanceGroupStateChangeReason

Description

The status change reason details for the instance group.

Members
Code
Type: string

The programmable code for the state change reason.

Message
Type: string

The status change reason description.

InstanceGroupStatus

Description

The details of the instance group status.

Members
State
Type: string

The current state of the instance group.

StateChangeReason

The status change reason details for the instance group.

Timeline
Type: InstanceGroupTimeline structure

The timeline of the instance group status over time.

InstanceGroupTimeline

Description

The timeline of the instance group lifecycle.

Members
CreationDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The creation date and time of the instance group.

EndDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the instance group terminated.

ReadyDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the instance group became ready to perform tasks.

InstanceResizePolicy

Description

Custom policy for requesting termination protection or termination of specific instances when shrinking an instance group.

Members
InstanceTerminationTimeout
Type: int

Decommissioning timeout override for the specific list of instances to be terminated.

InstancesToProtect
Type: Array of strings

Specific list of instances to be protected when shrinking an instance group.

InstancesToTerminate
Type: Array of strings

Specific list of instances to be terminated when shrinking an instance group.

InstanceStateChangeReason

Description

The details of the status change reason for the instance.

Members
Code
Type: string

The programmable code for the state change reason.

Message
Type: string

The status change reason description.

InstanceStatus

Description

The instance status details.

Members
State
Type: string

The current state of the instance.

StateChangeReason
Type: InstanceStateChangeReason structure

The details of the status change reason for the instance.

Timeline
Type: InstanceTimeline structure

The timeline of the instance status over time.

InstanceTimeline

Description

The timeline of the instance lifecycle.

Members
CreationDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The creation date and time of the instance.

EndDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the instance was terminated.

ReadyDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the instance was ready to perform tasks.

InstanceTypeConfig

Description

An instance type configuration for each instance type in an instance fleet, which determines the Amazon EC2 instances Amazon EMR attempts to provision to fulfill On-Demand and Spot target capacities. When you use an allocation strategy, you can include a maximum of 30 instance type configurations for a fleet. For more information about how to use an allocation strategy, see Configure Instance Fleets. Without an allocation strategy, you may specify a maximum of five instance type configurations for a fleet.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

Members
BidPrice
Type: string

The bid price for each Amazon EC2 Spot Instance type as defined by InstanceType. Expressed in USD. If neither BidPrice nor BidPriceAsPercentageOfOnDemandPrice is provided, BidPriceAsPercentageOfOnDemandPrice defaults to 100%.

BidPriceAsPercentageOfOnDemandPrice
Type: double

The bid price, as a percentage of On-Demand price, for each Amazon EC2 Spot Instance as defined by InstanceType. Expressed as a number (for example, 20 specifies 20%). If neither BidPrice nor BidPriceAsPercentageOfOnDemandPrice is provided, BidPriceAsPercentageOfOnDemandPrice defaults to 100%.

Configurations
Type: Array of Configuration structures

A configuration classification that applies when provisioning cluster instances, which can include configurations for applications and software that run on the cluster.

CustomAmiId
Type: string

The custom AMI ID to use for the instance type.

EbsConfiguration
Type: EbsConfiguration structure

The configuration of Amazon Elastic Block Store (Amazon EBS) attached to each instance as defined by InstanceType.

InstanceType
Required: Yes
Type: string

An Amazon EC2 instance type, such as m3.xlarge.

Priority
Type: double

The priority at which Amazon EMR launches the Amazon EC2 instances with this instance type. Priority starts at 0, which is the highest priority. Amazon EMR considers the highest priority first.

WeightedCapacity
Type: int

The number of units that a provisioned instance of this type provides toward fulfilling the target capacities defined in InstanceFleetConfig. This value is 1 for a master instance fleet, and must be 1 or greater for core and task instance fleets. Defaults to 1 if not specified.

InstanceTypeSpecification

Description

The configuration specification for each instance type in an instance fleet.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

Members
BidPrice
Type: string

The bid price for each Amazon EC2 Spot Instance type as defined by InstanceType. Expressed in USD.

BidPriceAsPercentageOfOnDemandPrice
Type: double

The bid price, as a percentage of On-Demand price, for each Amazon EC2 Spot Instance as defined by InstanceType. Expressed as a number (for example, 20 specifies 20%).

Configurations
Type: Array of Configuration structures

A configuration classification that applies when provisioning cluster instances, which can include configurations for applications and software bundled with Amazon EMR.

CustomAmiId
Type: string

The custom AMI ID to use for the instance type.

EbsBlockDevices
Type: Array of EbsBlockDevice structures

The configuration of Amazon Elastic Block Store (Amazon EBS) attached to each instance as defined by InstanceType.

EbsOptimized
Type: boolean

Evaluates to TRUE when the specified InstanceType is EBS-optimized.

InstanceType
Type: string

The Amazon EC2 instance type, for example m3.xlarge.

Priority
Type: double

The priority at which Amazon EMR launches the Amazon EC2 instances with this instance type. Priority starts at 0, which is the highest priority. Amazon EMR considers the highest priority first.

WeightedCapacity
Type: int

The number of units that a provisioned instance of this type provides toward fulfilling the target capacities defined in InstanceFleetConfig. Capacity values represent performance characteristics such as vCPUs, memory, or I/O. If not specified, the default value is 1.

InternalServerError

Description

Indicates that an error occurred while processing the request and that the request was not completed.

Members

InternalServerException

Description

This exception occurs when there is an internal failure in the Amazon EMR service.

Members
Message
Type: string

The message associated with the exception.

InvalidRequestException

Description

This exception occurs when there is something wrong with user input.

Members
ErrorCode
Type: string

The error code associated with the exception.

Message
Type: string

The message associated with the exception.

JobFlowDetail

Description

A description of a cluster (job flow).

Members
AmiVersion
Type: string

Applies only to Amazon EMR AMI versions 3.x and 2.x. For Amazon EMR releases 4.0 and later, ReleaseLabel is used. To specify a custom AMI, use CustomAmiID.

AutoScalingRole
Type: string

An IAM role for automatic scaling policies. The default role is EMR_AutoScaling_DefaultRole. The IAM role provides a way for the automatic scaling feature to get the required permissions it needs to launch and terminate Amazon EC2 instances in an instance group.

BootstrapActions
Type: Array of BootstrapActionDetail structures

A list of the bootstrap actions run by the job flow.

ExecutionStatusDetail
Required: Yes
Type: JobFlowExecutionStatusDetail structure

Describes the execution status of the job flow.

Instances
Required: Yes
Type: JobFlowInstancesDetail structure

Describes the Amazon EC2 instances of the job flow.

JobFlowId
Required: Yes
Type: string

The job flow identifier.

JobFlowRole
Type: string

The IAM role that was specified when the job flow was launched. The Amazon EC2 instances of the job flow assume this role.

LogEncryptionKmsKeyId
Type: string

The KMS key used for encrypting log files. This attribute is only available with Amazon EMR 5.30.0 and later, excluding 6.0.0.

LogUri
Type: string

The location in Amazon S3 where log files for the job are stored.

Name
Required: Yes
Type: string

The name of the job flow.

ScaleDownBehavior
Type: string

The way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized. TERMINATE_AT_INSTANCE_HOUR indicates that Amazon EMR terminates nodes at the instance-hour boundary, regardless of when the request to terminate the instance was submitted. This option is only available with Amazon EMR 5.1.0 and later and is the default for clusters created using that version. TERMINATE_AT_TASK_COMPLETION indicates that Amazon EMR adds nodes to a deny list and drains tasks from nodes before terminating the Amazon EC2 instances, regardless of the instance-hour boundary. With either behavior, Amazon EMR removes the least active nodes first and blocks instance termination if it could lead to HDFS corruption. TERMINATE_AT_TASK_COMPLETION available only in Amazon EMR releases 4.1.0 and later, and is the default for releases of Amazon EMR earlier than 5.1.0.

ServiceRole
Type: string

The IAM role that is assumed by the Amazon EMR service to access Amazon Web Services resources on your behalf.

Steps
Type: Array of StepDetail structures

A list of steps run by the job flow.

SupportedProducts
Type: Array of strings

A list of strings set by third-party software when the job flow is launched. If you are not using third-party software to manage the job flow, this value is empty.

VisibleToAllUsers
Type: boolean

Indicates whether the cluster is visible to IAM principals in the Amazon Web Services account associated with the cluster. When true, IAM principals in the Amazon Web Services account can perform Amazon EMR cluster actions that their IAM policies allow. When false, only the IAM principal that created the cluster and the Amazon Web Services account root user can perform Amazon EMR actions, regardless of IAM permissions policies attached to other IAM principals.

The default value is true if a value is not provided when creating a cluster using the Amazon EMR API RunJobFlow command, the CLI create-cluster command, or the Amazon Web Services Management Console.

JobFlowExecutionStatusDetail

Description

Describes the status of the cluster (job flow).

Members
CreationDateTime
Required: Yes
Type: timestamp (string|DateTime or anything parsable by strtotime)

The creation date and time of the job flow.

EndDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The completion date and time of the job flow.

LastStateChangeReason
Type: string

Description of the job flow last changed state.

ReadyDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the job flow was ready to start running bootstrap actions.

StartDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The start date and time of the job flow.

State
Required: Yes
Type: string

The state of the job flow.

JobFlowInstancesConfig

Description

A description of the Amazon EC2 instance on which the cluster (job flow) runs. A valid JobFlowInstancesConfig must contain either InstanceGroups or InstanceFleets. They cannot be used together. You may also have MasterInstanceType, SlaveInstanceType, and InstanceCount (all three must be present), but we don't recommend this configuration.

Members
AdditionalMasterSecurityGroups
Type: Array of strings

A list of additional Amazon EC2 security group IDs for the master node.

AdditionalSlaveSecurityGroups
Type: Array of strings

A list of additional Amazon EC2 security group IDs for the core and task nodes.

Ec2KeyName
Type: string

The name of the Amazon EC2 key pair that can be used to connect to the master node using SSH as the user called "hadoop."

Ec2SubnetId
Type: string

Applies to clusters that use the uniform instance group configuration. To launch the cluster in Amazon Virtual Private Cloud (Amazon VPC), set this parameter to the identifier of the Amazon VPC subnet where you want the cluster to launch. If you do not specify this value and your account supports EC2-Classic, the cluster launches in EC2-Classic.

Ec2SubnetIds
Type: Array of strings

Applies to clusters that use the instance fleet configuration. When multiple Amazon EC2 subnet IDs are specified, Amazon EMR evaluates them and launches instances in the optimal subnet.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

EmrManagedMasterSecurityGroup
Type: string

The identifier of the Amazon EC2 security group for the master node. If you specify EmrManagedMasterSecurityGroup, you must also specify EmrManagedSlaveSecurityGroup.

EmrManagedSlaveSecurityGroup
Type: string

The identifier of the Amazon EC2 security group for the core and task nodes. If you specify EmrManagedSlaveSecurityGroup, you must also specify EmrManagedMasterSecurityGroup.

HadoopVersion
Type: string

Applies only to Amazon EMR release versions earlier than 4.0. The Hadoop version for the cluster. Valid inputs are "0.18" (no longer maintained), "0.20" (no longer maintained), "0.20.205" (no longer maintained), "1.0.3", "2.2.0", or "2.4.0". If you do not set this value, the default of 0.18 is used, unless the AmiVersion parameter is set in the RunJobFlow call, in which case the default version of Hadoop for that AMI version is used.

InstanceCount
Type: int

The number of Amazon EC2 instances in the cluster.

InstanceFleets
Type: Array of InstanceFleetConfig structures

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

Describes the Amazon EC2 instances and instance configurations for clusters that use the instance fleet configuration.

InstanceGroups
Type: Array of InstanceGroupConfig structures

Configuration for the instance groups in a cluster.

KeepJobFlowAliveWhenNoSteps
Type: boolean

Specifies whether the cluster should remain available after completing all steps. Defaults to false. For more information about configuring cluster termination, see Control Cluster Termination in the EMR Management Guide.

MasterInstanceType
Type: string

The Amazon EC2 instance type of the master node.

Placement
Type: PlacementType structure

The Availability Zone in which the cluster runs.

ServiceAccessSecurityGroup
Type: string

The identifier of the Amazon EC2 security group for the Amazon EMR service to access clusters in VPC private subnets.

SlaveInstanceType
Type: string

The Amazon EC2 instance type of the core and task nodes.

TerminationProtected
Type: boolean

Specifies whether to lock the cluster to prevent the Amazon EC2 instances from being terminated by API call, user intervention, or in the event of a job-flow error.

UnhealthyNodeReplacement
Type: boolean

Indicates whether Amazon EMR should gracefully replace core nodes that have degraded within the cluster.

JobFlowInstancesDetail

Description

Specify the type of Amazon EC2 instances that the cluster (job flow) runs on.

Members
Ec2KeyName
Type: string

The name of an Amazon EC2 key pair that can be used to connect to the master node using SSH.

Ec2SubnetId
Type: string

For clusters launched within Amazon Virtual Private Cloud, this is the identifier of the subnet where the cluster was launched.

HadoopVersion
Type: string

The Hadoop version for the cluster.

InstanceCount
Required: Yes
Type: int

The number of Amazon EC2 instances in the cluster. If the value is 1, the same instance serves as both the master and core and task node. If the value is greater than 1, one instance is the master node and all others are core and task nodes.

InstanceGroups
Type: Array of InstanceGroupDetail structures

Details about the instance groups in a cluster.

KeepJobFlowAliveWhenNoSteps
Type: boolean

Specifies whether the cluster should remain available after completing all steps.

MasterInstanceId
Type: string

The Amazon EC2 instance identifier of the master node.

MasterInstanceType
Required: Yes
Type: string

The Amazon EC2 master node instance type.

MasterPublicDnsName
Type: string

The DNS name of the master node. If the cluster is on a private subnet, this is the private DNS name. On a public subnet, this is the public DNS name.

NormalizedInstanceHours
Type: int

An approximation of the cost of the cluster, represented in m1.small/hours. This value is increased one time for every hour that an m1.small instance runs. Larger instances are weighted more heavily, so an Amazon EC2 instance that is roughly four times more expensive would result in the normalized instance hours being increased incrementally four times. This result is only an approximation and does not reflect the actual billing rate.

Placement
Type: PlacementType structure

The Amazon EC2 Availability Zone for the cluster.

SlaveInstanceType
Required: Yes
Type: string

The Amazon EC2 core and task node instance type.

TerminationProtected
Type: boolean

Specifies whether the Amazon EC2 instances in the cluster are protected from termination by API calls, user intervention, or in the event of a job-flow error.

UnhealthyNodeReplacement
Type: boolean

Indicates whether Amazon EMR should gracefully replace core nodes that have degraded within the cluster.

KerberosAttributes

Description

Attributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration. For more information see Use Kerberos Authentication in the Amazon EMR Management Guide.

Members
ADDomainJoinPassword
Type: string

The Active Directory password for ADDomainJoinUser.

ADDomainJoinUser
Type: string

Required only when establishing a cross-realm trust with an Active Directory domain. A user with sufficient privileges to join resources to the domain.

CrossRealmTrustPrincipalPassword
Type: string

Required only when establishing a cross-realm trust with a KDC in a different realm. The cross-realm principal password, which must be identical across realms.

KdcAdminPassword
Required: Yes
Type: string

The password used within the cluster for the kadmin service on the cluster-dedicated KDC, which maintains Kerberos principals, password policies, and keytabs for the cluster.

Realm
Required: Yes
Type: string

The name of the Kerberos realm to which all nodes in a cluster belong. For example, EC2.INTERNAL.

KeyValue

Description

A key-value pair.

Members
Key
Type: string

The unique identifier of a key-value pair.

Value
Type: string

The value part of the identified key.

ManagedScalingPolicy

Description

Managed scaling policy for an Amazon EMR cluster. The policy specifies the limits for resources that can be added or terminated from a cluster. The policy only applies to the core and task nodes. The master node cannot be scaled after initial configuration.

Members
ComputeLimits
Type: ComputeLimits structure

The Amazon EC2 unit limits for a managed scaling policy. The managed scaling activity of a cluster is not allowed to go above or below these limits. The limit only applies to the core and task nodes. The master node cannot be scaled after initial configuration.

ScalingStrategy
Type: string

Determines whether a custom scaling utilization performance index can be set. Possible values include ADVANCED or DEFAULT.

UtilizationPerformanceIndex
Type: int

An integer value that represents an advanced scaling strategy. Setting a higher value optimizes for performance. Setting a lower value optimizes for resource conservation. Setting the value to 50 balances performance and resource conservation. Possible values are 1, 25, 50, 75, and 100.

MetricDimension

Description

A CloudWatch dimension, which is specified using a Key (known as a Name in CloudWatch), Value pair. By default, Amazon EMR uses one dimension whose Key is JobFlowID and Value is a variable representing the cluster ID, which is ${emr.clusterId}. This enables the rule to bootstrap when the cluster ID becomes available.

Members
Key
Type: string

The dimension name.

Value
Type: string

The dimension value.

NotebookExecution

Description

A notebook execution. An execution is a specific instance that an Amazon EMR Notebook is run using the StartNotebookExecution action.

Members
Arn
Type: string

The Amazon Resource Name (ARN) of the notebook execution.

EditorId
Type: string

The unique identifier of the Amazon EMR Notebook that is used for the notebook execution.

EndTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The timestamp when notebook execution ended.

EnvironmentVariables
Type: Associative array of custom strings keys (XmlStringMaxLen256) to strings

The environment variables associated with the notebook execution.

ExecutionEngine
Type: ExecutionEngineConfig structure

The execution engine, such as an Amazon EMR cluster, used to run the Amazon EMR notebook and perform the notebook execution.

LastStateChangeReason
Type: string

The reason for the latest status change of the notebook execution.

NotebookExecutionId
Type: string

The unique identifier of a notebook execution.

NotebookExecutionName
Type: string

A name for the notebook execution.

NotebookInstanceSecurityGroupId
Type: string

The unique identifier of the Amazon EC2 security group associated with the Amazon EMR Notebook instance. For more information see Specifying Amazon EC2 Security Groups for Amazon EMR Notebooks in the Amazon EMR Management Guide.

NotebookParams
Type: string

Input parameters in JSON format passed to the Amazon EMR Notebook at runtime for execution.

NotebookS3Location
Type: NotebookS3LocationForOutput structure

The Amazon S3 location that stores the notebook execution input.

OutputNotebookFormat
Type: string

The output format for the notebook execution.

OutputNotebookS3Location

The Amazon S3 location for the notebook execution output.

OutputNotebookURI
Type: string

The location of the notebook execution's output file in Amazon S3.

StartTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The timestamp when notebook execution started.

Status
Type: string

The status of the notebook execution.

  • START_PENDING indicates that the cluster has received the execution request but execution has not begun.

  • STARTING indicates that the execution is starting on the cluster.

  • RUNNING indicates that the execution is being processed by the cluster.

  • FINISHING indicates that execution processing is in the final stages.

  • FINISHED indicates that the execution has completed without error.

  • FAILING indicates that the execution is failing and will not finish successfully.

  • FAILED indicates that the execution failed.

  • STOP_PENDING indicates that the cluster has received a StopNotebookExecution request and the stop is pending.

  • STOPPING indicates that the cluster is in the process of stopping the execution as a result of a StopNotebookExecution request.

  • STOPPED indicates that the execution stopped because of a StopNotebookExecution request.

Tags
Type: Array of Tag structures

A list of tags associated with a notebook execution. Tags are user-defined key-value pairs that consist of a required key string with a maximum of 128 characters and an optional value string with a maximum of 256 characters.

NotebookExecutionSummary

Description

Details for a notebook execution. The details include information such as the unique ID and status of the notebook execution.

Members
EditorId
Type: string

The unique identifier of the editor associated with the notebook execution.

EndTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The timestamp when notebook execution started.

ExecutionEngineId
Type: string

The unique ID of the execution engine for the notebook execution.

NotebookExecutionId
Type: string

The unique identifier of the notebook execution.

NotebookExecutionName
Type: string

The name of the notebook execution.

NotebookS3Location
Type: NotebookS3LocationForOutput structure

The Amazon S3 location that stores the notebook execution input.

StartTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The timestamp when notebook execution started.

Status
Type: string

The status of the notebook execution.

  • START_PENDING indicates that the cluster has received the execution request but execution has not begun.

  • STARTING indicates that the execution is starting on the cluster.

  • RUNNING indicates that the execution is being processed by the cluster.

  • FINISHING indicates that execution processing is in the final stages.

  • FINISHED indicates that the execution has completed without error.

  • FAILING indicates that the execution is failing and will not finish successfully.

  • FAILED indicates that the execution failed.

  • STOP_PENDING indicates that the cluster has received a StopNotebookExecution request and the stop is pending.

  • STOPPING indicates that the cluster is in the process of stopping the execution as a result of a StopNotebookExecution request.

  • STOPPED indicates that the execution stopped because of a StopNotebookExecution request.

NotebookS3LocationForOutput

Description

The Amazon S3 location that stores the notebook execution input.

Members
Bucket
Type: string

The Amazon S3 bucket that stores the notebook execution input.

Key
Type: string

The key to the Amazon S3 location that stores the notebook execution input.

NotebookS3LocationFromInput

Description

The Amazon S3 location that stores the notebook execution input.

Members
Bucket
Type: string

The Amazon S3 bucket that stores the notebook execution input.

Key
Type: string

The key to the Amazon S3 location that stores the notebook execution input.

OSRelease

Description

The Amazon Linux release specified for a cluster in the RunJobFlow request.

Members
Label
Type: string

The Amazon Linux release specified for a cluster in the RunJobFlow request. The format is as shown in Amazon Linux 2 Release Notes . For example, 2.0.20220218.1.

OnDemandCapacityReservationOptions

Description

Describes the strategy for using unused Capacity Reservations for fulfilling On-Demand capacity.

Members
CapacityReservationPreference
Type: string

Indicates the instance's Capacity Reservation preferences. Possible preferences include:

  • open - The instance can run in any open Capacity Reservation that has matching attributes (instance type, platform, Availability Zone).

  • none - The instance avoids running in a Capacity Reservation even if one is available. The instance runs as an On-Demand Instance.

CapacityReservationResourceGroupArn
Type: string

The ARN of the Capacity Reservation resource group in which to run the instance.

UsageStrategy
Type: string

Indicates whether to use unused Capacity Reservations for fulfilling On-Demand capacity.

If you specify use-capacity-reservations-first, the fleet uses unused Capacity Reservations to fulfill On-Demand capacity up to the target On-Demand capacity. If multiple instance pools have unused Capacity Reservations, the On-Demand allocation strategy (lowest-price) is applied. If the number of unused Capacity Reservations is less than the On-Demand target capacity, the remaining On-Demand target capacity is launched according to the On-Demand allocation strategy (lowest-price).

If you do not specify a value, the fleet fulfills the On-Demand capacity according to the chosen On-Demand allocation strategy.

OnDemandProvisioningSpecification

Description

The launch specification for On-Demand Instances in the instance fleet, which determines the allocation strategy.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions. On-Demand Instances allocation strategy is available in Amazon EMR releases 5.12.1 and later.

Members
AllocationStrategy
Required: Yes
Type: string

Specifies the strategy to use in launching On-Demand instance fleets. Available options are lowest-price and prioritized. lowest-price specifies to launch the instances with the lowest price first, and prioritized specifies that Amazon EMR should launch the instances with the highest priority first. The default is lowest-price.

CapacityReservationOptions

The launch specification for On-Demand instances in the instance fleet, which determines the allocation strategy.

OnDemandResizingSpecification

Description

The resize specification for On-Demand Instances in the instance fleet, which contains the resize timeout period.

Members
AllocationStrategy
Type: string

Specifies the allocation strategy to use to launch On-Demand instances during a resize. The default is lowest-price.

CapacityReservationOptions

Describes the strategy for using unused Capacity Reservations for fulfilling On-Demand capacity.

TimeoutDurationMinutes
Type: int

On-Demand resize timeout in minutes. If On-Demand Instances are not provisioned within this time, the resize workflow stops. The minimum value is 5 minutes, and the maximum value is 10,080 minutes (7 days). The timeout applies to all resize workflows on the Instance Fleet. The resize could be triggered by Amazon EMR Managed Scaling or by the customer (via Amazon EMR Console, Amazon EMR CLI modify-instance-fleet or Amazon EMR SDK ModifyInstanceFleet API) or by Amazon EMR due to Amazon EC2 Spot Reclamation.

OutputNotebookS3LocationForOutput

Description

The Amazon S3 location that stores the notebook execution output.

Members
Bucket
Type: string

The Amazon S3 bucket that stores the notebook execution output.

Key
Type: string

The key to the Amazon S3 location that stores the notebook execution output.

OutputNotebookS3LocationFromInput

Description

The Amazon S3 location that stores the notebook execution output.

Members
Bucket
Type: string

The Amazon S3 bucket that stores the notebook execution output.

Key
Type: string

The key to the Amazon S3 location that stores the notebook execution output.

PlacementGroupConfig

Description

Placement group configuration for an Amazon EMR cluster. The configuration specifies the placement strategy that can be applied to instance roles during cluster creation.

To use this configuration, consider attaching managed policy AmazonElasticMapReducePlacementGroupPolicy to the Amazon EMR role.

Members
InstanceRole
Required: Yes
Type: string

Role of the instance in the cluster.

Starting with Amazon EMR release 5.23.0, the only supported instance role is MASTER.

PlacementStrategy
Type: string

Amazon EC2 Placement Group strategy associated with instance role.

Starting with Amazon EMR release 5.23.0, the only supported placement strategy is SPREAD for the MASTER instance role.

PlacementType

Description

The Amazon EC2 Availability Zone configuration of the cluster (job flow).

Members
AvailabilityZone
Type: string

The Amazon EC2 Availability Zone for the cluster. AvailabilityZone is used for uniform instance groups, while AvailabilityZones (plural) is used for instance fleets.

AvailabilityZones
Type: Array of strings

When multiple Availability Zones are specified, Amazon EMR evaluates them and launches instances in the optimal Availability Zone. AvailabilityZones is used for instance fleets, while AvailabilityZone (singular) is used for uniform instance groups.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

PortRange

Description

A list of port ranges that are permitted to allow inbound traffic from all public IP addresses. To specify a single port, use the same value for MinRange and MaxRange.

Members
MaxRange
Type: int

The smallest port number in a specified range of port numbers.

MinRange
Required: Yes
Type: int

The smallest port number in a specified range of port numbers.

ReleaseLabelFilter

Description

The release label filters by application or version prefix.

Members
Application
Type: string

Optional release label application filter. For example, spark@2.1.0.

Prefix
Type: string

Optional release label version prefix filter. For example, emr-5.

ScalingAction

Description

The type of adjustment the automatic scaling activity makes when triggered, and the periodicity of the adjustment.

Members
Market
Type: string

Not available for instance groups. Instance groups use the market type specified for the group.

SimpleScalingPolicyConfiguration
Required: Yes
Type: SimpleScalingPolicyConfiguration structure

The type of adjustment the automatic scaling activity makes when triggered, and the periodicity of the adjustment.

ScalingConstraints

Description

The upper and lower Amazon EC2 instance limits for an automatic scaling policy. Automatic scaling activities triggered by automatic scaling rules will not cause an instance group to grow above or below these limits.

Members
MaxCapacity
Required: Yes
Type: int

The upper boundary of Amazon EC2 instances in an instance group beyond which scaling activities are not allowed to grow. Scale-out activities will not add instances beyond this boundary.

MinCapacity
Required: Yes
Type: int

The lower boundary of Amazon EC2 instances in an instance group below which scaling activities are not allowed to shrink. Scale-in activities will not terminate instances below this boundary.

ScalingRule

Description

A scale-in or scale-out rule that defines scaling activity, including the CloudWatch metric alarm that triggers activity, how Amazon EC2 instances are added or removed, and the periodicity of adjustments. The automatic scaling policy for an instance group can comprise one or more automatic scaling rules.

Members
Action
Required: Yes
Type: ScalingAction structure

The conditions that trigger an automatic scaling activity.

Description
Type: string

A friendly, more verbose description of the automatic scaling rule.

Name
Required: Yes
Type: string

The name used to identify an automatic scaling rule. Rule names must be unique within a scaling policy.

Trigger
Required: Yes
Type: ScalingTrigger structure

The CloudWatch alarm definition that determines when automatic scaling activity is triggered.

ScalingTrigger

Description

The conditions that trigger an automatic scaling activity.

Members
CloudWatchAlarmDefinition
Required: Yes
Type: CloudWatchAlarmDefinition structure

The definition of a CloudWatch metric alarm. When the defined alarm conditions are met along with other trigger parameters, scaling activity begins.

ScriptBootstrapActionConfig

Description

Configuration of the script to run during a bootstrap action.

Members
Args
Type: Array of strings

A list of command line arguments to pass to the bootstrap action script.

Path
Required: Yes
Type: string

Location in Amazon S3 of the script to run during a bootstrap action.

SecurityConfigurationSummary

Description

The creation date and time, and name, of a security configuration.

Members
CreationDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time the security configuration was created.

Name
Type: string

The name of the security configuration.

SessionMappingDetail

Description

Details for an Amazon EMR Studio session mapping including creation time, user or group ID, Studio ID, and so on.

Members
CreationTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The time the session mapping was created.

IdentityId
Type: string

The globally unique identifier (GUID) of the user or group.

IdentityName
Type: string

The name of the user or group. For more information, see UserName and DisplayName in the IAM Identity Center Identity Store API Reference.

IdentityType
Type: string

Specifies whether the identity mapped to the Amazon EMR Studio is a user or a group.

LastModifiedTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The time the session mapping was last modified.

SessionPolicyArn
Type: string

The Amazon Resource Name (ARN) of the session policy associated with the user or group.

StudioId
Type: string

The ID of the Amazon EMR Studio.

SessionMappingSummary

Description

Details for an Amazon EMR Studio session mapping. The details do not include the time the session mapping was last modified.

Members
CreationTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The time the session mapping was created.

IdentityId
Type: string

The globally unique identifier (GUID) of the user or group from the IAM Identity Center Identity Store.

IdentityName
Type: string

The name of the user or group. For more information, see UserName and DisplayName in the IAM Identity Center Identity Store API Reference.

IdentityType
Type: string

Specifies whether the identity mapped to the Amazon EMR Studio is a user or a group.

SessionPolicyArn
Type: string

The Amazon Resource Name (ARN) of the session policy associated with the user or group.

StudioId
Type: string

The ID of the Amazon EMR Studio.

ShrinkPolicy

Description

Policy for customizing shrink operations. Allows configuration of decommissioning timeout and targeted instance shrinking.

Members
DecommissionTimeout
Type: int

The desired timeout for decommissioning an instance. Overrides the default YARN decommissioning timeout.

InstanceResizePolicy
Type: InstanceResizePolicy structure

Custom policy for requesting termination protection or termination of specific instances when shrinking an instance group.

SimpleScalingPolicyConfiguration

Description

An automatic scaling configuration, which describes how the policy adds or removes instances, the cooldown period, and the number of Amazon EC2 instances that will be added each time the CloudWatch metric alarm condition is satisfied.

Members
AdjustmentType
Type: string

The way in which Amazon EC2 instances are added (if ScalingAdjustment is a positive number) or terminated (if ScalingAdjustment is a negative number) each time the scaling activity is triggered. CHANGE_IN_CAPACITY is the default. CHANGE_IN_CAPACITY indicates that the Amazon EC2 instance count increments or decrements by ScalingAdjustment, which should be expressed as an integer. PERCENT_CHANGE_IN_CAPACITY indicates the instance count increments or decrements by the percentage specified by ScalingAdjustment, which should be expressed as an integer. For example, 20 indicates an increase in 20% increments of cluster capacity. EXACT_CAPACITY indicates the scaling activity results in an instance group with the number of Amazon EC2 instances specified by ScalingAdjustment, which should be expressed as a positive integer.

CoolDown
Type: int

The amount of time, in seconds, after a scaling activity completes before any further trigger-related scaling activities can start. The default value is 0.

ScalingAdjustment
Required: Yes
Type: int

The amount by which to scale in or scale out, based on the specified AdjustmentType. A positive value adds to the instance group's Amazon EC2 instance count while a negative number removes instances. If AdjustmentType is set to EXACT_CAPACITY, the number should only be a positive integer. If AdjustmentType is set to PERCENT_CHANGE_IN_CAPACITY, the value should express the percentage as an integer. For example, -20 indicates a decrease in 20% increments of cluster capacity.

SimplifiedApplication

Description

The returned release label application names or versions.

Members
Name
Type: string

The returned release label application name. For example, hadoop.

Version
Type: string

The returned release label application version. For example, 3.2.1.

SpotProvisioningSpecification

Description

The launch specification for Spot Instances in the instance fleet, which determines the defined duration, provisioning timeout behavior, and allocation strategy.

The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions. Spot Instance allocation strategy is available in Amazon EMR releases 5.12.1 and later.

Spot Instances with a defined duration (also known as Spot blocks) are no longer available to new customers from July 1, 2021. For customers who have previously used the feature, we will continue to support Spot Instances with a defined duration until December 31, 2022.

Members
AllocationStrategy
Type: string

Specifies one of the following strategies to launch Spot Instance fleets: capacity-optimized, price-capacity-optimized, lowest-price, or diversified, and capacity-optimized-prioritized. For more information on the provisioning strategies, see Allocation strategies for Spot Instances in the Amazon EC2 User Guide for Linux Instances.

When you launch a Spot Instance fleet with the old console, it automatically launches with the capacity-optimized strategy. You can't change the allocation strategy from the old console.

BlockDurationMinutes
Type: int

The defined duration for Spot Instances (also known as Spot blocks) in minutes. When specified, the Spot Instance does not terminate before the defined duration expires, and defined duration pricing for Spot Instances applies. Valid values are 60, 120, 180, 240, 300, or 360. The duration period starts as soon as a Spot Instance receives its instance ID. At the end of the duration, Amazon EC2 marks the Spot Instance for termination and provides a Spot Instance termination notice, which gives the instance a two-minute warning before it terminates.

Spot Instances with a defined duration (also known as Spot blocks) are no longer available to new customers from July 1, 2021. For customers who have previously used the feature, we will continue to support Spot Instances with a defined duration until December 31, 2022.

TimeoutAction
Required: Yes
Type: string

The action to take when TargetSpotCapacity has not been fulfilled when the TimeoutDurationMinutes has expired; that is, when all Spot Instances could not be provisioned within the Spot provisioning timeout. Valid values are TERMINATE_CLUSTER and SWITCH_TO_ON_DEMAND. SWITCH_TO_ON_DEMAND specifies that if no Spot Instances are available, On-Demand Instances should be provisioned to fulfill any remaining Spot capacity.

TimeoutDurationMinutes
Required: Yes
Type: int

The Spot provisioning timeout period in minutes. If Spot Instances are not provisioned within this time period, the TimeOutAction is taken. Minimum value is 5 and maximum value is 1440. The timeout applies only during initial provisioning, when the cluster is first created.

SpotResizingSpecification

Description

The resize specification for Spot Instances in the instance fleet, which contains the resize timeout period.

Members
AllocationStrategy
Type: string

Specifies the allocation strategy to use to launch Spot instances during a resize. If you run Amazon EMR releases 6.9.0 or higher, the default is price-capacity-optimized. If you run Amazon EMR releases 6.8.0 or lower, the default is capacity-optimized.

TimeoutDurationMinutes
Type: int

Spot resize timeout in minutes. If Spot Instances are not provisioned within this time, the resize workflow will stop provisioning of Spot instances. Minimum value is 5 minutes and maximum value is 10,080 minutes (7 days). The timeout applies to all resize workflows on the Instance Fleet. The resize could be triggered by Amazon EMR Managed Scaling or by the customer (via Amazon EMR Console, Amazon EMR CLI modify-instance-fleet or Amazon EMR SDK ModifyInstanceFleet API) or by Amazon EMR due to Amazon EC2 Spot Reclamation.

Step

Description

This represents a step in a cluster.

Members
ActionOnFailure
Type: string

The action to take when the cluster step fails. Possible values are TERMINATE_CLUSTER, CANCEL_AND_WAIT, and CONTINUE. TERMINATE_JOB_FLOW is provided for backward compatibility. We recommend using TERMINATE_CLUSTER instead.

If a cluster's StepConcurrencyLevel is greater than 1, do not use AddJobFlowSteps to submit a step with this parameter set to CANCEL_AND_WAIT or TERMINATE_CLUSTER. The step is not submitted and the action fails with a message that the ActionOnFailure setting is not valid.

If you change a cluster's StepConcurrencyLevel to be greater than 1 while a step is running, the ActionOnFailure parameter may not behave as you expect. In this case, for a step that fails with this parameter set to CANCEL_AND_WAIT, pending steps and the running step are not canceled; for a step that fails with this parameter set to TERMINATE_CLUSTER, the cluster does not terminate.

Config
Type: HadoopStepConfig structure

The Hadoop job configuration of the cluster step.

ExecutionRoleArn
Type: string

The Amazon Resource Name (ARN) of the runtime role for a step on the cluster. The runtime role can be a cross-account IAM role. The runtime role ARN is a combination of account ID, role name, and role type using the following format: arn:partition:service:region:account:resource.

For example, arn:aws:IAM::1234567890:role/ReadOnly is a correctly formatted runtime role ARN.

Id
Type: string

The identifier of the cluster step.

Name
Type: string

The name of the cluster step.

Status
Type: StepStatus structure

The current execution status details of the cluster step.

StepConfig

Description

Specification for a cluster (job flow) step.

Members
ActionOnFailure
Type: string

The action to take when the step fails. Use one of the following values:

  • TERMINATE_CLUSTER - Shuts down the cluster.

  • CANCEL_AND_WAIT - Cancels any pending steps and returns the cluster to the WAITING state.

  • CONTINUE - Continues to the next step in the queue.

  • TERMINATE_JOB_FLOW - Shuts down the cluster. TERMINATE_JOB_FLOW is provided for backward compatibility. We recommend using TERMINATE_CLUSTER instead.

If a cluster's StepConcurrencyLevel is greater than 1, do not use AddJobFlowSteps to submit a step with this parameter set to CANCEL_AND_WAIT or TERMINATE_CLUSTER. The step is not submitted and the action fails with a message that the ActionOnFailure setting is not valid.

If you change a cluster's StepConcurrencyLevel to be greater than 1 while a step is running, the ActionOnFailure parameter may not behave as you expect. In this case, for a step that fails with this parameter set to CANCEL_AND_WAIT, pending steps and the running step are not canceled; for a step that fails with this parameter set to TERMINATE_CLUSTER, the cluster does not terminate.

HadoopJarStep
Required: Yes
Type: HadoopJarStepConfig structure

The JAR file used for the step.

Name
Required: Yes
Type: string

The name of the step.

StepDetail

Description

Combines the execution state and configuration of a step.

Members
ExecutionStatusDetail
Required: Yes
Type: StepExecutionStatusDetail structure

The description of the step status.

StepConfig
Required: Yes
Type: StepConfig structure

The step configuration.

StepExecutionStatusDetail

Description

The execution state of a step.

Members
CreationDateTime
Required: Yes
Type: timestamp (string|DateTime or anything parsable by strtotime)

The creation date and time of the step.

EndDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The completion date and time of the step.

LastStateChangeReason
Type: string

A description of the step's current state.

StartDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The start date and time of the step.

State
Required: Yes
Type: string

The state of the step.

StepStateChangeReason

Description

The details of the step state change reason.

Members
Code
Type: string

The programmable code for the state change reason. Note: Currently, the service provides no code for the state change.

Message
Type: string

The descriptive message for the state change reason.

StepStatus

Description

The execution status details of the cluster step.

Members
FailureDetails
Type: FailureDetails structure

The details for the step failure including reason, message, and log file path where the root cause was identified.

State
Type: string

The execution state of the cluster step.

StateChangeReason
Type: StepStateChangeReason structure

The reason for the step execution status change.

Timeline
Type: StepTimeline structure

The timeline of the cluster step status over time.

StepSummary

Description

The summary of the cluster step.

Members
ActionOnFailure
Type: string

The action to take when the cluster step fails. Possible values are TERMINATE_CLUSTER, CANCEL_AND_WAIT, and CONTINUE. TERMINATE_JOB_FLOW is available for backward compatibility.

Config
Type: HadoopStepConfig structure

The Hadoop job configuration of the cluster step.

Id
Type: string

The identifier of the cluster step.

Name
Type: string

The name of the cluster step.

Status
Type: StepStatus structure

The current execution status details of the cluster step.

StepTimeline

Description

The timeline of the cluster step lifecycle.

Members
CreationDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the cluster step was created.

EndDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the cluster step execution completed or failed.

StartDateTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the cluster step execution started.

Studio

Description

Details for an Amazon EMR Studio including ID, creation time, name, and so on.

Members
AuthMode
Type: string

Specifies whether the Amazon EMR Studio authenticates users with IAM or IAM Identity Center.

CreationTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The time the Amazon EMR Studio was created.

DefaultS3Location
Type: string

The Amazon S3 location to back up Amazon EMR Studio Workspaces and notebook files.

Description
Type: string

The detailed description of the Amazon EMR Studio.

EncryptionKeyArn
Type: string

The KMS key identifier (ARN) used to encrypt Amazon EMR Studio workspace and notebook files when backed up to Amazon S3.

EngineSecurityGroupId
Type: string

The ID of the Engine security group associated with the Amazon EMR Studio. The Engine security group allows inbound network traffic from resources in the Workspace security group.

IdcInstanceArn
Type: string

The ARN of the IAM Identity Center instance the Studio application belongs to.

IdcUserAssignment
Type: string

Indicates whether the Studio has REQUIRED or OPTIONAL IAM Identity Center user assignment. If the value is set to REQUIRED, users must be explicitly assigned to the Studio application to access the Studio.

IdpAuthUrl
Type: string

Your identity provider's authentication endpoint. Amazon EMR Studio redirects federated users to this endpoint for authentication when logging in to a Studio with the Studio URL.

IdpRelayStateParameterName
Type: string

The name of your identity provider's RelayState parameter.

Name
Type: string

The name of the Amazon EMR Studio.

ServiceRole
Type: string

The name of the IAM role assumed by the Amazon EMR Studio.

StudioArn
Type: string

The Amazon Resource Name (ARN) of the Amazon EMR Studio.

StudioId
Type: string

The ID of the Amazon EMR Studio.

SubnetIds
Type: Array of strings

The list of IDs of the subnets associated with the Amazon EMR Studio.

Tags
Type: Array of Tag structures

A list of tags associated with the Amazon EMR Studio.

TrustedIdentityPropagationEnabled
Type: boolean

Indicates whether the Studio has Trusted identity propagation enabled. The default value is false.

Url
Type: string

The unique access URL of the Amazon EMR Studio.

UserRole
Type: string

The name of the IAM role assumed by users logged in to the Amazon EMR Studio. A Studio only requires a UserRole when you use IAM authentication.

VpcId
Type: string

The ID of the VPC associated with the Amazon EMR Studio.

WorkspaceSecurityGroupId
Type: string

The ID of the Workspace security group associated with the Amazon EMR Studio. The Workspace security group allows outbound network traffic to resources in the Engine security group and to the internet.

StudioSummary

Description

Details for an Amazon EMR Studio, including ID, Name, VPC, and Description. To fetch additional details such as subnets, IAM roles, security groups, and tags for the Studio, use the DescribeStudio API.

Members
AuthMode
Type: string

Specifies whether the Studio authenticates users using IAM or IAM Identity Center.

CreationTime
Type: timestamp (string|DateTime or anything parsable by strtotime)

The time when the Amazon EMR Studio was created.

Description
Type: string

The detailed description of the Amazon EMR Studio.

Name
Type: string

The name of the Amazon EMR Studio.

StudioId
Type: string

The ID of the Amazon EMR Studio.

Url
Type: string

The unique access URL of the Amazon EMR Studio.

VpcId
Type: string

The ID of the Virtual Private Cloud (Amazon VPC) associated with the Amazon EMR Studio.

SupportedInstanceType

Description

An instance type that the specified Amazon EMR release supports.

Members
Architecture
Type: string

The CPU architecture, for example X86_64 or AARCH64.

EbsOptimizedAvailable
Type: boolean

Indicates whether the SupportedInstanceType supports Amazon EBS optimization.

EbsOptimizedByDefault
Type: boolean

Indicates whether the SupportedInstanceType uses Amazon EBS optimization by default.

EbsStorageOnly
Type: boolean

Indicates whether the SupportedInstanceType only supports Amazon EBS.

InstanceFamilyId
Type: string

The Amazon EC2 family and generation for the SupportedInstanceType.

Is64BitsOnly
Type: boolean

Indicates whether the SupportedInstanceType only supports 64-bit architecture.

MemoryGB
Type: float

The amount of memory that is available to Amazon EMR from the SupportedInstanceType. The kernel and hypervisor software consume some memory, so this value might be lower than the overall memory for the instance type.

NumberOfDisks
Type: int

Number of disks for the SupportedInstanceType. This value is 0 for Amazon EBS-only instance types.

StorageGB
Type: int

StorageGB represents the storage capacity of the SupportedInstanceType. This value is 0 for Amazon EBS-only instance types.

Type
Type: string

The Amazon EC2 instance type, for example m5.xlarge, of the SupportedInstanceType.

VCPU
Type: int

The number of vCPUs available for the SupportedInstanceType.

SupportedProductConfig

Description

The list of supported product configurations that allow user-supplied arguments. Amazon EMR accepts these arguments and forwards them to the corresponding installation script as bootstrap action arguments.

Members
Args
Type: Array of strings

The list of user-supplied arguments.

Name
Type: string

The name of the product configuration.

Tag

Description

A key-value pair containing user-defined metadata that you can associate with an Amazon EMR resource. Tags make it easier to associate clusters in various ways, such as grouping clusters to track your Amazon EMR resource allocation costs. For more information, see Tag Clusters.

Members
Key
Type: string

A user-defined key, which is the minimum required information for a valid tag. For more information, see Tag.

Value
Type: string

A user-defined value, which is optional in a tag. For more information, see Tag Clusters.

UsernamePassword

Description

The username and password that you use to connect to cluster endpoints.

Members
Password
Type: string

The password associated with the temporary credentials that you use to connect to cluster endpoints.

Username
Type: string

The username associated with the temporary credentials that you use to connect to cluster endpoints.

VolumeSpecification

Description

EBS volume specifications such as volume type, IOPS, size (GiB) and throughput (MiB/s) that are requested for the EBS volume attached to an Amazon EC2 instance in the cluster.

Members
Iops
Type: int

The number of I/O operations per second (IOPS) that the volume supports.

SizeInGB
Required: Yes
Type: int

The volume size, in gibibytes (GiB). This can be a number from 1 - 1024. If the volume type is EBS-optimized, the minimum value is 10.

Throughput
Type: int

The throughput, in mebibyte per second (MiB/s). This optional parameter can be a number from 125 - 1000 and is valid only for gp3 volumes.

VolumeType
Required: Yes
Type: string

The volume type. Volume types supported are gp3, gp2, io1, st1, sc1, and standard.