Esempi di utilizzo di Amazon EMR AWS CLI

Modalità Focus

Esempi di utilizzo di Amazon EMR AWS CLI - AWS Command Line Interface

I seguenti esempi di codice mostrano come eseguire azioni e implementare scenari comuni utilizzando Amazon EMR. AWS Command Line Interface

Le operazioni sono estratti di codice da programmi più grandi e devono essere eseguite nel contesto. Sebbene le operazioni mostrino come richiamare le singole funzioni del servizio, è possibile visualizzarle contestualizzate negli scenari correlati.

Ogni esempio include un collegamento al codice sorgente completo, dove puoi trovare istruzioni su come configurare ed eseguire il codice nel contesto.

Argomenti

Operazioni

Operazioni

Il seguente esempio di codice mostra come utilizzareadd-instance-fleet.

AWS CLI

Per aggiungere un parco di istanze di attività a un cluster

Questo esempio aggiunge un nuovo parco di istanze di attività al cluster specificato.

Comando:


aws emr add-instance-fleet --cluster-id 'j-12ABCDEFGHI34JK' --instance-fleet  InstanceFleetType=TASK,TargetSpotCapacity=1,LaunchSpecifications={SpotSpecification='{TimeoutDurationMinutes=20,TimeoutAction=TERMINATE_CLUSTER}'},InstanceTypeConfigs=['{InstanceType=m3.xlarge,BidPrice=0.5}']

Output:


{
   "ClusterId": "j-12ABCDEFGHI34JK",
   "InstanceFleetId": "if-23ABCDEFGHI45JJ"
}

Per i dettagli sull'API, vedere AddInstanceFleetin AWS CLI Command Reference.

add-instance-fleet

Il seguente esempio di codice mostra come utilizzareadd-instance-fleet.

AWS CLI

Per aggiungere un parco di istanze di attività a un cluster

Questo esempio aggiunge un nuovo parco di istanze di attività al cluster specificato.

Comando:


aws emr add-instance-fleet --cluster-id 'j-12ABCDEFGHI34JK' --instance-fleet  InstanceFleetType=TASK,TargetSpotCapacity=1,LaunchSpecifications={SpotSpecification='{TimeoutDurationMinutes=20,TimeoutAction=TERMINATE_CLUSTER}'},InstanceTypeConfigs=['{InstanceType=m3.xlarge,BidPrice=0.5}']

Output:


{
   "ClusterId": "j-12ABCDEFGHI34JK",
   "InstanceFleetId": "if-23ABCDEFGHI45JJ"
}

Per i dettagli sull'API, vedere AddInstanceFleetin AWS CLI Command Reference.

Il seguente esempio di codice mostra come utilizzareadd-steps.

AWS CLI

1. Per aggiungere passaggi JAR personalizzati a un cluster

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX --steps Type=CUSTOM_JAR,Name=CustomJAR,ActionOnFailure=CONTINUE,Jar=s3://amzn-s3-demo-bucket/mytest.jar,Args=arg1,arg2,arg3 Type=CUSTOM_JAR,Name=CustomJAR,ActionOnFailure=CONTINUE,Jar=s3://amzn-s3-demo-bucket/mytest.jar,MainClass=mymainclass,Args=arg1,arg2,arg3

Parametri richiesti:

Jar

Parametri opzionali:


Type, Name, ActionOnFailure, Args

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
        "s-YYYYYYYY"
    ]
}

2. Per aggiungere passaggi di streaming a un cluster

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX --steps Type=STREAMING,Name='Streaming Program',ActionOnFailure=CONTINUE,Args=[-files,s3://elasticmapreduce/samples/wordcount/wordSplitter.py,-mapper,wordSplitter.py,-reducer,aggregate,-input,s3://elasticmapreduce/samples/wordcount/input,-output,s3://amzn-s3-demo-bucket/wordcount/output]

Parametri richiesti:


Type, Args

Parametri opzionali:


Name, ActionOnFailure

Equivalente a JSON (contenuto di step.json):


 [
  {
    "Name": "JSON Streaming Step",
    "Args": ["-files","s3://elasticmapreduce/samples/wordcount/wordSplitter.py","-mapper","wordSplitter.py","-reducer","aggregate","-input","s3://elasticmapreduce/samples/wordcount/input","-output","s3://amzn-s3-demo-bucket/wordcount/output"],
    "ActionOnFailure": "CONTINUE",
    "Type": "STREAMING"
  }
]

NOTA: gli argomenti JSON devono includere opzioni e valori come elementi propri nell'elenco.

Comando (usando step.json):


aws emr add-steps --cluster-id j-XXXXXXXX --steps file://./step.json

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
        "s-YYYYYYYY"
    ]
}

3. Per aggiungere una fase di streaming con più file a un cluster (solo JSON)

JSON (multiplefiles.json):


[
  {
     "Name": "JSON Streaming Step",
     "Type": "STREAMING",
     "ActionOnFailure": "CONTINUE",
     "Args": [
         "-files",
         "s3://amzn-s3-demo-bucket/mapper.py,s3://amzn-s3-demo-bucket/reducer.py",
         "-mapper",
         "mapper.py",
         "-reducer",
         "reducer.py",
         "-input",
         "s3://amzn-s3-demo-bucket/input",
         "-output",
         "s3://amzn-s3-demo-bucket/output"]
  }
]

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX  --steps file://./multiplefiles.json

Parametri richiesti:


Type, Args

Parametri opzionali:


Name, ActionOnFailure

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
    ]
}

4. Per aggiungere passaggi Hive a un cluster

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX --steps Type=HIVE,Name='Hive program',ActionOnFailure=CONTINUE,Args=[-f,s3://amzn-s3-demo-bucket/myhivescript.q,-d,INPUT=s3://amzn-s3-demo-bucket/myhiveinput,-d,OUTPUT=s3://amzn-s3-demo-bucket/myhiveoutput,arg1,arg2] Type=HIVE,Name='Hive steps',ActionOnFailure=TERMINATE_CLUSTER,Args=[-f,s3://elasticmapreduce/samples/hive-ads/libs/model-build.q,-d,INPUT=s3://elasticmapreduce/samples/hive-ads/tables,-d,OUTPUT=s3://amzn-s3-demo-bucket/hive-ads/output/2014-04-18/11-07-32,-d,LIBS=s3://elasticmapreduce/samples/hive-ads/libs]

Parametri richiesti:


Type, Args

Parametri opzionali:


Name, ActionOnFailure

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
        "s-YYYYYYYY"
    ]
}

5. Per aggiungere Pig steps a un cluster

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX --steps Type=PIG,Name='Pig program',ActionOnFailure=CONTINUE,Args=[-f,s3://amzn-s3-demo-bucket/mypigscript.pig,-p,INPUT=s3://amzn-s3-demo-bucket/mypiginput,-p,OUTPUT=s3://amzn-s3-demo-bucket/mypigoutput,arg1,arg2] Type=PIG,Name='Pig program',Args=[-f,s3://elasticmapreduce/samples/pig-apache/do-reports2.pig,-p,INPUT=s3://elasticmapreduce/samples/pig-apache/input,-p,OUTPUT=s3://amzn-s3-demo-bucket/pig-apache/output,arg1,arg2]

Parametri richiesti:


Type, Args

Parametri opzionali:


Name, ActionOnFailure

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
        "s-YYYYYYYY"
    ]
}

6. Per aggiungere i passaggi di Impala a un cluster

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX --steps Type=IMPALA,Name='Impala program',ActionOnFailure=CONTINUE,Args=--impala-script,s3://myimpala/input,--console-output-path,s3://myimpala/output

Parametri richiesti:


Type, Args

Parametri opzionali:


Name, ActionOnFailure

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
        "s-YYYYYYYY"
    ]
}

Per i dettagli sull'API, consulta AddSteps AWS CLICommand Reference.

add-steps

Il seguente esempio di codice mostra come utilizzareadd-steps.

AWS CLI

1. Per aggiungere passaggi JAR personalizzati a un cluster

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX --steps Type=CUSTOM_JAR,Name=CustomJAR,ActionOnFailure=CONTINUE,Jar=s3://amzn-s3-demo-bucket/mytest.jar,Args=arg1,arg2,arg3 Type=CUSTOM_JAR,Name=CustomJAR,ActionOnFailure=CONTINUE,Jar=s3://amzn-s3-demo-bucket/mytest.jar,MainClass=mymainclass,Args=arg1,arg2,arg3

Parametri richiesti:

Jar

Parametri opzionali:


Type, Name, ActionOnFailure, Args

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
        "s-YYYYYYYY"
    ]
}

2. Per aggiungere passaggi di streaming a un cluster

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX --steps Type=STREAMING,Name='Streaming Program',ActionOnFailure=CONTINUE,Args=[-files,s3://elasticmapreduce/samples/wordcount/wordSplitter.py,-mapper,wordSplitter.py,-reducer,aggregate,-input,s3://elasticmapreduce/samples/wordcount/input,-output,s3://amzn-s3-demo-bucket/wordcount/output]

Parametri richiesti:


Type, Args

Parametri opzionali:


Name, ActionOnFailure

Equivalente a JSON (contenuto di step.json):


 [
  {
    "Name": "JSON Streaming Step",
    "Args": ["-files","s3://elasticmapreduce/samples/wordcount/wordSplitter.py","-mapper","wordSplitter.py","-reducer","aggregate","-input","s3://elasticmapreduce/samples/wordcount/input","-output","s3://amzn-s3-demo-bucket/wordcount/output"],
    "ActionOnFailure": "CONTINUE",
    "Type": "STREAMING"
  }
]

NOTA: gli argomenti JSON devono includere opzioni e valori come elementi propri nell'elenco.

Comando (usando step.json):


aws emr add-steps --cluster-id j-XXXXXXXX --steps file://./step.json

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
        "s-YYYYYYYY"
    ]
}

3. Per aggiungere una fase di streaming con più file a un cluster (solo JSON)

JSON (multiplefiles.json):


[
  {
     "Name": "JSON Streaming Step",
     "Type": "STREAMING",
     "ActionOnFailure": "CONTINUE",
     "Args": [
         "-files",
         "s3://amzn-s3-demo-bucket/mapper.py,s3://amzn-s3-demo-bucket/reducer.py",
         "-mapper",
         "mapper.py",
         "-reducer",
         "reducer.py",
         "-input",
         "s3://amzn-s3-demo-bucket/input",
         "-output",
         "s3://amzn-s3-demo-bucket/output"]
  }
]

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX  --steps file://./multiplefiles.json

Parametri richiesti:


Type, Args

Parametri opzionali:


Name, ActionOnFailure

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
    ]
}

4. Per aggiungere passaggi Hive a un cluster

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX --steps Type=HIVE,Name='Hive program',ActionOnFailure=CONTINUE,Args=[-f,s3://amzn-s3-demo-bucket/myhivescript.q,-d,INPUT=s3://amzn-s3-demo-bucket/myhiveinput,-d,OUTPUT=s3://amzn-s3-demo-bucket/myhiveoutput,arg1,arg2] Type=HIVE,Name='Hive steps',ActionOnFailure=TERMINATE_CLUSTER,Args=[-f,s3://elasticmapreduce/samples/hive-ads/libs/model-build.q,-d,INPUT=s3://elasticmapreduce/samples/hive-ads/tables,-d,OUTPUT=s3://amzn-s3-demo-bucket/hive-ads/output/2014-04-18/11-07-32,-d,LIBS=s3://elasticmapreduce/samples/hive-ads/libs]

Parametri richiesti:


Type, Args

Parametri opzionali:


Name, ActionOnFailure

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
        "s-YYYYYYYY"
    ]
}

5. Per aggiungere Pig steps a un cluster

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX --steps Type=PIG,Name='Pig program',ActionOnFailure=CONTINUE,Args=[-f,s3://amzn-s3-demo-bucket/mypigscript.pig,-p,INPUT=s3://amzn-s3-demo-bucket/mypiginput,-p,OUTPUT=s3://amzn-s3-demo-bucket/mypigoutput,arg1,arg2] Type=PIG,Name='Pig program',Args=[-f,s3://elasticmapreduce/samples/pig-apache/do-reports2.pig,-p,INPUT=s3://elasticmapreduce/samples/pig-apache/input,-p,OUTPUT=s3://amzn-s3-demo-bucket/pig-apache/output,arg1,arg2]

Parametri richiesti:


Type, Args

Parametri opzionali:


Name, ActionOnFailure

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
        "s-YYYYYYYY"
    ]
}

6. Per aggiungere i passaggi di Impala a un cluster

Comando:


aws emr add-steps --cluster-id j-XXXXXXXX --steps Type=IMPALA,Name='Impala program',ActionOnFailure=CONTINUE,Args=--impala-script,s3://myimpala/input,--console-output-path,s3://myimpala/output

Parametri richiesti:


Type, Args

Parametri opzionali:


Name, ActionOnFailure

Output:


{
    "StepIds":[
        "s-XXXXXXXX",
        "s-YYYYYYYY"
    ]
}

Per i dettagli sull'API, consulta AddSteps AWS CLICommand Reference.

Il seguente esempio di codice mostra come utilizzareadd-tags.

AWS CLI

1. Per aggiungere tag a un cluster

Comando:


aws emr add-tags --resource-id j-xxxxxxx --tags name="John Doe" age=29 sex=male address="123 East NW Seattle"

Output:


None

2. Per elencare i tag di un cluster

--Comando:


aws emr describe-cluster --cluster-id j-XXXXXXYY --query Cluster.Tags

Output:


[
    {
        "Value": "male",
        "Key": "sex"
    },
    {
        "Value": "123 East NW Seattle",
        "Key": "address"
    },
    {
        "Value": "John Doe",
        "Key": "name"
    },
    {
        "Value": "29",
        "Key": "age"
    }
]

Per i dettagli sull'API, vedi AddTagsin AWS CLI Command Reference.

add-tags

Il seguente esempio di codice mostra come utilizzareadd-tags.

AWS CLI

1. Per aggiungere tag a un cluster

Comando:


aws emr add-tags --resource-id j-xxxxxxx --tags name="John Doe" age=29 sex=male address="123 East NW Seattle"

Output:


None

2. Per elencare i tag di un cluster

--Comando:


aws emr describe-cluster --cluster-id j-XXXXXXYY --query Cluster.Tags

Output:


[
    {
        "Value": "male",
        "Key": "sex"
    },
    {
        "Value": "123 East NW Seattle",
        "Key": "address"
    },
    {
        "Value": "John Doe",
        "Key": "name"
    },
    {
        "Value": "29",
        "Key": "age"
    }
]

Per i dettagli sull'API, vedi AddTagsin AWS CLI Command Reference.

Il seguente esempio di codice mostra come utilizzarecreate-cluster-examples.

AWS CLI

La maggior parte degli esempi seguenti presuppone che tu abbia specificato il ruolo del servizio Amazon EMR e il profilo dell' EC2 istanza Amazon. Se non l'hai fatto, devi specificare ogni ruolo IAM richiesto o utilizzare il --use-default-roles parametro durante la creazione del cluster. Per ulteriori informazioni sulla specificazione dei ruoli IAM, consulta Configure IAM Roles for Amazon EMR Permissions AWS to Services nella Amazon EMR Management Guide.

Esempio 1: creare un cluster

L'create-clusteresempio seguente crea un cluster EMR semplice.


aws emr create-cluster \
    --release-label emr-5.14.0 \
    --instance-type m4.large \
    --instance-count 2

Questo comando non produce alcun output.

Esempio 2: creare un cluster Amazon EMR con impostazioni predefinite ServiceRole e ruoli InstanceProfile

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza la --instance-groups configurazione.


aws emr create-cluster \
    --release-label emr-5.14.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

Esempio 3: creare un cluster Amazon EMR che utilizza una flotta di istanze

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza la --instance-fleets configurazione, specificando due tipi di istanze per ogni flotta e due sottoreti. EC2


aws emr create-cluster \
    --release-label emr-5.14.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole,SubnetIds=['subnet-ab12345c','subnet-de67890f'] \
    --instance-fleets InstanceFleetType=MASTER,TargetOnDemandCapacity=1,InstanceTypeConfigs=['{InstanceType=m4.large}'] InstanceFleetType=CORE,TargetSpotCapacity=11,InstanceTypeConfigs=['{InstanceType=m4.large,BidPrice=0.5,WeightedCapacity=3}','{InstanceType=m4.2xlarge,BidPrice=0.9,WeightedCapacity=5}'],LaunchSpecifications={SpotSpecification='{TimeoutDurationMinutes=120,TimeoutAction=SWITCH_TO_ON_DEMAND}'}

Esempio 4: creare un cluster con ruoli predefiniti

L'create-clusteresempio seguente utilizza il --use-default-roles parametro per specificare il ruolo di servizio e il profilo di istanza predefiniti.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --use-default-roles \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 5: creare un cluster e specificare le applicazioni da installare

L'create-clusteresempio seguente utilizza il --applications parametro per specificare le applicazioni installate da Amazon EMR. Questo esempio installa Hadoop, Hive e Pig.


aws emr create-cluster \
    --applications Name=Hadoop Name=Hive Name=Pig \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 6: creare un cluster che includa Spark

L'esempio seguente installa Spark.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --applications Name=Spark \
    --ec2-attributes KeyName=myKey \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 7: specificare un'AMI personalizzata da utilizzare per le istanze del cluster

L'create-clusteresempio seguente crea un'istanza di cluster basata sull'AMI Amazon Linux con IDami-a518e6df.


aws emr create-cluster \
    --name "Cluster with My Custom AMI" \
    --custom-ami-id ami-a518e6df \
    --ebs-root-volume-size 20 \
    --release-label emr-5.9.0 \
    --use-default-roles \
    --instance-count 2 \
    --instance-type m4.large

Esempio 8: per personalizzare le configurazioni delle applicazioni

Gli esempi seguenti utilizzano il --configurations parametro per specificare un file di configurazione JSON che contiene personalizzazioni delle applicazioni per Hadoop. Per ulteriori informazioni, consulta Configurazione delle applicazioni nella Guida alle versioni di Amazon EMR.

Contenuto di configurations.json.


[
    {
       "Classification": "mapred-site",
       "Properties": {
           "mapred.tasktracker.map.tasks.maximum": 2
       }
    },
    {
        "Classification": "hadoop-env",
        "Properties": {},
        "Configurations": [
            {
                "Classification": "export",
                "Properties": {
                    "HADOOP_DATANODE_HEAPSIZE": 2048,
                    "HADOOP_NAMENODE_OPTS": "-XX:GCTimeRatio=19"
                }
            }
        ]
    }
]

L'esempio seguente fa riferimento configurations.json a un file locale.


aws emr create-cluster \
    --configurations file://configurations.json \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

L'esempio seguente fa riferimento configurations.json come file in Amazon S3.


aws emr create-cluster \
    --configurations https://s3.amazonaws.com/amzn-s3-demo-bucket/configurations.json \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 9: creare un cluster con gruppi di istanze master, core e task

L'create-clusteresempio seguente specifica --instance-groups il tipo e il numero di EC2 istanze da utilizzare per i gruppi di istanze master, core e task.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --instance-groups Name=Master,InstanceGroupType=MASTER,InstanceType=m4.large,InstanceCount=1 Name=Core,InstanceGroupType=CORE,InstanceType=m4.large,InstanceCount=2 Name=Task,InstanceGroupType=TASK,InstanceType=m4.large,InstanceCount=2

Esempio 10: specificare che un cluster deve terminare dopo aver completato tutti i passaggi

L'create-clusteresempio seguente specifica che il cluster deve chiudersi automaticamente dopo aver completato tutti i passaggi. --auto-terminate


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large  InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 11: specificare i dettagli della configurazione del cluster come la coppia di EC2 chiavi Amazon, la configurazione di rete e i gruppi di sicurezza

L'create-clusteresempio seguente crea un cluster con la coppia di EC2 chiavi Amazon denominata myKey e un profilo di istanza personalizzato denominatomyProfile. Le coppie di chiavi vengono utilizzate per autorizzare le connessioni SSH ai nodi del cluster, molto spesso al nodo master. Per ulteriori informazioni, consulta Use an Amazon EC2 Key Pair for SSH Credentials nella Amazon EMR Management Guide.


aws emr create-cluster \
    --ec2-attributes KeyName=myKey,InstanceProfile=myProfile \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

L'esempio seguente crea un cluster in una sottorete Amazon VPC.


aws emr create-cluster \
    --ec2-attributes SubnetId=subnet-xxxxx \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

L'esempio seguente crea un cluster nella zona di us-east-1b disponibilità.


aws emr create-cluster \
    --ec2-attributes AvailabilityZone=us-east-1b \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

L'esempio seguente crea un cluster e specifica solo i gruppi di sicurezza gestiti da Amazon EMR.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role myServiceRole \
    --ec2-attributes InstanceProfile=myRole,EmrManagedMasterSecurityGroup=sg-master1,EmrManagedSlaveSecurityGroup=sg-slave1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

L'esempio seguente crea un cluster e specifica solo gruppi di EC2 sicurezza Amazon aggiuntivi.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role myServiceRole \
    --ec2-attributes InstanceProfile=myRole,AdditionalMasterSecurityGroups=[sg-addMaster1,sg-addMaster2,sg-addMaster3,sg-addMaster4],AdditionalSlaveSecurityGroups=[sg-addSlave1,sg-addSlave2,sg-addSlave3,sg-addSlave4] \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

L'esempio seguente crea un cluster e specifica i gruppi di sicurezza gestiti da EMR, nonché i gruppi di sicurezza aggiuntivi.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role myServiceRole \
    --ec2-attributes InstanceProfile=myRole,EmrManagedMasterSecurityGroup=sg-master1,EmrManagedSlaveSecurityGroup=sg-slave1,AdditionalMasterSecurityGroups=[sg-addMaster1,sg-addMaster2,sg-addMaster3,sg-addMaster4],AdditionalSlaveSecurityGroups=[sg-addSlave1,sg-addSlave2,sg-addSlave3,sg-addSlave4] \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

L'esempio seguente crea un cluster in una sottorete privata VPC e utilizza un gruppo di EC2 sicurezza Amazon specifico per abilitare l'accesso al servizio Amazon EMR, necessario per i cluster in sottoreti private.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role myServiceRole \
    --ec2-attributes InstanceProfile=myRole,ServiceAccessSecurityGroup=sg-service-access,EmrManagedMasterSecurityGroup=sg-master,EmrManagedSlaveSecurityGroup=sg-slave \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

L'esempio seguente specifica i parametri di configurazione del gruppo di sicurezza utilizzando un file JSON denominato archiviato localmente. ec2_attributes.json NOTA: gli argomenti JSON devono includere opzioni e valori come elementi propri nell'elenco.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role myServiceRole \
    --ec2-attributes file://ec2_attributes.json  \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

Contenuto di ec2_attributes.json.


[
    {
        "SubnetId": "subnet-xxxxx",
        "KeyName": "myKey",
        "InstanceProfile":"myRole",
        "EmrManagedMasterSecurityGroup": "sg-master1",
        "EmrManagedSlaveSecurityGroup": "sg-slave1",
        "ServiceAccessSecurityGroup": "sg-service-access",
        "AdditionalMasterSecurityGroups": ["sg-addMaster1","sg-addMaster2","sg-addMaster3","sg-addMaster4"],
        "AdditionalSlaveSecurityGroups": ["sg-addSlave1","sg-addSlave2","sg-addSlave3","sg-addSlave4"]
    }
]

Esempio 12: per abilitare il debug e specificare un URI di registro

L'create-clusteresempio seguente utilizza il --enable-debugging parametro, che consente di visualizzare i file di log più facilmente utilizzando lo strumento di debug nella console Amazon EMR. Il --log-uri parametro è obbligatorio con. --enable-debugging


aws emr create-cluster \
    --enable-debugging \
    --log-uri s3://amzn-s3-demo-bucket/myLog \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 13: per aggiungere tag durante la creazione di un cluster

I tag sono coppie chiave-valore che aiutano a identificare e gestire i cluster. L'create-clusteresempio seguente utilizza il --tags parametro per creare tre tag per un cluster, uno con il nome della chiave name e il valoreShirley Rodriguez, un secondo con il nome della chiave age e il valore 29 e un terzo tag con il nome della chiave department e il valore. Analytics


aws emr create-cluster \
    --tags name="Shirley Rodriguez" age=29 department="Analytics" \
    --release-label emr-5.32.0 \
    --instance-type m5.xlarge \
    --instance-count 3 \
    --use-default-roles

L'esempio seguente elenca i tag applicati a un cluster.


aws emr describe-cluster \
    --cluster-id j-XXXXXXYY \
    --query Cluster.Tags

Esempio 14: Utilizzare una configurazione di sicurezza che abiliti la crittografia e altre funzionalità di sicurezza

L'create-clusteresempio seguente utilizza il --security-configuration parametro per specificare una configurazione di sicurezza per un cluster EMR. Puoi utilizzare configurazioni di sicurezza con Amazon EMR versione 4.8.0 o successiva.


aws emr create-cluster \
    --instance-type m4.large \
    --release-label emr-5.9.0 \
    --security-configuration mySecurityConfiguration

Esempio 15: creare un cluster con volumi di storage EBS aggiuntivi configurati per i gruppi di istanze

Quando si specificano volumi EBS aggiuntivi, sono richiesti i seguenti argomenti:VolumeType, SizeInGB se EbsBlockDeviceConfigs specificato.

L'create-clusteresempio seguente crea un cluster con più volumi EBS collegati alle EC2 istanze del gruppo di istanze principale.


aws emr create-cluster \
    --release-label emr-5.9.0  \
    --use-default-roles \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=d2.xlarge 'InstanceGroupType=CORE,InstanceCount=2,InstanceType=d2.xlarge,EbsConfiguration={EbsOptimized=true,EbsBlockDeviceConfigs=[{VolumeSpecification={VolumeType=gp2,SizeInGB=100}},{VolumeSpecification={VolumeType=io1,SizeInGB=100,Iops=100},VolumesPerInstance=4}]}' \
    --auto-terminate

L'esempio seguente crea un cluster con più volumi EBS collegati alle EC2 istanze del gruppo di istanze principale.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --use-default-roles \
    --instance-groups 'InstanceGroupType=MASTER, InstanceCount=1, InstanceType=d2.xlarge, EbsConfiguration={EbsOptimized=true, EbsBlockDeviceConfigs=[{VolumeSpecification={VolumeType=io1, SizeInGB=100, Iops=100}},{VolumeSpecification={VolumeType=standard,SizeInGB=50},VolumesPerInstance=3}]}' InstanceGroupType=CORE,InstanceCount=2,InstanceType=d2.xlarge \
    --auto-terminate

Esempio 16: creare un cluster con una politica di scalabilità automatica

Puoi collegare policy di scalabilità automatica ai gruppi di istanze principali e task utilizzando Amazon EMR versione 4.0 e successive. La policy di scalabilità automatica aggiunge e rimuove dinamicamente EC2 le istanze in risposta a un parametro Amazon. CloudWatch Per ulteriori informazioni, consulta Using Automatic Scaling in Amazon EMR https://docs.aws.amazon.com/emr/ latest/ManagementGuide/emr < -automatic-scaling.html>`_ nella Amazon EMR Management Guide.

Quando si allega una politica di ridimensionamento automatico, è necessario specificare anche il ruolo predefinito per l'utilizzo del ridimensionamento automatico. --auto-scaling-role EMR_AutoScaling_DefaultRole

L'create-clusteresempio seguente specifica la politica di ridimensionamento automatico per il gruppo di CORE istanze utilizzando l'AutoScalingPolicyargomento con una struttura JSON incorporata, che specifica la configurazione della politica di scalabilità. I gruppi di istanze con una struttura JSON incorporata devono avere l'intera raccolta di argomenti racchiusa tra virgolette singole. L'uso delle virgolette singole è facoltativo per i gruppi di esempio senza una struttura JSON incorporata.


aws emr create-cluster
    --release-label emr-5.9.0 \
    --use-default-roles --auto-scaling-role EMR_AutoScaling_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceType=d2.xlarge,InstanceCount=1 'InstanceGroupType=CORE,InstanceType=d2.xlarge,InstanceCount=2,AutoScalingPolicy={Constraints={MinCapacity=1,MaxCapacity=5},Rules=[{Name=TestRule,Description=TestDescription,Action={Market=ON_DEMAND,SimpleScalingPolicyConfiguration={AdjustmentType=EXACT_CAPACITY,ScalingAdjustment=2}},Trigger={CloudWatchAlarmDefinition={ComparisonOperator=GREATER_THAN,EvaluationPeriods=5,MetricName=TestMetric,Namespace=EMR,Period=3,Statistic=MAXIMUM,Threshold=4.5,Unit=NONE,Dimensions=[{Key=TestKey,Value=TestValue}]}}}]}'

L'esempio seguente utilizza un file JSON per specificare la configurazione di tutti i gruppi di istanze in un cluster. instancegroupconfig.json Il file JSON specifica la configurazione della politica di scalabilità automatica per il gruppo di istanze principale.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups file://myfolder/instancegroupconfig.json \
    --auto-scaling-role EMR_AutoScaling_DefaultRole

Contenuto di instancegroupconfig.json.


[
    {
        "InstanceCount": 1,
        "Name": "MyMasterIG",
        "InstanceGroupType": "MASTER",
        "InstanceType": "m4.large"
    },
    {
        "InstanceCount": 2,
        "Name": "MyCoreIG",
        "InstanceGroupType": "CORE",
        "InstanceType": "m4.large",
        "AutoScalingPolicy": {
            "Constraints": {
                "MinCapacity": 2,
                "MaxCapacity": 10
            },
            "Rules": [
                {
                    "Name": "Default-scale-out",
                    "Description": "Replicates the default scale-out rule in the console for YARN memory.",
                    "Action": {
                        "SimpleScalingPolicyConfiguration": {
                            "AdjustmentType": "CHANGE_IN_CAPACITY",
                            "ScalingAdjustment": 1,
                            "CoolDown": 300
                        }
                    },
                    "Trigger": {
                        "CloudWatchAlarmDefinition": {
                            "ComparisonOperator": "LESS_THAN",
                            "EvaluationPeriods": 1,
                            "MetricName": "YARNMemoryAvailablePercentage",
                            "Namespace": "AWS/ElasticMapReduce",
                            "Period": 300,
                            "Threshold": 15,
                            "Statistic": "AVERAGE",
                            "Unit": "PERCENT",
                            "Dimensions": [
                                {
                                    "Key": "JobFlowId",
                                    "Value": "${emr.clusterId}"
                                }
                            ]
                        }
                    }
                }
            ]
        }
    }
]

Esempio 17: aggiungi passaggi JAR personalizzati durante la creazione di un cluster

L'create-clusteresempio seguente aggiunge passaggi specificando un file JAR archiviato in Amazon S3. Steps invia il lavoro a un cluster. La funzione principale definita nel file JAR viene eseguita dopo il provisioning EC2 delle istanze, l'esecuzione di eventuali azioni di bootstrap e l'installazione delle applicazioni. I passaggi vengono specificati utilizzando. Type=CUSTOM_JAR

Le fasi JAR personalizzate richiedono il Jar= parametro, che specifica il percorso e il nome del file JAR. I parametri opzionali sono TypeName,ActionOnFailure,Args, eMainClass. Se la classe principale non è specificata, il file JAR deve essere specificato Main-Class nel relativo file manifest.


aws emr create-cluster \
    --steps Type=CUSTOM_JAR,Name=CustomJAR,ActionOnFailure=CONTINUE,Jar=s3://amzn-s3-demo-bucket/mytest.jar,Args=arg1,arg2,arg3 Type=CUSTOM_JAR,Name=CustomJAR,ActionOnFailure=CONTINUE,Jar=s3://amzn-s3-demo-bucket/mytest.jar,MainClass=mymainclass,Args=arg1,arg2,arg3  \
    --release-label emr-5.3.1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 18: Per aggiungere passaggi di streaming durante la creazione di un cluster

create-clusterGli esempi seguenti aggiungono una fase di streaming a un cluster che termina dopo l'esecuzione di tutti i passaggi. Le fasi di streaming richiedono parametri Type eArgs. I parametri opzionali delle fasi di streaming sono Name eActionOnFailure.

L'esempio seguente specifica il passaggio inline.


aws emr create-cluster \
    --steps Type=STREAMING,Name='Streaming Program',ActionOnFailure=CONTINUE,Args=[-files,s3://elasticmapreduce/samples/wordcount/wordSplitter.py,-mapper,wordSplitter.py,-reducer,aggregate,-input,s3://elasticmapreduce/samples/wordcount/input,-output,s3://amzn-s3-demo-bucket/wordcount/output] \
    --release-label emr-5.3.1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

L'esempio seguente utilizza un file di configurazione JSON memorizzato localmente denominato. multiplefiles.json La configurazione JSON specifica più file. Per specificare più file all'interno di un passaggio, è necessario utilizzare un file di configurazione JSON per specificare il passaggio. Gli argomenti JSON devono includere opzioni e valori come elementi propri nell'elenco.


aws emr create-cluster \
    --steps file://./multiplefiles.json \
    --release-label emr-5.9.0  \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Contenuto di multiplefiles.json.


[
    {
        "Name": "JSON Streaming Step",
        "Args": [
            "-files",
            "s3://elasticmapreduce/samples/wordcount/wordSplitter.py",
            "-mapper",
            "wordSplitter.py",
            "-reducer",
            "aggregate",
            "-input",
            "s3://elasticmapreduce/samples/wordcount/input",
            "-output",
            "s3://amzn-s3-demo-bucket/wordcount/output"
        ],
        "ActionOnFailure": "CONTINUE",
        "Type": "STREAMING"
    }
]

Esempio 19: Per aggiungere passaggi Hive durante la creazione di un cluster

L'esempio seguente aggiunge i passaggi Hive durante la creazione di un cluster. I passaggi Hive richiedono parametri Type e. Args I parametri opzionali di Hive steps sono Name e. ActionOnFailure


aws emr create-cluster \
    --steps Type=HIVE,Name='Hive program',ActionOnFailure=CONTINUE,ActionOnFailure=TERMINATE_CLUSTER,Args=[-f,s3://elasticmapreduce/samples/hive-ads/libs/model-build.q,-d,INPUT=s3://elasticmapreduce/samples/hive-ads/tables,-d,OUTPUT=s3://amzn-s3-demo-bucket/hive-ads/output/2014-04-18/11-07-32,-d,LIBS=s3://elasticmapreduce/samples/hive-ads/libs] \
    --applications Name=Hive \
    --release-label emr-5.3.1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

Esempio 20: Per aggiungere passaggi Pig durante la creazione di un cluster

L'esempio seguente aggiunge i passaggi Pig durante la creazione di un cluster. I parametri richiesti da Pig steps sono Type e. Args I parametri opzionali di Pig Steps sono Name e. ActionOnFailure


aws emr create-cluster \
    --steps Type=PIG,Name='Pig program',ActionOnFailure=CONTINUE,Args=[-f,s3://elasticmapreduce/samples/pig-apache/do-reports2.pig,-p,INPUT=s3://elasticmapreduce/samples/pig-apache/input,-p,OUTPUT=s3://amzn-s3-demo-bucket/pig-apache/output] \
    --applications Name=Pig \
    --release-label emr-5.3.1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

Esempio 21: Per aggiungere azioni bootstrap

L'create-clusteresempio seguente esegue due azioni di bootstrap definite come script archiviati in Amazon S3.


aws emr create-cluster \
    --bootstrap-actions Path=s3://amzn-s3-demo-bucket/myscript1,Name=BootstrapAction1,Args=[arg1,arg2] Path=s3://amzn-s3-demo-bucket/myscript2,Name=BootstrapAction2,Args=[arg1,arg2] \
    --release-label emr-5.3.1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 22: per abilitare la visualizzazione coerente di EMRFS e personalizzare le impostazioni e RetryCount RetryPeriod

L'create-clusteresempio seguente specifica il numero di tentativi e il periodo di nuovi tentativi per la visualizzazione coerente di EMRFS. L'argomento Consistent=true è obbligatorio.


aws emr create-cluster \
    --instance-type m4.large \
    --release-label emr-5.9.0 \
    --emrfs Consistent=true,RetryCount=6,RetryPeriod=30

L'esempio seguente specifica la stessa configurazione EMRFS dell'esempio precedente, utilizzando un file di configurazione JSON memorizzato localmente denominato. emrfsconfig.json


aws emr create-cluster \
    --instance-type m4.large \
    --release-label emr-5.9.0 \
    --emrfs file://emrfsconfig.json

Contenuto di emrfsconfig.json.


{
    "Consistent": true,
    "RetryCount": 6,
    "RetryPeriod": 30
}

Esempio 23: creare un cluster con Kerberos configurato

create-clusterGli esempi seguenti creano un cluster utilizzando una configurazione di sicurezza con Kerberos abilitato e stabiliscono i parametri Kerberos per il cluster che utilizza. --kerberos-attributes

Il comando seguente specifica gli attributi Kerberos per il cluster in linea.


aws emr create-cluster \
    --instance-type m3.xlarge \
    --release-label emr-5.10.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --security-configuration mySecurityConfiguration \
    --kerberos-attributes Realm=EC2.INTERNAL,KdcAdminPassword=123,CrossRealmTrustPrincipalPassword=123

Il comando seguente specifica gli stessi attributi, ma fa riferimento a un file JSON memorizzato localmente denominato. kerberos_attributes.json In questo esempio, il file viene salvato nella stessa directory in cui si esegue il comando. Puoi anche fare riferimento a un file di configurazione salvato in Amazon S3.


aws emr create-cluster \
    --instance-type m3.xlarge \
    --release-label emr-5.10.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --security-configuration mySecurityConfiguration \
    --kerberos-attributes file://kerberos_attributes.json

Contenuto di kerberos_attributes.json.


{
    "Realm": "EC2.INTERNAL",
    "KdcAdminPassword": "123",
    "CrossRealmTrustPrincipalPassword": "123",
}

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza la --instance-groups configurazione e dispone di una politica di scalabilità gestita.


aws emr create-cluster \
    --release-label emr-5.30.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large
    --managed-scaling-policy ComputeLimits='{MinimumCapacityUnits=2,MaximumCapacityUnits=4,UnitType=Instances}'

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza «-- log-encryption-kms-key -id» per definire l'ID della chiave KMS utilizzato per la crittografia dei log.


aws emr create-cluster \
    --release-label emr-5.30.0 \
    --log-uri s3://amzn-s3-demo-bucket/myLog \
    --log-encryption-kms-key-id arn:aws:kms:us-east-1:110302272565:key/dd559181-283e-45d7-99d1-66da348c4d33 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza la configurazione «--placement-group-configs" per posizionare i nodi master in un cluster ad alta disponibilità (HA) all'interno di un gruppo di collocamento utilizzando SPREAD la strategia di EC2 posizionamento.


aws emr create-cluster \
    --release-label emr-5.30.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=3,InstanceType=m4.largeInstanceGroupType=CORE,InstanceCount=1,InstanceType=m4.large \
    --placement-group-configs InstanceRole=MASTER

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza la configurazione «--auto-termination-policy" per impostare una soglia di terminazione automatica dei periodi di inattività per il cluster.


aws emr create-cluster \
    --release-label emr-5.34.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=1,InstanceType=m4.large \
    --auto-termination-policy IdleTimeout=100

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza il «--os-release-label" per definire una versione di Amazon Linux per il lancio del cluster


aws emr create-cluster \
    --release-label emr-6.6.0 \
    --os-release-label 2.0.20220406.1 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=1,InstanceType=m4.large

Esempio 24: Per specificare gli attributi di un volume root EBS: dimensione, iops e throughput per le istanze di cluster create con le versioni EMR 6.15.0 e successive

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza gli attributi del volume root per configurare le specifiche dei volumi root per le EC2 istanze.


aws emr create-cluster \
    --name "Cluster with My Custom AMI" \
    --custom-ami-id ami-a518e6df \
    --ebs-root-volume-size 20 \
    --ebs-root-volume-iops 3000 \
    --ebs-root-volume-throughput 125 \
    --release-label emr-6.15.0 \
    --use-default-roles \
    --instance-count 2 \
    --instance-type m4.large

Per i dettagli sull'API, consulta AWS CLI Command CreateClusterExamplesReference.

create-cluster-examples

Il seguente esempio di codice mostra come utilizzarecreate-cluster-examples.

AWS CLI

Esempio 1: creare un cluster

L'create-clusteresempio seguente crea un cluster EMR semplice.


aws emr create-cluster \
    --release-label emr-5.14.0 \
    --instance-type m4.large \
    --instance-count 2

Questo comando non produce alcun output.

Esempio 2: creare un cluster Amazon EMR con impostazioni predefinite ServiceRole e ruoli InstanceProfile

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza la --instance-groups configurazione.


aws emr create-cluster \
    --release-label emr-5.14.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

Esempio 3: creare un cluster Amazon EMR che utilizza una flotta di istanze

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza la --instance-fleets configurazione, specificando due tipi di istanze per ogni flotta e due sottoreti. EC2


aws emr create-cluster \
    --release-label emr-5.14.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole,SubnetIds=['subnet-ab12345c','subnet-de67890f'] \
    --instance-fleets InstanceFleetType=MASTER,TargetOnDemandCapacity=1,InstanceTypeConfigs=['{InstanceType=m4.large}'] InstanceFleetType=CORE,TargetSpotCapacity=11,InstanceTypeConfigs=['{InstanceType=m4.large,BidPrice=0.5,WeightedCapacity=3}','{InstanceType=m4.2xlarge,BidPrice=0.9,WeightedCapacity=5}'],LaunchSpecifications={SpotSpecification='{TimeoutDurationMinutes=120,TimeoutAction=SWITCH_TO_ON_DEMAND}'}

Esempio 4: creare un cluster con ruoli predefiniti

L'create-clusteresempio seguente utilizza il --use-default-roles parametro per specificare il ruolo di servizio e il profilo di istanza predefiniti.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --use-default-roles \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 5: creare un cluster e specificare le applicazioni da installare

L'create-clusteresempio seguente utilizza il --applications parametro per specificare le applicazioni installate da Amazon EMR. Questo esempio installa Hadoop, Hive e Pig.


aws emr create-cluster \
    --applications Name=Hadoop Name=Hive Name=Pig \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 6: creare un cluster che includa Spark

L'esempio seguente installa Spark.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --applications Name=Spark \
    --ec2-attributes KeyName=myKey \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 7: specificare un'AMI personalizzata da utilizzare per le istanze del cluster

L'create-clusteresempio seguente crea un'istanza di cluster basata sull'AMI Amazon Linux con IDami-a518e6df.


aws emr create-cluster \
    --name "Cluster with My Custom AMI" \
    --custom-ami-id ami-a518e6df \
    --ebs-root-volume-size 20 \
    --release-label emr-5.9.0 \
    --use-default-roles \
    --instance-count 2 \
    --instance-type m4.large

Esempio 8: per personalizzare le configurazioni delle applicazioni

Contenuto di configurations.json.


[
    {
       "Classification": "mapred-site",
       "Properties": {
           "mapred.tasktracker.map.tasks.maximum": 2
       }
    },
    {
        "Classification": "hadoop-env",
        "Properties": {},
        "Configurations": [
            {
                "Classification": "export",
                "Properties": {
                    "HADOOP_DATANODE_HEAPSIZE": 2048,
                    "HADOOP_NAMENODE_OPTS": "-XX:GCTimeRatio=19"
                }
            }
        ]
    }
]

L'esempio seguente fa riferimento configurations.json a un file locale.


aws emr create-cluster \
    --configurations file://configurations.json \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

L'esempio seguente fa riferimento configurations.json come file in Amazon S3.


aws emr create-cluster \
    --configurations https://s3.amazonaws.com/amzn-s3-demo-bucket/configurations.json \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 9: creare un cluster con gruppi di istanze master, core e task

L'create-clusteresempio seguente specifica --instance-groups il tipo e il numero di EC2 istanze da utilizzare per i gruppi di istanze master, core e task.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --instance-groups Name=Master,InstanceGroupType=MASTER,InstanceType=m4.large,InstanceCount=1 Name=Core,InstanceGroupType=CORE,InstanceType=m4.large,InstanceCount=2 Name=Task,InstanceGroupType=TASK,InstanceType=m4.large,InstanceCount=2

Esempio 10: specificare che un cluster deve terminare dopo aver completato tutti i passaggi

L'create-clusteresempio seguente specifica che il cluster deve chiudersi automaticamente dopo aver completato tutti i passaggi. --auto-terminate


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large  InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 11: specificare i dettagli della configurazione del cluster come la coppia di EC2 chiavi Amazon, la configurazione di rete e i gruppi di sicurezza


aws emr create-cluster \
    --ec2-attributes KeyName=myKey,InstanceProfile=myProfile \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

L'esempio seguente crea un cluster in una sottorete Amazon VPC.


aws emr create-cluster \
    --ec2-attributes SubnetId=subnet-xxxxx \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

L'esempio seguente crea un cluster nella zona di us-east-1b disponibilità.


aws emr create-cluster \
    --ec2-attributes AvailabilityZone=us-east-1b \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

L'esempio seguente crea un cluster e specifica solo i gruppi di sicurezza gestiti da Amazon EMR.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role myServiceRole \
    --ec2-attributes InstanceProfile=myRole,EmrManagedMasterSecurityGroup=sg-master1,EmrManagedSlaveSecurityGroup=sg-slave1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

L'esempio seguente crea un cluster e specifica solo gruppi di EC2 sicurezza Amazon aggiuntivi.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role myServiceRole \
    --ec2-attributes InstanceProfile=myRole,AdditionalMasterSecurityGroups=[sg-addMaster1,sg-addMaster2,sg-addMaster3,sg-addMaster4],AdditionalSlaveSecurityGroups=[sg-addSlave1,sg-addSlave2,sg-addSlave3,sg-addSlave4] \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

L'esempio seguente crea un cluster e specifica i gruppi di sicurezza gestiti da EMR, nonché i gruppi di sicurezza aggiuntivi.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role myServiceRole \
    --ec2-attributes InstanceProfile=myRole,EmrManagedMasterSecurityGroup=sg-master1,EmrManagedSlaveSecurityGroup=sg-slave1,AdditionalMasterSecurityGroups=[sg-addMaster1,sg-addMaster2,sg-addMaster3,sg-addMaster4],AdditionalSlaveSecurityGroups=[sg-addSlave1,sg-addSlave2,sg-addSlave3,sg-addSlave4] \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role myServiceRole \
    --ec2-attributes InstanceProfile=myRole,ServiceAccessSecurityGroup=sg-service-access,EmrManagedMasterSecurityGroup=sg-master,EmrManagedSlaveSecurityGroup=sg-slave \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role myServiceRole \
    --ec2-attributes file://ec2_attributes.json  \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

Contenuto di ec2_attributes.json.


[
    {
        "SubnetId": "subnet-xxxxx",
        "KeyName": "myKey",
        "InstanceProfile":"myRole",
        "EmrManagedMasterSecurityGroup": "sg-master1",
        "EmrManagedSlaveSecurityGroup": "sg-slave1",
        "ServiceAccessSecurityGroup": "sg-service-access",
        "AdditionalMasterSecurityGroups": ["sg-addMaster1","sg-addMaster2","sg-addMaster3","sg-addMaster4"],
        "AdditionalSlaveSecurityGroups": ["sg-addSlave1","sg-addSlave2","sg-addSlave3","sg-addSlave4"]
    }
]

Esempio 12: per abilitare il debug e specificare un URI di registro


aws emr create-cluster \
    --enable-debugging \
    --log-uri s3://amzn-s3-demo-bucket/myLog \
    --release-label emr-5.9.0 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 13: per aggiungere tag durante la creazione di un cluster


aws emr create-cluster \
    --tags name="Shirley Rodriguez" age=29 department="Analytics" \
    --release-label emr-5.32.0 \
    --instance-type m5.xlarge \
    --instance-count 3 \
    --use-default-roles

L'esempio seguente elenca i tag applicati a un cluster.


aws emr describe-cluster \
    --cluster-id j-XXXXXXYY \
    --query Cluster.Tags

Esempio 14: Utilizzare una configurazione di sicurezza che abiliti la crittografia e altre funzionalità di sicurezza


aws emr create-cluster \
    --instance-type m4.large \
    --release-label emr-5.9.0 \
    --security-configuration mySecurityConfiguration

Esempio 15: creare un cluster con volumi di storage EBS aggiuntivi configurati per i gruppi di istanze

Quando si specificano volumi EBS aggiuntivi, sono richiesti i seguenti argomenti:VolumeType, SizeInGB se EbsBlockDeviceConfigs specificato.

L'create-clusteresempio seguente crea un cluster con più volumi EBS collegati alle EC2 istanze del gruppo di istanze principale.


aws emr create-cluster \
    --release-label emr-5.9.0  \
    --use-default-roles \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=d2.xlarge 'InstanceGroupType=CORE,InstanceCount=2,InstanceType=d2.xlarge,EbsConfiguration={EbsOptimized=true,EbsBlockDeviceConfigs=[{VolumeSpecification={VolumeType=gp2,SizeInGB=100}},{VolumeSpecification={VolumeType=io1,SizeInGB=100,Iops=100},VolumesPerInstance=4}]}' \
    --auto-terminate

L'esempio seguente crea un cluster con più volumi EBS collegati alle EC2 istanze del gruppo di istanze principale.


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --use-default-roles \
    --instance-groups 'InstanceGroupType=MASTER, InstanceCount=1, InstanceType=d2.xlarge, EbsConfiguration={EbsOptimized=true, EbsBlockDeviceConfigs=[{VolumeSpecification={VolumeType=io1, SizeInGB=100, Iops=100}},{VolumeSpecification={VolumeType=standard,SizeInGB=50},VolumesPerInstance=3}]}' InstanceGroupType=CORE,InstanceCount=2,InstanceType=d2.xlarge \
    --auto-terminate

Esempio 16: creare un cluster con una politica di scalabilità automatica


aws emr create-cluster
    --release-label emr-5.9.0 \
    --use-default-roles --auto-scaling-role EMR_AutoScaling_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceType=d2.xlarge,InstanceCount=1 'InstanceGroupType=CORE,InstanceType=d2.xlarge,InstanceCount=2,AutoScalingPolicy={Constraints={MinCapacity=1,MaxCapacity=5},Rules=[{Name=TestRule,Description=TestDescription,Action={Market=ON_DEMAND,SimpleScalingPolicyConfiguration={AdjustmentType=EXACT_CAPACITY,ScalingAdjustment=2}},Trigger={CloudWatchAlarmDefinition={ComparisonOperator=GREATER_THAN,EvaluationPeriods=5,MetricName=TestMetric,Namespace=EMR,Period=3,Statistic=MAXIMUM,Threshold=4.5,Unit=NONE,Dimensions=[{Key=TestKey,Value=TestValue}]}}}]}'


aws emr create-cluster \
    --release-label emr-5.9.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups file://myfolder/instancegroupconfig.json \
    --auto-scaling-role EMR_AutoScaling_DefaultRole

Contenuto di instancegroupconfig.json.


[
    {
        "InstanceCount": 1,
        "Name": "MyMasterIG",
        "InstanceGroupType": "MASTER",
        "InstanceType": "m4.large"
    },
    {
        "InstanceCount": 2,
        "Name": "MyCoreIG",
        "InstanceGroupType": "CORE",
        "InstanceType": "m4.large",
        "AutoScalingPolicy": {
            "Constraints": {
                "MinCapacity": 2,
                "MaxCapacity": 10
            },
            "Rules": [
                {
                    "Name": "Default-scale-out",
                    "Description": "Replicates the default scale-out rule in the console for YARN memory.",
                    "Action": {
                        "SimpleScalingPolicyConfiguration": {
                            "AdjustmentType": "CHANGE_IN_CAPACITY",
                            "ScalingAdjustment": 1,
                            "CoolDown": 300
                        }
                    },
                    "Trigger": {
                        "CloudWatchAlarmDefinition": {
                            "ComparisonOperator": "LESS_THAN",
                            "EvaluationPeriods": 1,
                            "MetricName": "YARNMemoryAvailablePercentage",
                            "Namespace": "AWS/ElasticMapReduce",
                            "Period": 300,
                            "Threshold": 15,
                            "Statistic": "AVERAGE",
                            "Unit": "PERCENT",
                            "Dimensions": [
                                {
                                    "Key": "JobFlowId",
                                    "Value": "${emr.clusterId}"
                                }
                            ]
                        }
                    }
                }
            ]
        }
    }
]

Esempio 17: aggiungi passaggi JAR personalizzati durante la creazione di un cluster


aws emr create-cluster \
    --steps Type=CUSTOM_JAR,Name=CustomJAR,ActionOnFailure=CONTINUE,Jar=s3://amzn-s3-demo-bucket/mytest.jar,Args=arg1,arg2,arg3 Type=CUSTOM_JAR,Name=CustomJAR,ActionOnFailure=CONTINUE,Jar=s3://amzn-s3-demo-bucket/mytest.jar,MainClass=mymainclass,Args=arg1,arg2,arg3  \
    --release-label emr-5.3.1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 18: Per aggiungere passaggi di streaming durante la creazione di un cluster

L'esempio seguente specifica il passaggio inline.


aws emr create-cluster \
    --steps Type=STREAMING,Name='Streaming Program',ActionOnFailure=CONTINUE,Args=[-files,s3://elasticmapreduce/samples/wordcount/wordSplitter.py,-mapper,wordSplitter.py,-reducer,aggregate,-input,s3://elasticmapreduce/samples/wordcount/input,-output,s3://amzn-s3-demo-bucket/wordcount/output] \
    --release-label emr-5.3.1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate


aws emr create-cluster \
    --steps file://./multiplefiles.json \
    --release-label emr-5.9.0  \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Contenuto di multiplefiles.json.


[
    {
        "Name": "JSON Streaming Step",
        "Args": [
            "-files",
            "s3://elasticmapreduce/samples/wordcount/wordSplitter.py",
            "-mapper",
            "wordSplitter.py",
            "-reducer",
            "aggregate",
            "-input",
            "s3://elasticmapreduce/samples/wordcount/input",
            "-output",
            "s3://amzn-s3-demo-bucket/wordcount/output"
        ],
        "ActionOnFailure": "CONTINUE",
        "Type": "STREAMING"
    }
]

Esempio 19: Per aggiungere passaggi Hive durante la creazione di un cluster


aws emr create-cluster \
    --steps Type=HIVE,Name='Hive program',ActionOnFailure=CONTINUE,ActionOnFailure=TERMINATE_CLUSTER,Args=[-f,s3://elasticmapreduce/samples/hive-ads/libs/model-build.q,-d,INPUT=s3://elasticmapreduce/samples/hive-ads/tables,-d,OUTPUT=s3://amzn-s3-demo-bucket/hive-ads/output/2014-04-18/11-07-32,-d,LIBS=s3://elasticmapreduce/samples/hive-ads/libs] \
    --applications Name=Hive \
    --release-label emr-5.3.1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

Esempio 20: Per aggiungere passaggi Pig durante la creazione di un cluster


aws emr create-cluster \
    --steps Type=PIG,Name='Pig program',ActionOnFailure=CONTINUE,Args=[-f,s3://elasticmapreduce/samples/pig-apache/do-reports2.pig,-p,INPUT=s3://elasticmapreduce/samples/pig-apache/input,-p,OUTPUT=s3://amzn-s3-demo-bucket/pig-apache/output] \
    --applications Name=Pig \
    --release-label emr-5.3.1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large

Esempio 21: Per aggiungere azioni bootstrap

L'create-clusteresempio seguente esegue due azioni di bootstrap definite come script archiviati in Amazon S3.


aws emr create-cluster \
    --bootstrap-actions Path=s3://amzn-s3-demo-bucket/myscript1,Name=BootstrapAction1,Args=[arg1,arg2] Path=s3://amzn-s3-demo-bucket/myscript2,Name=BootstrapAction2,Args=[arg1,arg2] \
    --release-label emr-5.3.1 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large \
    --auto-terminate

Esempio 22: per abilitare la visualizzazione coerente di EMRFS e personalizzare le impostazioni e RetryCount RetryPeriod

L'create-clusteresempio seguente specifica il numero di tentativi e il periodo di nuovi tentativi per la visualizzazione coerente di EMRFS. L'argomento Consistent=true è obbligatorio.


aws emr create-cluster \
    --instance-type m4.large \
    --release-label emr-5.9.0 \
    --emrfs Consistent=true,RetryCount=6,RetryPeriod=30

L'esempio seguente specifica la stessa configurazione EMRFS dell'esempio precedente, utilizzando un file di configurazione JSON memorizzato localmente denominato. emrfsconfig.json


aws emr create-cluster \
    --instance-type m4.large \
    --release-label emr-5.9.0 \
    --emrfs file://emrfsconfig.json

Contenuto di emrfsconfig.json.


{
    "Consistent": true,
    "RetryCount": 6,
    "RetryPeriod": 30
}

Esempio 23: creare un cluster con Kerberos configurato

Il comando seguente specifica gli attributi Kerberos per il cluster in linea.


aws emr create-cluster \
    --instance-type m3.xlarge \
    --release-label emr-5.10.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --security-configuration mySecurityConfiguration \
    --kerberos-attributes Realm=EC2.INTERNAL,KdcAdminPassword=123,CrossRealmTrustPrincipalPassword=123


aws emr create-cluster \
    --instance-type m3.xlarge \
    --release-label emr-5.10.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --security-configuration mySecurityConfiguration \
    --kerberos-attributes file://kerberos_attributes.json

Contenuto di kerberos_attributes.json.


{
    "Realm": "EC2.INTERNAL",
    "KdcAdminPassword": "123",
    "CrossRealmTrustPrincipalPassword": "123",
}

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza la --instance-groups configurazione e dispone di una politica di scalabilità gestita.


aws emr create-cluster \
    --release-label emr-5.30.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large
    --managed-scaling-policy ComputeLimits='{MinimumCapacityUnits=2,MaximumCapacityUnits=4,UnitType=Instances}'

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza «-- log-encryption-kms-key -id» per definire l'ID della chiave KMS utilizzato per la crittografia dei log.


aws emr create-cluster \
    --release-label emr-5.30.0 \
    --log-uri s3://amzn-s3-demo-bucket/myLog \
    --log-encryption-kms-key-id arn:aws:kms:us-east-1:110302272565:key/dd559181-283e-45d7-99d1-66da348c4d33 \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=2,InstanceType=m4.large


aws emr create-cluster \
    --release-label emr-5.30.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=3,InstanceType=m4.largeInstanceGroupType=CORE,InstanceCount=1,InstanceType=m4.large \
    --placement-group-configs InstanceRole=MASTER


aws emr create-cluster \
    --release-label emr-5.34.0 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=1,InstanceType=m4.large \
    --auto-termination-policy IdleTimeout=100

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza il «--os-release-label" per definire una versione di Amazon Linux per il lancio del cluster


aws emr create-cluster \
    --release-label emr-6.6.0 \
    --os-release-label 2.0.20220406.1 \
    --service-role EMR_DefaultRole \
    --ec2-attributes InstanceProfile=EMR_EC2_DefaultRole \
    --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.large InstanceGroupType=CORE,InstanceCount=1,InstanceType=m4.large

Esempio 24: Per specificare gli attributi di un volume root EBS: dimensione, iops e throughput per le istanze di cluster create con le versioni EMR 6.15.0 e successive

L'create-clusteresempio seguente crea un cluster Amazon EMR che utilizza gli attributi del volume root per configurare le specifiche dei volumi root per le EC2 istanze.


aws emr create-cluster \
    --name "Cluster with My Custom AMI" \
    --custom-ami-id ami-a518e6df \
    --ebs-root-volume-size 20 \
    --ebs-root-volume-iops 3000 \
    --ebs-root-volume-throughput 125 \
    --release-label emr-6.15.0 \
    --use-default-roles \
    --instance-count 2 \
    --instance-type m4.large

Per i dettagli sull'API, consulta AWS CLI Command CreateClusterExamplesReference.

Il seguente esempio di codice mostra come utilizzarecreate-default-roles.

AWS CLI

1. Per creare il ruolo IAM predefinito per EC2

Comando:


aws emr create-default-roles

Output:


If the role already exists then the command returns nothing.

If the role does not exist then the output will be:

[
    {
        "RolePolicy": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": [
                        "cloudwatch:*",
                        "dynamodb:*",
                        "ec2:Describe*",
                        "elasticmapreduce:Describe*",
                        "elasticmapreduce:ListBootstrapActions",
                        "elasticmapreduce:ListClusters",
                        "elasticmapreduce:ListInstanceGroups",
                        "elasticmapreduce:ListInstances",
                        "elasticmapreduce:ListSteps",
                        "kinesis:CreateStream",
                        "kinesis:DeleteStream",
                        "kinesis:DescribeStream",
                        "kinesis:GetRecords",
                        "kinesis:GetShardIterator",
                        "kinesis:MergeShards",
                        "kinesis:PutRecord",
                        "kinesis:SplitShard",
                        "rds:Describe*",
                        "s3:*",
                        "sdb:*",
                        "sns:*",
                        "sqs:*"
                    ],
                    "Resource": "*",
                    "Effect": "Allow"
                }
            ]
        },
        "Role": {
            "AssumeRolePolicyDocument": {
                "Version": "2008-10-17",
                "Statement": [
                    {
                        "Action": "sts:AssumeRole",
                        "Sid": "",
                        "Effect": "Allow",
                        "Principal": {
                            "Service": "ec2.amazonaws.com"
                        }
                    }
                ]
            },
            "RoleId": "AROAIQ5SIQUGL5KMYBJX6",
            "CreateDate": "2015-06-09T17:09:04.602Z",
            "RoleName": "EMR_EC2_DefaultRole",
            "Path": "/",
            "Arn": "arn:aws:iam::176430881729:role/EMR_EC2_DefaultRole"
        }
    },
    {
        "RolePolicy": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": [
                        "ec2:AuthorizeSecurityGroupIngress",
                        "ec2:CancelSpotInstanceRequests",
                        "ec2:CreateSecurityGroup",
                        "ec2:CreateTags",
                        "ec2:DeleteTags",
                        "ec2:DescribeAvailabilityZones",
                        "ec2:DescribeAccountAttributes",
                        "ec2:DescribeInstances",
                        "ec2:DescribeInstanceStatus",
                        "ec2:DescribeKeyPairs",
                        "ec2:DescribePrefixLists",
                        "ec2:DescribeRouteTables",
                        "ec2:DescribeSecurityGroups",
                        "ec2:DescribeSpotInstanceRequests",
                        "ec2:DescribeSpotPriceHistory",
                        "ec2:DescribeSubnets",
                        "ec2:DescribeVpcAttribute",
                        "ec2:DescribeVpcEndpoints",
                        "ec2:DescribeVpcEndpointServices",
                        "ec2:DescribeVpcs",
                        "ec2:ModifyImageAttribute",
                        "ec2:ModifyInstanceAttribute",
                        "ec2:RequestSpotInstances",
                        "ec2:RunInstances",
                        "ec2:TerminateInstances",
                        "iam:GetRole",
                        "iam:GetRolePolicy",
                        "iam:ListInstanceProfiles",
                        "iam:ListRolePolicies",
                        "iam:PassRole",
                        "s3:CreateBucket",
                        "s3:Get*",
                        "s3:List*",
                        "sdb:BatchPutAttributes",
                        "sdb:Select",
                        "sqs:CreateQueue",
                        "sqs:Delete*",
                        "sqs:GetQueue*",
                        "sqs:ReceiveMessage"
                    ],
                    "Resource": "*",
                    "Effect": "Allow"
                }
            ]
        },
        "Role": {
            "AssumeRolePolicyDocument": {
                "Version": "2008-10-17",
                "Statement": [
                    {
                        "Action": "sts:AssumeRole",
                        "Sid": "",
                        "Effect": "Allow",
                        "Principal": {
                            "Service": "elasticmapreduce.amazonaws.com"
                        }
                    }
                ]
            },
            "RoleId": "AROAI3SRVPPVSRDLARBPY",
            "CreateDate": "2015-06-09T17:09:10.401Z",
            "RoleName": "EMR_DefaultRole",
            "Path": "/",
            "Arn": "arn:aws:iam::176430881729:role/EMR_DefaultRole"
        }
    }
]

Per i dettagli sull'API, consulta CreateDefaultRoles AWS CLICommand Reference.

create-default-roles

Il seguente esempio di codice mostra come utilizzarecreate-default-roles.

AWS CLI

1. Per creare il ruolo IAM predefinito per EC2

Comando:


aws emr create-default-roles

Output:


If the role already exists then the command returns nothing.

If the role does not exist then the output will be:

[
    {
        "RolePolicy": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": [
                        "cloudwatch:*",
                        "dynamodb:*",
                        "ec2:Describe*",
                        "elasticmapreduce:Describe*",
                        "elasticmapreduce:ListBootstrapActions",
                        "elasticmapreduce:ListClusters",
                        "elasticmapreduce:ListInstanceGroups",
                        "elasticmapreduce:ListInstances",
                        "elasticmapreduce:ListSteps",
                        "kinesis:CreateStream",
                        "kinesis:DeleteStream",
                        "kinesis:DescribeStream",
                        "kinesis:GetRecords",
                        "kinesis:GetShardIterator",
                        "kinesis:MergeShards",
                        "kinesis:PutRecord",
                        "kinesis:SplitShard",
                        "rds:Describe*",
                        "s3:*",
                        "sdb:*",
                        "sns:*",
                        "sqs:*"
                    ],
                    "Resource": "*",
                    "Effect": "Allow"
                }
            ]
        },
        "Role": {
            "AssumeRolePolicyDocument": {
                "Version": "2008-10-17",
                "Statement": [
                    {
                        "Action": "sts:AssumeRole",
                        "Sid": "",
                        "Effect": "Allow",
                        "Principal": {
                            "Service": "ec2.amazonaws.com"
                        }
                    }
                ]
            },
            "RoleId": "AROAIQ5SIQUGL5KMYBJX6",
            "CreateDate": "2015-06-09T17:09:04.602Z",
            "RoleName": "EMR_EC2_DefaultRole",
            "Path": "/",
            "Arn": "arn:aws:iam::176430881729:role/EMR_EC2_DefaultRole"
        }
    },
    {
        "RolePolicy": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": [
                        "ec2:AuthorizeSecurityGroupIngress",
                        "ec2:CancelSpotInstanceRequests",
                        "ec2:CreateSecurityGroup",
                        "ec2:CreateTags",
                        "ec2:DeleteTags",
                        "ec2:DescribeAvailabilityZones",
                        "ec2:DescribeAccountAttributes",
                        "ec2:DescribeInstances",
                        "ec2:DescribeInstanceStatus",
                        "ec2:DescribeKeyPairs",
                        "ec2:DescribePrefixLists",
                        "ec2:DescribeRouteTables",
                        "ec2:DescribeSecurityGroups",
                        "ec2:DescribeSpotInstanceRequests",
                        "ec2:DescribeSpotPriceHistory",
                        "ec2:DescribeSubnets",
                        "ec2:DescribeVpcAttribute",
                        "ec2:DescribeVpcEndpoints",
                        "ec2:DescribeVpcEndpointServices",
                        "ec2:DescribeVpcs",
                        "ec2:ModifyImageAttribute",
                        "ec2:ModifyInstanceAttribute",
                        "ec2:RequestSpotInstances",
                        "ec2:RunInstances",
                        "ec2:TerminateInstances",
                        "iam:GetRole",
                        "iam:GetRolePolicy",
                        "iam:ListInstanceProfiles",
                        "iam:ListRolePolicies",
                        "iam:PassRole",
                        "s3:CreateBucket",
                        "s3:Get*",
                        "s3:List*",
                        "sdb:BatchPutAttributes",
                        "sdb:Select",
                        "sqs:CreateQueue",
                        "sqs:Delete*",
                        "sqs:GetQueue*",
                        "sqs:ReceiveMessage"
                    ],
                    "Resource": "*",
                    "Effect": "Allow"
                }
            ]
        },
        "Role": {
            "AssumeRolePolicyDocument": {
                "Version": "2008-10-17",
                "Statement": [
                    {
                        "Action": "sts:AssumeRole",
                        "Sid": "",
                        "Effect": "Allow",
                        "Principal": {
                            "Service": "elasticmapreduce.amazonaws.com"
                        }
                    }
                ]
            },
            "RoleId": "AROAI3SRVPPVSRDLARBPY",
            "CreateDate": "2015-06-09T17:09:10.401Z",
            "RoleName": "EMR_DefaultRole",
            "Path": "/",
            "Arn": "arn:aws:iam::176430881729:role/EMR_DefaultRole"
        }
    }
]

Per i dettagli sull'API, consulta CreateDefaultRoles AWS CLICommand Reference.

Il seguente esempio di codice mostra come utilizzarecreate-security-configuration.

AWS CLI

1. Per creare una configurazione di sicurezza con crittografia in transito abilitata con PEM per il fornitore di certificati e crittografia a riposo abilitata con SSE-S3 per la crittografia S3 e -KMS per il provider di chiavi del disco locale AWS

Comando:


 aws emr create-security-configuration --name MySecurityConfig --security-configuration '{
        "EncryptionConfiguration": {
                "EnableInTransitEncryption" : true,
                "EnableAtRestEncryption" : true,
                "InTransitEncryptionConfiguration" : {
                        "TLSCertificateConfiguration" : {
                                "CertificateProviderType" : "PEM",
                                "S3Object" : "s3://mycertstore/artifacts/MyCerts.zip"
                        }
                },
                "AtRestEncryptionConfiguration" : {
                        "S3EncryptionConfiguration" : {
                                "EncryptionMode" : "SSE-S3"
                        },
                        "LocalDiskEncryptionConfiguration" : {
                                "EncryptionKeyProviderType" : "AwsKms",
                                "AwsKmsKey" : "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012"
                        }
                }
        }
}'

Output:


{
"CreationDateTime": 1474070889.129,
"Name": "MySecurityConfig"
}

Equivalente a JSON (contenuto di security_configuration.json):


{
    "EncryptionConfiguration": {
        "EnableInTransitEncryption": true,
        "EnableAtRestEncryption": true,
        "InTransitEncryptionConfiguration": {
            "TLSCertificateConfiguration": {
                "CertificateProviderType": "PEM",
                "S3Object": "s3://mycertstore/artifacts/MyCerts.zip"
            }
        },
        "AtRestEncryptionConfiguration": {
            "S3EncryptionConfiguration": {
                "EncryptionMode": "SSE-S3"
            },
            "LocalDiskEncryptionConfiguration": {
                "EncryptionKeyProviderType": "AwsKms",
                "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012"
            }
        }
    }
}

Comando (utilizzando security_configuration.json):


aws emr create-security-configuration --name "MySecurityConfig" --security-configuration file://./security_configuration.json

Output:


{
"CreationDateTime": 1474070889.129,
"Name": "MySecurityConfig"
}

2. Creare una configurazione di sicurezza con Kerberos abilitato utilizzando KDC dedicato al cluster e cross-realm trust

Comando:


 aws emr create-security-configuration --name MySecurityConfig --security-configuration '{
     "AuthenticationConfiguration": {
         "KerberosConfiguration": {
             "Provider": "ClusterDedicatedKdc",
             "ClusterDedicatedKdcConfiguration": {
                 "TicketLifetimeInHours": 24,
                 "CrossRealmTrustConfiguration": {
                   "Realm": "AD.DOMAIN.COM",
                   "Domain": "ad.domain.com",
                   "AdminServer": "ad.domain.com",
                   "KdcServer": "ad.domain.com"
                 }
             }
         }
     }
}'

Output:


{
"CreationDateTime": 1490225558.982,
"Name": "MySecurityConfig"
}

Equivalente a JSON (contenuto di security_configuration.json):


{
    "AuthenticationConfiguration": {
        "KerberosConfiguration": {
            "Provider": "ClusterDedicatedKdc",
            "ClusterDedicatedKdcConfiguration": {
                "TicketLifetimeInHours": 24,
                "CrossRealmTrustConfiguration": {
                    "Realm": "AD.DOMAIN.COM",
                    "Domain": "ad.domain.com",
                    "AdminServer": "ad.domain.com",
                    "KdcServer": "ad.domain.com"
                }
            }
        }
    }
}

Comando (utilizzando security_configuration.json):


aws emr create-security-configuration --name "MySecurityConfig" --security-configuration file://./security_configuration.json

Output:


{
"CreationDateTime": 1490225558.982,
"Name": "MySecurityConfig"
}

Per i dettagli sull'API, consulta Command Reference. CreateSecurityConfigurationAWS CLI

create-security-configuration

Il seguente esempio di codice mostra come utilizzarecreate-security-configuration.

AWS CLI

Comando:


 aws emr create-security-configuration --name MySecurityConfig --security-configuration '{
        "EncryptionConfiguration": {
                "EnableInTransitEncryption" : true,
                "EnableAtRestEncryption" : true,
                "InTransitEncryptionConfiguration" : {
                        "TLSCertificateConfiguration" : {
                                "CertificateProviderType" : "PEM",
                                "S3Object" : "s3://mycertstore/artifacts/MyCerts.zip"
                        }
                },
                "AtRestEncryptionConfiguration" : {
                        "S3EncryptionConfiguration" : {
                                "EncryptionMode" : "SSE-S3"
                        },
                        "LocalDiskEncryptionConfiguration" : {
                                "EncryptionKeyProviderType" : "AwsKms",
                                "AwsKmsKey" : "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012"
                        }
                }
        }
}'

Output:


{
"CreationDateTime": 1474070889.129,
"Name": "MySecurityConfig"
}

Equivalente a JSON (contenuto di security_configuration.json):


{
    "EncryptionConfiguration": {
        "EnableInTransitEncryption": true,
        "EnableAtRestEncryption": true,
        "InTransitEncryptionConfiguration": {
            "TLSCertificateConfiguration": {
                "CertificateProviderType": "PEM",
                "S3Object": "s3://mycertstore/artifacts/MyCerts.zip"
            }
        },
        "AtRestEncryptionConfiguration": {
            "S3EncryptionConfiguration": {
                "EncryptionMode": "SSE-S3"
            },
            "LocalDiskEncryptionConfiguration": {
                "EncryptionKeyProviderType": "AwsKms",
                "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012"
            }
        }
    }
}

Comando (utilizzando security_configuration.json):


aws emr create-security-configuration --name "MySecurityConfig" --security-configuration file://./security_configuration.json

Output:


{
"CreationDateTime": 1474070889.129,
"Name": "MySecurityConfig"
}

2. Creare una configurazione di sicurezza con Kerberos abilitato utilizzando KDC dedicato al cluster e cross-realm trust

Comando:


 aws emr create-security-configuration --name MySecurityConfig --security-configuration '{
     "AuthenticationConfiguration": {
         "KerberosConfiguration": {
             "Provider": "ClusterDedicatedKdc",
             "ClusterDedicatedKdcConfiguration": {
                 "TicketLifetimeInHours": 24,
                 "CrossRealmTrustConfiguration": {
                   "Realm": "AD.DOMAIN.COM",
                   "Domain": "ad.domain.com",
                   "AdminServer": "ad.domain.com",
                   "KdcServer": "ad.domain.com"
                 }
             }
         }
     }
}'

Output:


{
"CreationDateTime": 1490225558.982,
"Name": "MySecurityConfig"
}

Equivalente a JSON (contenuto di security_configuration.json):


{
    "AuthenticationConfiguration": {
        "KerberosConfiguration": {
            "Provider": "ClusterDedicatedKdc",
            "ClusterDedicatedKdcConfiguration": {
                "TicketLifetimeInHours": 24,
                "CrossRealmTrustConfiguration": {
                    "Realm": "AD.DOMAIN.COM",
                    "Domain": "ad.domain.com",
                    "AdminServer": "ad.domain.com",
                    "KdcServer": "ad.domain.com"
                }
            }
        }
    }
}

Comando (utilizzando security_configuration.json):


aws emr create-security-configuration --name "MySecurityConfig" --security-configuration file://./security_configuration.json

Output:


{
"CreationDateTime": 1490225558.982,
"Name": "MySecurityConfig"
}

Per i dettagli sull'API, consulta Command Reference. CreateSecurityConfigurationAWS CLI

Il seguente esempio di codice mostra come utilizzaredelete-security-configuration.

AWS CLI

Per eliminare una configurazione di sicurezza nell'area corrente

Comando:


aws emr delete-security-configuration --name MySecurityConfig

Output:


None

Per i dettagli sull'API, consulta DeleteSecurityConfiguration AWS CLICommand Reference.

delete-security-configuration

Il seguente esempio di codice mostra come utilizzaredelete-security-configuration.

AWS CLI

Per eliminare una configurazione di sicurezza nell'area corrente

Comando:


aws emr delete-security-configuration --name MySecurityConfig

Output:


None

Per i dettagli sull'API, consulta DeleteSecurityConfiguration AWS CLICommand Reference.

Il seguente esempio di codice mostra come utilizzaredescribe-cluster.

AWS CLI

Comando:


aws emr describe-cluster --cluster-id j-XXXXXXXX

Output:


For release-label based uniform instance groups cluster:

        {
            "Cluster": {
                "Status": {
                    "Timeline": {
                        "ReadyDateTime": 1436475075.199,
                        "CreationDateTime": 1436474656.563,
                    },
                    "State": "WAITING",
                    "StateChangeReason": {
                        "Message": "Waiting for steps to run"
                    }
                },
                "Ec2InstanceAttributes": {
                    "ServiceAccessSecurityGroup": "sg-xxxxxxxx",
                    "EmrManagedMasterSecurityGroup": "sg-xxxxxxxx",
                    "IamInstanceProfile": "EMR_EC2_DefaultRole",
                    "Ec2KeyName": "myKey",
                    "Ec2AvailabilityZone": "us-east-1c",
                    "EmrManagedSlaveSecurityGroup": "sg-yyyyyyyyy"
                },
                "Name": "My Cluster",
                "ServiceRole": "EMR_DefaultRole",
                "Tags": [],
                "TerminationProtected": true,
                "UnhealthyNodeReplacement": true,
                "ReleaseLabel": "emr-4.0.0",
                "NormalizedInstanceHours": 96,
                "InstanceGroups": [
                    {
                        "RequestedInstanceCount": 2,
                        "Status": {
                            "Timeline": {
                                "ReadyDateTime": 1436475074.245,
                                "CreationDateTime": 1436474656.564,
                                "EndDateTime": 1436638158.387
                            },
                            "State": "RUNNING",
                            "StateChangeReason": {
                                "Message": "",
                            }
                        },
                        "Name": "CORE",
                        "InstanceGroupType": "CORE",
                        "Id": "ig-YYYYYYY",
                        "Configurations": [],
                        "InstanceType": "m3.large",
                        "Market": "ON_DEMAND",
                        "RunningInstanceCount": 2
                    },
                    {
                        "RequestedInstanceCount": 1,
                        "Status": {
                            "Timeline": {
                                "ReadyDateTime": 1436475074.245,
                                "CreationDateTime": 1436474656.564,
                                "EndDateTime": 1436638158.387
                            },
                            "State": "RUNNING",
                            "StateChangeReason": {
                                "Message": "",
                            }
                        },
                        "Name": "MASTER",
                        "InstanceGroupType": "MASTER",
                        "Id": "ig-XXXXXXXXX",
                        "Configurations": [],
                        "InstanceType": "m3.large",
                        "Market": "ON_DEMAND",
                        "RunningInstanceCount": 1
                    }
                ],
                "Applications": [
                    {
                        "Name": "Hadoop"
                    }
                ],
                "VisibleToAllUsers": true,
                "BootstrapActions": [],
                "MasterPublicDnsName": "ec2-54-147-144-78.compute-1.amazonaws.com",
                "AutoTerminate": false,
                "Id": "j-XXXXXXXX",
                "Configurations": [
                    {
                        "Properties": {
                            "fs.s3.consistent.retryPeriodSeconds": "20",
                            "fs.s3.enableServerSideEncryption": "true",
                            "fs.s3.consistent": "false",
                            "fs.s3.consistent.retryCount": "2"
                        },
                        "Classification": "emrfs-site"
                    }
                ]
            }
        }


For release-label based instance fleet cluster:
{
    "Cluster": {
        "Status": {
            "Timeline": {
                "ReadyDateTime": 1487897289.705,
                "CreationDateTime": 1487896933.942
            },
            "State": "WAITING",
            "StateChangeReason": {
                "Message": "Waiting for steps to run"
            }
        },
        "Ec2InstanceAttributes": {
            "EmrManagedMasterSecurityGroup": "sg-xxxxx",
            "RequestedEc2AvailabilityZones": [],
            "RequestedEc2SubnetIds": [],
            "IamInstanceProfile": "EMR_EC2_DefaultRole",
            "Ec2AvailabilityZone": "us-east-1a",
            "EmrManagedSlaveSecurityGroup": "sg-xxxxx"
        },
        "Name": "My Cluster",
        "ServiceRole": "EMR_DefaultRole",
        "Tags": [],
        "TerminationProtected": false,
        "UnhealthyNodeReplacement": false,
        "ReleaseLabel": "emr-5.2.0",
        "NormalizedInstanceHours": 472,
        "InstanceCollectionType": "INSTANCE_FLEET",
        "InstanceFleets": [
            {
                "Status": {
                    "Timeline": {
                        "ReadyDateTime": 1487897212.74,
                        "CreationDateTime": 1487896933.948
                    },
                    "State": "RUNNING",
                    "StateChangeReason": {
                        "Message": ""
                    }
                },
                "ProvisionedSpotCapacity": 1,
                "Name": "MASTER",
                "InstanceFleetType": "MASTER",
                "LaunchSpecifications": {
                    "SpotSpecification": {
                        "TimeoutDurationMinutes": 60,
                        "TimeoutAction": "TERMINATE_CLUSTER"
                    }
                },
                "TargetSpotCapacity": 1,
                "ProvisionedOnDemandCapacity": 0,
                "InstanceTypeSpecifications": [
                    {
                        "BidPrice": "0.5",
                        "InstanceType": "m3.xlarge",
                        "WeightedCapacity": 1
                    }
                ],
                "Id": "if-xxxxxxx",
                "TargetOnDemandCapacity": 0
            }
        ],
        "Applications": [
            {
                "Version": "2.7.3",
                "Name": "Hadoop"
            }
        ],
        "ScaleDownBehavior": "TERMINATE_AT_INSTANCE_HOUR",
        "VisibleToAllUsers": true,
        "BootstrapActions": [],
        "MasterPublicDnsName": "ec2-xxx-xx-xxx-xx.compute-1.amazonaws.com",
        "AutoTerminate": false,
        "Id": "j-xxxxx",
        "Configurations": []
    }
}

For ami based uniform instance group cluster:

    {
        "Cluster": {
            "Status": {
                "Timeline": {
                    "ReadyDateTime": 1399400564.432,
                    "CreationDateTime": 1399400268.62
                },
                "State": "WAITING",
                "StateChangeReason": {
                    "Message": "Waiting for steps to run"
                }
            },
            "Ec2InstanceAttributes": {
                "IamInstanceProfile": "EMR_EC2_DefaultRole",
                "Ec2AvailabilityZone": "us-east-1c"
            },
            "Name": "My Cluster",
            "Tags": [],
            "TerminationProtected": true,
            "UnhealthyNodeReplacement": true,
            "RunningAmiVersion": "2.5.4",
            "InstanceGroups": [
                {
                    "RequestedInstanceCount": 1,
                    "Status": {
                        "Timeline": {
                            "ReadyDateTime": 1399400558.848,
                            "CreationDateTime": 1399400268.621
                        },
                        "State": "RUNNING",
                        "StateChangeReason": {
                            "Message": ""
                        }
                    },
                    "Name": "Master instance group",
                    "InstanceGroupType": "MASTER",
                    "InstanceType": "m1.small",
                    "Id": "ig-ABCD",
                    "Market": "ON_DEMAND",
                    "RunningInstanceCount": 1
                },
                {
                    "RequestedInstanceCount": 2,
                    "Status": {
                        "Timeline": {
                            "ReadyDateTime": 1399400564.439,
                            "CreationDateTime": 1399400268.621
                        },
                        "State": "RUNNING",
                        "StateChangeReason": {
                            "Message": ""
                        }
                    },
                    "Name": "Core instance group",
                    "InstanceGroupType": "CORE",
                    "InstanceType": "m1.small",
                    "Id": "ig-DEF",
                    "Market": "ON_DEMAND",
                    "RunningInstanceCount": 2
                }
            ],
            "Applications": [
                {
                    "Version": "1.0.3",
                    "Name": "hadoop"
                }
            ],
            "BootstrapActions": [],
            "VisibleToAllUsers": false,
            "RequestedAmiVersion": "2.4.2",
            "LogUri": "s3://myLogUri/",
            "AutoTerminate": false,
            "Id": "j-XXXXXXXX"
        }
    }

Per i dettagli sull'API, consulta DescribeCluster AWS CLICommand Reference.

describe-cluster

Il seguente esempio di codice mostra come utilizzaredescribe-cluster.

AWS CLI

Comando:


aws emr describe-cluster --cluster-id j-XXXXXXXX

Output:


For release-label based uniform instance groups cluster:

        {
            "Cluster": {
                "Status": {
                    "Timeline": {
                        "ReadyDateTime": 1436475075.199,
                        "CreationDateTime": 1436474656.563,
                    },
                    "State": "WAITING",
                    "StateChangeReason": {
                        "Message": "Waiting for steps to run"
                    }
                },
                "Ec2InstanceAttributes": {
                    "ServiceAccessSecurityGroup": "sg-xxxxxxxx",
                    "EmrManagedMasterSecurityGroup": "sg-xxxxxxxx",
                    "IamInstanceProfile": "EMR_EC2_DefaultRole",
                    "Ec2KeyName": "myKey",
                    "Ec2AvailabilityZone": "us-east-1c",
                    "EmrManagedSlaveSecurityGroup": "sg-yyyyyyyyy"
                },
                "Name": "My Cluster",
                "ServiceRole": "EMR_DefaultRole",
                "Tags": [],
                "TerminationProtected": true,
                "UnhealthyNodeReplacement": true,
                "ReleaseLabel": "emr-4.0.0",
                "NormalizedInstanceHours": 96,
                "InstanceGroups": [
                    {
                        "RequestedInstanceCount": 2,
                        "Status": {
                            "Timeline": {
                                "ReadyDateTime": 1436475074.245,
                                "CreationDateTime": 1436474656.564,
                                "EndDateTime": 1436638158.387
                            },
                            "State": "RUNNING",
                            "StateChangeReason": {
                                "Message": "",
                            }
                        },
                        "Name": "CORE",
                        "InstanceGroupType": "CORE",
                        "Id": "ig-YYYYYYY",
                        "Configurations": [],
                        "InstanceType": "m3.large",
                        "Market": "ON_DEMAND",
                        "RunningInstanceCount": 2
                    },
                    {
                        "RequestedInstanceCount": 1,
                        "Status": {
                            "Timeline": {
                                "ReadyDateTime": 1436475074.245,
                                "CreationDateTime": 1436474656.564,
                                "EndDateTime": 1436638158.387
                            },
                            "State": "RUNNING",
                            "StateChangeReason": {
                                "Message": "",
                            }
                        },
                        "Name": "MASTER",
                        "InstanceGroupType": "MASTER",
                        "Id": "ig-XXXXXXXXX",
                        "Configurations": [],
                        "InstanceType": "m3.large",
                        "Market": "ON_DEMAND",
                        "RunningInstanceCount": 1
                    }
                ],
                "Applications": [
                    {
                        "Name": "Hadoop"
                    }
                ],
                "VisibleToAllUsers": true,
                "BootstrapActions": [],
                "MasterPublicDnsName": "ec2-54-147-144-78.compute-1.amazonaws.com",
                "AutoTerminate": false,
                "Id": "j-XXXXXXXX",
                "Configurations": [
                    {
                        "Properties": {
                            "fs.s3.consistent.retryPeriodSeconds": "20",
                            "fs.s3.enableServerSideEncryption": "true",
                            "fs.s3.consistent": "false",
                            "fs.s3.consistent.retryCount": "2"
                        },
                        "Classification": "emrfs-site"
                    }
                ]
            }
        }


For release-label based instance fleet cluster:
{
    "Cluster": {
        "Status": {
            "Timeline": {
                "ReadyDateTime": 1487897289.705,
                "CreationDateTime": 1487896933.942
            },
            "State": "WAITING",
            "StateChangeReason": {
                "Message": "Waiting for steps to run"
            }
        },
        "Ec2InstanceAttributes": {
            "EmrManagedMasterSecurityGroup": "sg-xxxxx",
            "RequestedEc2AvailabilityZones": [],
            "RequestedEc2SubnetIds": [],
            "IamInstanceProfile": "EMR_EC2_DefaultRole",
            "Ec2AvailabilityZone": "us-east-1a",
            "EmrManagedSlaveSecurityGroup": "sg-xxxxx"
        },
        "Name": "My Cluster",
        "ServiceRole": "EMR_DefaultRole",
        "Tags": [],
        "TerminationProtected": false,
        "UnhealthyNodeReplacement": false,
        "ReleaseLabel": "emr-5.2.0",
        "NormalizedInstanceHours": 472,
        "InstanceCollectionType": "INSTANCE_FLEET",
        "InstanceFleets": [
            {
                "Status": {
                    "Timeline": {
                        "ReadyDateTime": 1487897212.74,
                        "CreationDateTime": 1487896933.948
                    },
                    "State": "RUNNING",
                    "StateChangeReason": {
                        "Message": ""
                    }
                },
                "ProvisionedSpotCapacity": 1,
                "Name": "MASTER",
                "InstanceFleetType": "MASTER",
                "LaunchSpecifications": {
                    "SpotSpecification": {
                        "TimeoutDurationMinutes": 60,
                        "TimeoutAction": "TERMINATE_CLUSTER"
                    }
                },
                "TargetSpotCapacity": 1,
                "ProvisionedOnDemandCapacity": 0,
                "InstanceTypeSpecifications": [
                    {
                        "BidPrice": "0.5",
                        "InstanceType": "m3.xlarge",
                        "WeightedCapacity": 1
                    }
                ],
                "Id": "if-xxxxxxx",
                "TargetOnDemandCapacity": 0
            }
        ],
        "Applications": [
            {
                "Version": "2.7.3",
                "Name": "Hadoop"
            }
        ],
        "ScaleDownBehavior": "TERMINATE_AT_INSTANCE_HOUR",
        "VisibleToAllUsers": true,
        "BootstrapActions": [],
        "MasterPublicDnsName": "ec2-xxx-xx-xxx-xx.compute-1.amazonaws.com",
        "AutoTerminate": false,
        "Id": "j-xxxxx",
        "Configurations": []
    }
}

For ami based uniform instance group cluster:

    {
        "Cluster": {
            "Status": {
                "Timeline": {
                    "ReadyDateTime": 1399400564.432,
                    "CreationDateTime": 1399400268.62
                },
                "State": "WAITING",
                "StateChangeReason": {
                    "Message": "Waiting for steps to run"
                }
            },
            "Ec2InstanceAttributes": {
                "IamInstanceProfile": "EMR_EC2_DefaultRole",
                "Ec2AvailabilityZone": "us-east-1c"
            },
            "Name": "My Cluster",
            "Tags": [],
            "TerminationProtected": true,
            "UnhealthyNodeReplacement": true,
            "RunningAmiVersion": "2.5.4",
            "InstanceGroups": [
                {
                    "RequestedInstanceCount": 1,
                    "Status": {
                        "Timeline": {
                            "ReadyDateTime": 1399400558.848,
                            "CreationDateTime": 1399400268.621
                        },
                        "State": "RUNNING",
                        "StateChangeReason": {
                            "Message": ""
                        }
                    },
                    "Name": "Master instance group",
                    "InstanceGroupType": "MASTER",
                    "InstanceType": "m1.small",
                    "Id": "ig-ABCD",
                    "Market": "ON_DEMAND",
                    "RunningInstanceCount": 1
                },
                {
                    "RequestedInstanceCount": 2,
                    "Status": {
                        "Timeline": {
                            "ReadyDateTime": 1399400564.439,
                            "CreationDateTime": 1399400268.621
                        },
                        "State": "RUNNING",
                        "StateChangeReason": {
                            "Message": ""
                        }
                    },
                    "Name": "Core instance group",
                    "InstanceGroupType": "CORE",
                    "InstanceType": "m1.small",
                    "Id": "ig-DEF",
                    "Market": "ON_DEMAND",
                    "RunningInstanceCount": 2
                }
            ],
            "Applications": [
                {
                    "Version": "1.0.3",
                    "Name": "hadoop"
                }
            ],
            "BootstrapActions": [],
            "VisibleToAllUsers": false,
            "RequestedAmiVersion": "2.4.2",
            "LogUri": "s3://myLogUri/",
            "AutoTerminate": false,
            "Id": "j-XXXXXXXX"
        }
    }

Per i dettagli sull'API, consulta DescribeCluster AWS CLICommand Reference.

Il seguente esempio di codice mostra come utilizzaredescribe-step.

AWS CLI

Il comando seguente descrive un passaggio con l'ID del passaggio s-3LZC0QUT43AM in un cluster con l'ID del clusterj-3SD91U2E1L2QX:


aws emr describe-step --cluster-id j-3SD91U2E1L2QX --step-id s-3LZC0QUT43AM

Output:


{
    "Step": {
        "Status": {
            "Timeline": {
                "EndDateTime": 1433200470.481,
                "CreationDateTime": 1433199926.597,
                "StartDateTime": 1433200404.959
            },
            "State": "COMPLETED",
            "StateChangeReason": {}
        },
        "Config": {
            "Args": [
                "s3://us-west-2.elasticmapreduce/libs/hive/hive-script",
                "--base-path",
                "s3://us-west-2.elasticmapreduce/libs/hive/",
                "--install-hive",
                "--hive-versions",
                "0.13.1"
            ],
            "Jar": "s3://us-west-2.elasticmapreduce/libs/script-runner/script-runner.jar",
            "Properties": {}
        },
        "Id": "s-3LZC0QUT43AM",
        "ActionOnFailure": "TERMINATE_CLUSTER",
        "Name": "Setup hive"
    }
}

Per i dettagli sull'API, consulta DescribeStep AWS CLICommand Reference.

describe-step

Il seguente esempio di codice mostra come utilizzaredescribe-step.

AWS CLI

Il comando seguente descrive un passaggio con l'ID del passaggio s-3LZC0QUT43AM in un cluster con l'ID del clusterj-3SD91U2E1L2QX:


aws emr describe-step --cluster-id j-3SD91U2E1L2QX --step-id s-3LZC0QUT43AM

Output:


{
    "Step": {
        "Status": {
            "Timeline": {
                "EndDateTime": 1433200470.481,
                "CreationDateTime": 1433199926.597,
                "StartDateTime": 1433200404.959
            },
            "State": "COMPLETED",
            "StateChangeReason": {}
        },
        "Config": {
            "Args": [
                "s3://us-west-2.elasticmapreduce/libs/hive/hive-script",
                "--base-path",
                "s3://us-west-2.elasticmapreduce/libs/hive/",
                "--install-hive",
                "--hive-versions",
                "0.13.1"
            ],
            "Jar": "s3://us-west-2.elasticmapreduce/libs/script-runner/script-runner.jar",
            "Properties": {}
        },
        "Id": "s-3LZC0QUT43AM",
        "ActionOnFailure": "TERMINATE_CLUSTER",
        "Name": "Setup hive"
    }
}

Per i dettagli sull'API, consulta DescribeStep AWS CLICommand Reference.

Il seguente esempio di codice mostra come utilizzareget.

AWS CLI

Quanto segue scarica l'hadoop-examples.jararchivio dall'istanza master in un cluster con l'ID del clusterj-3SD91U2E1L2QX:


aws emr get --cluster-id j-3SD91U2E1L2QX --key-pair-file ~/.ssh/mykey.pem --src /home/hadoop-examples.jar --dest ~

Per i dettagli sull'API, consulta Get in AWS CLI Command Reference.

get

Il seguente esempio di codice mostra come utilizzareget.

AWS CLI

Quanto segue scarica l'hadoop-examples.jararchivio dall'istanza master in un cluster con l'ID del clusterj-3SD91U2E1L2QX:


aws emr get --cluster-id j-3SD91U2E1L2QX --key-pair-file ~/.ssh/mykey.pem --src /home/hadoop-examples.jar --dest ~

Per i dettagli sull'API, consulta Get in AWS CLI Command Reference.

Il seguente esempio di codice mostra come usarelist-clusters.

AWS CLI

Il comando seguente elenca tutti i cluster EMR attivi nella regione corrente:


aws emr list-clusters --active

Output:


{
    "Clusters": [
        {
            "Status": {
                "Timeline": {
                    "ReadyDateTime": 1433200405.353,
                    "CreationDateTime": 1433199926.596
                },
                "State": "WAITING",
                "StateChangeReason": {
                    "Message": "Waiting after step completed"
                }
            },
            "NormalizedInstanceHours": 6,
            "Id": "j-3SD91U2E1L2QX",
            "Name": "my-cluster"
        }
    ]
}

Per i dettagli sull'API, vedere ListClustersin AWS CLI Command Reference.

list-clusters

Il seguente esempio di codice mostra come usarelist-clusters.

AWS CLI

Il comando seguente elenca tutti i cluster EMR attivi nella regione corrente:


aws emr list-clusters --active

Output:


{
    "Clusters": [
        {
            "Status": {
                "Timeline": {
                    "ReadyDateTime": 1433200405.353,
                    "CreationDateTime": 1433199926.596
                },
                "State": "WAITING",
                "StateChangeReason": {
                    "Message": "Waiting after step completed"
                }
            },
            "NormalizedInstanceHours": 6,
            "Id": "j-3SD91U2E1L2QX",
            "Name": "my-cluster"
        }
    ]
}

Per i dettagli sull'API, vedere ListClustersin AWS CLI Command Reference.

Il seguente esempio di codice mostra come utilizzarelist-instance-fleets.

AWS CLI

Per ottenere i dettagli di configurazione delle flotte di istanze in un cluster

Questo esempio elenca i dettagli delle flotte di istanze nel cluster specificato.

Comando:


list-instance-fleets --cluster-id 'j-12ABCDEFGHI34JK'

Output:


{
  "InstanceFleets": [
      {
          "Status": {
              "Timeline": {
                  "ReadyDateTime": 1488759094.637,
                  "CreationDateTime": 1488758719.817
              },
              "State": "RUNNING",
              "StateChangeReason": {
                  "Message": ""
              }
          },
          "ProvisionedSpotCapacity": 6,
          "Name": "CORE",
          "InstanceFleetType": "CORE",
          "LaunchSpecifications": {
              "SpotSpecification": {
                  "TimeoutDurationMinutes": 60,
                  "TimeoutAction": "TERMINATE_CLUSTER"
              }
          },
          "ProvisionedOnDemandCapacity": 2,
          "InstanceTypeSpecifications": [
              {
                  "BidPrice": "0.5",
                  "InstanceType": "m3.xlarge",
                  "WeightedCapacity": 2
              }
          ],
          "Id": "if-1ABC2DEFGHIJ3"
      },
      {
          "Status": {
              "Timeline": {
                  "ReadyDateTime": 1488759058.598,
                  "CreationDateTime": 1488758719.811
              },
              "State": "RUNNING",
              "StateChangeReason": {
                  "Message": ""
              }
          },
          "ProvisionedSpotCapacity": 0,
          "Name": "MASTER",
          "InstanceFleetType": "MASTER",
          "ProvisionedOnDemandCapacity": 1,
          "InstanceTypeSpecifications": [
              {
                  "BidPriceAsPercentageOfOnDemandPrice": 100.0,
                  "InstanceType": "m3.xlarge",
                  "WeightedCapacity": 1
              }
          ],
         "Id": "if-2ABC4DEFGHIJ4"
      }
  ]
}

Per i dettagli sull'API, vedere ListInstanceFleetsin AWS CLI Command Reference.

list-instance-fleets

Il seguente esempio di codice mostra come utilizzarelist-instance-fleets.

AWS CLI

Per ottenere i dettagli di configurazione delle flotte di istanze in un cluster

Questo esempio elenca i dettagli delle flotte di istanze nel cluster specificato.

Comando:


list-instance-fleets --cluster-id 'j-12ABCDEFGHI34JK'

Output:


{
  "InstanceFleets": [
      {
          "Status": {
              "Timeline": {
                  "ReadyDateTime": 1488759094.637,
                  "CreationDateTime": 1488758719.817
              },
              "State": "RUNNING",
              "StateChangeReason": {
                  "Message": ""
              }
          },
          "ProvisionedSpotCapacity": 6,
          "Name": "CORE",
          "InstanceFleetType": "CORE",
          "LaunchSpecifications": {
              "SpotSpecification": {
                  "TimeoutDurationMinutes": 60,
                  "TimeoutAction": "TERMINATE_CLUSTER"
              }
          },
          "ProvisionedOnDemandCapacity": 2,
          "InstanceTypeSpecifications": [
              {
                  "BidPrice": "0.5",
                  "InstanceType": "m3.xlarge",
                  "WeightedCapacity": 2
              }
          ],
          "Id": "if-1ABC2DEFGHIJ3"
      },
      {
          "Status": {
              "Timeline": {
                  "ReadyDateTime": 1488759058.598,
                  "CreationDateTime": 1488758719.811
              },
              "State": "RUNNING",
              "StateChangeReason": {
                  "Message": ""
              }
          },
          "ProvisionedSpotCapacity": 0,
          "Name": "MASTER",
          "InstanceFleetType": "MASTER",
          "ProvisionedOnDemandCapacity": 1,
          "InstanceTypeSpecifications": [
              {
                  "BidPriceAsPercentageOfOnDemandPrice": 100.0,
                  "InstanceType": "m3.xlarge",
                  "WeightedCapacity": 1
              }
          ],
         "Id": "if-2ABC4DEFGHIJ4"
      }
  ]
}

Per i dettagli sull'API, vedere ListInstanceFleetsin AWS CLI Command Reference.

Il seguente esempio di codice mostra come utilizzarelist-instances.

AWS CLI

Il comando seguente elenca tutte le istanze di un cluster con l'ID cluster: j-3C6XNQ39VR9WL


aws emr list-instances --cluster-id j-3C6XNQ39VR9WL

Output:


For a uniform instance group based cluster
  {
    "Instances": [
         {
            "Status": {
                "Timeline": {
                    "ReadyDateTime": 1433200400.03,
                    "CreationDateTime": 1433199960.152
                },
                "State": "RUNNING",
                "StateChangeReason": {}
            },
            "Ec2InstanceId": "i-f19ecfee",
            "PublicDnsName": "ec2-52-52-41-150.us-west-2.compute.amazonaws.com",
            "PrivateDnsName": "ip-172-21-11-216.us-west-2.compute.internal",
            "PublicIpAddress": "52.52.41.150",
            "Id": "ci-3NNHQUQ2TWB6Y",
            "PrivateIpAddress": "172.21.11.216"
        },
        {
            "Status": {
                "Timeline": {
                    "ReadyDateTime": 1433200400.031,
                    "CreationDateTime": 1433199949.102
                },
                "State": "RUNNING",
                "StateChangeReason": {}
            },
            "Ec2InstanceId": "i-1feee4c2",
            "PublicDnsName": "ec2-52-63-246-32.us-west-2.compute.amazonaws.com",
            "PrivateDnsName": "ip-172-31-24-130.us-west-2.compute.internal",
            "PublicIpAddress": "52.63.246.32",
            "Id": "ci-GAOCMKNKDCV7",
            "PrivateIpAddress": "172.21.11.215"
        },
        {
            "Status": {
                "Timeline": {
                    "ReadyDateTime": 1433200400.031,
                    "CreationDateTime": 1433199949.102
                },
                "State": "RUNNING",
                "StateChangeReason": {}
            },
            "Ec2InstanceId": "i-15cfeee3",
            "PublicDnsName": "ec2-52-25-246-63.us-west-2.compute.amazonaws.com",
            "PrivateDnsName": "ip-172-31-24-129.us-west-2.compute.internal",
            "PublicIpAddress": "52.25.246.63",
            "Id": "ci-2W3TDFFB47UAD",
            "PrivateIpAddress": "172.21.11.214"
        }
    ]
  }


For a fleet based cluster:
   {
      "Instances": [
          {
              "Status": {
                  "Timeline": {
                      "ReadyDateTime": 1487810810.878,
                      "CreationDateTime": 1487810588.367,
                      "EndDateTime": 1488022990.924
                  },
                  "State": "TERMINATED",
                  "StateChangeReason": {
                      "Message": "Instance was terminated."
                  }
              },
              "Ec2InstanceId": "i-xxxxx",
              "InstanceFleetId": "if-xxxxx",
              "EbsVolumes": [],
              "PublicDnsName": "ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com",
              "InstanceType": "m3.xlarge",
              "PrivateDnsName": "ip-xx-xx-xxx-xx.ec2.internal",
              "Market": "SPOT",
              "PublicIpAddress": "xx.xx.xxx.xxx",
              "Id": "ci-xxxxx",
              "PrivateIpAddress": "10.47.191.80"
          }
      ]
  }

Per i dettagli sull'API, consulta ListInstances AWS CLICommand Reference.

list-instances

Il seguente esempio di codice mostra come utilizzarelist-instances.

AWS CLI

Il comando seguente elenca tutte le istanze di un cluster con l'ID cluster: j-3C6XNQ39VR9WL


aws emr list-instances --cluster-id j-3C6XNQ39VR9WL

Output:


For a uniform instance group based cluster
  {
    "Instances": [
         {
            "Status": {
                "Timeline": {
                    "ReadyDateTime": 1433200400.03,
                    "CreationDateTime": 1433199960.152
                },
                "State": "RUNNING",
                "StateChangeReason": {}
            },
            "Ec2InstanceId": "i-f19ecfee",
            "PublicDnsName": "ec2-52-52-41-150.us-west-2.compute.amazonaws.com",
            "PrivateDnsName": "ip-172-21-11-216.us-west-2.compute.internal",
            "PublicIpAddress": "52.52.41.150",
            "Id": "ci-3NNHQUQ2TWB6Y",
            "PrivateIpAddress": "172.21.11.216"
        },
        {
            "Status": {
                "Timeline": {
                    "ReadyDateTime": 1433200400.031,
                    "CreationDateTime": 1433199949.102
                },
                "State": "RUNNING",
                "StateChangeReason": {}
            },
            "Ec2InstanceId": "i-1feee4c2",
            "PublicDnsName": "ec2-52-63-246-32.us-west-2.compute.amazonaws.com",
            "PrivateDnsName": "ip-172-31-24-130.us-west-2.compute.internal",
            "PublicIpAddress": "52.63.246.32",
            "Id": "ci-GAOCMKNKDCV7",
            "PrivateIpAddress": "172.21.11.215"
        },
        {
            "Status": {
                "Timeline": {
                    "ReadyDateTime": 1433200400.031,
                    "CreationDateTime": 1433199949.102
                },
                "State": "RUNNING",
                "StateChangeReason": {}
            },
            "Ec2InstanceId": "i-15cfeee3",
            "PublicDnsName": "ec2-52-25-246-63.us-west-2.compute.amazonaws.com",
            "PrivateDnsName": "ip-172-31-24-129.us-west-2.compute.internal",
            "PublicIpAddress": "52.25.246.63",
            "Id": "ci-2W3TDFFB47UAD",
            "PrivateIpAddress": "172.21.11.214"
        }
    ]
  }


For a fleet based cluster:
   {
      "Instances": [
          {
              "Status": {
                  "Timeline": {
                      "ReadyDateTime": 1487810810.878,
                      "CreationDateTime": 1487810588.367,
                      "EndDateTime": 1488022990.924
                  },
                  "State": "TERMINATED",
                  "StateChangeReason": {
                      "Message": "Instance was terminated."
                  }
              },
              "Ec2InstanceId": "i-xxxxx",
              "InstanceFleetId": "if-xxxxx",
              "EbsVolumes": [],
              "PublicDnsName": "ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com",
              "InstanceType": "m3.xlarge",
              "PrivateDnsName": "ip-xx-xx-xxx-xx.ec2.internal",
              "Market": "SPOT",
              "PublicIpAddress": "xx.xx.xxx.xxx",
              "Id": "ci-xxxxx",
              "PrivateIpAddress": "10.47.191.80"
          }
      ]
  }

Per i dettagli sull'API, consulta ListInstances AWS CLICommand Reference.

Il seguente esempio di codice mostra come utilizzarelist-security-configurations.

AWS CLI

Per elencare le configurazioni di sicurezza nell'area corrente

Comando:


aws emr list-security-configurations

Output:


{
    "SecurityConfigurations": [
        {
            "CreationDateTime": 1473889697.417,
            "Name": "MySecurityConfig-1"
        },
        {
            "CreationDateTime": 1473889697.417,
            "Name": "MySecurityConfig-2"
        }
    ]
}

Per i dettagli sull'API, consulta ListSecurityConfigurations AWS CLICommand Reference.

list-security-configurations

Il seguente esempio di codice mostra come utilizzarelist-security-configurations.

AWS CLI

Per elencare le configurazioni di sicurezza nell'area corrente

Comando:


aws emr list-security-configurations

Output:


{
    "SecurityConfigurations": [
        {
            "CreationDateTime": 1473889697.417,
            "Name": "MySecurityConfig-1"
        },
        {
            "CreationDateTime": 1473889697.417,
            "Name": "MySecurityConfig-2"
        }
    ]
}

Per i dettagli sull'API, consulta ListSecurityConfigurations AWS CLICommand Reference.

Il seguente esempio di codice mostra come utilizzarelist-steps.

AWS CLI

Il comando seguente elenca tutti i passaggi di un cluster con l'ID clusterj-3SD91U2E1L2QX:


aws emr list-steps --cluster-id j-3SD91U2E1L2QX

Per i dettagli sull'API, consulta ListSteps AWS CLICommand Reference.

list-steps

Il seguente esempio di codice mostra come utilizzarelist-steps.

AWS CLI

Il comando seguente elenca tutti i passaggi di un cluster con l'ID clusterj-3SD91U2E1L2QX:


aws emr list-steps --cluster-id j-3SD91U2E1L2QX

Per i dettagli sull'API, consulta ListSteps AWS CLICommand Reference.

Il seguente esempio di codice mostra come utilizzaremodify-cluster-attributes.

AWS CLI

Il comando seguente imposta la visibilità di un cluster EMR con l'ID j-301CDNY0J5XM4 per tutti gli utenti:


aws emr modify-cluster-attributes --cluster-id j-301CDNY0J5XM4 --visible-to-all-users

Per i dettagli sull'API, vedere ModifyClusterAttributesin AWS CLI Command Reference.

modify-cluster-attributes

Il seguente esempio di codice mostra come utilizzaremodify-cluster-attributes.

AWS CLI

Il comando seguente imposta la visibilità di un cluster EMR con l'ID j-301CDNY0J5XM4 per tutti gli utenti:


aws emr modify-cluster-attributes --cluster-id j-301CDNY0J5XM4 --visible-to-all-users

Per i dettagli sull'API, vedere ModifyClusterAttributesin AWS CLI Command Reference.

Il seguente esempio di codice mostra come utilizzaremodify-instance-fleet.

AWS CLI

Per modificare le capacità target di una flotta di istanze

Questo esempio modifica le capacità target On-Demand e Spot a 1 per il parco istanze specificato.

Comando:


aws emr modify-instance-fleet --cluster-id 'j-12ABCDEFGHI34JK' --instance-fleet InstanceFleetId='if-2ABC4DEFGHIJ4',TargetOnDemandCapacity=1,TargetSpotCapacity=1

Per i dettagli sull'API, consulta AWS CLI Command ModifyInstanceFleetReference.

modify-instance-fleet

Il seguente esempio di codice mostra come utilizzaremodify-instance-fleet.

AWS CLI

Per modificare le capacità target di una flotta di istanze

Questo esempio modifica le capacità target On-Demand e Spot a 1 per il parco istanze specificato.

Comando:


aws emr modify-instance-fleet --cluster-id 'j-12ABCDEFGHI34JK' --instance-fleet InstanceFleetId='if-2ABC4DEFGHIJ4',TargetOnDemandCapacity=1,TargetSpotCapacity=1

Per i dettagli sull'API, consulta AWS CLI Command ModifyInstanceFleetReference.

Il seguente esempio di codice mostra come utilizzareput.

AWS CLI

Il comando seguente carica un file denominato healthcheck.sh all'istanza master in un cluster con l'ID del cluster: j-3SD91U2E1L2QX


aws emr put --cluster-id j-3SD91U2E1L2QX --key-pair-file ~/.ssh/mykey.pem --src ~/scripts/healthcheck.sh --dest /home/hadoop/bin/healthcheck.sh

Per i dettagli sull'API, consulta Put in AWS CLI Command Reference.

put

Il seguente esempio di codice mostra come utilizzareput.

AWS CLI

Il comando seguente carica un file denominato healthcheck.sh all'istanza master in un cluster con l'ID del cluster: j-3SD91U2E1L2QX


aws emr put --cluster-id j-3SD91U2E1L2QX --key-pair-file ~/.ssh/mykey.pem --src ~/scripts/healthcheck.sh --dest /home/hadoop/bin/healthcheck.sh

Per i dettagli sull'API, consulta Put in AWS CLI Command Reference.

Il seguente esempio di codice mostra come utilizzareremove-tags.

AWS CLI

Il comando seguente rimuove un tag con la chiave prod da un cluster con l'ID del clusterj-3SD91U2E1L2QX:


aws emr remove-tags --resource-id j-3SD91U2E1L2QX --tag-keys prod

Per i dettagli sull'API, consulta RemoveTags AWS CLICommand Reference.

remove-tags

Il seguente esempio di codice mostra come utilizzareremove-tags.

AWS CLI

Il comando seguente rimuove un tag con la chiave prod da un cluster con l'ID del clusterj-3SD91U2E1L2QX:


aws emr remove-tags --resource-id j-3SD91U2E1L2QX --tag-keys prod

Per i dettagli sull'API, consulta RemoveTags AWS CLICommand Reference.

Il seguente esempio di codice mostra come utilizzareschedule-hbase-backup.

AWS CLI

Nota: questo comando può essere utilizzato solo con le HBase versioni AMI 2.x e 3.x

1. Per pianificare un HBase backup completo >>>>>> 06ab6d6e13564b5733d75abaf3b599f93cf39a23

Comando:


aws emr schedule-hbase-backup --cluster-id j-XXXXXXYY --type full --dir
s3://amzn-s3-demo-bucket/backup --interval 10 --unit hours --start-time
2014-04-21T05:26:10Z --consistent

Output:


None

2. Per pianificare un backup incrementale HBase

Comando:


aws emr schedule-hbase-backup --cluster-id j-XXXXXXYY --type incremental
 --dir s3://amzn-s3-demo-bucket/backup --interval 30 --unit minutes --start-time
2014-04-21T05:26:10Z --consistent

Output:


None

Per i dettagli sull'API, consulta ScheduleHbaseBackup AWS CLICommand Reference.

schedule-hbase-backup

Il seguente esempio di codice mostra come utilizzareschedule-hbase-backup.

AWS CLI

Nota: questo comando può essere utilizzato solo con le HBase versioni AMI 2.x e 3.x

1. Per pianificare un HBase backup completo >>>>>> 06ab6d6e13564b5733d75abaf3b599f93cf39a23

Comando:


aws emr schedule-hbase-backup --cluster-id j-XXXXXXYY --type full --dir
s3://amzn-s3-demo-bucket/backup --interval 10 --unit hours --start-time
2014-04-21T05:26:10Z --consistent

Output:


None

2. Per pianificare un backup incrementale HBase

Comando:


aws emr schedule-hbase-backup --cluster-id j-XXXXXXYY --type incremental
 --dir s3://amzn-s3-demo-bucket/backup --interval 30 --unit minutes --start-time
2014-04-21T05:26:10Z --consistent

Output:


None

Per i dettagli sull'API, consulta ScheduleHbaseBackup AWS CLICommand Reference.

Il seguente esempio di codice mostra come utilizzaresocks.

AWS CLI

Il comando seguente apre una connessione socks con l'istanza master in un cluster con l'ID del cluster: j-3SD91U2E1L2QX


aws emr socks --cluster-id j-3SD91U2E1L2QX --key-pair-file ~/.ssh/mykey.pem

L'opzione key pair file richiede un percorso locale a un file di chiave privata.

Per i dettagli sull'API, consulta Socks in AWS CLI Command Reference.

socks

Il seguente esempio di codice mostra come utilizzaresocks.

AWS CLI

Il comando seguente apre una connessione socks con l'istanza master in un cluster con l'ID del cluster: j-3SD91U2E1L2QX


aws emr socks --cluster-id j-3SD91U2E1L2QX --key-pair-file ~/.ssh/mykey.pem

L'opzione key pair file richiede un percorso locale a un file di chiave privata.

Per i dettagli sull'API, consulta Socks in AWS CLI Command Reference.

Il seguente esempio di codice mostra come utilizzaressh.

AWS CLI

Il comando seguente apre una connessione ssh con l'istanza master in un cluster con l'ID del cluster: j-3SD91U2E1L2QX


aws emr ssh --cluster-id j-3SD91U2E1L2QX --key-pair-file ~/.ssh/mykey.pem

L'opzione key pair file richiede un percorso locale a un file di chiave privata.

Output:


ssh -o StrictHostKeyChecking=no -o ServerAliveInterval=10 -i /home/local/user/.ssh/mykey.pem hadoop@ec2-52-52-41-150.us-west-2.compute.amazonaws.com
Warning: Permanently added 'ec2-52-52-41-150.us-west-2.compute.amazonaws.com,52.52.41.150' (ECDSA) to the list of known hosts.
Last login: Mon Jun  1 23:15:38 2015

      __|  __|_  )
       _|  (     /   Amazon Linux AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-ami/2015.03-release-notes/
26 package(s) needed for security, out of 39 available
Run "sudo yum update" to apply all updates.

--------------------------------------------------------------------------------

Welcome to Amazon Elastic MapReduce running Hadoop and Amazon Linux.

Hadoop is installed in /home/hadoop. Log files are in /mnt/var/log/hadoop. Check
/mnt/var/log/hadoop/steps for diagnosing step failures.

The Hadoop UI can be accessed via the following commands:

  ResourceManager    lynx http://ip-172-21-11-216:9026/
  NameNode           lynx http://ip-172-21-11-216:9101/

--------------------------------------------------------------------------------

[hadoop@ip-172-31-16-216 ~]$

Per i dettagli sull'API, consulta Ssh in AWS CLI Command Reference.

ssh

Il seguente esempio di codice mostra come utilizzaressh.

AWS CLI

Il comando seguente apre una connessione ssh con l'istanza master in un cluster con l'ID del cluster: j-3SD91U2E1L2QX


aws emr ssh --cluster-id j-3SD91U2E1L2QX --key-pair-file ~/.ssh/mykey.pem

L'opzione key pair file richiede un percorso locale a un file di chiave privata.

Output:


ssh -o StrictHostKeyChecking=no -o ServerAliveInterval=10 -i /home/local/user/.ssh/mykey.pem hadoop@ec2-52-52-41-150.us-west-2.compute.amazonaws.com
Warning: Permanently added 'ec2-52-52-41-150.us-west-2.compute.amazonaws.com,52.52.41.150' (ECDSA) to the list of known hosts.
Last login: Mon Jun  1 23:15:38 2015

      __|  __|_  )
       _|  (     /   Amazon Linux AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-ami/2015.03-release-notes/
26 package(s) needed for security, out of 39 available
Run "sudo yum update" to apply all updates.

--------------------------------------------------------------------------------

Welcome to Amazon Elastic MapReduce running Hadoop and Amazon Linux.

Hadoop is installed in /home/hadoop. Log files are in /mnt/var/log/hadoop. Check
/mnt/var/log/hadoop/steps for diagnosing step failures.

The Hadoop UI can be accessed via the following commands:

  ResourceManager    lynx http://ip-172-21-11-216:9026/
  NameNode           lynx http://ip-172-21-11-216:9101/

--------------------------------------------------------------------------------

[hadoop@ip-172-31-16-216 ~]$

Per i dettagli sull'API, consulta Ssh in AWS CLI Command Reference.

Avvertimento JavaScript è disabilitato o non è disponibile nel tuo browser.

Per usare la documentazione AWS, JavaScript deve essere abilitato. Consulta le pagine della guida del browser per le istruzioni.

Convenzioni dei documenti

MediaStore

Amazon EMR su EKS

In questa pagina

Seleziona le tue preferenze relative ai cookie

Personalizza le tue preferenze relative ai cookie

Essenziali

Prestazione

Funzionali

Pubblicitari

Impossibile salvare le preferenze dei cookie

Esempi di utilizzo di Amazon EMR AWS CLI

Argomenti

Operazioni

add-instance-fleet

add-steps

add-tags

create-cluster-examples

create-default-roles

create-security-configuration

delete-security-configuration

describe-cluster

describe-step

get

list-clusters

list-instance-fleets

list-instances

list-security-configurations

list-steps

modify-cluster-attributes

modify-instance-fleet

put

remove-tags

schedule-hbase-backup

socks

ssh

In questa pagina

Questa pagina ti è stata utile?

Argomento successivo:

Argomento precedente:

Hai bisogno di aiuto?