public class StepFactory extends Object
Example usage, create an interactive Hive job flow with debugging enabled:
AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey); AmazonElasticMapReduce emr = new AmazonElasticMapReduceClient(credentials); StepFactory stepFactory = new StepFactory(); StepConfig enableDebugging = new StepConfig() .withName("Enable Debugging") .withActionOnFailure("TERMINATE_JOB_FLOW") .withHadoopJarStep(stepFactory.newEnableDebuggingStep()); StepConfig installHive = new StepConfig() .withName("Install Hive") .withActionOnFailure("TERMINATE_JOB_FLOW") .withHadoopJarStep(stepFactory.newInstallHiveStep()); RunJobFlowRequest request = new RunJobFlowRequest() .withName("Hive Interactive") .withSteps(enableDebugging, installHive) .withLogUri("s3://log-bucket/") .withInstances(new JobFlowInstancesConfig() .withEc2KeyName("keypair") .withHadoopVersion("0.20") .withInstanceCount(5) .withKeepJobFlowAliveWhenNoSteps(true) .withMasterInstanceType("m1.small") .withSlaveInstanceType("m1.small")); RunJobFlowResult result = emr.runJobFlow(request);
Modifier and Type | Class and Description |
---|---|
static class |
StepFactory.HiveVersion
The available Hive versions.
|
Constructor and Description |
---|
StepFactory()
Creates a new StepFactory using the default Elastic Map Reduce bucket
(us-east-1.elasticmapreduce) for the default (us-east-1) region.
|
StepFactory(String bucket)
Creates a new StepFactory using the specified Amazon S3 bucket to load
resources.
|
Modifier and Type | Method and Description |
---|---|
HadoopJarStepConfig |
newEnableDebuggingStep()
When ran as the first step in your job flow, enables the Hadoop debugging
UI in the AWS Management Console.
|
HadoopJarStepConfig |
newInstallHiveStep()
Step that installs the default version of Hive on your job flow.
|
HadoopJarStepConfig |
newInstallHiveStep(StepFactory.HiveVersion... hiveVersions)
Step that installs the specified versions of Hive on your job flow.
|
HadoopJarStepConfig |
newInstallHiveStep(String... hiveVersions)
Step that installs the specified versions of Hive on your job flow.
|
HadoopJarStepConfig |
newInstallPigStep()
Step that installs the default version of Pig on your job flow.
|
HadoopJarStepConfig |
newInstallPigStep(String... pigVersions)
Step that installs Pig on your job flow.
|
HadoopJarStepConfig |
newRunHiveScriptStep(String script,
String... args)
Step that runs a Hive script on your job flow using the default Hive version.
|
HadoopJarStepConfig |
newRunHiveScriptStepVersioned(String script,
String hiveVersion,
String... scriptArgs)
Step that runs a Hive script on your job flow using the specified Hive version.
|
HadoopJarStepConfig |
newRunPigScriptStep(String script,
String... scriptArgs)
Step that runs a Pig script on your job flow using the default Pig version.
|
HadoopJarStepConfig |
newRunPigScriptStep(String script,
String pigVersion,
String... scriptArgs)
Step that runs a Pig script on your job flow using the specified Pig version.
|
HadoopJarStepConfig |
newScriptRunnerStep(String script,
String... args)
Runs a specified script on the master node of your cluster.
|
public StepFactory()
public StepFactory(String bucket)
The official bucket format is "<region>.elasticmapreduce", so if you're using the us-east-1 region, you should use the bucket "us-east-1.elasticmapreduce".
bucket
- The Amazon S3 bucket from which to load resources.public HadoopJarStepConfig newScriptRunnerStep(String script, String... args)
script
- The script to run.args
- Arguments that get passed to the script.public HadoopJarStepConfig newEnableDebuggingStep()
public HadoopJarStepConfig newInstallHiveStep(StepFactory.HiveVersion... hiveVersions)
hiveVersions
- the versions of Hive to installpublic HadoopJarStepConfig newInstallHiveStep(String... hiveVersions)
hiveVersions
- the versions of Hive to installpublic HadoopJarStepConfig newInstallHiveStep()
public HadoopJarStepConfig newRunHiveScriptStepVersioned(String script, String hiveVersion, String... scriptArgs)
script
- The script to run.hiveVersion
- The Hive version to use.scriptArgs
- Arguments that get passed to the script.public HadoopJarStepConfig newRunHiveScriptStep(String script, String... args)
script
- The script to run.args
- Arguments that get passed to the script.public HadoopJarStepConfig newInstallPigStep()
public HadoopJarStepConfig newInstallPigStep(String... pigVersions)
pigVersions
- the versions of Pig to install.public HadoopJarStepConfig newRunPigScriptStep(String script, String pigVersion, String... scriptArgs)
script
- The script to run.pigVersion
- The Pig version to use.scriptArgs
- Arguments that get passed to the script.public HadoopJarStepConfig newRunPigScriptStep(String script, String... scriptArgs)
script
- The script to run.scriptArgs
- Arguments that get passed to the script.