

# Tune Multiple Algorithms with Hyperparameter Optimization to Find the Best Model
<a name="multiple-algorithm-hpo"></a>

To create a new hyperparameter optimization (HPO) job with Amazon SageMaker AI that tunes multiple algorithms, you must provide job settings that apply to all of the algorithms to be tested and a training definition for each of these algorithms. You must also specify the resources you want to use for the tuning job.
+ The **job settings** to configure include warm starting, early stopping, and the tuning strategy. Warm starting and early stopping are available only when tuning a single algorithm.
+ The **training job definition** to specify the name, algorithm source, objective metric, and the range of values, when required, to configure the set of hyperparameter values for each training job. It configures the channels for data inputs, data output locations, and any checkpoint storage locations for each training job. The definition also configures the resources to deploy for each training job, including instance types and counts, managed spot training, and stopping conditions.
+ The **tuning job resources**: to deploy, including the maximum number of concurrent training jobs that a hyperparameter tuning job can run concurrently and the maximum number of training jobs that the hyperparameter tuning job can run.

## Get Started
<a name="multiple-algorithm-hpo-get-started"></a>

You can create a new hyperparameter tuning job, clone a job, add, or edit tags to a job from the console. You can also use the search feature to find jobs by their name, creation time, or status. Alternatively, you can also hyperparameter tuning jobs with the SageMaker AI API.
+ **In the console**: To create a new job, open the Amazon SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/), choose **Hyperparameter tuning jobs** from the **Training**, menu, and then choose **Create hyperparameter tuning job**. Then following the configuration steps to create a training job for each algorithm that you want to use. These steps are documented in the [Create a Hyperparameter Optimization Tuning Job for One or More Algorithms (Console)](multiple-algorithm-hpo-create-tuning-jobs.md) topic. 
**Note**  
When you start the configuration steps, note that the warm start and early stopping features are not available to use with multi-algorithm HPO. If you want to use these features, you can only tune a single algorithm at a time. 
+ **With the API**: For instructions on using the SageMaker API to create a hyperparameter tuning job, see [Example: Hyperparameter Tuning Job](automatic-model-tuning-ex.html). When you call [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateHyperParameterTuningJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateHyperParameterTuningJob.html) to tune multiple algorithms, you must provide a list of training definitions using [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateHyperParameterTuningJob.html#sagemaker-CreateHyperParameterTuningJob-request-TrainingJobDefinitions](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateHyperParameterTuningJob.html#sagemaker-CreateHyperParameterTuningJob-request-TrainingJobDefinitions) instead of specifying a single [TrainingJobDefinition](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateHyperParameterTuningJob.html#sagemaker-CreateHyperParameterTuningJob-request-TrainingJobDefinition). You must provide job settings that apply to all of the algorithms to be tested and a training definition for each of these algorithms. You must also specify the resources that you want to use for the tuning job. Choose only one of these definition types depending on the number of algorithms that are being tuned. 

**Topics**
+ [

## Get Started
](#multiple-algorithm-hpo-get-started)
+ [

# Create a Hyperparameter Optimization Tuning Job for One or More Algorithms (Console)
](multiple-algorithm-hpo-create-tuning-jobs.md)
+ [

# Manage Hyperparameter Tuning and Training Jobs
](multiple-algorithm-hpo-manage-tuning-jobs.md)

# Create a Hyperparameter Optimization Tuning Job for One or More Algorithms (Console)
<a name="multiple-algorithm-hpo-create-tuning-jobs"></a>

This guide shows you how to create a new hyperparameter optimization (HPO) tuning job for one or more algorithms. To create an HPO job, define the settings for the tuning job, and create training job definitions for each algorithm being tuned. Next, configure the resources for and create the tuning job. The following sections provide details about how to complete each step. We provide an example of how to tune multiple algorithms using the SageMaker AI SDK for Python client at the end of this guide.

## Components of a tuning job
<a name="multiple-algorithm-hpo-create-tuning-jobs-define-settings"></a>

An HPO tuning job contains the following three components:
+ Tuning job settings
+ Training job definitions
+ Tuning job configuration

The way that these components are included in your HPO tuning job depends on whether your tuning job contains one or multiple training algorithms. The following guide describes each of the components and gives an example of both types of tuning jobs.

### Tuning job settings
<a name="multiple-algorithm-hpo-create-tuning-jobs-components-tuning-settings"></a>

Your tuning job settings are applied across all of the algorithms in the HPO tuning job. Warm start and early stopping are available only when you're tuning a single algorithm. After you define the job settings, you can create individual training definitions for each algorithm or variation that you want to tune. 

**Warm start**  
If you cloned this job, you can use the results from a previous tuning job to improve the performance of this new tuning job. This is the warm start feature, and it's only available when tuning a single algorithm. With the warm start option, you can choose up to five previous hyperparameter tuning jobs to use. Alternatively, you can use transfer learning to add additional data to the parent tuning job. When you select this option, you choose one previous tuning job as the parent. 

**Note**  
Warm start is compatible only with tuning jobs that were created after October 1, 2018. For more information, see [Run a warm start job](automatic-model-tuning-considerations.html).

**Early stopping**  
To reduce compute time and avoid overfitting your model, you can stop training jobs early. Early stopping is helpful when the training job is unlikely to improve the current best objective metric of the hyperparameter tuning job. Like warm start, this feature is only available when tuning a single algorithm. This is an automatic feature without configuration options, and it’s disabled by default. For more information about how early stopping works, the algorithms that support it, and how to use it with your own algorithms, see [Stop Training Jobs Early](automatic-model-tuning-early-stopping.html).

**Tuning strategy**  
Tuning strategy can be either random, Bayesian, or Hyperband. These selections specify how automatic tuning algorithms search specified hyperparameter ranges that are selected in a later step. Random search chooses random combinations of values from the specified ranges and can be run sequentially or in parallel. Bayesian optimization chooses values based on what is likely to get the best result according to the known history of previous selections. Hyperband uses a multi-fidelity strategy that dynamically allocates resources toward well-utilized jobs and automatically stops those that underperform. The new configuration that starts after stopping other configurations is chosen randomly.

 Hyperband can only be used with iterative algorithms, or algorithms that run steps in iterations, such as [https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html](https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html) or [Random Cut Forest](https://docs.aws.amazon.com/sagemaker/latest/dg/randomcutforest.html). Hyperband can't be used with non-iterative algorithms, such as decision trees or [k-Nearest Neighbors](https://docs.aws.amazon.com/sagemaker/latest/dg/k-nearest-neighbors.html). For more information about search strategies, see [How Hyperparameter Tuning Works](automatic-model-tuning-how-it-works.html).

**Note**  
Hyperband uses an advanced internal mechanism to apply early stopping. Therefore, when you use the Hyperband internal early stopping feature, the parameter `TrainingJobEarlyStoppingType` in the `HyperParameterTuningJobConfig` API must be set to `OFF`.

**Tags**  
To help you manage tuning jobs, you can enter tags as key-value pairs to assign metadata to tuning jobs. Values in the key-value pair are not required. You can use the key without values. To see the keys associated with a job, choose the **Tags** tab on the details page for tuning job. For more information about using tags for tuning jobs, see [Manage Hyperparameter Tuning and Training Jobs](multiple-algorithm-hpo-manage-tuning-jobs.md).

### Training job definitions
<a name="multiple-algorithm-hpo-create-tuning-jobs-training-definitions"></a>

To create a training job definition, you must configure the algorithm and parameters, define the data input and output, and configure resources. Provide at least one [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_TrainingJobDefinition.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_TrainingJobDefinition.html) for each HPO tuning job. Each training definition specifies the configuration for an algorithm.

To create several definitions for your training job, you can clone a job definition. Cloning a job can save time because it copies all of the job settings, including data channels and Amazon S3 storage locations for output artifacts. You can edit a cloned job to change what you need for your use case.

**Topics**
+ [

#### Configure algorithm and parameters
](#multiple-algorithm-hpo-algorithm-configuration)
+ [

#### Define data input and output
](#multiple-algorithm-hpo-data)
+ [

#### Configure training job resources
](#multiple-algorithm-hpo-training-job-definition-resources)
+ [

#### Add or clone a training job
](#multiple-algorithm-hpo-add-training-job)

#### Configure algorithm and parameters
<a name="multiple-algorithm-hpo-algorithm-configuration"></a>

 The following list describes what you need to configure the set of hyperparameter values for each training job. 
+ A name for your tuning job
+ Permission to access services
+ Parameters for any algorithm options
+ An objective metric
+ The range of hyperparameter values, when required

**Name**  
 Provide a unique name for the training definition. 

**Permissions**  
 Amazon SageMaker AI requires permissions to call other services on your behalf. Choose an AWS Identity and Access Management (IAM) role, or let AWS create a role with the `AmazonSageMakerFullAccess` IAM policy attached. 

**Optional security settings**  
 The network isolation setting prevents the container from making any outbound network calls. This is required for AWS Marketplace machine learning offerings. 

 You can also choose to use a virtual private cloud (VPC).

**Note**  
 Inter-container encryption is only available when you create a job definition from the API. 

**Algorithm options**  
You can choose built-in algorithms, your own algorithm, your own container with an algorithm, or you can subscribe to an algorithm from AWS Marketplace. 
+ If you choose a built-in algorithm, it has the Amazon Elastic Container Registry (Amazon ECR) image information pre-populated.
+ If you choose your own container, you must specify the (Amazon ECR) image information. You can select the input mode for the algorithm as file or pipe.
+ If you plan to supply your data using a CSV file from Amazon S3, you should select the file.

**Metrics**  
When you choose a built-in algorithm, metrics are provided for you. If you choose your own algorithm, you must define your metrics. You can define up to 20 metrics for your tuning job to monitor. You must choose one metric as the objective metric. For more information about how to define a metric for a tuning job, see [Define metrics](automatic-model-tuning-define-metrics-variables.md#automatic-model-tuning-define-metrics).

**Objective metric**  
To find the best training job, set an objective metric and whether to maximize or minimize it. After the training job is complete, you can view the tuning job detail page. The detail page provides a summary of the best training job that is found using this objective metric. 

**Hyperparameter configuration**  
When you choose a built-in algorithm, the default values for its hyperparameters are set for you, using ranges that are optimized for the algorithm that's being tuned. You can change these values as you see fit. For example, instead of a range, you can set a fixed value for a hyperparameter by setting the parameter’s type to **static**. Each algorithm has different required and optional parameters. For more information, see [Best Practices for Hyperparameter Tuning](automatic-model-tuning-considerations.html) and [Define Hyperparameter Ranges](automatic-model-tuning-define-ranges.html). 

#### Define data input and output
<a name="multiple-algorithm-hpo-data"></a>

Each training job definition for a tuning job must configure the channels for data inputs, data output locations, and optionally, any checkpoint storage locations for each training job. 

**Input data configuration**  
Input data is defined by channels. Each channel its own source location (Amazon S3 or Amazon Elastic File System), compression, and format options. You can define up to 20 channels of input sources. If the algorithm that you choose supports multiple input channels, you can specify those, too. For example, when you use the [XGBoost churn prediction notebook](https://sagemaker-examples.readthedocs.io/en/latest/introduction_to_applying_machine_learning/xgboost_customer_churn/xgboost_customer_churn.html), you can add two channels: train and validation.

**Checkpoint configuration**  
Checkpoints are periodically generated during training. For the checkpoints to be saved, you must choose an Amazon S3 location. Checkpoints are used in metrics reporting, and are also used to resume managed spot training jobs. For more information, see [Checkpoints in Amazon SageMaker AI](model-checkpoints.md).

**Output data configuration**  
Define an Amazon S3 location for the artifacts of the training job to be stored. You have the option of adding encryption to the output using an AWS Key Management Service (AWS KMS) key. 

#### Configure training job resources
<a name="multiple-algorithm-hpo-training-job-definition-resources"></a>

Each training job definition for a tuning job must configure the resources to deploy, including instance types and counts, managed spot training, and stopping conditions.

**Resource configuration**  
Each training definition can have a different resource configuration. You choose the instance type and number of nodes. 

**Managed spot training**  
You can save computer costs for jobs if you have flexibility in start and end times by allowing SageMaker AI to use spare capacity to run jobs. For more information, see [Managed Spot Training in Amazon SageMaker AI](model-managed-spot-training.md).

**Stopping condition**  
The stopping condition specifies the maximum duration that's allowed for each training job. 

#### Add or clone a training job
<a name="multiple-algorithm-hpo-add-training-job"></a>

After you create a training job definition for a tuning job, you will return to the **Training Job Definition(s)** panel. This panel is where you can create additional training job definitions to train additional algorithms. You can select the **Add training job definition** and work through the steps to define a training job again. 

Alternatively, to replicate an existing training job definition and edit it for the new algorithm, choose **Clone** from the **Action** menu. The clone option can save time because it copies all of the job’s settings, including the data channels and Amazon S3 storage locations. For more information about cloning, see [Manage Hyperparameter Tuning and Training Jobs](multiple-algorithm-hpo-manage-tuning-jobs.md).

### Tuning job configuration
<a name="multiple-algorithm-hpo-resource-config"></a>

**Resource Limits**  
You can specify the maximum number of concurrent training jobs that a hyperparameter tuning job can run concurrently (10 at most). You can also specify the maximum number of training jobs that the hyperparameter tuning job can run (500 at most). The number of parallel jobs should not exceed the number of nodes that you have requested across all of your training definitions. The total number of jobs can’t exceed the number of jobs that your definitions are expected to run.

Review the job settings, the training job definitions, and the resource limits. Then select **Create hyperparameter tuning job**.

## HPO tuning job example
<a name="multiple-algorithm-hpo-create-tuning-jobs-define-example"></a>

To run a hyperparameter optimization (HPO) training job, first create a training job definition for each algorithm that's being tuned. Next, define the tuning job settings and configure the resources for the tuning job. Finally, run the tuning job.

If your HPO tuning job contains a single training algorithm, the SageMaker AI tuning function will call the `HyperparameterTuner` API directly and pass in your parameters. If your HPO tuning job contains multiple training algorithms, your tuning function will call the `create` function of the `HyperparameterTuner` API. The `create` function tells the API to expect a dictionary containing one or more estimators.

In the following section, code examples show how to tune a job containing either a single training algorithm or multiple algorithms using the SageMaker AI Python SDK.

### Create training job definitions
<a name="multiple-algorithm-hpo-create-tuning-jobs-define-example-train"></a>

When you create a tuning job that includes multiple training algorithms, your tuning job configuration will include the estimators and metrics and other parameters for your training jobs. Therefore, you need to create the training job definition first, and then configure your tuning job. 

The following code example shows how to retrieve two SageMaker AI containers containing the built-in algorithms [https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html](https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html) and [https://docs.aws.amazon.com/sagemaker/latest/dg/linear-learner.html](https://docs.aws.amazon.com/sagemaker/latest/dg/linear-learner.html). If your tuning job contains only one training algorithm, omit one of the containers and one of the estimators.

```
import sagemaker
from sagemaker import image_uris

from sagemaker.estimator import Estimator

sess = sagemaker.Session()
region = sess.boto_region_name
role = sagemaker.get_execution_role()

bucket = sess.default_bucket()
prefix = "sagemaker/multi-algo-hpo"

# Define the training containers and intialize the estimators
xgb_container = image_uris.retrieve("xgboost", region, "latest")
ll_container = image_uris.retrieve("linear-learner", region, "latest")

xgb_estimator = Estimator(
    xgb_container,
    role=role,
    instance_count=1,
    instance_type="ml.m4.xlarge",
    output_path='s3://{}/{}/xgb_output".format(bucket, prefix)',
    sagemaker_session=sess,
)

ll_estimator = Estimator(
    ll_container,
    role,
    instance_count=1,
    instance_type="ml.c4.xlarge",
    output_path="s3://{}/{}/ll_output".format(bucket, prefix),
    sagemaker_session=sess,
)

# Set static hyperparameters
ll_estimator.set_hyperparameters(predictor_type="binary_classifier")
xgb_estimator.set_hyperparameters(
    eval_metric="auc",
    objective="binary:logistic",
    num_round=100,
    rate_drop=0.3,
    tweedie_variance_power=1.4,
)
```

Next, define your input data by specifying the training, validation, and testing datasets, as shown in the following code example. This example shows how to tune multiple training algorithms.

```
training_data = sagemaker.inputs.TrainingInput(
    s3_data="s3://{}/{}/train".format(bucket, prefix), content_type="csv"
)
validation_data = sagemaker.inputs.TrainingInput(
    s3_data="s3://{}/{}/validate".format(bucket, prefix), content_type="csv"
)
test_data = sagemaker.inputs.TrainingInput(
    s3_data="s3://{}/{}/test".format(bucket, prefix), content_type="csv"
)

train_inputs = {
    "estimator-1": {
        "train": training_data,
        "validation": validation_data,
        "test": test_data,
    },
    "estimator-2": {
        "train": training_data,
        "validation": validation_data,
        "test": test_data,
    },
}
```

If your tuning algorithm contains only one training algorithm, your `train_inputs` should contain only one estimator.

You must upload the inputs for the training, validation, and training datasets to your Amazon S3 bucket before you use those in an HPO tuning job.

### Define resources and settings for your tuning job
<a name="multiple-algorithm-hpo-create-tuning-jobs-define-example-resources"></a>

This section shows how to initialize a tuner, define resources, and specify job settings for your tuning job. If your tuning job contains multiple training algorithms, these settings are applied to all of the algorithms that are contained inside your tuning job. This section provides two code examples to define a tuner. The code examples show you how to optimize a single training algorithm followed by an example of how to tune multiple training algorithms.

#### Tune a single training algorithm
<a name="multiple-algorithm-hpo-create-tuning-jobs-define-example-resources-single"></a>

The following code example shows how to initialize a tuner and set hyperparameter ranges for one SageMaker AI built-in algorithm, XGBoost.

```
from sagemaker.tuner import HyperparameterTuner
from sagemaker.parameter import ContinuousParameter, IntegerParameter

hyperparameter_ranges = {
    "max_depth": IntegerParameter(1, 10),
    "eta": ContinuousParameter(0.1, 0.3),
}

objective_metric_name = "validation:accuracy"

tuner = HyperparameterTuner(
    xgb_estimator,
    objective_metric_name,
    hyperparameter_ranges,
    objective_type="Maximize",
    max_jobs=5,
    max_parallel_jobs=2,
)
```

#### Tune multiple training algorithms
<a name="multiple-algorithm-hpo-create-tuning-jobs-define-example-resources-multiple"></a>

Each training job requires different configurations, and these are specified using a dictionary. The following code example shows how to initialize a tuner with configurations for two SageMaker AI built-in algorithms, XGBoost and Linear Learner. The code example also shows how to set a tuning strategy and other job settings, such as the compute resources for the tuning job. The following code example uses `metric_definitions_dict`, which is optional.

```
from sagemaker.tuner import HyperparameterTuner
from sagemaker.parameter import ContinuousParameter, IntegerParameter

# Initialize your tuner
tuner = HyperparameterTuner.create(
    estimator_dict={
        "estimator-1": xgb_estimator,
        "estimator-2": ll_estimator,
    },
    objective_metric_name_dict={
        "estimator-1": "validation:auc",
        "estimator-2": "test:binary_classification_accuracy",
    },
    hyperparameter_ranges_dict={
        "estimator-1": {"eta": ContinuousParameter(0.1, 0.3)},
        "estimator-2": {"learning_rate": ContinuousParameter(0.1, 0.3)},
    },
    metric_definitions_dict={
        "estimator-1": [
            {"Name": "validation:auc", "Regex": "Overall test accuracy: (.*?);"}
        ],
        "estimator-2": [
            {
                "Name": "test:binary_classification_accuracy",
                "Regex": "Overall test accuracy: (.*?);",
            }
        ],
    },
    strategy="Bayesian",
    max_jobs=10,
    max_parallel_jobs=3,
)
```

### Run your HPO tuning job
<a name="multiple-algorithm-hpo-create-tuning-jobs-define-example-run"></a>

Now you can run your tuning job by passing your training inputs to the `fit` function of the `HyperparameterTuner` class. The following code example shows how to pass the `train_inputs` parameter, that is defined in a previous code example, to your tuner.

```
tuner.fit(inputs=train_inputs, include_cls_metadata ={}, estimator_kwargs ={})   
```

# Manage Hyperparameter Tuning and Training Jobs
<a name="multiple-algorithm-hpo-manage-tuning-jobs"></a>

A tuning job can contain many training jobs and creating and managing these jobs and their definitions can become a complex and onerous task. SageMaker AI provides tools to help facilitate the management of these jobs. Tuning jobs you have run can be accessed from the Amazon SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/). Select **Hyperparameter tuning job** from the **Training** menu to see the list. This page is also where you start the procedure to create a new tuning job by selecting **Create hyperparameter tuning job**. 

To see the training jobs run a part of a tuning job, select one of the hyperparameter tuning jobs from the list. The tabs on the tuning job page allow you to inspect the training jobs, their definitions, the tags and configuration used for the tuning job, and the best training job found during tuning. You can select the best training job or any of the other training jobs that belong to the tuning job to see all of their settings. From here you can create a model that uses the hyperparameter values found by a training job by selecting **Create Model** or you can clone the training job by selecting **Clone**.

**Cloning**  
You can save time by cloning a training job that belongs to a hyperparameter tuning job. Cloning copies all of the job’s settings, including data channels, S3 storage locations for output artifacts. You can do this for training jobs you have already run from the tuning job page, as just described, or when you are creating additional training job definitions while creating a hyperparameter tuning job, as described in [Add or clone a training job](multiple-algorithm-hpo-create-tuning-jobs.md#multiple-algorithm-hpo-add-training-job) step of that procedure. 

**Tagging**  
Automatic Model Tuning launches multiple training jobs within a single parent tuning job to discover the ideal weighting of model hyperparameters. Tags can be added to the parent tuning job as described in the [Components of a tuning job](multiple-algorithm-hpo-create-tuning-jobs.md#multiple-algorithm-hpo-create-tuning-jobs-define-settings) section and these tags are then propagated to the individual training jobs underneath. Customers can use these tags for purposes, such as cost allocation or access control. To add tags using the SageMaker SDK, use [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AddTags.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AddTags.html) API. For more information about using tagging for AWS resources, see [Tagging AWS resources](https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html).