

# Flywheels
<a name="flywheels"></a>

An Amazon Comprehend *flywheel* simplifies the process of improving a custom model over time. You can use a flywheel to orchestrate the tasks associated with training and evaluating new custom model versions. Flywheels support plain text custom models for custom classification and custom entity recognition.

**Topics**
+ [Flywheel overview](flywheels-about.md)
+ [Flywheel data lakes](flywheels-datalake.md)
+ [IAM policies and permissions](flywheels-permissions.md)
+ [Configuring flywheels using the console](flywheels-config-console.md)
+ [Configuring flywheels using the API](flywheels-config-api.md)
+ [Configuring datasets](datasets-config.md)
+ [Flywheel iterations](flywheels-iterate.md)
+ [Using flywheels for analysis](flywheels-inference.md)

# Flywheel overview
<a name="flywheels-about"></a>

A *flywheel* is an Amazon Comprehend resource that orchestrates the training and evaluation of new versions of a custom model. You can create a flywheel to use an existing trained model, or Amazon Comprehend can create and train a new model for the flywheel. Use flywheels with plain-text custom models for custom classification or custom entity recognition.

You can configure and manage flywheels using the Amazon Comprehend console or API. You can also configure flywheels using CloudFormation.

When you create a flywheel, Amazon Comprehend creates a *data lake* in your account. The [data lake](flywheels-datalake.md) stores and manages all the flywheel data, such as the training data and test data for all versions of the model.

You set the *active model version* to be the version of the flywheel model that you want to use for inference jobs or Amazon Comprehend endpoints. Initially, the flywheel contains one version of the model. Over time, as you train new model versions, you select the best-performing version to be the active model version. When a user specifies the flywheel ARN to run an inference job, Amazon Comprehend runs the job using the flywheel's active model version. 

Periodically, you obtain new labeled data (training data or test data) for the model. You make new data available to the flywheel by creating one or more *datasets*. A dataset contains input data for training or testing the custom model associated with a flywheel. Amazon Comprehend uploads the input data to the flywheel's data lake.

To incorporate the new datasets into your custom model, you create and run a flywheel *iteration*. A flywheel iteration is a workflow that uses the new datasets to evaluate the active model version and to train a new model version. Based on the metrics for the existing and new model versions, you can decide whether to promote the new model version to be the active version.

You can use the flywheel active model version to run custom analysis (real time or asynchronous jobs). To use the flywheel model for real-time analysis, you must create an [endpoint](https://docs.aws.amazon.com/comprehend/latest/dg/manage-endpoints.html) for the flywheel.

There is no additional charge for using flywheels. However, when you run a flywheel iteration, you incur the standard charges for training a new model version and storing the model data. For detailed pricing information, see [Amazon Comprehend Pricing](https://aws.amazon.com/comprehend/pricing).

**Topics**
+ [Flywheel datasets](#flywheels-datasets)
+ [Flywheel creation](#flywheels-about-create)
+ [Flywheel states](#flywheels-about-states)
+ [Flywheel iterations](#flywheels-about-iterations)

## Flywheel datasets
<a name="flywheels-datasets"></a>

To add new labeled data to a flywheel, you create a dataset. You configure each dataset as training data or test data. You associate the dataset with a specific flywheel and custom model. 

After you create a dataset, Amazon Comprehend uploads the data to the flywheel's data lake. For more information, see [Flywheel data lakes](flywheels-datalake.md). 

## Flywheel creation
<a name="flywheels-about-create"></a>

When you create a flywheel, you can associate the flywheel with an existing trained model, or the flywheel can create a new model. 

When you create a flywheel with an existing model, you specify the active model version. Amazon Comprehend copies the model's training data and test data into the flywheel's data lake. Make sure that the model training and test data exist in the same Amazon S3 location as when you created the model. 

To create a flywheel for a new model, you provide a dataset for training data (and an optional dataset for test data) when you create the flywheel. When you run the flywheel to create the first flywheel iteration, the flywheel trains the new model.

 When you train a custom model, you specify a list of custom labels (custom classification) or custom entities (custom entity recognition) for the model to recognize. Note the following important points about custom labels/entities:
+ When you create a flywheel for a new model, the list of labels/entities that you provide during flywheel creation is the final list for the flywheel.
+ When you create a flywheel from an existing model, the list of labels/entities associated with that model becomes the final list for the flywheel.
+ If you associate a new dataset with the flywheel, and that dataset contains additional labels/entities, Amazon Comprehend ignores the new labels/entities.
+ You can review a flywheel's label/entity list using the [DescribeFlywheel](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_DescribeFlywheel.html) API operation.
**Note**  
For custom classification, Amazon Comprehend populates the label list after the flywheel status becomes ACTIVE. Wait until the flywheel is active before calling the DescribeFlywheel API operation. 

## Flywheel states
<a name="flywheels-about-states"></a>

A flywheel transitions between the following states: 
+ CREATING - Amazon Comprehend is creating the flywheel resources. You can perform read operations on the flywheel, such as `DescribeFlywheel`.
+ ACTIVE - The flywheel is active. You can determine if a flywheel iteration in progress and view the status of the iteration. You can perform read actions on the flywheel and actions such as `DeleteFlywheel` and `UpdateFlywheel`.
+ UPDATING - Amazon Comprehend is updating the flywheel. You can perform read operations on the flywheel.
+ DELETING - Amazon Comprehend is deleting the flywheel. You can perform read operations on the flywheel.
+ FAILED - the flywheel create operation failed.

After Amazon Comprehend deletes a flywheel, you retain access to all the model data in the flywheel data lake. Amazon Comprehend deletes all the internal metadata required for managing the flywheel resources. Amazon Comprehend also deletes the datasets associated with this flywheel (the model data is saved in the data lake).

## Flywheel iterations
<a name="flywheels-about-iterations"></a>

When you obtain new training or test data for a flywheel model, you create one or more new datasets to upload the new data to the flywheel's data lake. 

You then run the flywheel to create a new flywheel iteration. The flywheel iteration evaluates the current active model version using the new data and stores the results in the data lake. The flywheel also creates and trains a new model version.

 If the new model exhibits better performance than the current active model version, you can promote the new model version to be the active model version. You can use the [ console](flywheels-iterate.md#flywheels-iterate-console-promote) or the [UpdateFlywheel](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_UpdateFlywheel.html) API operation to update the active model version.

# Flywheel data lakes
<a name="flywheels-datalake"></a>

When you create a flywheel, Amazon Comprehend creates a data lake in your account to contain all the flywheel data, such as the input and output data required for the model versions. 

Amazon Comprehend creates the data lake in the Amazon S3 location that you specify when you create the flywheel. You can specify the location as an Amazon S3 bucket or as a new folder in an Amazon S3 bucket. 

## Data lake folder structure
<a name="flywheels-datalake-folders"></a>

When Amazon Comprehend creates the data lake, it sets up the following folder structure in the Amazon S3 location.

**Warning**  
Amazon Comprehend manages the data lake folder organization and contents. Always use the Amazon Comprehend API operations to modify the data lake folders, or your flywheel may not operate correctly.

```
  Document Pool
  Annotations Pool
  Staging
  Model Datasets
    (data for each version of the model)
    VersionID-1
      Training
      Test
      ModelStats
    VersionID-2
      Training
      Test
      ModelStats
```

To view the training assessment of a model version, perform these steps: 

1. Open the folder named **Model Datasets** at the root level of the data lake. This folder contains a subfolder for each version of the model. 

1. Open the folder for the model version of interest.

1. Open the folder named **ModelStats** to view the statistics for the model.

## Data lake management
<a name="flywheels-datalake-mgmt"></a>

Amazon Comprehend performs the following tasks to manage the data lake on your behalf:
+ Defines the folder structure of the data lake and ingests datasets into the appropriate folders.
+ Manages the input documents (such as text files and annotation files) required to train the model.
+ Manages the training and evaluation output data associated with each version of the model.
+ Manages encryption for files stored in the data lake.

Amazon Comprehend performs all the data creation and update operations for the data lake. You retain full access to the data in the data lake. For example:
+ You have full access to the contents of the data lake.
+ The data lake remains available after you delete the flywheel.
+ You can configure access logs for the Amazon S3 bucket that contains the data lake.
+ You can provide encryption keys for the data. You specify these when you create the flywheel.

 We recommend the following best practices:
+ Don't manually add your own folders or files into the data lake. Don't modify or delete any files in the data lake.
+ Always use the Amazon Comprehend creation and update operations to add or modify data in the data lake. For example, use `CreateDataset` to provide training or test data and `StartFlywheelIteration` to generate evaluation data for model versions.
+ The data lake structure may evolve over time. Don't create downstream scripts or programs that rely explicitly on the data lake structure. 
+ When you provide a data lake location for the flywheel, we recommend creating a common prefix for data related to all flywheels or using a different prefix for each flywheel. We don't recommend using the complete data lake path of one flywheel as the prefix for another flywheel.

# IAM policies and permissions
<a name="flywheels-permissions"></a>

You configure the following policies and permissions to use flywheels: 
+ [Configure IAM user permissions](#flywheels-permissions-iam) for users to access flywheel operations.
+ (Optional) [Configure permissions for AWS KMS keys](#flywheels-permissions-kms) for the data lake.
+ [Create a data access role](#flywheels-permissions-service) that authorizes Amazon Comprehend to access the data lake.

## Configure IAM user permissions
<a name="flywheels-permissions-iam"></a>

To use flywheel capabilities, add appropriate permissions policies to your AWS Identity and Access Management (IAM) identities (users, groups, and roles). 

The following example shows permissions policy to create datasets, to create and manage flywheels, and to run the flywheel.

**Example IAM policy to manage flywheels**  

```
{
  "Effect": "Allow",
  "Action": [
    "comprehend:CreateFlywheel",
    "comprehend:DeleteFlywheel",
    "comprehend:UpdateFlywheel",
    "comprehend:ListFlywheels",
    "comprehend:DescribeFlywheel",
    "comprehend:CreateDataset",
    "comprehend:DescribeDataset",
    "comprehend:ListDatasets",
    "comprehend:StartFlywheelIteration",
    "comprehend:DescribeFlywheelIteration",
    "comprehend:ListFlywheelIterationHistory"    
  ],
  "Resource": "*"
}
```

For information about creating IAM policies for Amazon Comprehend, see [How Amazon Comprehend works with IAM](security_iam_service-with-iam.md). 

## Configure permissions for AWS KMS keys
<a name="flywheels-permissions-kms"></a>

If you are using AWS KMS keys for your data in the data lake, set up the required permissions. For information, see [Permissions required to use KMS encryption](security_iam_id-based-policy-examples.md#auth-kms-permissions) .

## Create a data access role
<a name="flywheels-permissions-service"></a>

You create a data access role in IAM for Amazon Comprehend to access flywheel data in the data lake. If you use the console to create a flywheel, the system can optionally create a new role for this purpose. For more information, see [Role-based permissions required for asynchronous operations](security_iam_id-based-policy-examples.md#auth-role-permissions).

# Configuring flywheels using the console
<a name="flywheels-config-console"></a>

You can use the Amazon Comprehend console to create, update, and delete flywheels. 

When you create a flywheel, Amazon Comprehend creates a data lake to hold all the data that the flywheel needs, such as the training data and test data for each version of the model.

When you delete a flywheel, Amazon Comprehend doesn't delete the data lake or the model associated with the flywheel. 

Review the information in section [Flywheel creation](flywheels-about.md#flywheels-about-create) before you create a new flywheel.

**Topics**
+ [Create a flywheel](#flywheels-config-console-create)
+ [Update a flywheel](#flywheels-update-console)
+ [Delete a flywheel](#flywheels-delete-console)

## Create a flywheel
<a name="flywheels-config-console-create"></a>

When you create a flywheel, the required configuration fields depend on whether the flywheel is for an existing custom model or a new model.

**To create a flywheel**

1. Sign in to the AWS Management Console and open the [Amazon Comprehend console](https://console.aws.amazon.com/comprehend/).

1. From the left menu, choose **Flywheels**.

1. From the **Flywheels** table, choose **Create new flywheel**. 

1. Under **Flywheel name**, enter a name for the flywheel. 

1. (Optional) To create a flywheel for an existing model, configure the fields under **Active model version**.

   1. From the **Model** drop-down list, select a model

   1. From the **Version** drop-down list, select the model version.

1. (Optional) To create a new classifier model for the flywheel, under **Custom model type**, choose a **Custom classification** and configure the parameters in following steps.

   1. Under **Language**, select the language for the model.

   1. Under **Classifier mode**, choose single-label mode or multi-label mode.

   1. Under **Custom labels**, enter one or more custom labels to use for training the model. Each label must match one of the classes in your input training data. 

1. (Optional) To create a new entity recognition model for the flywheel, under **Custom model type**, choose a **Custom entity recognition** and configure the parameters in following steps.

   1. Under **Language**, select the language for the model.

   1. Under **Custom entity type**, enter up to 25 custom entities to use for training the model. Each label must match one of the entity types in your input training data.

      To create more than one label, perform the following steps multiple times.

      1. Enter a custom label. The label must be all uppercase. Use an underscore as a separator between words in the label.

      1. Choose **Add type**.

      To remove one of the labels that you've added, choose **X** to the right of the label name.

1. Configure your choices for volume encryption, model encryption, and data lake encryption. For each of these, choose whether to use an AWS owned KMS key or a key that you have permission to use.
   + If you are using an AWS owned KMS key, there are no additional parameters. 
   + If you are using another existing key, for **KMS key ARN** enter the ARN for the key ID.
   + If you want to create a new key, choose **Create an AWS KMS key**.

   For more information on creating and using KMS keys and the associated encryption, see [AWS Key Management Service](https://docs.aws.amazon.com/kms/latest/developerguide/overview.html).

   1. Configure the **Volume encryption** key. Amazon Comprehend uses this key to encrypt the data in the storage volume while your job is being processed. choose whether to use an AWS owned KMS key or a key that you have permission to use.

   1. Configure the **Model encryption** key. Amazon Comprehend uses this key to encrypt the model data for this model version. 

1. Configure the **Data lake location**. For more information, see [Data lake management](flywheels-datalake.md#flywheels-datalake-mgmt).

1. (Optional) Configure **Data lake encryption** key. Amazon Comprehend uses this key to encrypt all files in the data lake.

1. (Optional) Configure **VPC settings**. Enter the VPC ID under **VPC** or choose the ID from the drop-down list. 

   1. Choose the subnet under **Subnets(s)**. After you select the first subnet, you can choose additional ones.

   1. Under **Security Group(s)**, choose the security group to use if you specified one. After you select the first security group, you can choose additional ones.

1. Configure the **Service access** permissions. 

   1. If you select **Use an existing IAM role**, select the role name in the drop-down list.

   1. If you select **Create an IAM role**, Amazon Comprehend creates a new role. The console displays the permissions that Amazon Comprehend configures for the role. Under **Role name**, enter a descriptive name for the role.

1. (Optional) Configure **Tags** settings. To add a tag, enter a key-value pair under **Tags**. Choose **Add tag**. To remove this pair before creating the flywheel, choose **Remove tag**. For more information, see [Tagging your resources](tagging.md).

1. Choose **Create**. 

## Update a flywheel
<a name="flywheels-update-console"></a>

You can configure the flywheel name, data lake location, model type, and model configuration only when you create the flywheel. 

When you update a flywheel, you can specify a different model if the model type and configuration options are the same as the current model. You can configure a new active model version. You can also update encryption details, service access permissions, and VPC settings. 

**To update a flywheel**

1. Sign in to the AWS Management Console and open the [Amazon Comprehend console](https://console.aws.amazon.com/comprehend/).

1. From the left menu, choose **Flywheels**.

1. From the **Flywheels** table, choose the flywheel to update. 

1. Under **Active model version**, choose a model from the **Model** drop-down list and choose a model version. 

   The form populates the model type and model configuration.

1. (Optional) Configure **Volume encryption** and **Model encryption** settings.

1. (Optional) Configure **Data lake encryption** settings.

1. Configure the **Service access** permissions. 

1. (Optional) Configure **VPC settings**.

1. (Optional) Configure **Tags** settings.

1. Choose **Save**. 

## Delete a flywheel
<a name="flywheels-delete-console"></a>

**To delete a flywheel**

1. Sign in to the AWS Management Console and open the [Amazon Comprehend console](https://console.aws.amazon.com/comprehend/).

1. From the left menu, choose **Flywheels**.

1. From the **Flywheels** table, choose the flywheel to delete. 

1. Choose **Delete**. 

# Configuring flywheels using the API
<a name="flywheels-config-api"></a>

You can use the Amazon Comprehend API to create, update, and delete flywheels. 

When you create a flywheel, Amazon Comprehend creates a data lake to hold all the data that the flywheel needs, such as the training data and test data for each version of the model.

When you delete a flywheel, Amazon Comprehend doesn't delete the data lake or the model associated with the flywheel. 

The flywheel delete operation fails if the flywheel is running an iteration or creating a dataset.

Review the information in section [Flywheel creation](flywheels-about.md#flywheels-about-create) before you create a new flywheel.

## Create a flywheel for an existing model
<a name="flywheels-config-api-create-existing"></a>

Use the [CreateFlywheel](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_CreateFlywheel.html) operation to create a flywheel for an existing model. 

**Example**  

```
aws comprehend create-flywheel  \
    --flywheel-name "myFlywheel2"  \
    --active-model-arn  "modelArn"  \
    --data-access-role-arn   arn:aws::iam::111122223333:role/testFlywheelDataAccess \
    --data-lake-s3-uri": "https://s3-bucket-endpoint"   \
```
If the operation is successful, the response includes the flywheel ARN.  

```
{
  "FlywheelArn": "arn:aws::comprehend:aws-region:111122223333:flywheel/name",
  "ActiveModelArn": "modelArn"
}
```

## Create a flywheel for a new model
<a name="flywheels-config-api-create-new"></a>

Use the [CreateFlywheel](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_CreateFlywheel.html) operation to create a flywheel for a new custom classification model. 

**Example**  

```
aws comprehend create-flywheel \
    --flywheel-name "myFlywheel2" \
    --data-access-role-arn  arn:aws::iam::111122223333:role/testFlywheelDataAccess \
    --model-type "DOCUMENT_CLASSIFIER" \
    --data-lake-s3-uri  "s3Uri"  \
    --task-config  file://taskConfig.json
```
The taskConfig.json file contains the following content.  

```
{
    "LanguageCode": "en",
    "DocumentClassificationConfig": {
        "Mode": "MULTI_LABEL",
        "Labels": ["optimism", "anger"]
    } 
}
```
The API response body includes the following content.  

```
{
  "FlywheelArn": "arn:aws::comprehend:aws-region:111122223333:flywheel/name",
  "ActiveModelArn": "modelArn"
}
```

## Describe a flywheel
<a name="flywheels-config-api-desc"></a>

Use the Amazon Comprehend [DescribeFlywheel](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_DescribeFlywheel.html) operation to retrieve configured information about a flywheel. 

```
aws comprehend describe-flywheel \
    --flywheel-arn  "flywheelArn"
```

The API response body includes the following content.

```
{
  "FlywheelProperties": {
      "FlywheelArn": "arn:aws::comprehend:aws-region:111122223333:flywheel/myTestFlywheel",
      "DataAccessRoleArn": "arn:aws::iam::111122223333:role/Admin",
      "TaskConfig": {
          "LanguageCode": "en",
          "DocumentClassificationConfig": {
              "Mode": "MULTI_LABEL"
          }
      },
      "DataLakeS3Uri": "s3://my-test-datalake/flywheelbasictest/myTestFlywheel/schemaVersion=1/20220801T014326Z",
      "Status": "ACTIVE",
      "ModelType":  "DOCUMENT_CLASSIFIER",
      "CreationTime": 1659318206.102,
      "LastModifiedTime": 1659318249.05
  }
}
```

## Update a flywheel
<a name="flywheels-config-api-update"></a>

Use the [UpdateFlywheel](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_UpdateFlywheel.html) operation to update the modifiable configuration values of the flywheel. 

Some configuration fields are JSON structures with subfields. To update one or more subfields, provide values for all the subfields (Amazon Comprehend sets the value to null for any subfield missing in the request). 

If you omit a top-level parameter in the `UpdateFlywheel` request, Amazon Comprehend doesn't change the values of the parameter or any of its subfields in the flywheel.

To add or remove tags on the flywheel, use the [TagResource](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_TagResource.html) and [UntagResource](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_UntagResource.html) operations.

You can promote a model version by setting the `ActiveModelArn` parameter, as shown in the following example. 

```
aws comprehend update-flywheel \
    --region aws-region \
    --flywheel-arn  "flywheelArn" \
    --active-model-arn  "modelArn" \
```

The API response body includes the following content.

```
{
  "FlywheelArn": "arn:aws::comprehend:aws-region:111122223333:flywheel/name",
  "ActiveModelArn": "modelArn"
}
```

## Delete a flywheel
<a name="flywheels-config-api-delete"></a>

Use the Amazon Comprehend [DeleteFlywheel](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_DeleteFlywheel.html) operation to delete flywheels. 

```
aws comprehend delete-flywheel \
    --flywheel-arn  "flywheelArn"
```

A successful API response contains an empty response message body

## List the flywheels
<a name="flywheels-config-api-list"></a>

Use the Amazon Comprehend [ListFlywheels](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_ListFlywheels.html) operation to retrieve a list of flywheels in the current region. 

```
aws comprehend list-flywheel \
    --region aws-region \
    --endpoint-url  "uri"
```

The API response body includes the following content.

```
{
    "FlywheelSummaryList": [
        {
            "FlywheelArn": "arn:aws::comprehend:aws-region:111122223333:flywheel/myTestFlywheel",
            "DataLakeS3Uri": "s3://my-test-datalake/flywheelbasictest/myTestFlywheel/schemaVersion=1/20220801T014326Z",
            "Status": "ACTIVE",
            ""ModelType":  "DOCUMENT_CLASSIFIER",
            "CreationTime": 1659318206.102,
            "LastModifiedTime": 1659318249.05
        }
    ]
}
```

# Configuring datasets
<a name="datasets-config"></a>

To add labeled training or test data to a flywheel, use the Amazon Comprehend console or API to create a dataset. 

You configure each dataset as training data or test data. You associate the dataset with a specific flywheel and custom model. When you create a dataset, Amazon Comprehend uploads the data to the flywheel's data lake. For details about file formats for the training data, see [Preparing classifier training data](prep-classifier-data.md) or [Preparing entity recognizer training data](prep-training-data-cer.md). 

When you delete the flywheel, Amazon Comprehend deletes the datasets. The uploaded data remains available in the data lake.

## Creating a dataset (console)
<a name="datasets-create-console"></a>

**Create a dataset**

1. Sign in to the AWS Management Console and open the [Amazon Comprehend console](https://console.aws.amazon.com/comprehend/).

1. From the left menu, choose **Flywheels** and choose the flywheel where you want to add the data.

1. Choose the **Datasets** tab.

1. In the **Training datasets** or **Test datasets** table, choose **Create dataset**. 

1. Under **Dataset details**, enter a name for the dataset and an optional description. 

1. Under **Data specifications**, choose the **Data format** and the **Dataset type** configuration fields.

1. (Optional) Under **Input format**, choose the format of the input documents. 

1. Under **Annotation location on S3**, enter the Amazon S3 location of the annotations file. 

1. Under **Training data location on S3**, enter the Amazon S3 location of the document files.

1. Choose **Create**. 

## Creating a dataset (API)
<a name="datasets-api-create"></a>

You can use the [CreateDataset](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_CreateDataset.html) operation to create a dataset. 

**Example**  

```
aws comprehend create-dataset \
    --flywheel-arn "myFlywheel2" \
    --dataset-name "my-training-dataset"
    --dataset-type "TRAIN"
    --description "my training dataset"
    --cli-input-json file://inputConfig.json 
}
```
The `inputConfig.json` file contains the following content.  

```
{
    "DataFormat": "COMPREHEND_CSV",
    "DocumentClassifierInputDataConfig": {
        "S3Uri": "s3://my-comprehend-datasets/multilabel_train.csv"
    }
}
```

To add or remove tags on the dataset, use the [TagResource](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_TagResource.html) and [UntagResource](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_UntagResource.html) operations.

## Describe a dataset
<a name="datasets-api-desc"></a>

Use the Amazon Comprehend [DescribeDataset](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_DescribeDataset.html) operation to retrieve configured information about a flywheel. 

```
aws comprehend describe-dataset \
    --dataset-arn  "datasetARN"
```

The response contains the following content.

```
{
   "DatasetProperties": {
      "DatasetArn": "arn:aws::comprehend:aws-region:111122223333:flywheel/myTestFlywheel/dataset/train-dataset",
      "DatasetName": "train-dataset",
      "DatasetType": "TRAIN",
      "DatasetS3Uri": "s3://my-test-datalake/flywheelbasictest/myTestFlywheel/schemaVersion=1/20220801T014326Z/datasets/train-dataset/20220801T194844Z",
      "Description": "Good Dataset",
      "Status": "COMPLETED",
      "NumberOfDocuments": 90,
      "CreationTime": 1659383324.297
  }
}
```

# Flywheel iterations
<a name="flywheels-iterate"></a>

Use flywheel iterations to help you create and manage new model versions. 

**Topics**
+ [Iteration workflow](#flywheels-iterate-flow)
+ [Managing iterations (console)](#flywheels-iterate-console)
+ [Managing iterations (API)](#flywheels-iterate-api)

## Iteration workflow
<a name="flywheels-iterate-flow"></a>

A flywheel starts out with a trained model version or uses an initial dataset to train a model version.

Over time, as you obtain new labeled data, you train new model versions to improve the performance of your flywheel model. When you run the flywheel, it creates a new iteration that trains and evaluates a new model version. You can promote the new model version if its performance is superior to the existing active model version.

The flywheel iteration workflow includes the following steps:

1. You create datasets for the new labeled data.

1. You run the flywheel to create a new iteration. The iteration follows these steps to train and evaluate a new model version: 

   1. Evaluates the active model version using the new data.

   1. Trains a new model version using the new data. 

   1. Stores the evaluation and training results in the data lake.

   1. Returns the F1 scores for both models.

1. After the iteration completes, you can compare the F1 scores for the existing active model and the new model. 

1. If the new model version has superior performance, you promote it to be the active model version. You can use the [ console](#flywheels-iterate-console-promote) or the [ API](#flywheels-iterate-console-promote) to promote the new model version.

## Managing iterations (console)
<a name="flywheels-iterate-console"></a>

You can use the console to start a new iteration and query the status of an in-progress iteration. You can also view the results of completed iterations.

### Start a flywheel iteration (console)
<a name="flywheels-iterate-console-start"></a>

Before you can start a new iteration, create one or more new training or test datasets. See [Configuring datasets](datasets-config.md)

**Start a flywheel iteration (console)**

1. Sign in to the AWS Management Console and open the [Amazon Comprehend console](https://console.aws.amazon.com/comprehend/).

1. From the left menu, choose **Flywheels**.

1. From the **Flywheels** table, choose a flywheel. 

1. Choose **Run flywheel**. 

### Analyze iteration results (Console)
<a name="flywheels-iterate-console-analyze"></a>

After it runs the flywheel iteration, the console displays the results in the **Flywheels iterations** table.

### Promote new model version (Console)
<a name="flywheels-iterate-console-promote"></a>

From the model details page in the console, you can promote a new model version to be the active model version. 

**Promote a flywheel model version to active model version (console)**

1. Sign in to the AWS Management Console and open the [Amazon Comprehend console](https://console.aws.amazon.com/comprehend/).

1. From the left menu, choose **Flywheels**.

1. From the **Flywheels** table, choose a flywheel. 

1. From the **Flywheel details page** table, choose the version to promote from the **Flywheels iterations** table. 

1. Choose **Make active model**. 

## Managing iterations (API)
<a name="flywheels-iterate-api"></a>

You can use the Amazon Comprehend API to start a new iteration and query the status of an in-progress iteration. You can also view the results of completed iterations.

### Start flywheel iteration (API)
<a name="flywheels-iterate-api-start"></a>

Use the Amazon Comprehend [StartFlywheelIteration](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_StartFlywheelIteration.html) operation to start a flywheel iteration. 

```
aws comprehend start-flywheel-iteration \
    --flywheel-arn  "flywheelArn"
```

The response contains the following content.

```
{
  "FlywheelIterationArn": "arn:aws::comprehend:aws-region:111122223333:flywheel/name"
}
```

### Promote new model version (API)
<a name="flywheels-iterate-api-promote"></a>

Use the [UpdateFlywheel](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_UpdateFlywheel.html) operation to promote a model version to be the active model version. 

Send the `UpdateFlywheel` request with the `ActiveModelArn` parameter set to the ARN of the new active model version.

```
aws comprehend update-flywheel \
    --active-model-arn  "modelArn" \
```

The response contains the following content.

```
{
  "FlywheelArn": "arn:aws::comprehend:aws-region:111122223333:flywheel/name",
  "ActiveModelArn": "modelArn"
}
```

### Describe flywheel iteration results (API)
<a name="flywheels-iterate-api-analyze"></a>

The Amazon Comprehend [DescribeFlywheelIteration](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_DescribeFlywheelIteration.html) operation returns information about an iteration after it runs to completion. 

```
aws comprehend describe-flywheel-iteration \
	--flywheel-arn "flywheelArn" \
	--flywheel-iteration-id  "flywheelIterationId" \
	--region aws-region
```

The response contains the followng content.

```
{
    "FlywheelIterationProperties": {
        "FlywheelArn": "flywheelArn",
        "FlywheelIterationId": "iterationId",
        "CreationTime": <createdAt>,
        "EndTime": <endedAt>,
        "Status": <status>,
        "Message": <message>,
        "EvaluatedModelArn": "modelArn",
        "EvaluatedModelMetrics": {
            "AverageF1Score": <value>,
            "AveragePrecision": <value>,
            "AverageRecall": <value>,
            "AverageAccuracy": <value>
        },
        "TrainedModelArn": "modelArn",
        "TrainedModelMetrics": {
            "AverageF1Score": <value>,
            "AveragePrecision": <value>,
            "AverageRecall": <value>,
            "AverageAccuracy": <value>
        }
    }
}
```

### Get iteration history (API)
<a name="flywheels-iterate-api-history"></a>

Use the [ListFlywheelIterationHistory](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_ListFlywheelIterationHistory.html) operation to get information about iteration history. 

```
aws comprehend list-flywheel-iteration-history \
	--flywheel-arn "flywheelArn"
```

The response contains the followng content.

```
{
    "FlywheelIterationPropertiesList": [
        {
            "FlywheelArn": "<flywheelArn>",
            "FlywheelIterationId": "20220907T214613Z",
            "CreationTime": 1662587173.224,
            "EndTime": 1662592043.02,
            "Status": "<status>",
            "Message": "<message>",
            "EvaluatedModelArn": "modelArn",
            "EvaluatedModelMetrics": {
                "AverageF1Score": 0.8333333333333333,
                "AveragePrecision": 0.75,
                "AverageRecall": 0.9375,
                "AverageAccuracy": 0.8125
            },
            "TrainedModelArn": "modelArn",
            "TrainedModelMetrics": {
                "AverageF1Score": 0.865497076023392,
                "AveragePrecision": 0.7636363636363637,
                "AverageRecall": 1.0,
                "AverageAccuracy": 0.84375
            }
        }
    ]
}
```

# Using flywheels for analysis
<a name="flywheels-inference"></a>

You can use the flywheel's active model version to run analysis for custom classification or entity recognition. The active model version is configurable. You can use the [ console](flywheels-iterate.md#flywheels-iterate-console-promote) or the [UpdateFlywheel](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_UpdateFlywheel.html) API operation to set a new version of the model to be the active model version. 

To use the flywheel, specify the flywheel ARN instead of a custom model ARN when you configure the analysis task. Amazon Comprehend runs the analysis using the flywheel's active model version. 

## Real-time analysis
<a name="flywheels-inference-console"></a>

You use an endpoint to run real-time analysis. When you create or update an endpoint, you can configure it with the flywheel ARN instead of a model ARN. When you run the real-time analysis, select the endpoint associated with the flywheel. Amazon Comprehend runs the analysis using the active model version of the flywheel.

When you use [UpdateFlywheel](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_UpdateFlywheel.html) to set a new active model version for the flywheel, the endpoint updates automatically to start using the new active model version. If you don't want the endpoint to update automatically, configure the endpoint (using [UpdateEndpoint](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_UpdateEndpoint.html)) to use the model version ARN directly. The endpoint continues to use this model version if the flywheel active model version changes.

For custom classification, use the [ClassifyDocument](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_ClassifyDocument.html) API operation. For custom entity recognition, use the [DetectEntities](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_DetectEntities.html) API request. Provide the endpoint of the flywheel in the **EndpointArn** parameter.

You can also use the console to run real-time analysis for [custom classification](custom-sync.md) or [custom entity recognition](detecting-cer-real-time.md).

## Asynchronous jobs
<a name="flywheels-inference-api"></a>

For custom classification, use the [StartDocumentClassificationJob](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_StartDocumentClassificationJob.html) API request to start an aysnchronous job. Provide the **FlywheelArn** parameter instead of the **DocumentClassifierArn**.

For custom entity recognition, use the [StartEntitiesDetectionJob](https://docs.aws.amazon.com/comprehend/latest/APIReference/API_StartEntitiesDetectionJob.html) API request. Provide the **FlywheelArn** parameter instead of the **EntityRecognizerArn**.

You can use the console to run asynchronous analysis jobs for [custom classification](analysis-jobs-custom-classifier.md) or [custom entity recognition](detecting-cer.md). When you create the job, enter the flywheel ARN in the **Recognizer model** or **Classifier model** field.