Create an Amazon MWAA environment - Amazon Managed Workflows for Apache Airflow

Create an Amazon MWAA environment

Amazon Managed Workflows for Apache Airflow sets up Apache Airflow on an environment in your chosen version using the same open-source Apache Airflow and user interface available from Apache. This guide describes the steps to create an Amazon MWAA environment.

Before you begin

  • The VPC network you specify for your environment cannot be modified after the environment is created.

  • You need an Amazon S3 bucket configured to Block all public access, with Bucket Versioning enabled.

  • You need an AWS account with permissions to use Amazon MWAA, and permission in AWS Identity and Access Management (IAM) to create IAM roles. If you choose the Private network access mode for the Apache Airflow web server, which limits Apache Airflow access within your Amazon VPC, you'll need permission in IAM to create Amazon VPC endpoints.

Apache Airflow versions

The following Apache Airflow versions are supported on Amazon Managed Workflows for Apache Airflow.

Note
  • Beginning with Apache Airflow v2.2.2, Amazon MWAA supports installing Python requirements, provider packages, and custom plugins directly on the Apache Airflow web server.

  • Beginning with Apache Airflow v2.7.2, your requirements file must include a --constraint statement. If you do not provide a constraint, Amazon MWAA will specify one for you to ensure the packages listed in your requirements are compatible with the version of Apache Airflow you are using.

    For more information on setting up constraints in your requirements file, see Installing Python dependencies.

For more information about migrating your self-managed Apache Airflow deployments, or migrating an existing Amazon MWAA environment, including instructions for backing up your metadata database, see the Amazon MWAA Migration Guide.

Create an environment

The following section describes the steps to create an Amazon MWAA environment.

Step one: Specify details

To specify details for the environment
  1. Open the Amazon MWAA console.

  2. Use the AWS Region selector to select your region.

  3. Choose Create environment.

  4. On the Specify details page, under Environment details:

    1. Type a unique name for your environment in Name.

    2. Choose the Apache Airflow version in Airflow version.

      Note

      If no value is specified, defaults to the latest Apache Airflow version. The latest version available is Apache Airflow v2.10.3.

  5. Under DAG code in Amazon S3 specify the following:

    1. S3 Bucket. Choose Browse S3 and select your Amazon S3 bucket, or enter the Amazon S3 URI.

    2. DAGs folder. Choose Browse S3 and select the dags folder in your Amazon S3 bucket, or enter the Amazon S3 URI.

    3. Plugins file - optional. Choose Browse S3 and select the plugins.zip file on your Amazon S3 bucket, or enter the Amazon S3 URI.

    4. Requirements file - optional. Choose Browse S3 and select the requirements.txt file on your Amazon S3 bucket, or enter the Amazon S3 URI.

    5. Startup script file - optional, Choose Browse S3 and select the script file on your Amazon S3 bucket, or enter the Amazon S3 URI.

  6. Choose Next.

Step two: Configure advanced settings

To configure advanced settings
  1. On the Configure advanced settings page, under Networking:

    1. Choose your Amazon VPC.

      This step populates two of the private subnets in your Amazon VPC.

  2. Under Web server access, select your preferred Apache Airflow access mode:

    1. Private network. This limits access of the Apache Airflow UI to users within your Amazon VPC that have been granted access to the IAM policy for your environment. You need permission to create Amazon VPC endpoints for this step.

      Note

      Choose the Private network option if your Apache Airflow UI is only accessed within a corporate network, and you do not require access to public repositories for web server requirements installation. If you choose this access mode option, you need to create a mechanism to access your Apache Airflow Web server in your Amazon VPC. For more information, see Accessing the VPC endpoint for your Apache Airflow Web server (private network access).

    2. Public network. This allows the Apache Airflow UI to be accessed over the Internet by users granted access to the IAM policy for your environment.

  3. Under Security group(s), choose the security group used to secure your Amazon VPC:

    1. By default, Amazon MWAA creates a security group in your Amazon VPC with specific inbound and outbound rules in Create new security group.

    2. Optional. Deselect the check box in Create new security group to select up to 5 security groups.

      Note

      An existing Amazon VPC security group must be configured with specific inbound and outbound rules to allow network traffic. To learn more, see Security in your VPC on Amazon MWAA.

  4. Under Environment class, choose an environment class.

    We recommend choosing the smallest size necessary to support your workload. You can change the environment class at any time.

  5. For Maximum worker count, specify the maximum number of Apache Airflow workers to run in the environment.

    For more information, see Example high performance use case.

  6. Specify the Maximum web server count and Minimum web server count to configure how Amazon MWAA scales the Apache Airflow web servers in your environment.

    For more information about web server automatic scaling, see Configuring Amazon MWAA web server automatic scaling.

  7. Under Encryption, choose a data encryption option:

    1. By default, Amazon MWAA uses an AWS owned key to encrypt your data.

    2. Optional. Choose Customize encryption settings (advanced) to choose a different AWS KMS key. If you choose to specify a Customer managed key in this step, you must specify an AWS KMS key ID or ARN. AWS KMS aliases and multi-region keys are not supported by Amazon MWAA. If you specified an Amazon S3 key for server-side encryption on your Amazon S3 bucket, you must specify the same key for your Amazon MWAA environment.

      Note

      You must have permissions to the key to select it on the Amazon MWAA console. You must also grant permissions for Amazon MWAA to use the key by attaching the policy described in Attach key policy.

  8. Recommended. Under Monitoring, choose one or more log categories for Airflow logging configuration to send Apache Airflow logs to CloudWatch Logs:

    1. Airflow task logs. Choose the type of Apache Airflow task logs to send to CloudWatch Logs in Log level.

    2. Airflow web server logs. Choose the type of Apache Airflow web server logs to send to CloudWatch Logs in Log level.

    3. Airflow scheduler logs. Choose the type of Apache Airflow scheduler logs to send to CloudWatch Logs in Log level.

    4. Airflow worker logs. Choose the type of Apache Airflow worker logs to send to CloudWatch Logs in Log level.

    5. Airflow DAG processing logs. Choose the type of Apache Airflow DAG processing logs to send to CloudWatch Logs in Log level.

  9. Optional. For Airflow configuration options, choose Add custom configuration option.

    You can choose from the suggested dropdown list of Apache Airflow configuration options for your Apache Airflow version, or specify custom configuration options. For example, core.default_task_retries : 3.

  10. Optional. Under Tags, choose Add new tag to associate tags to your environment. For example, Environment: Staging.

  11. Under Permissions, choose an execution role:

    1. By default, Amazon MWAA creates an execution role in Create a new role. You must have permission to create IAM roles to use this option.

    2. Optional. Choose Enter role ARN to enter the Amazon Resource Name (ARN) of an existing execution role.

  12. Choose Next.

Step three: Review and create

To review an environment summary
  • Review the environment summary, choose Create environment.

    Note

    It takes about twenty to thirty minutes to create an environment.