Changing a DAG's timezone on Amazon MWAA - Amazon Managed Workflows for Apache Airflow

Changing a DAG's timezone on Amazon MWAA

Apache Airflow schedules your directed acyclic graph (DAG) in UTC+0 by default. The following steps show how you can change the timezone in which Amazon MWAA runs your DAGs with Pendulum. Optionally, this topic demonstrates how you can create a custom plugin to change the timezone for your environment's Apache Airflow logs.

Version

  • You can use the code example on this page with Apache Airflow v2 in Python 3.10.

Prerequisites

To use the sample code on this page, you'll need the following:

Permissions

  • No additional permissions are required to use the code example on this page.

Create a plugin to change the timezone in Airflow logs

Apache Airflow will run the Python files in the plugins directory at start-up. With the following plugin, you can override the executor's timezone, which modifies the timezone in which Apache Airflow writes logs.

  1. Create a directory named plugins for your custom plugin, and navigate to the directory. For example:

    $ mkdir plugins $ cd plugins
  2. Copy the contents of the following code sample and save locally as dag-timezone-plugin.py in the plugins folder.

    import time import os os.environ['TZ'] = 'America/Los_Angeles' time.tzset()
  3. In the plugins directory, create an empty Python file named __init__.py. Your plugins directory should be similar to the following:

    plugins/ |-- __init__.py |-- dag-timezone-plugin.py

Create a plugins.zip

The following steps show how to create plugins.zip. The content of this example can be combined with other plugins and binaries into a single plugins.zip file.

  1. In your command prompt, navigate to the plugins directory from the previous step. For example:

    cd plugins
  2. Zip the contents within your plugins directory.

    zip -r ../plugins.zip ./
  3. Upload plugins.zip to your S3 bucket

    $ aws s3 cp plugins.zip s3://your-mwaa-bucket/

Code sample

To change the default timezone (UTC+0) in which the DAG runs, we'll use a library called Pendulum, a Python library for working with timezone-aware datetime.

  1. In your command prompt, navigate to the directory where your DAGs are stored. For example:

    $ cd dags
  2. Copy the content of the following example and save as tz-aware-dag.py.

    from airflow import DAG from airflow.operators.bash_operator import BashOperator from datetime import datetime, timedelta # Import the Pendulum library. import pendulum # Instantiate Pendulum and set your timezone. local_tz = pendulum.timezone("America/Los_Angeles") with DAG( dag_id = "tz_test", schedule_interval="0 12 * * *", catchup=False, start_date=datetime(2022, 1, 1, tzinfo=local_tz) ) as dag: bash_operator_task = BashOperator( task_id="tz_aware_task", dag=dag, bash_command="date" )
  3. Run the following AWS CLI command to copy the DAG to your environment's bucket, then trigger the DAG using the Apache Airflow UI.

    $ aws s3 cp your-dag.py s3://your-environment-bucket/dags/
  4. If successful, you'll output similar to the following in the task logs for the tz_aware_task in the tz_test DAG:

    [2022-08-01, 12:00:00 PDT] {{subprocess.py:74}} INFO - Running command: ['bash', '-c', 'date']
    [2022-08-01, 12:00:00 PDT] {{subprocess.py:85}} INFO - Output:
    [2022-08-01, 12:00:00 PDT] {{subprocess.py:89}} INFO - Mon Aug  1 12:00:00 PDT 2022
    [2022-08-01, 12:00:00 PDT] {{subprocess.py:93}} INFO - Command exited with return code 0
    [2022-08-01, 12:00:00 PDT] {{taskinstance.py:1280}} INFO - Marking task as SUCCESS. dag_id=tz_test, task_id=tz_aware_task, execution_date=20220801T190033, start_date=20220801T190035, end_date=20220801T190035
    [2022-08-01, 12:00:00 PDT] {{local_task_job.py:154}} INFO - Task exited with return code 0
    [2022-08-01, 12:00:00 PDT] {{local_task_job.py:264}} INFO - 0 downstream tasks scheduled from follow-on schedule check
                    

What's next?