Scheduling and running visual flows with workflows - Amazon SageMaker Unified Studio

Scheduling and running visual flows with workflows

You can schedule the Visual ETL flows you authored to run based on a schedule using Workflows. The following is an example of how to do this:

  1. Create a Visual ETL flow and name it "mwaa-test".

  2. Save your draft flow (“mwaa-test.vetl”) to your project.

    The Amazon SageMaker Unified Studio UI showing the option to clone to Notebook .
  3. Navigate to Build → Workflows menu, click on the “Create workflow in editor”.

    The Amazon SageMaker Unified Studio UI showing the option to "Create workflow in editor" .
  4. You will now see an example DAG template in JupyterLab.

    The Amazon SageMaker Unified Studio JupyterLab UI showing the DAG teamplate .
  5. Modify the lines of python code as below, then save it as “mwaa_test_dag.py”. We will execute the dataflow at 8AM everyday. By default, the dataflow’s notebook file is under the path “src/dataflows”.

    WORKFLOW_SCHEDULE = '0 8 * * *' NOTEBOOK_PATH = 'src/dataflows/mwaa-test.vetl' dag_id = "workflow-mwaa-test" # optional, set to give your workflow a meaningful name
    The Amazon SageMaker Unified Studio JupyterLab UI showing the notebook path and workflow schedule variables modified. .
  6. Pull the file “dataflows/mwaa-test.vetl” from the project’s source code repository to JupyterLab.

    The Amazon SageMaker Unified Studio UI showing the "VETL" file in the source code repo for JupyterLab .
    The Amazon SageMaker Unified Studio UI showing a successful pull from the source repo. .
  7. Navigate back to the Workflows console, now we can see the DAG is created. We can access Airflow UI via the “Actions” dropdown list.

    The Amazon SageMaker Unified Studio UI showing the option to "Open Airflow UI" in the Workflow section .
  8. Manually trigger the DAG.

    The Airflow UI showing the option to Trigger DAG.