為 Apache Airflow PythonVirtualenvOperator 建立自訂外掛程式 - Amazon Managed Workflows for Apache Airflow

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

為 Apache Airflow PythonVirtualenvOperator 建立自訂外掛程式

下列範例示範如何將 Apache Airflow PythonVirtualenvOperator 修補為 Amazon Managed Workflows for Apache Airflow 上的自訂外掛程式。

版本

  • 您可以在 Python 3.10 中使用此頁面上的程式碼範例搭配 Apache Airflow v2

先決條件

若要使用此頁面上的範例程式碼,您需要下列項目:

許可

  • 使用此頁面上的程式碼範例不需要額外的許可。

要求

若要使用此頁面上的範例程式碼,請將下列相依性新增至您的 requirements.txt。如需進一步了解,請參閱 安裝 Python 相依性

virtualenv

自訂外掛程式範例程式碼

Apache Airflow 會在啟動時執行外掛程式資料夾中 Python 檔案的內容。此外掛程式會在PythonVirtualenvOperator該啟動程序中修補內建 ,使其與 Amazon MWAA 相容。下列步驟顯示自訂外掛程式的範例程式碼。

Apache Airflow v2
  1. 在命令提示中,導覽至上述plugins目錄。例如:

    cd plugins
  2. 複製下列程式碼範例的內容,並在本機儲存為 virtual_python_plugin.py

    """ Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. """ from airflow.plugins_manager import AirflowPlugin import airflow.utils.python_virtualenv from typing import List def _generate_virtualenv_cmd(tmp_dir: str, python_bin: str, system_site_packages: bool) -> List[str]: cmd = ['python3','/usr/local/airflow/.local/lib/python3.7/site-packages/virtualenv', tmp_dir] if system_site_packages: cmd.append('--system-site-packages') if python_bin is not None: cmd.append(f'--python={python_bin}') return cmd airflow.utils.python_virtualenv._generate_virtualenv_cmd=_generate_virtualenv_cmd class VirtualPythonPlugin(AirflowPlugin): name = 'virtual_python_plugin'
Apache Airflow v1
  1. 在命令提示中,導覽至上述plugins目錄。例如:

    cd plugins
  2. 複製下列程式碼範例的內容,並在本機儲存為 virtual_python_plugin.py

    from airflow.plugins_manager import AirflowPlugin from airflow.operators.python_operator import PythonVirtualenvOperator def _generate_virtualenv_cmd(self, tmp_dir): cmd = ['python3','/usr/local/airflow/.local/lib/python3.7/site-packages/virtualenv', tmp_dir] if self.system_site_packages: cmd.append('--system-site-packages') if self.python_version is not None: cmd.append('--python=python{}'.format(self.python_version)) return cmd PythonVirtualenvOperator._generate_virtualenv_cmd=_generate_virtualenv_cmd class EnvVarPlugin(AirflowPlugin): name = 'virtual_python_plugin'

Plugins.zip

下列步驟說明如何建立 plugins.zip

  1. 在命令提示中,導覽至包含virtual_python_plugin.py上述內容的目錄。例如:

    cd plugins
  2. 壓縮plugins資料夾中的內容。

    zip plugins.zip virtual_python_plugin.py

範例程式碼

下列步驟說明如何建立自訂外掛程式的 DAG 程式碼。

Apache Airflow v2
  1. 在命令提示中,導覽至存放 DAG 程式碼的目錄。例如:

    cd dags
  2. 複製下列程式碼範例的內容,並在本機儲存為 virtualenv_test.py

    """ Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. """ from airflow import DAG from airflow.operators.python import PythonVirtualenvOperator from airflow.utils.dates import days_ago import os os.environ["PATH"] = os.getenv("PATH") + ":/usr/local/airflow/.local/bin" def virtualenv_fn(): import boto3 print("boto3 version ",boto3.__version__) with DAG(dag_id="virtualenv_test", schedule_interval=None, catchup=False, start_date=days_ago(1)) as dag: virtualenv_task = PythonVirtualenvOperator( task_id="virtualenv_task", python_callable=virtualenv_fn, requirements=["boto3>=1.17.43"], system_site_packages=False, dag=dag, )
Apache Airflow v1
  1. 在命令提示中,導覽至存放 DAG 程式碼的目錄。例如:

    cd dags
  2. 複製下列程式碼範例的內容,並在本機儲存為 virtualenv_test.py

    """ Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. """ from airflow import DAG from airflow.operators.python_operator import PythonVirtualenvOperator from airflow.utils.dates import days_ago import os os.environ["PATH"] = os.getenv("PATH") + ":/usr/local/airflow/.local/bin" def virtualenv_fn(): import boto3 print("boto3 version ",boto3.__version__) with DAG(dag_id="virtualenv_test", schedule_interval=None, catchup=False, start_date=days_ago(1)) as dag: virtualenv_task = PythonVirtualenvOperator( task_id="virtualenv_task", python_callable=virtualenv_fn, requirements=["boto3>=1.17.43"], system_site_packages=False, dag=dag, )

氣流組態選項

如果您使用的是 Apache Airflow v2,請將 新增core.lazy_load_plugins : False為 Apache Airflow 組態選項。若要進一步了解,請參閱使用組態選項載入 2 中的外掛程式

後續步驟?

  • 了解如何在此範例中將requirements.txt檔案上傳至 中的 Amazon S3 儲存貯體安裝 Python 相依性

  • 了解如何在此範例中將 DAG 程式碼上傳至 Amazon S3 儲存貯體中的 dags 資料夾新增或更新 DAGs

  • 進一步了解如何在此範例中將plugins.zip檔案上傳至 中的 Amazon S3 儲存貯體安裝自訂外掛程式