為 Apache 氣流創建一個自定義插件 PythonVirtualenvOperator - Amazon Managed Workflows for Apache Airflow

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

為 Apache 氣流創建一個自定義插件 PythonVirtualenvOperator

下列範例顯示如何在適用於 Apache 氣流 PythonVirtualenvOperator 的 Amazon 受管工作流程上使用自訂外掛程式修補 Apache 氣流。

版本

  • 此頁面上的示例代碼可以與 Python 3.7 中的阿帕奇氣流 V1 一起使用。

  • 您可以使用此頁面上的代碼示例與 Python 3.10 中的阿帕奇氣流 V2

必要條件

若要使用此頁面上的範例程式碼,您需要下列項目:

許可

  • 使用此頁面上的程式碼範例不需要其他權限。

要求

若要使用此頁面上的範例程式碼,請將下列相依性新增至requirements.txt. 如需進一步了解,請參閱 安裝 Python 相依性

virtualenv

自定義插件示例代碼

阿帕奇氣流將在啟動時執行插件文件夾中的 Python 文件的內容。該插件將PythonVirtualenvOperator在該啟動過程中對內置進行修補,以使其與 Amazon 兼容MWAA。下列步驟顯示自訂外掛程式的範例程式碼。

Apache Airflow v2
  1. 在命令提示符中,導航到上面的plugins目錄。例如:

    cd plugins
  2. 複製下列程式碼範例的內容,並在本機儲存為virtual_python_plugin.py

    """ Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. """ from airflow.plugins_manager import AirflowPlugin import airflow.utils.python_virtualenv from typing import List def _generate_virtualenv_cmd(tmp_dir: str, python_bin: str, system_site_packages: bool) -> List[str]: cmd = ['python3','/usr/local/airflow/.local/lib/python3.7/site-packages/virtualenv', tmp_dir] if system_site_packages: cmd.append('--system-site-packages') if python_bin is not None: cmd.append(f'--python={python_bin}') return cmd airflow.utils.python_virtualenv._generate_virtualenv_cmd=_generate_virtualenv_cmd class VirtualPythonPlugin(AirflowPlugin): name = 'virtual_python_plugin'
Apache Airflow v1
  1. 在命令提示符中,導航到上面的plugins目錄。例如:

    cd plugins
  2. 複製下列程式碼範例的內容,並在本機儲存為virtual_python_plugin.py

    from airflow.plugins_manager import AirflowPlugin from airflow.operators.python_operator import PythonVirtualenvOperator def _generate_virtualenv_cmd(self, tmp_dir): cmd = ['python3','/usr/local/airflow/.local/lib/python3.7/site-packages/virtualenv', tmp_dir] if self.system_site_packages: cmd.append('--system-site-packages') if self.python_version is not None: cmd.append('--python=python{}'.format(self.python_version)) return cmd PythonVirtualenvOperator._generate_virtualenv_cmd=_generate_virtualenv_cmd class EnvVarPlugin(AirflowPlugin): name = 'virtual_python_plugin'

Plugins.zip

下列步驟顯示如何建立plugins.zip.

  1. 在命令提示符中,導航到virtual_python_plugin.py上面包含的目錄。例如:

    cd plugins
  2. 壓縮文plugins件夾中的內容。

    zip plugins.zip virtual_python_plugin.py

範例程式碼

下列步驟說明如何建立自訂外掛程式的程式DAG碼。

Apache Airflow v2
  1. 在命令提示符中,導航到存儲DAG代碼的目錄。例如:

    cd dags
  2. 複製下列程式碼範例的內容,並在本機儲存為virtualenv_test.py

    """ Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. """ from airflow import DAG from airflow.operators.python import PythonVirtualenvOperator from airflow.utils.dates import days_ago import os os.environ["PATH"] = os.getenv("PATH") + ":/usr/local/airflow/.local/bin" def virtualenv_fn(): import boto3 print("boto3 version ",boto3.__version__) with DAG(dag_id="virtualenv_test", schedule_interval=None, catchup=False, start_date=days_ago(1)) as dag: virtualenv_task = PythonVirtualenvOperator( task_id="virtualenv_task", python_callable=virtualenv_fn, requirements=["boto3>=1.17.43"], system_site_packages=False, dag=dag, )
Apache Airflow v1
  1. 在命令提示符中,導航到存儲DAG代碼的目錄。例如:

    cd dags
  2. 複製下列程式碼範例的內容,並在本機儲存為virtualenv_test.py

    """ Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. """ from airflow import DAG from airflow.operators.python_operator import PythonVirtualenvOperator from airflow.utils.dates import days_ago import os os.environ["PATH"] = os.getenv("PATH") + ":/usr/local/airflow/.local/bin" def virtualenv_fn(): import boto3 print("boto3 version ",boto3.__version__) with DAG(dag_id="virtualenv_test", schedule_interval=None, catchup=False, start_date=days_ago(1)) as dag: virtualenv_task = PythonVirtualenvOperator( task_id="virtualenv_task", python_callable=virtualenv_fn, requirements=["boto3>=1.17.43"], system_site_packages=False, dag=dag, )

氣流組態選項

如果您使用的是 Apache 氣流 v2,請添加core.lazy_load_plugins : False為 Apache 氣流配置選項。若要深入瞭解,請參閱使用設定選項載入外掛程式 2

後續步驟?

  • 在中了解如何將此範例中的requirements.txt檔案上傳到您的 Amazon S3 儲存貯體安裝 Python 相依性

  • 了解如何將此範例中的DAG程式碼上傳到的 Amazon S3 儲存貯體中的dags資料夾新增或更新 DAGs

  • 在本範例中進一步了解如何將plugins.zip檔案上傳到中的 Amazon S3 儲存貯體安裝自定義插件