EMRStudio 中的笔记本CLI命令示例 - Amazon EMR

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

EMRStudio 中的笔记本CLI命令示例

本主题显示EMR笔记本的CLI命令示例。该示例使用笔记本控制台中的演示EMR笔记本。要查找笔记本,请使用相对于主目录的文件路径。在此示例中,您可以运行两个笔记本文件:demo_pyspark.ipynbmy_folder/python3.ipynb

注意

EMR笔记本电脑可在控制台中作为 EMR Studio 工作区使用。控制台中的 “创建工作区” 按钮允许您创建新的笔记本。要访问或创建工作区,EMRNotebook 用户需要额外的IAM角色权限。有关更多信息,请参阅 Amazon EMR Notebook 是控制台和亚马逊控制台中的 Ama z EMR on EMR Studio 工作空间。

文件 demo_pyspark.ipynb 的相对路径是 demo_pyspark.ipynb,如下所示。

Jupyter notebook interface showing a file explorer and code editor with PySpark content.

python3.ipynb 的相对路径是 my_folder/python3.ipynb,如下所示。

File explorer showing python3.ipynb in my_folder, and Jupyter notebook interface with code.

有关亚马逊EMRAPINotebookExecution操作的信息,请参阅亚马逊EMRAPI操作。

运行笔记本

您可以使用 AWS CLI 来运行带有start-notebook-execution操作的笔记本,如以下示例所示。

例 — 在带有亚马逊EMR(在亚马逊上运行EC2)集群的 EMR Studio 工作区中执行EMR笔记本
aws emr --region us-east-1 \ start-notebook-execution \ --editor-id e-ABCDEFG123456 \ --notebook-params '{"input_param":"my-value", "good_superhero":["superman", "batman"]}' \ --relative-path test.ipynb \ --notebook-execution-name my-execution \ --execution-engine '{"Id" : "j-1234ABCD123"}' \ --service-role EMR_Notebooks_DefaultRole { "NotebookExecutionId": "ex-ABCDEFGHIJ1234ABCD" }
例 — 在带有EMR笔记本集群的 EMR Studio 工作区中执行EMR笔记本
aws emr start-notebook-execution \ --region us-east-1 \ --service-role EMR_Notebooks_DefaultRole \ --environment-variables '{"KERNEL_EXTRA_SPARK_OPTS": "--conf spark.executor.instances=1", "KERNEL_LAUNCH_TIMEOUT": "350"}' \ --output-notebook-format HTML \ --execution-engine Id=arn:aws:emr-containers:us-west-2:account-id:/virtualclusters/ABCDEFG/endpoints/ABCDEF,Type=EMR_ON_EKS,ExecutionRoleArn=arn:aws:iam::account-id:role/execution-role \ --editor-id e-ABCDEFG \ --relative-path EMRonEKS-spark_python.ipynb
例 — 执行指定其 Amazon S3 位置的EMR笔记本电脑
aws emr start-notebook-execution \ --region us-east-1 \ --notebook-execution-name my-execution-on-emr-on-eks-cluster \ --service-role EMR_Notebooks_DefaultRole \ --environment-variables '{"KERNEL_EXTRA_SPARK_OPTS": "--conf spark.executor.instances=1", "KERNEL_LAUNCH_TIMEOUT": "350"}' \ --output-notebook-format HTML \ --execution-engine Id=arn:aws:emr-containers:us-west-2:account-id:/virtualclusters/ABCDEF/endpoints/ABCDEF,Type=EMR_ON_EKS,ExecutionRoleArn=arn:aws:iam::account-id:role/execution-role \ --notebook-s3-location '{"Bucket": "amzn-s3-demo-bucket","Key": "s3-prefix-to-notebook-location/EMRonEKS-spark_python.ipynb"}' \ --output-notebook-s3-location '{"Bucket": "amzn-s3-demo-bucket","Key": "s3-prefix-for-storing-output-notebook"}'

笔记本输出

下面是示例笔记本的输出内容。单元格 3 显示新注入的参数值。

Jupyter notebook cells showing Python code and output for parameter injection and manipulation.

描述笔记本

您可以使用 describe-notebook-execution 操作以访问关于特定笔记本执行的信息。

aws emr --region us-east-1 \ describe-notebook-execution --notebook-execution-id ex-IZWZZVR9DKQ9WQ7VZWXJZR29UGHTE { "NotebookExecution": { "NotebookExecutionId": "ex-IZWZZVR9DKQ9WQ7VZWXJZR29UGHTE", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "ExecutionEngine": { "Id": "j-2QMOV6JAX1TS2", "Type": "EMR", "MasterInstanceSecurityGroupId": "sg-05ce12e58cd4f715e" }, "NotebookExecutionName": "my-execution", "NotebookParams": "{\"input_param\":\"my-value\", \"good_superhero\":[\"superman\", \"batman\"]}", "Status": "FINISHED", "StartTime": 1593490857.009, "Arn": "arn:aws:elasticmapreduce:us-east-1:123456789012:notebook-execution/ex-IZWZZVR9DKQ9WQ7VZWXJZR29UGHTE", "LastStateChangeReason": "Execution is finished for cluster j-2QMOV6JAX1TS2.", "NotebookInstanceSecurityGroupId": "sg-0683b0a39966d4a6a", "Tags": [] } }

停止运行笔记本

如果您想要停止笔记本正在运行的执行,您可以借助 stop-notebook-execution 命令。

# stop a running execution aws emr --region us-east-1 \ stop-notebook-execution --notebook-execution-id ex-IZWZX78UVPAATC8LHJR129B1RBN4T # describe it aws emr --region us-east-1 \ describe-notebook-execution --notebook-execution-id ex-IZWZX78UVPAATC8LHJR129B1RBN4T { "NotebookExecution": { "NotebookExecutionId": "ex-IZWZX78UVPAATC8LHJR129B1RBN4T", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "ExecutionEngine": { "Id": "j-2QMOV6JAX1TS2", "Type": "EMR" }, "NotebookExecutionName": "my-execution", "NotebookParams": "{\"input_param\":\"my-value\", \"good_superhero\":[\"superman\", \"batman\"]}", "Status": "STOPPED", "StartTime": 1593490876.241, "Arn": "arn:aws:elasticmapreduce:us-east-1:123456789012:editor-execution/ex-IZWZX78UVPAATC8LHJR129B1RBN4T", "LastStateChangeReason": "Execution is stopped for cluster j-2QMOV6JAX1TS2. Internal error", "Tags": [] } }

按开启时间列出笔记本的执行情况

您可以将 --from 参数输入至 list-notebook-executions,按开启时间列出笔记本的执行情况。

# filter by start time aws emr --region us-east-1 \ list-notebook-executions --from 1593400000.000 { "NotebookExecutions": [ { "NotebookExecutionId": "ex-IZWZX78UVPAATC8LHJR129B1RBN4T", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "STOPPED", "StartTime": 1593490876.241 }, { "NotebookExecutionId": "ex-IZWZZVR9DKQ9WQ7VZWXJZR29UGHTE", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "RUNNING", "StartTime": 1593490857.009 }, { "NotebookExecutionId": "ex-IZWZYRS0M14L5V95WZ9OQ399SKMNW", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "STOPPED", "StartTime": 1593490292.995 }, { "NotebookExecutionId": "ex-IZX009ZK83IVY5E33VH8MDMELVK8K", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "FINISHED", "StartTime": 1593489834.765 }, { "NotebookExecutionId": "ex-IZWZXOZF88JWDF9J09GJ91R57VI0N", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "FAILED", "StartTime": 1593488934.688 } ] }

按开启时间和状态列出笔记本的执行情况

list-notebook-executions 命令也可以采用 --status 参数来筛选结果。

# filter by start time and status aws emr --region us-east-1 \ list-notebook-executions --from 1593400000.000 --status FINISHED { "NotebookExecutions": [ { "NotebookExecutionId": "ex-IZWZZVR9DKQ9WQ7VZWXJZR29UGHTE", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "FINISHED", "StartTime": 1593490857.009 }, { "NotebookExecutionId": "ex-IZX009ZK83IVY5E33VH8MDMELVK8K", "EditorId": "e-BKTM2DIHXBEDRU44ANWRKIU8N", "NotebookExecutionName": "my-execution", "Status": "FINISHED", "StartTime": 1593489834.765 } ] }