使用 SageMaker Python 配置数据输入模式 SDK

SageMaker Python SDK 为用于启动训练作业的机器学习框架提供了通用 E stimator 类及其变体。在配置 SageMaker Estimator类或Estimator.fit方法时，可以指定其中一种数据输入模式。以下代码模板显示指定输入模式的两种方法。

使用 Estimator 类指定输入模式


from sagemaker.estimator import Estimator
from sagemaker.inputs import TrainingInput

estimator = Estimator(
    checkpoint_s3_uri='s3://amzn-s3-demo-bucket/checkpoint-destination/',
    output_path='s3://amzn-s3-demo-bucket/output-path/',
    base_job_name='job-name',
    input_mode='File'  # Available options: File | Pipe | FastFile
    ...
)

# Run the training job
estimator.fit(
    inputs=TrainingInput(s3_data="s3://amzn-s3-demo-bucket/my-data/train")
)

欲了解更多信息，请参阅 Python 文档中的 sagemaker.estimator.estimator.estimator 类。SageMaker SDK

通过estimator.fit()方法指定输入模式


from sagemaker.estimator import Estimator
from sagemaker.inputs import TrainingInput

estimator = Estimator(
    checkpoint_s3_uri='s3://amzn-s3-demo-bucket/checkpoint-destination/',
    output_path='s3://amzn-s3-demo-bucket/output-path/',
    base_job_name='job-name',
    ...
)

# Run the training job
estimator.fit(
    inputs=TrainingInput(
        s3_data="s3://amzn-s3-demo-bucket/my-data/train",
        input_mode='File'  # Available options: File | Pipe | FastFile
    )
)

欲了解更多信息，请参阅 sagemaker.estimator.e stimator.fit 类方法和 sagemaker.inputs。TrainingInputSageMaker Python SDK 文档中的类。

提示

要详细了解如何使用 Python FSx SDK 估算器使用您的配置VPC配置 Amazon for Lustre 或 Amaz EFS SageMaker on，请参阅 Python SageMaker 文档中的使用文件系统作为训练输入。SDK

提示

建议使用与 Amazon S3、Amazon EFS 和 FSx Lustre 的数据输入模式集成，以优化配置数据源以实现最佳实践。您可以使用 SageMaker 托管存储选项和输入模式从战略上提高数据加载性能，但它不受严格限制。您可以直接在训练容器中编写自己的数据读取逻辑。例如，您可以设置为从不同的数据来源读取，编写自己的 S3 数据加载器类，或者在训练脚本中使用第三方框架的数据加载函数。但是，您必须确保指定 SageMaker可以识别的正确路径。

提示

如果您使用自定义训练容器，请务必安装有助于为SageMaker 训练作业设置环境的 SageMaker 培训工具包。否则，您必须在 Dockerfile 中明确指定环境变量。有关更多信息，请参阅使用自己的算法和模型创建容器。

有关如何使用低级设置数据输入模式的更多信息 SageMaker APIs，请参阅亚马逊如何 SageMaker 提供培训信息 CreateTrainingJobAPI、和 TrainingInputMode in AlgorithmSpecification。

Javascript 在您的浏览器中被禁用或不可用。

要使用 Amazon Web Services 文档，必须启用 Javascript。请参阅浏览器的帮助页面以了解相关说明。

文档惯例

设置训练作业以访问数据集

将数据输入通道配置为使用 Amazon for Lu FSx stre