Amazon SageMaker AI data parallelism library examples - Amazon SageMaker AI

Amazon SageMaker AI data parallelism library examples

This page provides Jupyter notebooks that present examples of implementing the SageMaker AI distributed data parallelism (SMDDP) library to run distributed training jobs on SageMaker AI.

Blogs and Case Studies

The following blogs discuss case studies about using the SMDDP library.

SMDDP v2 blogs

SMDDP v1 blogs

Example notebooks

Example notebooks are provided in the SageMaker AI examples GitHub repository. To download the examples, run the following command to clone the repository and go to training/distributed_training/pytorch/data_parallel.

Note

Clone and run the example notebooks in the following SageMaker AI ML IDEs.

git clone https://github.com/aws/amazon-sagemaker-examples.git cd amazon-sagemaker-examples/training/distributed_training/pytorch/data_parallel

SMDDP v2 examples

SMDDP v1 examples