Amazon SageMaker data parallelism library examples - Amazon SageMaker

Amazon SageMaker data parallelism library examples

This page provides Jupyter notebooks that present examples of implementing the SageMaker distributed data parallelism (SMDDP) library to run distributed training jobs on SageMaker.

Blogs and Case Studies

The following blogs discuss case studies about using the SMDDP library.

SMDDP v2 blogs

SMDDP v1 blogs

Example notebooks

Example notebooks are provided in the SageMaker examples GitHub repository. To download the examples, run the following command to clone the repository and go to training/distributed_training/pytorch/data_parallel.

Note

Clone and run the example notebooks in the following SageMaker ML IDEs.

git clone https://github.com/aws/amazon-sagemaker-examples.git cd amazon-sagemaker-examples/training/distributed_training/pytorch/data_parallel

SMDDP v2 examples

SMDDP v1 examples