Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

SageMaker model parallelism library v2

Focus mode
SageMaker model parallelism library v2 - Amazon SageMaker AI
Note

Since the release of the SageMaker model parallelism (SMP) library v2.0.0 on December 19, 2023, this documentation is renewed for the SMP library v2. For previous versions of the SMP library, see (Archived) SageMaker model parallelism library v1.x.

The Amazon SageMaker AI model parallelism library is a capability of SageMaker AI that enables high performance and optimized large scale training on SageMaker AI accelerate compute instances. The Core features of the SageMaker model parallelism library v2 include techniques and optimizations to accelerate and simplify large model training, such as hybrid sharded data parallelism, tensor parallelism, activation checkpointing, and activation offloading. You can use the SMP library to accelerate the training and fine-tuning of large language models (LLMs), large vision models (LVMs), and foundation models (FMs) with hundreds of billions of parameters.

The SageMaker model parallelism library v2 (SMP v2) aligns the library’s APIs and methods with open source PyTorch Fully Sharded Data Parallelism (FSDP), which gives you the benefit of SMP performance optimizations with minimal code changes. With SMP v2, you can improve the computational performance of training a state-of-the-art large model on SageMaker AI by bringing your PyTorch FSDP training scripts to SageMaker AI.

You can use SMP v2 for the general SageMaker Training jobs and distributed training workloads on Amazon SageMaker HyperPod clusters.

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.