Distributed Training with Amazon SageMaker RL - Amazon SageMaker

Distributed Training with Amazon SageMaker RL

Amazon SageMaker RL supports multi-core and multi-instance distributed training. Depending on your use case, training and/or environment rollout can be distributed. For example, SageMaker RL works for the following distributed scenarios:

  • Single training instance and multiple rollout instances of the same instance type. For an example, see the Neural Network Compression example in the SageMaker examples repository.

  • Single trainer instance and multiple rollout instances, where different instance types for training and rollouts. For an example, see the AWS DeepRacer / AWS RoboMaker example in the SageMaker examples repository.

  • Single trainer instance that uses multiple cores for rollout. For an example, see the Roboschool example in the SageMaker examples repository. This is useful if the simulation environment is light-weight and can run on a single thread.

  • Multiple instances for training and rollouts. For an example, see the Roboschool example in the SageMaker examples repository.