SageMaker Studio Administration Best Practices - SageMaker Studio Administration Best Practices

SageMaker Studio Administration Best Practices

Publication date: April 25, 2023 (Document revisions)

Abstract

Amazon SageMaker AI Studio provides a single, web-based visual interface where you can perform all machine learning (ML) development steps, which improves data science team productivity. SageMaker AI Studio gives you complete access, control, and visibility into each step required to build, train, and evaluate models.

In this whitepaper, we discuss best practices for subjects including operating model, domain management, identity management, permissions management, network management, logging, monitoring, and customization. The best practices discussed here are intended for enterprise SageMaker AI Studio deployment, including multi-tenant deployments. This document is intended for ML platform administrators, ML engineers, and ML architects.

Are you Well-Architected?

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

In the Machine Learning Lens, we focus on how to design, deploy, and architect your machine learning workloads in the AWS Cloud. This lens adds to the best practices described in the Well-Architected Framework.

Introduction

When you administrate SageMaker AI Studio as your ML platform, you need best practices guidance for making informed decisions to help you scale your ML platform as your workloads grow. For provisioning, operationalizing, and scaling your ML platform, consider the following:

  • Choose the right operating model and organize your ML environments to meet your business objectives.

  • Choose how to set up SageMaker AI Studio domain authentication for user identities, and consider the domain-level limitations.

  • Decide how to federate your users’ identity and authorization to the ML platform for fine-grained access controls and auditing.

  • Consider setting up permissions and guardrails for various roles of your ML personas.

  • Plan your virtual private cloud (VPC) network topology, considering your ML workload’s sensitivity, number of users, instance types, apps, and jobs launched.

  • Classify and protect your data at rest and in transit with encryption.

  • Consider how to log and monitor various application programming interfaces (APIs) and user activities for compliance.

  • Customize the SageMaker AI Studio notebook experience with your own images and lifecycle configuration scripts.