Configure Amazon SageMaker Canvas in a VPC without internet access
The Amazon SageMaker Canvas application runs in a container in an AWS managed Amazon Virtual Private Cloud (VPC). If you want to further control access to your resources or run SageMaker Canvas without public internet access, you can configure your Amazon SageMaker AI domain and VPC settings. Within your own VPC, you can configure settings such as security groups (virtual firewalls that control inbound and outbound traffic from Amazon EC2 instances) and subnets (ranges of IP addresses in your VPC). To learn more about VPCs, see How Amazon VPC works.
When the SageMaker Canvas application is running in the AWS managed VPC, it can interact with other AWS services using either an internet connection or through VPC endpoints created in a customer-managed VPC (without public internet access). SageMaker Canvas applications can access these VPC endpoints through a Studio Classic-created network interface that provides connectivity to the customer-managed VPC. The default behavior of the SageMaker Canvas application is to have internet access. When using an internet connection, the containers for the preceding jobs access AWS resources over the internet, such as the Amazon S3 buckets where you store training data and model artifacts.
However, if you have security requirements to control access to your data and job containers, we recommend that you configure SageMaker Canvas and your VPC so that your data and containers aren’t accessible over the internet. SageMaker AI uses the VPC configuration settings you specify when setting up your domain for SageMaker Canvas.
If you want to configure your SageMaker Canvas application without internet access, you must configure your VPC settings when you onboard to Amazon SageMaker AI domain, set up VPC endpoints, and grant the necessary AWS Identity and Access Management permissions. For information about configuring a VPC in Amazon SageMaker AI, see Choose an Amazon VPC. The following sections describe how to run SageMaker Canvas in a VPC without public internet access.
Configure Amazon SageMaker Canvas in a VPC without internet access
You can send traffic from SageMaker Canvas to other AWS services through your own VPC. If your own VPC doesn't have public internet access and you've set up your domain in VPC only mode, then SageMaker Canvas won't have public internet access as well. This includes all requests, such as accessing datasets in Amazon S3 or training jobs for standard builds, and the requests go through VPC endpoints in your VPC instead of the public internet. When you onboard to domain and Choose an Amazon VPC, you can specify your own VPC as the default VPC for the domain, along with your desired security group and subnet settings. Then, SageMaker AI creates a network interface in your VPC that SageMaker Canvas uses to access VPC endpoints in your VPC.
Make sure that you set up one or more security groups in your VPC with inbound and
outbound rules that allow
TCP traffic within the security group. This is required for connectivity between
the Jupyter Server application and the Kernel Gateway applications. You must allow access
to at least ports in the range 8192-65535
. Also, make sure to create a
distinct security group for each user profile and add inbound access from that same security group.
We do not recommend reusing a domain level security group for user profiles. If the
domain level security group allows inbound access to itself, all applications in the domain
have access to all other applications in the domain. Note that the security group and
subnet settings are set after you finish onboarding to domain.
When onboarding to domain, if you choose Public internet only as the network access type, the VPC is SageMaker AI managed and allows internet access.
You can change this behavior by choosing VPC only so that SageMaker AI sends all traffic to a network interface that SageMaker AI creates in your specified VPC. When you choose this option, you must provide the subnets, security groups, and VPC endpoints that are necessary to communicate with the SageMaker API and SageMaker AI Runtime, and various AWS services, such as Amazon S3 and Amazon CloudWatch, that are used by SageMaker Canvas. Note that you can only import data from Amazon S3 buckets located in the same Region as your VPC.
The following procedures show how you can configure these settings to use SageMaker Canvas without the internet.
Step 1: Onboard to Amazon SageMaker AI domain
To send SageMaker Canvas traffic to a network interface in your own VPC instead of over the internet, specify the VPC you want to use when onboarding to Amazon SageMaker AI domain. You must also specify at least two subnets in your VPC that SageMaker AI can use. Choose Standard setup and do the following procedure when configuring the Network and Storage Section for the domain.
Select your desired VPC.
Choose two or more Subnets. If you don’t specify the subnets, SageMaker AI uses all of the subnets in the VPC.
-
Choose one or more Security group(s).
Choose VPC Only to turn off direct internet access in the AWS managed VPC where SageMaker Canvas is hosted.
After disabling internet access, finish the onboarding process to set up your domain. For more information about the VPC settings for Amazon SageMaker AI domain, see Choose an Amazon VPC.
Step 2: Configure VPC endpoints and access
Note
In order to configure Canvas in your own VPC, you must enable private DNS hostnames for your VPC endpoints. For more information, see Connect to SageMaker AI Through a VPC Interface Endpoint.
SageMaker Canvas only accesses other AWS services to manage and store data for its functionality. For example, it connects to Amazon Redshift if your users access an Amazon Redshift database. It can connect to an AWS service such as Amazon Redshift using an internet connection or a VPC endpoint. Use VPC endpoints if you want to set up connections from your VPC to AWS services that don't use the public internet.
A VPC endpoint creates a private connection to an AWS service that uses a networking path that is isolated from the public internet. For example, if you set up access to Amazon S3 using a VPC endpoint from your own VPC, then the SageMaker Canvas application can access Amazon S3 by going through the network interface in your VPC and then through the VPC endpoint that connects to Amazon S3. The communication between SageMaker Canvas and Amazon S3 is private.
For more information about configuring VPC endpoints for your VPC, see AWS PrivateLink. If you are using Amazon Bedrock models in Canvas with a VPC, for more information about controlling access to your data, see Protect jobs using a VPC in the Amazon Bedrock User Guide.
The following are the VPC endpoints for each service you can use with SageMaker Canvas:
Service | Endpoint | Endpoint type |
---|---|---|
AWS Application Auto Scaling |
com.amazonaws. |
Interface |
Amazon Athena |
com.amazonaws. |
Interface |
Amazon SageMaker AI |
com.amazonaws. com.amazonaws. com.amazonaws. |
Interface |
Amazon SageMaker AI Data Science Assistant |
com.amazonaws. |
Interface |
AWS Security Token Service |
com.amazonaws. |
Interface |
Amazon Elastic Container Registry (Amazon ECR) |
com.amazonaws. com.amazonaws. |
Interface |
Amazon Elastic Compute Cloud (Amazon EC2) |
com.amazonaws. |
Interface |
Amazon Simple Storage Service (Amazon S3) |
com.amazonaws. |
Gateway |
Amazon Redshift |
com.amazonaws. |
Interface |
AWS Secrets Manager |
com.amazonaws. |
Interface |
AWS Systems Manager |
com.amazonaws. |
Interface |
Amazon CloudWatch |
com.amazonaws. |
Interface |
Amazon CloudWatch Logs |
com.amazonaws. |
Interface |
Amazon Forecast |
com.amazonaws. com.amazonaws. |
Interface |
Amazon Textract |
com.amazonaws. |
Interface |
Amazon Comprehend |
com.amazonaws. |
Interface |
Amazon Rekognition |
com.amazonaws. |
Interface |
AWS Glue |
com.amazonaws. |
Interface |
AWS Application Auto Scaling |
com.amazonaws. |
Interface |
Amazon Relational Database Service (Amazon RDS) |
com.amazonaws. |
Interface |
Amazon Bedrock (see note after table) |
com.amazonaws. |
Interface |
Amazon Kendra |
com.amazonaws. |
Interface |
Amazon EMR Serverless |
com.amazonaws. |
Interface |
Amazon Q Developer (see note after table) |
com.amazonaws. |
Interface |
Note
The Amazon Q Developer VPC endpoint is currently available only in the US East (N. Virginia) region. To connect to it from other regions, you can choose one of the following options based on your security and infrastructure preferences:
Set up a NAT Gateway. Configure a NAT Gateway in your VPC's private subnet to enable internet connectivity for the Q Developer endpoint. For more information, see Setting up a NAT Gateway in a VPC Private Subnet
. Enable cross-region VPC endpoint access. Set up cross-region VPC endpoint access for Q Developer. Use this option to connect securely without requiring internet access. For more information, see Configuring Cross-Region VPC Endpoint Access
.
Note
For Amazon Bedrock, the interface endpoint service name
com.amazonaws.
has been deprecated. Create a new VPC endpoint
with the service name listed in the preceding table.Region
.bedrock
Additionally, you can't fine-tune foundation models from Canvas VPCs with no internet access. This is because Amazon Bedrock doesn't support VPC endpoints for model customization APIs. To learn more about fine-tuning foundation models in Canvas, see Fine-tune foundation models.
You must also add an endpoint policy for Amazon S3 to control AWS principal access to your VPC endpoint. For information about how to update your VPC endpoint policy, see Control access to VPC endpoints using endpoint policies.
The following are two VPC endpoint policies that you can use. Use the first policy if you only want to grant access to the basic functionality of Canvas, such as importing data and creating models. Use the second policy if you want to grant access to the additional genenerative AI features in Canvas.
Step 3: Grant IAM permissions
The SageMaker Canvas user must have the necessary AWS Identity and Access Management permissions to allow connection
to the VPC endpoints. The IAM role to which you give permissions must be the same
one you used when onboarding to Amazon SageMaker AI domain. You can attach the SageMaker AI managed
AmazonSageMakerFullAccess
policy to the IAM role for the user to
give the user the required permissions. If you require more restrictive IAM
permissions and use custom policies instead, then give the user’s role the
ec2:DescribeVpcEndpointServices
permission. SageMaker Canvas requires these
permissions to verify the existence of the required VPC endpoints for standard build
jobs. If it detects these VPC endpoints, then standard build jobs run by default in
your VPC. Otherwise, they will run in the default AWS managed VPC.
For instructions on how to attach the AmazonSageMakerFullAccess
IAM
policy to your user’s IAM role, see Adding and
removing IAM identity permissions.
To grant your user’s IAM role the granular ec2:DescribeVpcEndpointServices
permission, use the following procedure.
Sign in to the AWS Management Console and open the IAM console
. In the navigation pane, choose Roles.
In the list, choose the name of the role to which you want to grant permissions.
Choose the Permissions tab.
Choose Add permissions and then choose Create inline policy.
-
Choose the JSON tab and enter the following policy, which grants the
ec2:DescribeVpcEndpointServices
permission:{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "ec2:DescribeVpcEndpointServices", "Resource": "*" } ] }
Choose Review policy, and then enter a Name for the policy (for example,
VPCEndpointPermissions
).Choose Create policy.
The user’s IAM role should now have permissions to access the VPC endpoints configured in your VPC.
(Optional) Step 4: Override security group settings for specific users
If you are an administrator, you might want different users to have different VPC settings, or user-specific VPC settings. When you override the default VPC’s security group settings for a specific user, these settings are passed on to the SageMaker Canvas application for that user.
You can override the security groups that a specific user has access to in your VPC when you set
up a new user profile in Studio Classic. You can use the CreateUserProfile
SageMaker API call (or create_user_profileUserSettings
, you can specify the SecurityGroups
for the user.