Available options

The following table displays all available options you can use to customize your notebook job, whether you run your Notebook Job in Studio, a local Jupyter environment, or using the SageMaker Python SDK. The table includes the type of custom option, a description, additional guidelines about how to use the option, a field name for the option in Studio (if available) and the parameter name for the notebook job step in the SageMaker Python SDK (if available).

For some options, you can also preset custom default values so you don’t have to specify them every time you set up a notebook job. For Studio, these options are Role, Input folder, Output folder, and KMS Key ID, and are specified in the following table. If you preset custom defaults for these options, these fields are prepopulated in the Create Job form when you create your notebook job. For details about how to create custom defaults in Studio and local Jupyter environments, see Set up default options for local notebooks.

The SageMaker SDK also gives you the option to set intelligent defaults so that you don’t have to specify these parameters when you create a NotebookJobStep. These parameters are role, s3_root_uri, s3_kms_key, volume_kms_key, subnets, security_group_ids, and are specified in the following table. For information about how to set intelligent defaults, see Set up default options.

Custom option	Description	Studio-specific guideline	Local Jupyter environment guideline	SageMaker Python SDK guideline
Job name	Your job name as it should appear in the Notebook Jobs dashboard.	Field Job name.	Same as Studio.	Parameter `notebook_job_name`. Defaults to `None`.
Image	The container image used to run the notebook noninteractively on the chosen compute type.	Field Image. This field defaults to your notebook’s current image. Change this field from the default to a custom value if needed. If Studio cannot infer this value, the form displays a validation error requiring you to specify it. This image can be a custom, bring-your-own image or an available Amazon SageMaker image. For a list of available SageMaker images supported by the notebook scheduler, see Amazon SageMaker images available for use with Studio Classic.	Field Image. This field requires an ECR URI of a Docker image that can run the provided notebook on the selected compute type. By default, the scheduler extension uses a pre-built SageMaker AI Docker image—base Python 2.0. This is the official Python 3.8 image from DockerHub with boto3, AWS CLI, and the Python 3 kernel. You can also provide any ECR URI that meets the notebook custom image specification. For details, see Custom SageMaker image specifications. This image should have all the kernels and libraries needed for the notebook run.	Required. Parameter `image_uri`. URI location of a Docker image on ECR. You can use specific SageMaker Distribution Images or custom image based on those images, or your own image pre-installed with notebook job dependencies that meets additional requirements. For details, see Image constraints for SageMaker AI Python SDK notebook jobs.
Instance type	The EC2 instance type to use to run the notebook job. The notebook job uses a SageMaker Training Job as a computing layer, so the specified instance type should be a SageMaker Training supported instance type.	Field Compute type. Defaults to `ml.m5.large`.	Same as Studio.	Parameter `instance_type`. Defaults to `ml.m5.large`.
Kernel	The Jupyter kernel used to run the notebook job.	Field Kernel. This field defaults to your notebook’s current kernel. Change this field from the default to a custom value if needed. If Studio cannot infer this value, the form displays a validation error requiring you to specify it.	Field Kernel. This kernel should be present in the image and follow the Jupyter kernel specs. This field defaults to the Python3 kernel found in the base Python 2.0 SageMaker image. Change this field to a custom value if needed.	Required. Parameter `kernel_name`. This kernel should be present in the image and follow the Jupyter kernel specs. To see the kernel identifiers for your image, see (LINK).
SageMaker AI session	The underlying SageMaker AI session to which SageMaker AI service calls are delegated.	N/A	N/A	Parameter `sagemaker_session`. If unspecified, one is created using a default configuration chain.
Role ARN	The role’s Amazon Resource Name (ARN) used with the notebook job.	Field Role ARN. This field defaults to the Studio execution role. Change this field to a custom value if needed. Note If Studio cannot infer this value, the Role ARN field is blank. In this case, insert the ARN you want to use.	Field Role ARN. This field defaults to any role prefixed with `SagemakerJupyterScheduler`. If you have multiple roles with the prefix, the extension chooses one. Change this field to a custom value if needed. For this field, you can set your own user default that pre-populates whenever you create a new job definition. For details, see Set up default options for local notebooks.	Parameter `role`. Defaults to the SageMaker AI default IAM role if the SDK is running in SageMaker Notebooks or SageMaker Studio Notebooks. Otherwise, it throws a `ValueError`. Allows intelligent defaults.
Input notebook	The name of the notebook which you are scheduling to run.	Required. Field Input file.	Same as Studio.	Required.Parameter `input_notebook`.
Input folder	The folder containing your inputs. The job inputs, including the input notebook and any optional start-up or initialization scripts, are put in this folder.	Field Input folder. If you don’t provide a folder, the scheduler creates a default Amazon S3 bucket for your inputs.	Same as Studio. For this field, you can set your own user default that pre-populates whenever you create a new job definition. For details, see Set up default options for local notebooks.	N/A. The input folder is placed inside the location specified by parameter `s3_root_uri`.
Output folder	The folder containing your outputs. The job outputs, including the output notebook and logs, are put in this folder.	Field Output folder. If you don’t specify a folder, the scheduler creates a default Amazon S3 bucket for your outputs.	Same as Studio. For this field, you can set your own user default that pre-populates whenever you create a new job definition. For details, see Set up default options for local notebooks.	N/A. The output folder is placed inside the location specified by parameter `s3_root_uri`.
Parameters	A dictionary of variables and values to pass to your notebook job.	Field Parameters. You need to parameterize your notebook to accept parameters.	Same as Studio.	Parameter `parameters`. You need to parameterize your notebook to accept parameters.
Additional (file or folder) dependencies	The list of file or folder dependencies which the notebook job uploads to s3 staged folder.	Not supported.	Not supported.	Parameter `additional_dependencies`. The notebook job uploads these dependencies to an S3 staged folder so they can be consumed during execution.
S3 root URI	The folder containing your inputs. The job inputs, including the input notebook and any optional start-up or initialization scripts, are put in this folder.	N/A. Use Input Folder and Output folder.	Same as Studio.	Parameter `s3_root_uri`. Defaults to a default S3 bucket. Allows intelligent defaults.
Environment variables	Any existing environment variables that you want to override, or new environment variables that you want to introduce and use in your notebook.	Field Environment variables.	Same as Studio.	Parameter `environment_variables`. Defaults to `None`.
Tags	A list of tags attached to the job.	N/A	N/A	Parameter `tags`. Defaults to `None`. Your tags control how the Studio UI captures and displays the job created by the pipeline. For details, see View your notebook jobs in the Studio UI dashboard.
Start-up script	A script preloaded in the notebook startup menu that you can choose to run before you run the notebook.	Field Start-up script. Select a Lifecycle Configuration (LCC) script that runs on the image at start-up. Note A start-up script runs in a shell outside of the Studio environment. Therefore, this script cannot depend on the Studio local storage, environment variables, or app metadata (in `/opt/ml/metadata`). Also, if you use a start-up script and an initialization script, the start-up script runs first.	Not supported.	Not supported.
Initialization script	A path to a local script you can run when your notebook starts up.	Field Initialization script. Enter the EFS file path where a local script or a Lifecycle Configuration (LCC) script is located. If you use a start-up script and an initialization script, the start-up script runs first. Note An initialization script is sourced from the same shell as the notebook job. This is not the case for a start-up script described previously. Also, if you use a start-up script and an initialization script, the start-up script runs first.	Field Initialization script. Enter the local file path where a local script or a Lifecycle Configuration (LCC) script is located.	Parameter `initialization_script`. Defaults to `None`.
Max retry attempts	The number of times Studio tries to rerun a failed job run.	Field Max retry attempts. Defaults to 1.	Same as Studio.	Parameter `max_retry_attempts`. Defaults to 1.
Max run time (in seconds)	The maximum length of time, in seconds, that a notebook job can run before it is stopped. If you configure both Max run time and Max retry attempts, the run time applies to each retry. If a job does not complete in this time, its status is set to `Failed`.	Field Max run time (in seconds). Defaults to `172800 seconds (2 days)`.	Same as Studio.	Parameter `max_runtime_in_seconds`. Defaults to `172800 seconds (2 days)`.
Retry policies	A list of retry policies, which govern actions to take in case of failure.	Not supported.	Not supported.	Parameter `retry_policies`. Defaults to `None`.
Add `Step` or `StepCollection` dependencies	A list of `Step` or `StepCollection` names or instances on which the job depends.	Not supported.	Not supported.	Parameter `depends_on`. Defaults to `None`. Use this to define explicit dependencies between steps in your pipeline graph.
Volume size	The size in GB of the storage volume for storing input and output data during training.	Not supported.	Not supported.	Parameter `volume_size`. Defaults to 30GB.
Encrypt traffic between containers	A flag that specifies whether traffic between training containers is encrypted for the training job.	N/A. Enabled by default.	N/A. Enabled by default.	Parameter `encrypt_inter_container_traffic`. Defaults to `True`.
Configure job encryption	An indicator that you want to encrypt your notebook job outputs, job instance volume, or both.	Field Configure job encryption. Check this box to choose encryption. If left unchecked, the job outputs are encrypted with the account's default KMS key and the job instance volume is not encrypted.	Same as Studio.	Not supported.
Output encryption KMS key	A KMS key to use if you want to customize the encryption key used for your notebook job outputs. This field is only applicable if you checked Configure job encryption.	Field Output encryption KMS key. If you do not specify this field, your notebook job outputs are encrypted with SSE-KMS using the default Amazon S3 KMS key. Also, if you create the Amazon S3 bucket yourself and use encryption, your encryption method is preserved.	Same as Studio. For this field, you can set your own user default that pre-populates whenever you create a new job definition. For details, see Set up default options for local notebooks.	Parameter `s3_kms_key`. Defaults to `None`. Allows intelligent defaults.
Job instance volume encryption KMS key	A KMS key to use if you want to encrypt your job instance volume. This field is only applicable if you checked Configure job encryption.	Field Job instance volume encryption KMS key.	Field Job instance volume encryption KMS key. For this field, you can set your own user default that pre-populates whenever you create a new job definition. For details, see Set up default options for local notebooks.	Parameter `volume_kms_key`. Defaults to `None`. Allows intelligent defaults.
Use a Virtual Private Cloud to run this job (for VPC users)	An indicator that you want to run this job in a Virtual Private Cloud (VPC). For better security, it is recommend that you use a private VPC.	Field Use a Virtual Private Cloud to run this job. Check this box if you want to use a VPC. At minimum, create the following VPC endpoints to enable your notebook job to privately connect to those AWS resources: SageMaker AI: For information on how to connect to SageMaker AI through a VPC interface endpoint, see Connect to SageMaker AI Within your VPC. Amazon S3: For information on how to connect to Amazon S3 through a VPC interface endpoint, see Gateway endpoints for Amazon S3. Amazon EC2: For information on how to connect to Amazon EC2 through a VPC interface endpoint, see Access Amazon EC2 using an interface VPC endpoint. Amazon EventBridge: This endpoint is only needed when setting up a scheduled notebook. It is not needed when launching a job on demand. For information on how to connect to EventBridge through a VPC interface endpoint, see Using Amazon EventBridge with interface VPC Endpoints. If you choose to use a VPC, you need to specify at least one private subnet and at least one security group in the following options. If you don’t use any private subnets, you need to consider other configuration options. For details, see Public VPC subnets not supported in Constraints and considerations.	Same as Studio.	N/A
Subnet(s) (for VPC users)	Your subnets. This field must contain at least one and at most five, and all the subnets you provide should be private. For details, see Public VPC subnets not supported in Constraints and considerations.	Field Subnet(s). This field defaults to the subnets associated with the Studio domain, but you can change this field if needed.	Field Subnet(s). The scheduler cannot detect your subnets, so you need to enter any subnets you configured for your VPC.	Parameter `subnets`. Defaults to `None`. Allows intelligent defaults.
Security group(s) (for VPC users)	Your security groups. This field must contain at least one and at most 15. For details, see Public VPC subnets not supported in Constraints and considerations.	Field Security groups. This field defaults to the security groups associated with the domain VPC, but you can change this field if needed.	Field Security groups. The scheduler cannot detect your security groups, so you need to enter any security groups you configured for your VPC.	Parameter `security_group_ids`. Defaults to `None`. Allows intelligent defaults.
Name	The name of the notebook job step.	N/A	N/A	Parameter `name`. If unspecified, it is derived from the notebook file name.
Display name	Your job name as it should appear in your list of pipeline executions.	N/A	N/A	Parameter `display_name`. Defaults to `None`.
Description	A description of your job.	N/A	N/A	Parameter `description`.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Invoke another notebook in your notebook job

Parameterize your notebook