Release notes for debugging capabilities of Amazon SageMaker - Amazon SageMaker

Release notes for debugging capabilities of Amazon SageMaker

See the following release notes to track the latest updates for debugging capabilities of Amazon SageMaker.

December 21, 2023

New features

Released a remote debugging functionality, a new debugging capability of SageMaker that gives you a shell-level access to training containers. With this release, you can debug training jobs by logging into the job containers running on SageMaker ML instances. To learn more, see Access a training container through AWS Systems Manager for remote debugging.

September 7, 2023

New features

Added a new utility module sagemaker.interactive_apps.tensorboard.TensorBoardApp that provides a function called get_app_url(). The get_app_url() function generates unsigned or presigned URLs to open the TensorBoard application in any environment in SageMaker or Amazon EC2. This is to provide a unified experience for both Studio Classic and non-Studio Classic users. For the Studio Classic environment, you can open TensorBoard by running the get_app_url() function as it is, or you can also specify a job name to start tracking as the TensorBoard application opens. For non-Studio Classic environments, you can open TensorBoard by providing your domain information to the utility function. With this functionality, regardless of where or how you run training code and launch training jobs, you can directly access TensorBoard by running the get_app_url function in your Jupyter notebook or terminal. This functionality is available in the SageMaker Python SDK v2.184.0 and later. For more information, see How to access TensorBoard on SageMaker.

April 4, 2023

New features

Released SageMaker with TensorBoard, a capability that hosts TensorBoard on SageMaker. TensorBoard is available as an application through SageMaker domain, and the SageMaker Training platform supports TensorBoard output data collection to S3 and loading them automatically to the hosted TensorBoard on SageMaker. With this capability, you can run training jobs set up with TensorBoard summary writers in SageMaker, save the TensorBoard output files in Amazon S3, open the TensorBoard application directly from the SageMaker console, and load the output files using SageMaker Data Manager plugin implemented to the hosted TensorBoard interface. You don't need to install TensorBoard manually and host locally on the SageMaker IDEs or local machine. To learn more, see Use TensorBoard to debug and analyze training jobs in Amazon SageMaker.

March 16, 2023

Deprecation notes

SageMaker Debugger deprecates the framework profiling feature starting from TensorFlow 2.11 and PyTorch 2.0. You can still use the feature in the previous versions of the frameworks and SDKs as follows.

  • SageMaker Python SDK <= v2.130.0

  • PyTorch >= v1.6.0, < v2.0

  • TensorFlow >= v2.3.1, < v2.11

With the deprecation, SageMaker Debugger also discontinues support for the following three ProfilerRules for framework profiling.

February 21, 2023

Other changes
  • The XGBoost report tab has been removed from the SageMaker Debugger's profiler dashboard. You can still access the XGBoost report by downloading it as a Jupyter notebook or a HTML file. For more information, see SageMaker Debugger XGBoost Training Report.

  • Starting from this release, the built-in profiler rules are not activated by default. To use the SageMaker Debugger profiler rules to detect certain computational problems, you need to add the rules when you configure a SageMaker training job launcher.

December 1, 2020

Amazon SageMaker Debugger launched deep profiling features at re:Invent 2020.

December 3, 2019

Amazon SageMaker Debugger initially launched at re:Invent 2019.