ステップ 1: SageMaker Profiler Python モジュールを使用してトレーニングスクリプトを調整するステップ 2: SageMaker AI フレームワーク推定器を作成して SageMaker Profiler をアクティブ化する（オプション) SageMaker Profiler Python パッケージをインストールする

SageMaker Profiler を使用してトレーニングジョブを準備して実行する

SageMaker Profiler を使用してトレーニングジョブを実行するための設定は、トレーニングスクリプトの適応と SageMaker トレーニングジョブランチャーの設定の 2 つのステップで構成されます。

トピック

ステップ 1: SageMaker Profiler Python モジュールを使用してトレーニングスクリプトを調整する
ステップ 2: SageMaker AI フレームワーク推定器を作成して SageMaker Profiler をアクティブ化する
（オプション) SageMaker Profiler Python パッケージをインストールする

ステップ 1: SageMaker Profiler Python モジュールを使用してトレーニングスクリプトを調整する

トレーニングジョブの実行GPUs中にでカーネル実行のキャプチャを開始するには、 SageMaker Profiler Python モジュールを使用してトレーニングスクリプトを変更します。ライブラリをインポートし、start_profiling() と stop_profiling() メソッドを追加して、プロファイリングの開始と終了を定義します。オプションのカスタム注釈を使用してトレーニングスクリプトにマーカーを追加し、各ステップの特定のオペレーション中のハードウェアアクティビティを視覚化することもできます。

アノテーターはからオペレーションを抽出することに注意してくださいGPUs。のプロファイリングオペレーションではCPUs、注釈を追加する必要はありません。CPUプロファイリングは、で練習するプロファイリング設定を指定するときにも有効になりますステップ 2: SageMaker AI フレームワーク推定器を作成して SageMaker Profiler をアクティブ化する。

注記

トレーニングジョブ全体をプロファイリングすることが、リソースの最も効率的な使い方ではありません。トレーニングジョブの最大 300 ステップをプロファイリングすることをお勧めします。

重要

2023 年 12 月 14 日のリリースには、重大な変更が含まれます。 SageMaker Profiler Python パッケージ名がから smppy に変更されましたsmprof。これは v2.12 TensorFlow 以降の SageMaker AI フレームワークコンテナで有効です。

v2 TensorFlow .11.0 などの以前のバージョンの SageMaker AI フレームワークコンテナのいずれかを使用する場合、 SageMaker Profiler Python パッケージは引き続きとして使用できますsmppy。どのバージョンまたはパッケージ名を使用すべきかわからない場合は、 SageMaker Profiler パッケージの import ステートメントを次のコードスニペットに置き換えます。


try:
    import smprof 
except ImportError:
    # backward-compatability for TF 2.11 and PT 1.13.1 images
    import smppy as smprof

アプローチ 1. コンテキストマネージャー smprof.annotate を使用して関数全体に注釈を付けます。

smprof.annotate() コンテキストマネージャーを使用すると、すべての関数をラップできます。このラッパーは、コード行ではなく関数別にプロファイリングする場合に推奨されます。次のスクリプト例では、各イテレーションでトレーニングループと関数全体をラップするコンテキストマネージャーの実装方法を示しています。


import smprof

SMProf = smprof.SMProfiler.instance()
config = smprof.Config()
config.profiler = {
    "EnableCuda": "1",
}
SMProf.configure(config)
SMProf.start_profiling()

for epoch in range(args.epochs):
    if world_size > 1:
        sampler.set_epoch(epoch)
    tstart = time.perf_counter()
    for i, data in enumerate(trainloader, 0):
        with smprof.annotate("step_"+str(i)):
            inputs, labels = data
            inputs = inputs.to("cuda", non_blocking=True)
            labels = labels.to("cuda", non_blocking=True)
    
            optimizer.zero_grad()
    
            with smprof.annotate("Forward"):
                outputs = net(inputs)
            with smprof.annotate("Loss"):
                loss = criterion(outputs, labels)
            with smprof.annotate("Backward"):
                loss.backward()
            with smprof.annotate("Optimizer"):
                optimizer.step()

SMProf.stop_profiling()

アプローチ 2. smprof.annotation_begin() と smprof.annotation_end() を使用して、関数内の特定のコード行に注釈を付けます。

特定のコード行をプロファイリングする注釈を定義することもできます。プロファイリングの正確な開始点と終了点は、関数ごとではなく、個々のコード行のレベルで設定できます。例えば、次のスクリプトでは、step_annotator は各イテレーションの開始時に定義され、イテレーションの終了時に終了します。一方、オペレーションごとに他の詳細な注釈が定義され、各イテレーションを通じて対象となるオペレーションをラップしています。


import smprof

SMProf = smprof.SMProfiler.instance()
config = smprof.Config()
config.profiler = {
    "EnableCuda": "1",
}
SMProf.configure(config)
SMProf.start_profiling()

for epoch in range(args.epochs):
    if world_size > 1:
        sampler.set_epoch(epoch)
    tstart = time.perf_counter()
    for i, data in enumerate(trainloader, 0):
        step_annotator = smprof.annotation_begin("step_" + str(i))

        inputs, labels = data
        inputs = inputs.to("cuda", non_blocking=True)
        labels = labels.to("cuda", non_blocking=True)
        optimizer.zero_grad()

        forward_annotator = smprof.annotation_begin("Forward")
        outputs = net(inputs)
        smprof.annotation_end(forward_annotator)

        loss_annotator = smprof.annotation_begin("Loss")
        loss = criterion(outputs, labels)
        smprof.annotation_end(loss_annotator)

        backward_annotator = smprof.annotation_begin("Backward")
        loss.backward()
        smprof.annotation_end(backward_annotator)

        optimizer_annotator = smprof.annotation_begin("Optimizer")
        optimizer.step()
        smprof.annotation_end(optimizer_annotator)

        smprof.annotation_end(step_annotator)

SMProf.stop_profiling()

プロファイラー開始モジュールに注釈を付けて設定したら、次のステップ 2 で SageMaker トレーニングジョブランチャーを使用して送信するスクリプトを保存します。サンプルランチャーでは、トレーニングスクリプトの名前が train_with_profiler_demo.py であることを想定しています。

ステップ 2: SageMaker AI フレームワーク推定器を作成して SageMaker Profiler をアクティブ化する

次の手順は、Python を使用してトレーニング用の SageMaker AI SageMaker フレームワーク推定器を準備する方法を示していますSDK。

次のように、ProfilerConfig モジュールと Profiler モジュールを使用して profiler_config オブジェクトを設定します。
```
from sagemaker import ProfilerConfig, Profiler
profiler_config = ProfilerConfig(
    profile_params = Profiler(cpu_profiling_duration=3600)
)
```
以下は Profiler モジュールとその引数の説明です。
- Profiler: トレーニングジョブで SageMaker Profiler をアクティブ化するためのモジュール。
  - cpu_profiling_duration (int): でプロファイリングする時間を秒単位で指定しますCPUs。デフォルトは 3,600 秒です。
前のステップで作成した profiler_config オブジェクトを使用して SageMaker AI フレームワーク推定器を作成します。次のコードは、 PyTorch 推定器を作成する例を示しています。推定器を作成する場合は、sagemaker.tensorflow.TensorFlow代わりにをインポートし、 SageMaker Profiler TensorFlow でサポートされているTensorFlowバージョンのいずれかを指定します。サポートされているフレームワークとインスタンスタイプの詳細については、「SageMaker Profiler がプリインストールされた AI SageMaker フレームワークイメージ」を参照してください。
```
import sagemaker
from sagemaker.pytorch import PyTorch

estimator = PyTorch(
    framework_version="2.0.0",
    role=sagemaker.get_execution_role(),
    entry_point="train_with_profiler_demo.py", # your training job entry point
    source_dir=source_dir, # source directory for your training script
    output_path=output_path,
    base_job_name="sagemaker-profiler-demo",
    hyperparameters=hyperparameters, # if any
    instance_count=1, # Recommended to test with < 8
    instance_type=ml.p4d.24xlarge,
    profiler_config=profiler_config
)
```
fit メソッドを実行してトレーニングジョブを開始します。wait=False を使用すると、トレーニングジョブのログを消音し、バックグラウンドで実行させることができます。
```
estimator.fit(wait=False)
```

トレーニングジョブの実行中またはジョブの完了後に、 SageMaker Profiler UI アプリケーションを開くにある次のトピックに進み、保存したプロファイルの調査と視覚化を開始できます。

Amazon S3 バケットに保存されているプロファイルデータに直接アクセスする場合は、次のスクリプトを使用して S3 を取得しますURI。


import os
# This is an ad-hoc function to get the S3 URI
# to where the profile output data is saved
def get_detailed_profiler_output_uri(estimator):
    config_name = None
    for processing in estimator.profiler_rule_configs:
        params = processing.get("RuleParameters", dict())
        rule = config_name = params.get("rule_to_invoke", "")
        if rule == "DetailedProfilerProcessing":
            config_name = processing.get("RuleConfigurationName")
            break
    return os.path.join(
        estimator.output_path, 
        estimator.latest_training_job.name, 
        "rule-output",
        config_name,
    )

print(
    f"Profiler output S3 bucket: ", 
    get_detailed_profiler_output_uri(estimator)
)

（オプション) SageMaker Profiler Python パッケージをインストールする

にリストされていない PyTorch または TensorFlow フレームワークイメージSageMaker Profiler がプリインストールされた AI SageMaker フレームワークイメージ、またはトレーニング用の独自のカスタム Docker コンテナで SageMaker Profiler を使用するには、のいずれかを使用して SageMaker Profiler をインストールできますSageMaker Profiler Python パッケージバイナリファイル。

オプション 1: トレーニングジョブの起動中に SageMaker Profiler パッケージをインストールする

PyTorch またはにリストされていない TensorFlow イメージを使用してジョブをトレーニングするために SageMaker Profiler を使用する場合はSageMaker Profiler がプリインストールされた AI SageMaker フレームワークイメージ、 requirements.txt ファイルを作成し、ステップ 2 で SageMaker AI フレームワーク推定器の source_dirパラメータに指定したパスの下に配置します。requirements.txt ファイルの一般的な設定の詳細については、SageMaker Python SDK ドキュメントの「サードパーティーライブラリの使用」を参照してください。requirements.txt ファイルで、SageMaker Profiler Python パッケージバイナリファイルの S3 バケットパスのいずれかを追加します。


# requirements.txt
https://smppy.s3.amazonaws.com/tensorflow/cu112/smprof-0.3.332-cp39-cp39-linux_x86_64.whl

オプション 2: カスタム Docker コンテナに SageMaker Profiler パッケージをインストールする

トレーニングにカスタム Docker コンテナを使用する場合は、Dockerfile に SageMaker Profiler Python パッケージバイナリファイルのいずれかを追加します。


# Install the smprof package version compatible with your CUDA version
RUN pip install https://smppy.s3.amazonaws.com/tensorflow/cu112/smprof-0.3.332-cp39-cp39-linux_x86_64.whl

SageMaker AI でトレーニングするためのカスタム Docker コンテナの実行に関する一般的なガイダンスについては、「独自のトレーニングコンテナの適応」を参照してください。

ブラウザで JavaScript が無効になっているか、使用できません。

AWS ドキュメントを使用するには、JavaScript を有効にする必要があります。手順については、使用するブラウザのヘルプページを参照してください。

ドキュメントの表記規則

SageMaker Profiler の前提条件

SageMaker Profiler UI アプリケーションを開く