建立模型品質基準

建立基準工作，將模型預測與 Amazon S3 中已存放的基準資料集中的 Ground Truth 標籤進行比較。一般而言，您會使用訓練資料集做為基準資料集。基準工作會計算模型的指標，並建議用於監控模型品質偏離的限制。

若要建立基準工作，您需要有一個資料集，其中包含來自模型的預測以及代表您資料 Ground Truth 的標籤。

若要建立基準任務，請使用 SageMaker Python 提供的ModelQualityMonitor類別SDK，並完成下列步驟。

建立模型品質基準工作

首先，建立 ModelQualityMonitor 類別的執行個體。下列程式碼片段顯示其做法。


from sagemaker import get_execution_role, session, Session
from sagemaker.model_monitor import ModelQualityMonitor
                
role = get_execution_role()
session = Session()

model_quality_monitor = ModelQualityMonitor(
    role=role,
    instance_count=1,
    instance_type='ml.m5.xlarge',
    volume_size_in_gb=20,
    max_runtime_in_seconds=1800,
    sagemaker_session=session
)

現在，呼叫 ModelQualityMonitor 物件的 suggest_baseline 方法來執行基準工作。下列程式碼片段假設您有一個基準資料集，其中包含儲存在 Amazon S3 中的預測和標籤。


baseline_job_name = "MyBaseLineJob"
job = model_quality_monitor.suggest_baseline(
    job_name=baseline_job_name,
    baseline_dataset=baseline_dataset_uri, # The S3 location of the validation dataset.
    dataset_format=DatasetFormat.csv(header=True),
    output_s3_uri = baseline_results_uri, # The S3 location to store the results.
    problem_type='BinaryClassification',
    inference_attribute= "prediction", # The column in the dataset that contains predictions.
    probability_attribute= "probability", # The column in the dataset that contains probabilities.
    ground_truth_attribute= "label" # The column in the dataset that contains ground truth labels.
)
job.wait(logs=False)

在基準工作完成後，您便可以看到工作產生的限制條件。首先，呼叫 ModelQualityMonitor 物件的 latest_baselining_job 方法，取得基準工作的結果。
```
baseline_job = model_quality_monitor.latest_baselining_job
```
基準工作會建議限制條件，限制條件是模型監控測量之指標的閾值。如果指標超出建議的閾值，模型監控會報告違規。若要檢視基準工作所產生的限制條件，請呼叫基準工作的 suggested_constraints 方法。以下程式碼片段將二進制分類模型的限制條件載入至 Pandas 資料框中。
```
import pandas as pd
pd.DataFrame(baseline_job.suggested_constraints().body_dict["binary_classification_constraints"]).T
```
建議您先檢視所產生的限制條件並視需要進行修改，然後再將其用於監控。例如，如果限制條件過於嚴格，您收到的違規警示可能會比您想要的多。

如果限制條件包含以科學符號表示的數字，則需要將其轉換為浮點數。以下 Python 預先處理指令碼範例顯示如何將科學符號的數字轉換為浮點數。
```
import csv

def fix_scientific_notation(col):
    try:
        return format(float(col), "f")
    except:
        return col

def preprocess_handler(csv_line):
    reader = csv.reader([csv_line])
    csv_record = next(reader)
    #skip baseline header, change HEADER_NAME to the first column's name
    if csv_record[0] == “HEADER_NAME”:
       return []
    return { str(i).zfill(20) : fix_scientific_notation(d) for i, d in enumerate(csv_record)}
```
您可以如模型監控文件中的定義，將預先處理指令碼新增至基準或監控排程為 record_preprocessor_script。
當您滿意限制條件時，在建立監控排程時將這些限制條件傳遞為 constraints 參數。如需詳細資訊，請參閱排程模型品質監控任務。

建議的基準限制條件包含在您使用 output_s3_uri 指定的位置中的限制條件 constraints.json 檔案中。如需有關此檔案的結構描述的詳細資訊，請參閱限制條件的結構描述 (constraints.json 檔案)。

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

模型品質

排程模型品質監控任務