本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。
非同步分析任務的輸出
分析任務完成後,它會將結果存放在您在請求中指定的 S3 儲存貯體中。
文字輸入的輸出
對於文字輸入文件的格式 (多類別或多標籤),任務輸出包含名為 的單一檔案output.tar.gz
。這是壓縮封存檔案,其中包含具有輸出的文字檔案。
多類別輸出
當您使用以多類別模式訓練的分類器時,結果會顯示 classes
。每個classes
類別都是訓練您的分類器時用來建立一組類別的類別。
如需這些輸出欄位的詳細資訊,請參閱《Amazon Comprehend API 參考》中的 ClassifyDocument。
下列範例使用以下互斥類別。
DOCUMENTARY
SCIENCE_FICTION
ROMANTIC_COMEDY
SERIOUS_DRAMA
OTHER
如果您的輸入資料格式是每行一個文件,輸出檔案會包含輸入中每行一行。每一行都包含檔案名稱、輸入行的零型行號,以及文件中的類別或類別。其結尾是 Amazon Comprehend 確信個別執行個體已正確分類。
例如:
{"File": "file1.txt", "Line": "0", "Classes": [{"Name": "Documentary", "Score": 0.8642}, {"Name": "Other", "Score": 0.0381}, {"Name": "Serious_Drama", "Score": 0.0372}]} {"File": "file1.txt", "Line": "1", "Classes": [{"Name": "Science_Fiction", "Score": 0.5}, {"Name": "Science_Fiction", "Score": 0.0381}, {"Name": "Science_Fiction", "Score": 0.0372}]} {"File": "file2.txt", "Line": "2", "Classes": [{"Name": "Documentary", "Score": 0.1}, {"Name": "Documentary", "Score": 0.0381}, {"Name": "Documentary", "Score": 0.0372}]} {"File": "file2.txt", "Line": "3", "Classes": [{"Name": "Serious_Drama", "Score": 0.3141}, {"Name": "Other", "Score": 0.0381}, {"Name": "Other", "Score": 0.0372}]}
如果您的輸入資料格式是每個檔案一個文件,輸出檔案會包含每個文件的一行。每行都有檔案名稱,以及文件中的類別或類別。最後,Amazon Comprehend 會相信 Amazon Comprehend 已正確分類個別執行個體。
例如:
{"File": "file0.txt", "Classes": [{"Name": "Documentary", "Score": 0.8642}, {"Name": "Other", "Score": 0.0381}, {"Name": "Serious_Drama", "Score": 0.0372}]} {"File": "file1.txt", "Classes": [{"Name": "Science_Fiction", "Score": 0.5}, {"Name": "Science_Fiction", "Score": 0.0381}, {"Name": "Science_Fiction", "Score": 0.0372}]} {"File": "file2.txt", "Classes": [{"Name": "Documentary", "Score": 0.1}, {"Name": "Documentary", "Score": 0.0381}, {"Name": "Domentary", "Score": 0.0372}]} {"File": "file3.txt", "Classes": [{"Name": "Serious_Drama", "Score": 0.3141}, {"Name": "Other", "Score": 0.0381}, {"Name": "Other", "Score": 0.0372}]}
多標籤輸出
當您使用以多標籤模式訓練的分類器時,結果會顯示 labels
。這些都是在訓練您的分類器時用來建立一組類別的labels
標籤。
下列範例使用這些唯一標籤。
SCIENCE_FICTION
ACTION
DRAMA
COMEDY
ROMANCE
如果您的輸入資料格式是每行一個文件,輸出檔案會包含輸入中每行一行。每一行都包含檔案名稱、輸入行的零型行號,以及文件中的類別或類別。其結尾是 Amazon Comprehend 確信個別執行個體已正確分類。
例如:
{"File": "file1.txt", "Line": "0", "Labels": [{"Name": "Action", "Score": 0.8642}, {"Name": "Drama", "Score": 0.650}, {"Name": "Science Fiction", "Score": 0.0372}]} {"File": "file1.txt", "Line": "1", "Labels": [{"Name": "Comedy", "Score": 0.5}, {"Name": "Action", "Score": 0.0381}, {"Name": "Drama", "Score": 0.0372}]} {"File": "file1.txt", "Line": "2", "Labels": [{"Name": "Action", "Score": 0.9934}, {"Name": "Drama", "Score": 0.0381}, {"Name": "Action", "Score": 0.0372}]} {"File": "file1.txt", "Line": "3", "Labels": [{"Name": "Romance", "Score": 0.9845}, {"Name": "Comedy", "Score": 0.8756}, {"Name": "Drama", "Score": 0.7723}, {"Name": "Science_Fiction", "Score": 0.6157}]}
如果您的輸入資料格式是每個檔案一個文件,輸出檔案會包含每個文件的一行。每行都有檔案名稱,以及文件中的類別或類別。最後,Amazon Comprehend 會相信 Amazon Comprehend 已正確分類個別執行個體。
例如:
{"File": "file0.txt", "Labels": [{"Name": "Action", "Score": 0.8642}, {"Name": "Drama", "Score": 0.650}, {"Name": "Science Fiction", "Score": 0.0372}]} {"File": "file1.txt", "Labels": [{"Name": "Comedy", "Score": 0.5}, {"Name": "Action", "Score": 0.0381}, {"Name": "Drama", "Score": 0.0372}]} {"File": "file2.txt", "Labels": [{"Name": "Action", "Score": 0.9934}, {"Name": "Drama", "Score": 0.0381}, {"Name": "Action", "Score": 0.0372}]} {"File": "file3.txt”, "Labels": [{"Name": "Romance", "Score": 0.9845}, {"Name": "Comedy", "Score": 0.8756}, {"Name": "Drama", "Score": 0.7723}, {"Name": "Science_Fiction", "Score": 0.6157}]}
半結構化輸入文件的輸出
對於半結構化輸入文件,輸出可以包含下列其他欄位:
DocumentMetadata – 文件的擷取資訊。中繼資料包含文件中的頁面清單,其中包含從每個頁面擷取的字元數。如果請求包含
Byte
參數,此欄位會出現在回應中。DocumentType – 輸入文件中每個頁面的文件類型。如果請求包含
Byte
參數,此欄位會出現在回應中。錯誤 – 系統在處理輸入文件時偵測到的頁面層級錯誤。如果系統沒有發生錯誤,則此欄位為空。
如需這些輸出欄位的詳細資訊,請參閱《Amazon Comprehend API 參考》中的 ClassifyDocument。
下列範例顯示兩頁掃描 PDF 檔案的輸出。
[{ #First page output "Classes": [ { "Name": "__label__2 ", "Score": 0.9993996620178223 }, { "Name": "__label__3 ", "Score": 0.0004330444789957255 } ], "DocumentMetadata": { "PageNumber": 1, "Pages": 2 }, "DocumentType": "ScannedPDF", "File": "file.pdf", "Version": "VERSION_NUMBER" }, #Second page output { "Classes": [ { "Name": "__label__2 ", "Score": 0.9993996620178223 }, { "Name": "__label__3 ", "Score": 0.0004330444789957255 } ], "DocumentMetadata": { "PageNumber": 2, "Pages": 2 }, "DocumentType": "ScannedPDF", "File": "file.pdf", "Version": "VERSION_NUMBER" }]