Terjemahan disediakan oleh mesin penerjemah. Jika konten terjemahan yang diberikan bertentangan dengan versi bahasa Inggris aslinya, utamakan versi bahasa Inggris.
Pahami bagaimana hasil pekerjaan evaluasi model Anda disimpan di Amazon S3
Output dari pekerjaan evaluasi model disimpan di bucket Amazon S3 yang Anda tentukan saat membuat pekerjaan evaluasi model. Hasil pekerjaan evaluasi model disimpan sebagai file JSON baris (.jsonl
).
Hasil dari pekerjaan evaluasi model disimpan di bucket S3 yang Anda tentukan sebagai berikut.
-
Untuk pekerjaan evaluasi model yang menggunakan pekerja manusia:
s3://
user-specified-S3-output-path
/job-name
/job-uuid
/datasets/dataset-name
/file-uuid
_output.jsonl -
Untuk pekerjaan evaluasi model otomatis:
s3://
user-specified-S3-output-path
/job-name
/job-uuid
/models/model-id
/taskTypes/task-type
/datasets/dataset
/file-uuid
_output.jsonl
Topik berikut menjelaskan bagaimana hasil dari pekerjaan evaluasi model berbasis pekerja otomatis dan manusia disimpan di Amazon S3.
Output data dari pekerjaan evaluasi model otomatis
Hasil pekerjaan evaluasi otomatis disimpan di datasets
direktori saat status pekerjaan berubah menjadi Selesai.
Untuk setiap metrik dan kumpulan data prompt terkait yang Anda pilih saat pekerjaan evaluasi model dibuat, file JSON baris dibuat di datasets
direktori. File menggunakan konvensi penamaan berikut
.metric
_input-dataset
.jsonl
Setiap hasil dari pekerjaan evaluasi model dimulai dengan automatedEvaluationResult
kuncinya. Kunci anak pertama scores
berisi metrik yang Anda pilih di konsol Amazon Bedrock. Dalam contoh ini, hanya satu metrik yang Accuracy
dipilih. Ini juga berisi nilai result
yang dihitung untuk metrik yang dipilih. Untuk mempelajari lebih lanjut tentang nilai spesifik apa yang dihitung, lihatTinjau metrik untuk pekerjaan evaluasi model otomatis di Amazon Bedrock (konsol).
Kunci kedua, inputRecord
adalah salinan dari apa yang Anda berikan dalam dataset prompt input Anda.
Kunci ketiga, modelResponses
berisi daftar JSON objek yang berisi model ARN yang dipilih saat Anda membuat pekerjaan evaluasi model. Ini juga berisi respons penuh dari model berdasarkan prompt yang diberikan.
Berikut ini adalah contoh output untuk jenis tugas ringkasan teks dengan hanya satu metrik, akurasi, dipilih.
{ "automatedEvaluationResult": { "scores": [{ "metricName": "Accuracy", "result": 0.31920555233955383 }] }, "inputRecord": { "prompt": "Human: two bodies have been recovered from a flooded coal mine in yiyang city , central china 's henan province , as of #:## thursday morning , rescuers said . Assistant:", "referenceResponse": "two bodies recovered from flooded coal mine in c. china" }, "modelResponses": [{ "modelIdentifier": "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-v2", "response": " Here are the key points I gathered from your news summary:\n\n- Two bodies have been recovered from a flooded coal mine in Yiyang City, in China's Henan province.\n\n- This happened as of Thursday morning, according to rescuers.\n\n- The location is central China's Henan province. \n\n- Yiyang City is where the flooded coal mine is located.\n\n- The coal mine was flooded, leading to the two deaths and need for rescue/recovery efforts.\n\n- This news is developing as of Thursday morning, so the death toll and rescue details may still be unfolding.\n\nPlease let me know if I have accurately summarized the key information or if you would like me to expand on any part of it." }] }
Output data dari pekerjaan evaluasi model yang menggunakan pekerja manusia.
Ketika pekerjaan evaluasi model telah selesai, Anda melihat parameter berikut dalam data keluaran yang dikembalikan dari tugas tinjauan manusia.
Parameter | Jenis Nilai | Nilai contoh | Deskripsi |
---|---|---|---|
flowDefinitionArn |
String |
arn:aws:sagemaker:us-west-2: |
Amazon Resource Number (ARN) dari alur kerja tinjauan manusia (definisi alur) yang digunakan untuk membuat loop manusia. |
humanAnswers |
Daftar JSON objek |
|
Daftar JSON objek yang berisi tanggapan pekerja di |
|
String | system-generated-hash |
Sebuah sistem menghasilkan string hex 40 karakter. |
inputRecord |
JSONobjek |
|
JSONObjek yang berisi prompt entri dari dataset input. |
modelResponses |
Daftar JSON objek |
|
Tanggapan individu dari model. |
inputContent |
Objek |
|
Konten input loop manusia diperlukan untuk memulai loop manusia di bucket S3 Anda. |
modelResponseIdMap |
Objek |
|
|
Berikut ini adalah contoh data keluaran dari pekerjaan evaluasi model.
{ "humanEvaluationResult": [{ "flowDefinitionArn": "arn:aws:sagemaker:
us-west-2
:111122223333
:flow-definition/flow-definition-name
", "humanAnswers": [{ "acceptanceTime": "2023-11-09T19:17:43.107Z", "answerContent": { "evaluationResults": { "thumbsUpDown": [{ "metricName": "Coherence", "modelResponseId": "0", "result": false }, { "metricName": "Accuracy", "modelResponseId": "0", "result": true }], "individualLikertScale": [{ "metricName": "Toxicity", "modelResponseId": "0", "result": 1 }] } }, "submissionTime": "2023-11-09T19:17:52.101Z", "timeSpentInSeconds": 8.994, "workerId": "444455556666", "workerMetadata": { "identityData": { "identityProviderType": "Cognito", "issuer": "https://cognito-idp.Wilayah AWS.amazonaws.com/Wilayah AWS_111222
", "sub": "c6aa8eb7-9944-42e9-a6b9-
" } } }], ...Additional response have been truncated for clarity... }], "humanLoopName": "b3b1c64a2166e001e094123456789012", "inputContent":{ "additionalDataS3Uri":"s3://user-specified-S3-output-path
/datasets/dataset-name
/records/record-number
/human-loop-additional-data.json", "evaluationMetrics":[ { "description":"testing", "metricName":"IndividualLikertScale", "ratingMethod":"IndividualLikertScale" } ], "instructions":"some dummy instructions" }, "modelResponseIdMap":{ "0":"arn:aws:bedrock:Wilayah AWS
::foundation-model/model-id
" } }, "inputRecord": { "category": "Fitness", "prompt": "Human: What is High-Intensity Interval Training ? \\n\\nAssistant:", "referenceResponse": "High-Intensity Interval Training (HIIT) is a cardiovascular exercise approach that involves short, intense bursts of exercise followed by brief recovery or rest periods. In HIIT, you perform exercises at near-maximum effort during the high-intensity intervals, which can include activities like sprinting, cycling, jumping, or body weight exercises such as burpees or push-ups. These intense intervals are interspersed with short recovery periods to lower your heart rate before the next high-intensity bout. HIIT workouts usually consist of multiple cycles of these intervals, and the number of cycles and the length of intervals can vary. HIIT is known for its time efficiency, making it an excellent choice for those with busy schedules. It offers both aerobic and anaerobic benefits, improving cardiovascular fitness, burning calories, and promoting fat loss, all while boosting metabolism for continued calorie burning post-workout." }, "modelResponses": [{ "modelIdentifier": "arn:aws:bedrock:Wilayah AWS::foundation-model/model-id
", "response": " High-intensity interval training (HIIT) is a form of exercise that involves short bursts of intense activity alternated with recovery periods. Here are some key features of HIIT:\n\n- Alternates short periods of intense anaerobic exercise with less intense recovery periods. The high-intensity periods are typically done at 80-95% of maximum heart rate.\n\n- The duration of the intense periods can range from 5 seconds to 8 minutes, but typically are in the range of 30 seconds to 4 minutes.\n\n- The total HIIT session typically lasts between 10-30 minutes, including warm-up, recovery periods, and cool down. \n\n- Uses exercises like sprints, cycling, rowing, jumping rope, stair climbing, swimming, etc. Movements that get your heart rate up quickly.\n\n- The short, intense intervals help improve cardiovascular fitness by increasing heart rate and oxygen consumption.\n\n- HIIT burns more calories and fat than steady-state cardio in a shorter time period. It also helps build muscle strength and endurance.\n\n- Considered an efficient and effective form of exercise for fat loss and improving aerobic power. But it requires motivation to push yourself during the intense intervals.\n\n- Not suitable for beginners due to the high-intensity. Start with steady-state cardio and build up endurance before trying HIIT.\n\nIn summary, HIIT intersperses intense bursts of" }] } }
Tabel berikut menjelaskan bagaimana metode Penilaian yang Anda pilih untuk setiap metrik di konsol Amazon Bedrock ditampilkan di bucket Amazon S3 Anda. Child-key pertama di bawah evaluationResults
ini adalah bagaimana metode Rating dikembalikan.
Metode penilaian dipilih | Disimpan di Amazon S3 |
---|---|
Skala Likert - Individu | IndividualLikertScale |
Skala Likert - Perbandingan | ComparisonLikertScale |
Tombol pilihan | ComparisonChoice |
Peringkat ordinal | ComparisonRank |
Jempol ke atas/bawah | ThumbsUpDown |