MLPER-18: Include human-in-the-loop monitoring
Use human-in-the-loop monitoring to monitor model performance efficiently. When automating decision processes, the human labeling of model results is a reliable quality test for model inferences.
Compare human labels with model inferences to estimate model performance degradation. Perform mitigation as model re-training.
Implementation plan
-
Use Amazon Augmented AI to get human review - Learn how to design a quality assurance system for model inferences. Establish a team of subject matter experts to audit model inference in production. Use Amazon Augmented AI
(Amazon A2I) to get human review of low-confidence predictions or random prediction samples. Amazon A2I uses resources in IAM, SageMaker, and Amazon S3 to create and run your human review workflows.