Receiving model logs and metrics

To receive logs and metrics from custom model training or inference, members must have created an ML Configuration with a valid role that provides the necessary CloudWatch permissions (see Create a service role for custom ML modeling - ML Configuration).

System metric

System metrics for both training and inference, such as CPU and memory utilization, are published to all members in the collaboration with valid ML Configurations. These metrics can be viewed as the job progresses via CloudWatch Metrics in the /aws/cleanroomsml/TrainedModels or /aws/cleanroomsml/TrainedModelInferenceJobs namespaces, respectively.

Model logs

Access to the model logs is provided by the privacy configuration policy of each configured model algorithm. The model author sets the privacy configuration policy when associating a configured model algorithm (either via the console or the CreateConfiguredModelAlgorithmAssociation API) to a collaboration. Setting the privacy configuration policy controls which members can receive the model logs.

Additionally, the model author can set a filter pattern in the privacy configuration policy to filter log events. All logs that a model container sends to stdout or stderr and that match the filter pattern (if set), are sent to Amazon CloudWatch Logs. Model logs are available in CloudWatch log groups /aws/cleanroomsml/TrainedModels or /aws/cleanroomsml/TrainedModelInferenceJobs, respectively.

Custom defined metrics

When you configure a model algorithm (either via the console or the CreateConfiguredModelAlgorithm API), the model author can provide specific metric names and regex statements to search for in the output logs. These can be viewed as the job progresses via CloudWatch Metrics in the /aws/cleanroomsml/TrainedModels namespace. When associating a configured model algorithm, the model author can set an optional noise level in the metrics privacy configuration to avoid outputting raw data while still providing visibility into custom metric trends. If a noise level is set, the metrics are published at the end of the job rather than in real time.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Data inference guidelines

Cryptographic computing