Configuring settings for automated sensitive data discovery
If you enable automated sensitive data discovery for your account or organization, you can adjust your automated discovery settings to refine the analyses that Amazon Macie performs. These settings specify Amazon Simple Storage Service (Amazon S3) buckets to exclude from analyses. They also specify the types and occurrences of sensitive data to detect and report—the managed data identifiers, custom data identifiers, and allow lists to use when analyzing S3 objects.
By default, Macie performs automated sensitive data discovery for all the S3 general purpose buckets that it monitors and analyzes for your account. If you're the Macie administrator for an organization, this includes buckets that your member accounts own. You can exclude specific buckets from the analyses. For example, you might exclude buckets that typically store AWS logging data, such as AWS CloudTrail event logs. If you exclude a bucket, you can include it again later.
In addition, Macie analyzes S3 objects by using only the set of managed data identifiers that we recommend for automated sensitive data discovery. Macie doesn't use custom data identifiers or allow lists that you defined. To customize the analyses, you can add or remove specific managed data identifiers, custom data identifiers, and allow lists.
If you change a setting, Macie applies your change when the next evaluation and analysis cycle starts, typically within 24 hours. In addition, your change applies only to the current AWS Region. To make the same change in additional Regions, repeat the applicable steps in each additional Region.
Topics
Note
To configure settings for automated sensitive data discovery, you must be the Macie administrator for an organization or have a standalone Macie account. If your account is part of an organization, only the Macie administrator for your organization can configure and manage these settings for accounts in your organization. If you have a member account, contact your Macie administrator to learn about the settings for your account and organization.
Configuration options for organizations
If an account is part of an organization that centrally manages multiple Amazon Macie accounts, the Macie administrator for the organization configures and manages automated sensitive data discovery for accounts in the organization. This includes settings that define the scope and nature of the analyses that Macie performs for the accounts. Members can't access these settings for their own accounts.
If you're the Macie administrator for an organization, you can define the scope of the analyses in several ways:
-
Automatically enable automated sensitive data discovery for accounts – When you enable automated sensitive data discovery, you specify whether to enable it automatically for all existing accounts and new member accounts, only for new member accounts, or no accounts. If you enable it automatically for new member accounts, it's enabled for any account that subsequently joins your organization, when the account joins your organization in Macie. If it's enabled for an account, Macie includes S3 buckets that the account owns. If it's disabled for an account, Macie excludes buckets that the account owns.
-
Selectively enable automated sensitive data discovery for accounts – With this option, you enable or disable automated sensitive data discovery for individual accounts on a case-by-case basis. If you enable it for an account, Macie includes S3 buckets that the account owns. If you don't enable it or you disable it for an account, Macie excludes buckets that the account owns.
-
Exclude specific S3 buckets from automated sensitive data discovery – If you enable automated sensitive data discovery for one or more accounts, you can exclude particular S3 buckets that the accounts own. Macie then skips the buckets when it performs automated discovery for your organization. To exclude particular buckets, add them to the exclusion list in the configuration settings for your administrator account. You can exclude as many as 1,000 buckets for your organization.
By default, automated sensitive data discovery is enabled automatically for all new and existing accounts in an organization. In addition, Macie includes all the S3 buckets that the accounts own. If you keep the default settings, Macie performs automated discovery for all the buckets that it monitors and analyzes for your administrator account, which includes all the buckets that your member accounts own.
As a Macie administrator, you also define the nature of the analyses that Macie performs for your organization. You do this by configuring additional settings for your administrator account—the managed data identifiers, custom data identifiers, and allows lists that you want Macie to use when it analyzes S3 objects. Macie uses the settings for your administrator account when it analyzes S3 objects for other accounts in your organization.
Excluding or including S3 buckets
By default, Amazon Macie performs automated sensitive data discovery for all the S3 general purpose buckets that it monitors and analyzes for your account. If you're the Macie administrator for an organization, this includes buckets that your member accounts own.
To refine the scope, you can exclude as many as 1,000 S3 buckets from analyses. If you exclude a bucket, Macie stops selecting and analyzing objects in the bucket when it performs automated sensitive data discovery. Existing sensitive data discovery statistics and details for the bucket persist. For example, the bucket's current sensitivity score remains unchanged. After you exclude a bucket, you can include it again later.
To exclude or include an S3 bucket
You can exclude or subsequently include an S3 bucket by using the Amazon Macie console or the Amazon Macie API. To do this programmatically, use the following operations: GetClassificationScope, to review a list of buckets that are currently excluded from analyses, or UpdateClassificationScope, to exclude or include a bucket in subsequent analyses.
To exclude or subsequently include an S3 bucket by using the console, follow these steps.
Open the Amazon Macie console at https://console.aws.amazon.com/macie/
. -
By using the AWS Region selector in the upper-right corner of the page, choose the Region in which you want to exclude or include specific S3 buckets in analyses.
-
In the navigation pane, under Settings, choose Automated sensitive data discovery.
The Automated sensitive data discovery page appears and displays your current settings. On that page, the S3 buckets section lists S3 buckets that are currently excluded, or it indicates that all buckets are currently included.
-
In the S3 buckets section, choose Edit.
-
Do one of the following:
-
To exclude one or more S3 buckets, choose Add buckets to the exclude list. Then, in the S3 buckets table, select the check box for each bucket to exclude. The table lists all the general purpose buckets for your account or organization in the current Region.
-
To include one or more S3 buckets that you previously excluded, choose Remove buckets from the exclude list. Then, in the S3 buckets table, select the check box for each bucket to include. The table lists all the buckets that are currently excluded from analyses.
To find specific buckets more easily, enter search criteria in the search box above the table. You can also sort the table by choosing a column heading.
-
-
When you finish selecting buckets, choose Add or Remove, depending on the option that you chose in the preceding step.
Tip
You can also exclude or include individual S3 buckets on a case-by-case basis while you review bucket details on the console. To do this, choose the bucket on the S3 buckets page. Then, in the details panel, change the Exclude from automated discovery setting for the bucket.
Adding or removing managed data identifiers
A managed data identifier is a set of built-in criteria and techniques that are designed to detect a specific type of sensitive data—for example, credit card numbers, AWS secret access keys, or passport numbers for a particular country or region. By default, Amazon Macie analyzes S3 objects by using the set of managed data identifiers that we recommend for automated sensitive data discovery. To review a list of these identifiers, see Default settings for automated sensitive data discovery.
You can tailor the analyses to focus on specific types of sensitive data:
-
Add managed data identifiers for the types of sensitive data that you want Macie to detect and report, and
-
Remove managed data identifiers for the types of sensitive data that you don't want Macie to detect and report.
If you remove a managed data identifier, your change doesn't affect existing sensitive data discovery statistics and details for S3 buckets. For example, if you remove the managed data identifier for AWS secret access keys and Macie previously detected that data in a bucket, Macie continues to report those detections.
Tip
Instead of removing a managed data identifier, which affects subsequent analyses of all S3 buckets, you can exclude its detections from sensitivity scores for only particular buckets. For more information, see Adjusting sensitivity scores for S3 buckets.
To add or remove a managed data identifier
You can add or remove a managed data identifier by using the Amazon Macie console or the Amazon Macie API. To do this programmatically, use the following operations: GetSensitivityInspectionTemplate, to determine which managed data identifiers you added or removed from current analyses, or UpdateSensitivityInspectionTemplate, to add or remove a managed data identifier from subsequent analyses.
To add or remove a managed data identifier by using the console, follow these steps.
Open the Amazon Macie console at https://console.aws.amazon.com/macie/
. -
By using the AWS Region selector in the upper-right corner of the page, choose the Region in which you want to add or remove a managed data identifier from analyses.
-
In the navigation pane, under Settings, choose Automated sensitive data discovery.
The Automated sensitive data discovery page appears and displays your current settings. On that page, the Managed data identifiers section displays your current settings, organized into two tabs:
-
Added to default – This tab lists managed data identifiers that you added. Macie uses these identifiers in addition to the ones that are in the default set and you haven't removed.
-
Removed from default – This tab lists managed data identifiers that you removed. Macie doesn't use these identifiers.
-
-
In the Managed data identifiers section, choose Edit.
-
Do any of the following:
-
To add one or more managed data identifiers, choose the Added to default tab. Then, in the table, select the check box for each managed data identifier to add. If a check box is already selected, you already added that identifier.
-
To remove one or more managed data identifiers, choose the Removed from default tab. Then, in the table, select the check box for each managed data identifier to remove. If a check box is already selected, you already removed that identifier.
On each tab, the table displays a list of all the managed data identifiers that Macie currently provides. In the table, the first column specifies each managed data identifier's ID. The ID describes the type of sensitive data that an identifier is designed to detect—for example, USA_PASSPORT_NUMBER for US passport numbers. To find specific managed data identifiers more easily, enter search criteria in the search box above the table. You can also sort the table by choosing a column heading. For details about each identifier, see Using managed data identifiers.
-
-
When you finish, choose Save.
Adding or removing custom data identifiers
A custom data identifier is a set of criteria that you define to detect sensitive data. The criteria consist of a regular expression (regex) that defines a text pattern to match and, optionally, character sequences and a proximity rule that refine the results. To learn more, see Building custom data identifiers.
By default, Amazon Macie doesn't use custom data identifiers when it performs automated sensitive data discovery. If you want Macie to use specific custom data identifiers, you can add them to the analyses. Macie then uses the custom data identifiers in addition to any managed data identifiers that you configure Macie to use.
If you add a custom data identifier, you can later remove it. Your change doesn't affect existing sensitive data discovery statistics and details for S3 buckets. That is to say, if you remove a custom data identifier that previously produced detections for a bucket, Macie continues to report those detections. However, instead of removing the identifier, which affects subsequent analyses of all buckets, consider excluding its detections from sensitivity scores for only particular buckets. For more information, see Adjusting sensitivity scores for S3 buckets.
To add or remove a custom data identifier
You can add or remove a custom data identifier by using the Amazon Macie console or the Amazon Macie API. To do this programmatically, use the following operations: GetSensitivityInspectionTemplate, to determine which custom data identifiers are currently used in analyses, or UpdateSensitivityInspectionTemplate, to add or remove a custom data identifier from subsequent analyses.
To add or remove a custom data identifier by using the console, follow these steps.
Open the Amazon Macie console at https://console.aws.amazon.com/macie/
. -
By using the AWS Region selector in the upper-right corner of the page, choose the Region in which you want to add or remove a custom data identifier from analyses.
-
In the navigation pane, under Settings, choose Automated sensitive data discovery.
The Automated sensitive data discovery page appears and displays your current settings. On that page, the Custom data identifiers section lists custom data identifiers that you already added, or it indicates that you haven't added any custom data identifiers.
-
In the Custom data identifiers section, choose Edit.
-
Do any of the following:
-
To add one or more custom data identifiers, select the check box for each custom data identifier to add. If a check box is already selected, you already added that identifier.
-
To remove one or more custom data identifiers, clear the check box for each custom data identifier to remove. If a check box is already cleared, Macie doesn't currently use that identifier.
Tip
To review or test the settings for a custom data identifier before you add or remove it, choose the link icon ( ) next to the identifier's name. Macie opens a page that displays the identifier's settings. To also test the identifier with sample data, enter up to 1,000 characters of text in the Sample data box on that page. Then choose Test. Macie evaluates the sample data and reports the number of matches.
-
-
When you finish, choose Save.
Adding or removing allow lists
In Amazon Macie, an allow list defines specific text or a text pattern that you want Macie to ignore when it inspects S3 objects for sensitive data. If text matches an entry or pattern in an allow list, Macie doesn’t report the text. This is the case even if the text matches the criteria of a managed or custom data identifier. To learn more, see Defining sensitive data exceptions with allow lists.
By default, Macie doesn't use allow lists when it performs automated sensitive data discovery. If you want Macie to use specific allow lists, you can add them to the analyses. If you add an allow list, you can later remove it.
To add or remove an allow list
You can add or remove an allow list by using the Amazon Macie console or the Amazon Macie API. To do this programmatically, use the following operations: GetSensitivityInspectionTemplate, to determine which allow lists are currently used in analyses, or UpdateSensitivityInspectionTemplate, to add or remove an allow list from subsequent analyses.
To add or remove an allow list by using the console, follow these steps.
Open the Amazon Macie console at https://console.aws.amazon.com/macie/
. -
By using the AWS Region selector in the upper-right corner of the page, choose the Region in which you want to add or remove an allow list from analyses.
-
In the navigation pane, under Settings, choose Automated sensitive data discovery.
The Automated sensitive data discovery page appears and displays your current settings. On that page, the Allow lists section specifies allow lists that you already added, or it indicates that you haven't added any allow lists.
-
In the Allow lists section, choose Edit.
-
Do any of the following:
-
To add one or more allow lists, select the check box for each allow list to add. If a check box is already selected, you already added that list.
-
To remove one or more allow lists, clear the check box for each allow list to remove. If a check box is already cleared, Macie doesn't currently use that list.
Tip
To review the settings for an allow list before you add or remove it, choose the link icon ( ) next to the list's name. Macie opens a page that displays the list's settings. If the list specifies a regular expression (regex), you can also use this page to test the regex with sample data. To do this, enter up to 1,000 characters of text in the Sample data box, and then choose Test. Macie evaluates the sample data and reports the number of matches.
-
-
When you finish, choose Save.