Retrieving sensitive data samples for a Macie
finding
By using Amazon Macie, you can retrieve and reveal samples of sensitive data that Macie reports
in individual sensitive data findings. This includes sensitive data that Macie detects using
managed data identifiers, and data that
matches the criteria of custom data identifiers.
The samples can help you verify the nature of the sensitive data that Macie found. They can also
help you tailor your investigation of an affected Amazon Simple Storage Service (Amazon S3) object and bucket. You can
retrieve and reveal sensitive data samples in all the AWS Regions where Macie is currently
available except the Asia Pacific (Osaka) and Israel (Tel Aviv) Regions.
If you retrieve and reveal sensitive data samples for a finding, Macie uses data in the
corresponding sensitive data discovery
result to locate the first 1–10 occurrences of sensitive data reported by the
finding. Macie then extracts the first 1–128 characters of each occurrence from the
affected S3 object. If a finding reports multiple types of sensitive data, Macie does this for
up to 100 types of sensitive data reported by the finding.
When Macie extracts sensitive data from an affected S3 object, Macie encrypts the data with
an AWS Key Management Service (AWS KMS) key that you specify, temporarily stores the encrypted data in a cache, and
returns the data in your results for the finding. Soon after extraction and encryption, Macie
permanently deletes the data from the cache unless additional retention is temporarily required
to resolve an operational issue.
If you choose to retrieve and reveal sensitive data samples for a finding again, Macie
repeats the process for locating, extracting, encrypting, storing, and ultimately deleting the
samples.
For a demonstration of how you can retrieve and reveal sensitive data samples by using the
Amazon Macie console, watch the following video:
Before you begin
Before you can retrieve and reveal sensitive data samples for findings, you need to configure and enable settings for your Amazon Macie
account. You also need to work with your AWS administrator to verify that you have
the permissions and resources that you need.
When you retrieve and reveal sensitive data samples for a finding, Macie performs a series
of tasks to locate, retrieve, encrypt, and reveal the samples. Macie doesn't use the Macie service-linked role for your account to perform
these tasks. Instead, you use your AWS Identity and Access Management (IAM) identity or allow Macie to assume an
IAM role in your account.
To retrieve and reveal sensitive data samples for a finding, you must have access to the
finding, the corresponding sensitive data discovery result, and the AWS KMS key that you
configured Macie to use to encrypt sensitive data samples. In addition, you or the IAM role
must be allowed to access the affected S3 bucket and the affected S3 object. You or the role
must also be allowed to use the AWS KMS key that was used to encrypt the affected object,
if applicable. If any IAM policies, resource policies, or other permissions settings deny
the requisite access, an error occurs and Macie doesn't return any samples for the
finding.
You must also be allowed to perform the following Macie actions:
The first three actions allow you to access your Macie account and retrieve the details of
findings. The last action allows you to retrieve and reveal sensitive data samples for
findings.
To use the Amazon Macie console to retrieve and reveal sensitive data samples, you must also
be allowed to perform the following action:
macie2:GetSensitiveDataOccurrencesAvailability
. This action allows you to
determine whether samples are available for individual findings. You don't need permission to
perform this action to retrieve and reveal samples programmatically. However, having this
permission can streamline your retrieval of samples.
If you're the delegated Macie administrator for an organization and you configured Macie to assume
an IAM role to retrieve sensitive data samples, you must also be allowed to perform the
following action: macie2:GetMember
. This action allows you to retrieve
information about the association between your account and an affected account. It enables
Macie to verify that you're currently the Macie administrator for the affected account.
If you're not allowed to perform the requisite actions or access the requisite data and
resources, ask your AWS administrator for assistance.
Determining whether sensitive data
samples are available for a finding
To retrieve and reveal sensitive data samples for a finding, the finding needs to meet
certain criteria. It has to include location data for specific occurrences of sensitive data.
In addition, it has to specify the location of a valid, corresponding sensitive data discovery
result. The sensitive data discovery result must be stored in the same AWS Region as the
finding. If you configured Amazon Macie to access affected S3 objects by assuming an AWS Identity and Access Management
(IAM) role, the sensitive data discovery result must also be stored in an S3 object that
Macie signed with a Hash-based Message Authentication Code (HMAC) AWS KMS key.
The affected S3 object also needs to meet certain criteria. The MIME type of the object
must be one of the following:
-
application/avro, for an Apache Avro
object container (.avro) file
-
application/gzip, for a GNU Zip
compressed archive (.gz or .gzip) file
-
application/json, for a JSON or JSON
Lines (.json or .jsonl) file
-
application/parquet, for an Apache
Parquet (.parquet) file
-
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,
for a Microsoft Excel workbook (.xlsx) file
-
application/zip, for a ZIP
compressed archive (.zip) file
-
text/csv, for a CSV (.csv)
file
-
text/plain, for a non-binary text
file other than a CSV, JSON, JSON Lines, or TSV file
-
text/tab-separated-values, for a TSV
(.tsv) file
In addition, the contents of the S3 object must be the same as when the finding was
created. Macie checks the object's entity tag (ETag) to determine whether it matches the ETag
specified by the finding. Also, the storage size of the object can't exceed the applicable
size quota for retrieving and revealing sensitive data samples. For a list of applicable
quotas, see Quotas for Macie.
If a finding and the affected S3 object meet the preceding criteria, sensitive data
samples are available for the finding. You can optionally determine whether this is the case
for a particular finding before you try to retrieve and reveal samples for it.
To determine whether sensitive data samples are available for a finding
You can use the Amazon Macie console or the Amazon Macie API to determine whether sensitive
data samples are available for a finding.
- Console
-
Follow these steps on the Amazon Macie console to determine whether sensitive data
samples are available for a finding.
To determine whether samples are available for a finding
Open the Amazon Macie console at https://console.aws.amazon.com/macie/.
-
In the navigation pane, choose Findings.
-
On the Findings page, choose the finding. The details panel
displays information for the finding.
-
In the details panel, scroll to the Sensitive data section.
Then refer to the Reveal samples field.
If sensitive data samples are available for the finding, a
Review link appears in the field, as shown in the following
image.
If sensitive data samples aren't available for the finding, the Reveal
samples field displays text indicating why:
-
Account not in organization – You're not allowed
to access the affected S3 object by using Macie. The affected account isn't
currently part of your organization. Or the account is part of your organization
but Macie isn't currently enabled for the account in the current
AWS Region.
-
Invalid classification result – There isn't a
corresponding sensitive data discovery result for the finding. Or the
corresponding sensitive data discovery result isn't available in the current
AWS Region, is malformed or corrupted, or uses an unsupported storage format.
Macie can't verify the location of the sensitive data to retrieve.
-
Invalid result signature – The corresponding
sensitive data discovery result is stored in an S3 object that wasn't signed by
Macie. Macie can't verify the integrity and authenticity of the sensitive data
discovery result. Therefore, Macie can't verify the location of the sensitive
data to retrieve.
-
Member role too permissive – The trust or
permissions policy for the IAM role in the affected member account doesn't
meet Macie requirements for restricting access to the role. Or the role's trust
policy doesn't specify the correct external ID for your organization. Macie
can’t assume the role to retrieve the sensitive data.
-
Missing GetMember permission – You're not
allowed to retrieve information about the association between your account and
the affected account. Macie can't determine whether you’re allowed to access the
affected S3 object as the delegated Macie administrator for the affected account.
-
Object exceeds size quota – The storage size of
the affected S3 object exceeds the size quota for retrieving and revealing
samples of sensitive data from that type of file.
-
Object unavailable – The affected S3 object
isn't available. The object was renamed, moved, or deleted, or its contents
changed after Macie created the finding. Or the object is encrypted with an
AWS KMS key that isn’t available. For example, the key is disabled, is
scheduled for deletion, or was deleted.
-
Result not signed – The corresponding sensitive
data discovery result is stored in an S3 object that hasn't been signed. Macie
can't verify the integrity and authenticity of the sensitive data discovery
result. Therefore, Macie can't verify the location of the sensitive data to
retrieve.
-
Role too permissive – Your account is configured
to retrieve occurrences of sensitive data by using an IAM role whose trust or
permissions policy doesn't meet Macie requirements for restricting access to the
role. Macie can’t assume the role to retrieve the sensitive data.
-
Unsupported object type – The affected S3 object
uses a file or storage format that Macie doesn't support for retrieving and
revealing samples of sensitive data. The MIME type of the affected S3 object
isn't one of the values in the preceding list.
If there's an issue with the sensitive data discovery result for the finding,
the information in the Detailed result location field of the
finding can help you investigate the issue. This field specifies the original path
to the result in Amazon S3. To investigate an issue with an IAM role, ensure that the
role's policies meet all requirements for Macie to assume the role. For these
details, see Configuring
an IAM role to access affected S3 objects.
- API
-
To programmatically determine whether sensitive data samples are available for a
finding, use the GetSensitiveDataOccurrencesAvailability operation of the Amazon Macie API. When
you submit your request, use the findingId
parameter to specify the unique
identifier for the finding. To obtain this identifier, you can use the ListFindings operation.
If you're using the AWS Command Line Interface (AWS CLI), run the get-sensitive-data-occurrences-availability command and use the
finding-id
parameter to specify the unique identifier for the finding. To
obtain this identifier, you can run the list-findings
command.
If your request succeeds and samples are available for the finding, you receive
output similar to the following:
{
"code": "AVAILABLE",
"reasons": []
}
If your request succeeds and samples aren't available for the finding, the value for
the code
field is UNAVAILABLE
and the reasons
array specifies why. For example:
{
"code": "UNAVAILABLE",
"reasons": [
"UNSUPPORTED_OBJECT_TYPE"
]
}
If there's an issue with the sensitive data discovery result for the finding, the
information in the classificationDetails.detailedResultsLocation
field of
the finding can help you investigate the issue. This field specifies the original path
to the result in Amazon S3. To investigate an issue with an IAM role, ensure that the
role's policies meet all requirements for Macie to assume the role. For these details,
see Configuring
an IAM role to access affected S3 objects.
Retrieving sensitive data samples for a
finding
To retrieve and reveal sensitive data samples for a finding, you can use the Amazon Macie
console or the Amazon Macie API.
- Console
-
Follow these steps to retrieve and reveal sensitive data samples for a finding by
using the Amazon Macie console.
To retrieve and reveal sensitive data samples for a finding
Open the Amazon Macie console at https://console.aws.amazon.com/macie/.
-
In the navigation pane, choose Findings.
-
On the Findings page, choose the finding. The details panel
displays information for the finding.
-
In the details panel, scroll to the Sensitive data section.
Then, in the Reveal samples field, choose
Review:
If the Review link doesn't appear in the Reveal
samples field, sensitive data samples aren't available for the
finding. To determine why this is the case, see the preceding topic.
After you choose Review, Macie displays a page that summarizes key details of the finding. The details
include the categories, types, and number of occurrences of sensitive data that
Macie found in the affected S3 object.
-
In the Sensitive data section of the page, choose
Reveal samples. Macie then retrieves and reveals samples of
the first 1–10 occurrences of sensitive data reported by the finding. Each
sample contains the first 1–128 characters of an occurrence of sensitive
data. It can take several minutes to retrieve and reveal the samples.
If the finding reports multiple types of sensitive data, Macie retrieves and
reveals samples for up to 100 types. For example, the following image shows samples
that span multiple categories and types of sensitive data—AWS credentials,
US phone numbers, and people's names.
The samples are organized first by
sensitive data category, and then by sensitive data type.
- API
-
To retrieve and reveal sensitive data samples for a finding programmatically, use
the GetSensitiveDataOccurrences operation of the Amazon Macie API. When you submit
your request, use the findingId
parameter to specify the unique identifier
for the finding. To obtain this identifier, you can use the ListFindings
operation.
To retrieve and reveal sensitive data samples by using the AWS Command Line Interface (AWS CLI), run
the get-sensitive-data-occurrences command and use the finding-id
parameter to specify the unique identifier for the finding. For example:
C:\>
aws macie2 get-sensitive-data-occurrences --finding-id "1f1c2d74db5d8caa76859ec52example
"
Where 1f1c2d74db5d8caa76859ec52example
is the unique
identifier for the finding. To obtain this identifier by using the AWS CLI, you can run
the list-findings command.
If your request succeeds, Macie begins processing your request and you receive output similar to the following:
{
"status": "PROCESSING"
}
It can take several minutes to process your request. Within a few minutes, submit
your request again.
If Macie can locate, retrieve, and encrypt the sensitive data samples, Macie returns
the samples in a sensitiveDataOccurrences
map. The map specifies
1–100 types of sensitive data reported by the finding and 1–10 samples for
each type. Each sample contains the first 1–128 characters of an occurrence of
sensitive data reported by the finding.
In the map, each key is the ID of the managed data identifier that detected the
sensitive data, or the name and unique identifier for the custom data identifier that
detected the sensitive data. The values are samples for the specified managed data
identifier or custom data identifier. For example, the following response provides three
samples of people's names and two samples of AWS secret access keys that were detected
by managed data identifiers (NAME
and AWS_CREDENTIALS
,
respectively).
{
"sensitiveDataOccurrences": {
"NAME": [
{
"value": "Akua Mansa"
},
{
"value": "John Doe"
},
{
"value": "Martha Rivera"
}
],
"AWS_CREDENTIALS": [
{
"value": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
},
{
"value": "je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY"
}
]
},
"status": "SUCCESS"
}
If your request succeeds but sensitive data samples aren't available for the
finding, you receive an UnprocessableEntityException
message that indicates
why samples aren't available. For example:
{
"message": "An error occurred (UnprocessableEntityException) when calling the GetSensitiveDataOccurrences operation: OBJECT_UNAVAILABLE"
}
In the preceding example, Macie attempted to retrieve samples from the affected S3
object but the object isn't available anymore. The contents of the object changed after
Macie created the finding.
If your request succeeds but another type of error prevented Macie from retrieving
and revealing sensitive data samples for the finding, you receive output similar to the
following:
{
"error": "Macie can't retrieve the samples. You're not allowed to access the affected S3 object or the object is encrypted with a key that you're not allowed to use.",
"status": "ERROR"
}
The value for the status
field is ERROR
and the
error
field describes the error that occurred. The information in the
preceding topic can help you
investigate the error.