Using the image and video moderation APIs
In the Amazon Rekognition Image API, you can detect inappropriate, unwanted, or offensive content synchronously using DetectModerationLabels and asynchronously using StartMediaAnalysisJob and GetMediaAnalysisJob operations. You can use the Amazon Rekognition Video API to detect such content asynchronously by using the StartContentModeration and GetContentModeration operations.
Label Categories
Amazon Rekognition uses a three-level hierarchical taxonomy to label categories of inappropriate, unwanted, or offensive content. Each label with Taxonomy Level 1 (L1) has a number of Taxonomy Level 2 labels (L2), and some Taxonomy Level 2 labels may have Taxonomy Level 3 labels (L3). This allows a hierarchical classification of the content.
For each detected moderation label, the API also returns the
TaxonomyLevel
, which contains the level (1, 2, or 3) that the label
belongs to. For example, an image may be labeled in accordance with the following
categorization:
L1: Non-Explicit Nudity of Intimate parts and Kissing, L2: Non-Explicit Nudity, L3: Implied Nudity.
Note
We recommend using L1 or L2 categories to moderate your content and using L3 categories only to remove specific concepts that you do not want to moderate (i.e. to detect content that you may not want to categorize as inappropriate, unwanted, or offensive content based on your moderation policy).
The following table shows the relationships between the category levels and the possible labels for each level. To download a list of the moderation labels, click here.
Top-Level Category (L1) | Second-Level Category (L2) | Third-Level Category (L3) | Definitions |
---|---|---|---|
Explicit | Explicit Nudity | Exposed Male Genitalia | Human male genitalia, including the penis (whether erect or flaccid), the scrotum, and any discernible pubic hair. This term is applicable in contexts involving sexual activity or any visual content where male genitals are displayed either completely or partially. |
Exposed Female Genitalia | External parts of the female reproductive system, encompassing the vulva, vagina, and any observable pubic hair. This term is applicable in scenarios involving sexual activity or any visual content where these aspects of female anatomy are displayed either completely or partially. | ||
Exposed Buttocks or Anus | Human buttocks or anus, including instances where the buttocks are nude or when they are discernible through sheer clothing. The definition specifically applies to situations where the buttocks or anus are directly and completely visible, excluding scenarios where any form of underwear or clothing provides complete or partial coverage. | ||
Exposed Female Nipple | Human female nipples, including fully visible and partially visible aerola (area surrounding the nipples) and nipples. | ||
Explicit Sexual Activity | N/A | Depiction of actual or simulated sexual acts which encompasses human sexual intercourse, oral sex, as well as male genital stimulation and female genital stimulation by other body parts and objects. The term also includes ejaculation or vaginal fluids on body parts and erotic practices or roleplaying involving bondage, discipline, dominance and submission, and sadomasochism. | |
Sex Toys | N/A | Objects or devices used for sexual stimulation or pleasure, e.g., dildo, vibrator, butt plug, beats, etc. | |
Non-Explicit Nudity of Intimate parts and Kissing | Non-Explicit Nudity | Bare Back | Human posterior part where the majority of the skin is visible from the neck to the end of the spine. This term does not apply when the individual's back is partially or fully occluded. |
Exposed Male Nipple | Human male nipples, including partially visible nipples. | ||
Partially Exposed Buttocks | Partially exposed human buttocks. This term includes a partially visible region of the buttocks or butt cheeks due to short clothes, or partially visible top portion of the anal cleft. The term does not apply to cases where the buttocks is fully nude. | ||
Partially Exposed Female Breast | Partially exposed human female breast where one a portion of the female's breast is visible or uncovered while not revealing the entire breast. This term applies when the region of the inner breast fold is visible or when the lower breast crease is visible with nipple fully covered or occluded. | ||
Implied Nudity | An individual who is nude, either topless or bottomless, but with intimate parts such as buttocks, nipples, or genitalia covered, occluded, or not fully visible. | ||
Obstructed Intimate Parts | Obstructed Female Nipple | Visual depiction of a situation in which a female's nipples is covered by opaque clothing or coverings, but their shapes are clearly visible. | |
Obstructed Male Genitalia | Visual depiction of a situation in which a male's genitalia or penis is covered by opaque clothing or coverings, but its shape is clearly visible. This term applies when the obstructed genitalia in the image is in close-up. | ||
Kissing on the Lips | N/A | Depiction of one person's lips making contact with another person's lips. | |
Swimwear or Underwear | Female Swimwear or Underwear | N/A | Human clothing for female swimwear (e.g., one-piece swimsuits, bikinis, tankinis, etc.) and female underwear (e.g., bras, panties, briefs, lingerie, thongs, etc.) |
Male Swimwear or Underwear | N/A | Human clothing for male swimwear (e.g., swim trunks, boardshorts, swim briefs, etc.) and male underwear (e.g., briefs, boxers, etc.) | |
Violence | Weapons | N/A | Instruments or devices used to cause harm or damage to living beings, structures, or systems. This includes firearms (e.g., guns, rifles, machine gunes, etc.), sharp weapons (e.g., swords, knives, etc.), explosives and ammunition (e.g., missile, bombs, bullets, etc.). |
Graphic Violence | Weapon Violence | The use of weapons to cause harm, damage, injury, or death to oneself, other individuals, or properties. | |
Physical Violence | The act of causing harm to other individuals or property (e.g., hitting, fighting, pulling hair, etc.) or other act of violence involving crowd or multiple individuals. | ||
Self-Harm | The act of causing harm to oneself, often by cutting body parts such as arms or legs, where cuts are typically visible. | ||
Blood & Gore | Visual representation of violence on a person, a group of individuals, or animals, involving open wounds, bloodshed, and mutilated body parts. | ||
Explosions and Blasts | Depiction of a violent and destructive burst of intense flames with thick smoke or dust and smoke erupting from the ground. | ||
Visually Disturbing | Death and Emaciation | Emaciated Bodies | Human bodies that are extremely thin and undernourished with severe physical wasting and depletion of muscle and fat tissue. |
Corpses | Human corpses in the form of mutilated bodies, hanging corpses, or skeletons. | ||
Crashes | Air Crash | Incidents of air vehicles, such as airplanes, helicopters, or other flying vehicles, resulting in damage, injury, or death. This term applies when parts of the air vehicles are visible. | |
Drugs & Tobacco | Products | Pills | Small, solid, often round or oval-shaped tables or capsules. This term applies to pills presented as standalones, in a bottle, or a transparent packet and does not apply to a visual depiction of a person taking pills. |
Drugs & Tobacco Paraphernalia & Use | Smoking | The act of inhaling, exhaling, and lighting up burning substances including cigarettes, cigars, e-cigarettes, hookah, or joint. | |
Alcohol | Alcohol Use | Drinking | The act of drinking alcoholic beverages from bottles or glasses of alcohol or liquor. |
Alcoholic Beverages | N/A | Close up of one or multiple bottles of alcohol or liquor, glasses or mugs with alcohol or liquor, and glasses or mugs with alcohol or liquor held by an individual. This term does not apply to an individual drinking from bottles or glasses of alcohol or liquor. | |
Rude Gestures | Middle Finger | N/A | Visual depiction of a hand gesture with middle finger is extended upward while the other fingers are folded down. |
Gambling | N/A | N/A | The act of participating in games of chance for a chance to win a prize in casinos, e.g., playing cards, blackjacks, roulette, slot machines at casinos, etc. |
Hate Symbols | Nazi Party | N/A | Visual depiction of symbols, flags, or gestures associated with Nazi Party. |
White Supremacy | N/A | Visual depiction of symbols or clothings associated with Ku Klux Klan (KKK) and images with confederate flags. | |
Extremist | N/A | Images containing extremist and terrorist group flags. |
Not every label in the L2 category has a supported label in the L3 category. Additionally, L3 labels under “Products” and “Drug and Tobacco Paraphernalia and Use” L2 labels aren’t exhaustive. These L2 labels cover concepts beyond the mentioned L3 labels and in such cases, only L2 labels is returned in the API response.
You determine the suitability of content for your application. For example, images of a
suggestive nature might be acceptable, but images containing nudity might not. To filter
images, use the ModerationLabel
labels array that's returned by DetectModerationLabels
(images) and by
GetContentModeration
(videos).
Content type
The API can also identify animated or illustrated content type, and the content type is returned as part of the response:
Animated content includes video game and animation (e.g., cartoon, comics, manga, anime).
Illustrated content includes drawing, painting, and sketches.
Confidence
You can set the confidence threshold that Amazon Rekognition uses to detect inappropriate content by
specifying the MinConfidence
input parameter. Labels aren't returned for
inappropriate content that is detected with a lower confidence than
MinConfidence
.
Specifying a value for MinConfidence
that is less than 50% is likely
to return a high number of false-positive results (i.e. higher recall, lower
precision). On the other hand, specifying a MinConfidence
above 50% is
likely to return a lower number of false-positive results (i.e. lower recall, higher
precision). If you don't specify a value for MinConfidence
, Amazon Rekognition
returns labels for inappropriate content that is detected with at least 50%
confidence.
The ModerationLabel
array contains labels in the preceding categories, and an
estimated confidence in the accuracy of the recognized content. A top-level label is
returned along with any second-level labels that were identified. For example, Amazon Rekognition
might return “Explicit Nudity” with a high confidence score as a top-level label. That
might be enough for your filtering needs. However, if it's necessary, you can use the
confidence score of a second-level label (such as "Graphic Male Nudity") to obtain more
granular filtering. For an example, see Detecting inappropriate images.
Versioning
Amazon Rekognition Image and Amazon Rekognition Video both return the version of the moderation detection model that is
used to detect inappropriate content (ModerationModelVersion
).
Sorting and Aggregating
When retrieving results with GetContentModeration, you can sort and aggregate your results.
Sort order — The array of labels returned
is sorted by time. To sort by label, specify NAME
in the
SortBy
input parameter for GetContentModeration
. If the
label appears multiple times in the video, there will be multiples instances of the
ModerationLabel
element.
Label information — The ModerationLabels
array element contains a ModerationLabel
object, which in turn contains
the label name and the confidence Amazon Rekognition has in the accuracy of the detected
label. Timestamp is the time the ModerationLabel
was detected, defined
as the number of milliseconds elapsed since the start of the video. For results
aggregated by video SEGMENTS
, the StartTimestampMillis
,
EndTimestampMillis
, and DurationMillis
structures are
returned, which define the start time, end time, and duration of a segment
respectively.
Aggregation — Specifies how results are
aggregated when returned. The default is to aggregate by TIMESTAMPS
.
You can also choose to aggregate by SEGMENTS
, which aggregates results
over a time window. Only labels detected during the segments are returned.
Custom Moderation adapter statuses
Custom Moderation adapters can be in one of the following statuses: TRAINING_IN_PROGRESS, TRAINING_COMPLETED, TRAINING_FAILED, DELETING, DEPRECATED, or EXPIRED. For a full explanation of these adapter statuses, see Managing adapters.
Note
Amazon Rekognition isn't an authority on, and doesn't in any way claim to be an exhaustive filter of, inappropriate or offensive content. Additionally, the image and video moderation APIs don't detect whether an image includes illegal content, such as CSAM.