Components of a guardrail - Amazon Bedrock

Components of a guardrail

Guardrails for Amazon Bedrock consists of a collection of different filtering policies that you can configure to avoid undesirable and harmful content and remove or mask sensitive information for privacy protection.

You can configure the following policies in a guardrail:

  • Content filters — You can configure thresholds to block input prompts or model responses containing harmful content such as hate, insults, sexual, violence, misconduct (including criminal activity), and prompt attacks (prompt injection and jailbreaks). For example, an e-commerce site can design its online assistant to avoid using inappropriate language such as hate speech or insults.

  • Denied topics — You can define a set of topics to avoid within your generative AI application. For example, a banking assistant application can be designed to avoid topics related to illegal investment advice.

  • Word filters — You can configure a set of custom words or phrases that you want to detect and block in the interaction between your users and generative AI applications. For example, you can detect and block profanity as well as specific custom words such as competitor names, or other offensive words.

  • Sensitive information filters — You can detect sensitive content such as Personally Identifiable Information (PII) or custom regex entities in user inputs and FM responses. Based on the use case, you can reject inputs containing sensitive information or redact them in FM responses. For example, you can redact users’ personal information while generating summaries from customer and agent conversation transcripts.