Prerequisites for creating an Amazon Bedrock knowledge base with a structured data store - Amazon Bedrock

Prerequisites for creating an Amazon Bedrock knowledge base with a structured data store

If you plan to connect a Amazon Bedrock knowledge base to a structured data store, you need to fulfill the prerequisites described in this topic.

Important

Executing arbitrary SQL queries can be a security risk for any Text-to-SQL application. We recommend that you take precautions as needed, such as using restricted roles, read-only databases, and sandboxing.

Review the following topics to ensure that you have all the necessary permissions set up.

Set up an Amazon Redshift query engine

Amazon Bedrock Knowledge Bases uses Amazon Redshift as the query engine for querying your data store. If your data is already in an Amazon Redshift provisioned or serverless query engine, you can skip this prerequisite. Otherwise, set up one of the following types of query engines:

To set up a query engine in Amazon Redshift provisioned
  1. Follow the procedure in Step 1: Create a sample Amazon Redshift cluster in the Amazon Redshift Getting Started Guide.

  2. Note the cluster ID.

  3. (Optional) For more information about Amazon Redshift provisioned clusters, see Amazon Redshift provisioned clusters in the Amazon Redshift Management Guide.

To set up a query engine in Amazon Redshift Serverless
  1. Follow only the setup procedure in Creating a data warehouse with Amazon Redshift Serverless in the Amazon Redshift Getting Started Guide and configure it with default settings.

  2. Note the workgroup ARN.

  3. (Optional) For more information about Amazon Redshift Serverless workgroups, see Workgroups and namespaces in the Amazon Redshift Management Guide.

Gather information about your database

Make sure your data is stored in one of the following supported structured data stores:

  • Amazon Redshift

  • AWS Glue Data Catalog (AWS Lake Formation)

Note

If your data isn't in one of the data sources above, but is in a data store supported for crawling by AWS Glue, you can set up a crawler to write your data store to an AWS Glue table by following the steps at Configuring a crawler in the AWS Glue Developer Guide.

Note the following information for when you create the knowledge base:

  • If your data is stored in an Amazon Redshift database, note the name of the database.

  • If your data is stored in AWS Glue Data Catalog (Amazon SageMaker AI Lakehouse), note the names of the tables that you want your knowledge base to have access to.

Attach permissions to your user role

For an IAM role to perform actions related to Amazon Bedrock Knowledge Bases, you must attach policies to the role that grant permissions to perform the actions. This topic describes permissions that allow a user to create and manage a knowledge base connected to a structured data store. It also describes permissions that allow a user to retrieve information from these knowledge bases and generate responses from them.

Expand the following sections to learn how to set up permissions for specific use cases:

To allow an IAM role to create a knowledge base, connect it to a structured data store, manage the knowledge base, and start and manage ingestion jobs from the data source to the knowledge base, you must provide permissions to the KnowledgeBase, DataSource, and IngestionJob actions.

Note

If the role has the AmazonBedrockFullAccess AWS managed policy attached, you can skip this prerequisite.

To provide permissions to tag knowledge bases, include permissions to bedrock:TagResource and bedrock:UntagResource. To allow a role to perform these actions, attach the following policy to the role:

{ "Version": "2012-10-17", "Statement": [ { "Sid": "CreateKB", "Effect": "Allow", "Action": [ "bedrock:CreateKnowledgeBase" ], "Resource": "*" }, { "Sid": "KBDataSourceManagement", "Effect": "Allow", "Action": [ "bedrock:GetKnowledgeBase", "bedrock:ListKnowledgeBases", "bedrock:UpdateKnowledgeBase", "bedrock:DeleteKnowledgeBase", "bedrock:StartIngestionJob", "bedrock:GetIngestionJob", "bedrock:ListIngestionJobs", "bedrock:StopIngestionJob", "bedrock:TagResource", "bedrock:UntagResource" ], "Resource": [ "arn:${Partition}:bedrock:${Region}:${Account}:knowledge-base/*" ] } ] }

After you create a knowledge base, we recommend that you scope the permissions in the KBDataSourceManagement statament down by replacing the wildcard (*) with the ID of the knowledge base that you created.

To allow an IAM role to query a knowledge base connected to a structured data store, attach the following policy to the role:

{ "Version": "2012-10-17", "Statement": [ { "Sid": "GetKB", "Effect": "Allow", "Action": [ "bedrock:GetKnowledgeBase" ], "Resource": [ "arn:${Partition}:bedrock:${Region}:${Account}:knowledge-base/${KnowledgeBaseId}" ] }, { "Sid": "GenerateQueryAccess", "Effect": "Allow", "Action": [ "bedrock:GenerateQuery", "sqlworkbench:GetSqlRecommendations" ], "Resource": "*" }, { "Sid": "Retrieve", "Effect": "Allow", "Action": [ "bedrock:Retrieve", ] "Resource": [ "arn:${Partition}:bedrock:${Region}:${Account}:knowledge-base/${KnowledgeBaseId}" ] }, { "Sid": "RetrieveAndGenerate", "Effect": "Allow", "Action": [ "bedrock:RetrieveAndGenerate", ] "Resource": [ "*" ] } ] }

You can remove statements that you don't need, depending on your use case:

  • The GetKB and GenerateQuery statements are required to call GenerateQuery to generate SQL queries that take into account user queries and your connected data source.

  • The Retrieve statement is required to call Retrieve to retrieve data from your structured data store.

  • The RetrieveAndGenerate statement is required to call RetrieveAndGenerate to retrieve data from your structured data store and generate responses based off the data.

To further restrict permissions, you can omit actions, or you can specify resources and condition keys by which to filter permissions. For more information about actions, resources, and condition keys, see the following topics in the Service Authorization Reference:

Create and set up permissions for your Amazon Bedrock Knowledge Bases service role

Amazon Bedrock Knowledge Bases uses a service role to connect knowledge bases to structured data stores, retrieve data from these data stores, and generate SQL queries based on user queries and the structure of the data stores.

Note

If you plan to use the AWS Management Console to create a knowledge base, you can skip this prerequisite. The console will create an Amazon Bedrock Knowledge Bases service role with the proper permissions.

To create a custom IAM service role with the proper permissions, follow the steps at Create a role to delegate permissions to an AWS service and attach the trust relationship defined in Trust relationship.

Then, add permissions for your knowledge base to access your Amazon Redshift query engine and databases. Expand the section that applies to your use case:

Attach the following policy to your custom service role to allow it to access your data and generate queries using it:

{ "Version": "2012-10-17", "Statement": [ { "Sid": "RedshiftDataAPIStatementPermissions", "Effect": "Allow", "Action": [ "redshift-data:GetStatementResult", "redshift-data:DescribeStatement", "redshift-data:CancelStatement" ], "Resource": [ "*" ], "Condition": { "StringEquals": { "redshift-data:statement-owner-iam-userid": "${aws:userid}" } } }, { "Sid": "RedshiftDataAPIExecutePermissions", "Effect": "Allow", "Action": [ "redshift-data:ExecuteStatement" ], "Resource": [ "arn:aws:redshift:${Region}:${Account}:cluster:${Cluster}" ] }, { "Sid": "SqlWorkbenchAccess", "Effect": "Allow", "Action": [ "sqlworkbench:GetSqlRecommendations", "sqlworkbench:PutSqlGenerationContext", "sqlworkbench:GetSqlGenerationContext", "sqlworkbench:DeleteSqlGenerationContext" ], "Resource": "*" }, { "Sid": "GenerateQueryAccess", "Effect": "Allow", "Action": [ "bedrock:GenerateQuery" ], "Resource": "*" } ] }

You also need to add permissions to allow your service role to authenticate to the query engine. Expand a section to see the permissions for the use case.

To allow your service role to authenticate to your Amazon Redshift provisioned query engine with IAM, attach the following policy to your custom service role:

{ "Version": "2012-10-17", "Statement": [ { "Sid": "GetCredentialsWithFederatedIAMCredentials", "Effect": "Allow", "Action": "redshift:GetClusterCredentialsWithIAM", "Resource": [ "arn:aws:redshift:${region}:${account}:dbname:${cluster}/${database}" ] } }
Note

If your data is stored in AWS Glue Data Catalog, replace ${database} with dev.

To authenticate as an Amazon Redshift database user, attach the following policy to the service role:

{ "Version": "2012-10-17", "Statement": [ { "Sid": "GetCredentialsWithClusterCredentials", "Effect": "Allow", "Action": [ "redshift:GetClusterCredentials" ], "Resource": [ "arn:aws:redshift:${region}:${account}:dbuser:${cluster}/${dbuser}", "arn:aws:redshift:${region}:${account}:dbname:${cluster}/${database}" ] } ] }

To allow your service role to authenticate to your Amazon Redshift provisioned query engine with an AWS Secrets Manager secret, do the following:

  • Attach the following policy to the role:

    { "Version": "2012-10-17", "Statement": [ { "Sid": "GetSecretPermissions", "Effect": "Allow", "Action": [ "secretsmanager:GetSecretValue" ], "Resource": [ "arn:aws:secretsmanager:${region}:${account}:secret:${secretName}" ] } ] }

The permissions to attach depend on your authentication method. Expand a section to see the permissions for the use case.

To allow your service role to authenticate to your Amazon Redshift provisioned query engine with IAM, attach the following policy to your custom service role:

{ "Version": "2012-10-17", "Statement": [ { "Sid": "RedshiftServerlessGetCredentials", "Effect": "Allow", "Action": "redshift-serverless:GetCredentials", "Resource": [ "arn:aws:redshift-serverless:${Region}:${Account}:workgroup:${WorkgroupId}" ] } }

To allow your service role to authenticate to your Amazon Redshift provisioned query engine with an AWS Secrets Manager secret, do the following:

  • Attach the following policy to the role:

    { "Version": "2012-10-17", "Statement": [ { "Sid": "GetSecretPermissions", "Effect": "Allow", "Action": [ "secretsmanager:GetSecretValue" ], "Resource": [ "arn:aws:secretsmanager:${region}:${account}:secret:${secretName}" ] } ] }

Grant database access to the role you use for authentication

To grant database access to the role you use for authentication, using Amazon Redshift query editor v2, run the following SQL commands:

  1. (If you authenticate with IAM and a user wasn't already created for your database) Run the following command, which uses CREATE USER to create a database user and allow it to authenticate through IAM, replacing ${service-role} with the name of the custom Amazon Bedrock Knowledge Bases service role you created:

    CREATE USER "IAMR:${service-role}" WITH PASSWORD DISABLE;
    Important

    If you use the Amazon Bedrock Knowledge Bases service role created for you in the console and then sync your data store before you do this step, the user will be created for you, but the sync will fail because the user hasn't been granted permissions to access your data store. You must carry out the following step before syncing.

  2. Grant the database user permissions to retrieve information from your database by running the GRANT command. You can scope to specific databases, tables, or rows or columns in tables by replacing ${tableName} or ${dbName}.

    Run the command that corresponds to the service that your data is stored in:

    • If your data is stored in Amazon Redshift, run the command that corresponds to your authentication method.

      IAM
      GRANT SELECT ON ${tableName} TO "IAMR:${serviceRole}";
      Database user
      GRANT SELECT ON ${tableName} TO "${dbUser}";
      AWS Secrets Manager username
      GRANT SELECT ON ${tableName} TO "IAMR:${secretsUsername}";
    • If your data is stored in AWS Glue Data Catalog, run the command that corresponds to your authentication method.

      IAM
      GRANT USAGE ON DATABASE ${schemaName} TO "IAMR:${serviceRole}";
      Database user
      GRANT USAGE ON DATABASE ${schemaName} TO "${dbUser}";
      AWS Secrets Manager username
      GRANT USAGE ON DATABASE ${schemaName} TO "${secretsUsername}";
    Important

    Don't grant CREATE, UPDATE, or DELETE access. Granting these actions can lead to unintended modification of your data.

(If your data is stored in AWS Glue Data Catalog) Additional required permissions

Whether you use the Amazon Bedrock Knowledge Bases service role that the AWS Management Console creates for you or a custom role that you created yourself, you need to configure the following permissions to allow access to your data if it is is stored in AWS Glue Data Catalog:

  • To allow access to your AWS Glue Data Catalog databases, attach the following permissions to the service role:

    { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "glue:GetDatabases", "glue:GetDatabase", "glue:GetTables", "glue:GetTable", "glue:GetPartitions", "glue:GetPartition", "glue:SearchTables" ], "Resource": [ "arn:aws:glue:${Region}:${Account}:table/${DatabaseName}/${TableName}", "arn:aws:glue:${Region}:${Account}:database/${DatabaseName}", "arn:aws:glue:${Region}:${Account}:catalog" ] } ] }
  • Grant permissions to your service role through AWS Lake Formation (to learn more about Lake Formation and its relationship with Amazon Redshift, see Redshift Spectrum and AWS Lake Formation) by doing the following:

    1. Sign in to the AWS Management Console, and open the Lake Formation console at https://console.aws.amazon.com/lakeformation/.

    2. Select Data permissions from the left navigation pane.

    3. Grant permissions to the service role you're using for Amazon Bedrock Knowledge Bases.

    4. Grant Describe and Select permissions for your databases and tables.

  • Depending on the data source you use in AWS Glue Data Catalog, you might need to also add permissions to access that data source (for more information, see AWS Glue dependency on other AWS services). For example, if your data source is in an Amazon S3 location, you'll need to add the following statement to the policy above.

    { "Sid": "Statement1", "Effect": "Allow", "Action": [ "s3:ListBucket", "s3:GetObject" ], "Resource": [ "arn:aws:s3:::${BucketName}", "arn:aws:s3:::${BucketName}/*" ] }

Request access to foundation models for RetrieveAndGenerate

If you plan to use RetrieveAndGenerate to generate responses based on retrieved data from your data source, request access to the foundation models to use for generation by following the steps at Access Amazon Bedrock foundation models.