# Build a knowledge base by connecting to a structured data store
<a name="knowledge-base-build-structured"></a>

Amazon Bedrock Knowledge Bases allows you to connect to structured data stores, which contain data that conforms to a predefined schema. Examples of structured data include tables and databases. Amazon Bedrock Knowledge Bases can convert user queries into language that is suitable for extracting data from supported structured data stores. It can then use the converted query to retrieve data that is relevant to the query and generate appropriate responses. This enables you to use the existing structured data directly without having to convert it to a different format or generate your own SQL queries.

After you set up your knowledge base, you can submit queries to retrieve data from it through the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) operation, or generate responses from the retrieved data through the [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) operation. These operations underlyingly convert the user queries into ones that are appropriate for the structured data store connected to the knowledge base.

You also have the option to convert queries independently of retrieving data by using the [GenerateQuery](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GenerateQuery.html) API operation. This operation converts natural language queries into SQL queries that are appropriate to the data source being queried. You can use this operation independently and insert it into your workflow.

Select a topic to learn about the prerequisites and process for connecting your knowledge base to a structured data store.

**Topics**
+ [

# Set up your query engine and permissions for creating a knowledge base with structured data store
](knowledge-base-prereq-structured.md)
+ [

# Create a knowledge base by connecting to a structured data store
](knowledge-base-structured-create.md)
+ [

# Sync your structured data store with your Amazon Bedrock knowledge base
](kb-data-source-structured-sync-ingest.md)
+ [

# Cross-region inference for knowledge bases with structured data store
](kb-structured-cris.md)

# Set up your query engine and permissions for creating a knowledge base with structured data store
<a name="knowledge-base-prereq-structured"></a>

This topic describes the permissions that you need when connecting your knowledge base to a structured data store. If you plan to connect an Amazon Bedrock knowledge base to a structured data store, you need to fulfill the prerequisites. For general permissions requirements to be fulfilled, see [Set up permissions for a user or role to create and manage knowledge bases](knowledge-base-prereq-permissions-general.md).

**Important**  
Executing arbitrary SQL queries can be a security risk for any Text-to-SQL application. We recommend that you take precautions as needed, such as using restricted roles, read-only databases, and sandboxing.

Amazon Bedrock Knowledge Bases uses Amazon Redshift as the query engine for querying your data store. A query engine accesses metadata from a structured data store and uses the metadata to help generate SQL queries. Amazon Redshift is a data warehouse service that uses SQL to analyze structured data across data warehouses, databases, and data lakes.

## Create Amazon Redshift query engine
<a name="kb-query-engine-setup-create"></a>

You can use Amazon Redshift Serverless or Amazon Redshift Provisioned depending on your use case, and connect to workgroups or clusters for your data warehouse. The underlying data that the Amazon Redshift engine can query can be data natively stored in Amazon Redshift clusters, or data located under the default AWS Glue Data Catalog (such as in Amazon S3 among others).

If you've already created a query engine, you can skip this prerequisite. Otherwise, perform the following steps to set up your Amazon Redshift provisioned or Amazon Redshift Serverless query engine:

**To set up a query engine in Amazon Redshift provisioned**

1. Follow the procedure in [Step 1: Create a sample Amazon Redshift cluster](https://docs.aws.amazon.com/redshift/latest/gsg/new-user.html#rs-gsg-launch-sample-cluster) in the Amazon Redshift Getting Started Guide.

1. Note the cluster ID.

1. (Optional) For more information about Amazon Redshift provisioned clusters, see [Amazon Redshift provisioned clusters](https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html) in the Amazon Redshift Management Guide.

**To set up a query engine in Amazon Redshift Serverless**

1. Follow only the setup procedure in [Creating a data warehouse with Amazon Redshift Serverless](https://docs.aws.amazon.com/redshift/latest/gsg/new-user-serverless.html#serverless-console-resource-creation) in the Amazon Redshift Getting Started Guide and configure it with default settings.

1. Note the workgroup ARN.

1. (Optional) For more information about Amazon Redshift Serverless workgroups, see [Workgroups and namespaces](https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-workgroup-namespace.html) in the Amazon Redshift Management Guide.

## Configure Amazon Redshift query engine permissions
<a name="kb-query-engine-setup-redshift-permissions"></a>

Depending on the Amazon Redshift query engine that you choose, you can configure certain permissions. The permissions that you configure depend on the authentication method. The following table shows the authentication methods that can be used for different query engines:


****  

| Authentication method | Amazon Redshift Provisioned | Amazon Redshift Serverless | 
| --- | --- | --- | 
| IAM | ![\[Green circular icon with a white checkmark symbol inside.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-yes.png)Yes | ![\[Green circular icon with a white checkmark symbol inside.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-yes.png)Yes | 
| Database username | ![\[Green circular icon with a white checkmark symbol inside.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-yes.png)Yes | ![\[Red circular icon with an X symbol, indicating cancellation or denial.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-no.png)No | 
| AWS Secrets Manager | ![\[Green circular icon with a white checkmark symbol inside.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-yes.png)Yes | ![\[Green circular icon with a white checkmark symbol inside.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-yes.png)Yes | 

Amazon Bedrock Knowledge Bases uses a [service role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-service.html) to connect knowledge bases to structured data stores, retrieve data from these data stores, and generate SQL queries based on user queries and the structure of the data stores.

**Note**  
If you plan to use the AWS Management Console to create a knowledge base, you can skip this prerequisite. The console will create an Amazon Bedrock Knowledge Bases service role with the proper permissions.

To create a custom IAM service role with the proper permissions, follow the steps at [Create a role to delegate permissions to an AWS service](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-service.html) and attach the trust relationship defined in [Trust relationship](kb-permissions.md#kb-permissions-trust).

Then, add permissions for your knowledge base to access your Amazon Redshift query engine and databases. Expand the section that applies to your use case:

### Your query engine is Amazon Redshift provisioned
<a name="w2aac28c10c27c13c11c15b1"></a>

Attach the following policy to your custom service role to allow it to access your data and generate queries using it:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "RedshiftDataAPIStatementPermissions",
            "Effect": "Allow",
            "Action": [
                "redshift-data:GetStatementResult",
                "redshift-data:DescribeStatement",
                "redshift-data:CancelStatement"
            ],
            "Resource": [
                "*"
            ],
            "Condition": {
                "StringEquals": {
                    "redshift-data:statement-owner-iam-userid": "${aws:userid}"
                }
            }
        },
        {
            "Sid": "RedshiftDataAPIExecutePermissions",
            "Effect": "Allow",
            "Action": [
                "redshift-data:ExecuteStatement"
            ],
            "Resource": [
                "arn:aws:redshift:us-east-1:123456789012:cluster:${Cluster}"
            ]
        },
        {
            "Sid": "SqlWorkbenchAccess",
            "Effect": "Allow",
            "Action": [
                "sqlworkbench:GetSqlRecommendations",
                "sqlworkbench:PutSqlGenerationContext",
                "sqlworkbench:GetSqlGenerationContext",
                "sqlworkbench:DeleteSqlGenerationContext"
            ],
            "Resource": "*"
        },
        {
            "Sid": "GenerateQueryAccess",
            "Effect": "Allow",
            "Action": [
                "bedrock:GenerateQuery"
            ],
            "Resource": "*"
        }
    ]
}
```

------

You also need to add permissions to allow your service role to authenticate to the query engine. Expand a section to see the permissions for that method.

------
#### [ IAM ]

To allow your service role to authenticate to your Amazon Redshift provisioned query engine with IAM, attach the following policy to your custom service role:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "GetCredentialsWithFederatedIAMCredentials",
            "Effect": "Allow",
            "Action": "redshift:GetClusterCredentialsWithIAM",
            "Resource": [
                "arn:aws:redshift:us-east-1:123456789012:dbname:Cluster/database"
            ]
        }
    ]
}
```

------

------
#### [ Database user ]

To authenticate as an Amazon Redshift database user, attach the following policy to the service role:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "GetCredentialsWithClusterCredentials",
            "Effect": "Allow",
            "Action": [
                "redshift:GetClusterCredentials"
            ],
            "Resource": [
                "arn:aws:redshift:us-east-1:123456789012:dbuser:${cluster}/${dbuser}",
                "arn:aws:redshift:us-east-1:123456789012:dbname:${cluster}/${database}"
            ]
        }
    ]
}
```

------

------
#### [ AWS Secrets Manager ]

To allow your service role to authenticate to your Amazon Redshift provisioned query engine with an AWS Secrets Manager secret, do the following:
+ Attach the following policy to the role:

  ```
  {
      "Version": "2012-10-17",		 	 	 
      "Statement": [
          {
              "Sid": "GetSecretPermissions",
              "Effect": "Allow",
              "Action": [
                  "secretsmanager:GetSecretValue"
              ],
              "Resource": [
                  "arn:aws:secretsmanager:${region}:${account}:secret:${secretName}"
              ]
          }
      ]
  }
  ```

------

### Your query engine is Amazon Redshift Serverless
<a name="w2aac28c10c27c13c11c15b3"></a>

The permissions to attach depend on your authentication method. Expand a section to see the permissions for a method.

------
#### [ IAM ]

To allow your service role to authenticate to your Amazon Redshift serverless query engine with IAM, attach the following policy to your custom service role:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "RedshiftServerlessGetCredentials",
            "Effect": "Allow",
            "Action": "redshift-serverless:GetCredentials",
            "Resource": [
                "arn:aws:redshift-serverless:us-east-1:123456789012:workgroup/WorkgroupId"
            ]
        }
    ]
}
```

------

------
#### [ AWS Secrets Manager ]

To allow your service role to authenticate to your Amazon Redshift provisioned query engine with an AWS Secrets Manager secret, do the following:
+ Attach the following policy to the role:

  ```
  {
      "Version": "2012-10-17",		 	 	 
      "Statement": [
          {
              "Sid": "GetSecretPermissions",
              "Effect": "Allow",
              "Action": [
                  "secretsmanager:GetSecretValue"
              ],
              "Resource": [
                  "arn:aws:secretsmanager:${region}:${account}:secret:${secretName}"
              ]
          }
      ]
  }
  ```

------

## Allow knowledge base service role to access your data store
<a name="knowledge-base-prereq-structured-db-access"></a>

Make sure your data is stored in one of the following [supported structured data stores](knowledge-base-structured-create.md):
+ Amazon Redshift
+ AWS Glue Data Catalog (AWS Lake Formation)

The following table summarizes the authentication methods available for the query engine, depending on your data store:


****  

| Authentication method | Amazon Redshift | AWS Glue Data Catalog (AWS Lake Formation) | 
| --- | --- | --- | 
| IAM | ![\[Green circular icon with a white checkmark symbol inside.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-yes.png)Yes | ![\[Green circular icon with a white checkmark symbol inside.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-yes.png)Yes | 
| Database username | ![\[Green circular icon with a white checkmark symbol inside.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-yes.png)Yes | ![\[Red circular icon with an X symbol, indicating cancellation or denial.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-no.png)No | 
| AWS Secrets Manager | ![\[Green circular icon with a white checkmark symbol inside.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-yes.png)Yes | ![\[Red circular icon with an X symbol, indicating cancellation or denial.\]](http://docs.aws.amazon.com/bedrock/latest/userguide/images/icons/icon-no.png)No | 

To learn how to set up permissions for your Amazon Bedrock Knowledge Bases service role to access your data store and generate queries based on it, expand the section that corresponds to the service that your data store is in:

### Amazon Redshift
<a name="w2aac28c10c27c13c13c13b1"></a>

To grant your Amazon Bedrock Knowledge Bases service role access to your Amazon Redshift database, use the [Amazon Redshift query editor v2](https://docs.aws.amazon.com/redshift/latest/mgmt/query-editor-v2.html) and run the following SQL commands:

1. (If you authenticate with IAM and a user wasn't already created for your database) Run the following command, which uses [CREATE USER](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_USER.html) to create a database user and allow it to authenticate through IAM, replacing *\$1\$1service-role\$1* with the name of the custom Amazon Bedrock Knowledge Bases service role you created:

   ```
   CREATE USER "IAMR:${service-role}" WITH PASSWORD DISABLE;
   ```
**Important**  
If you use the Amazon Bedrock Knowledge Bases service role created for you in the console and then [sync your data store](kb-data-source-structured-sync-ingest.md) before you do this step, the user will be created for you, but the sync will fail because the user hasn't been granted permissions to access your data store. You must carry out the following step before syncing.

1. Grant an identity permissions to retrieve information from your database by running the [GRANT](https://docs.aws.amazon.com/redshift/latest/dg/r_GRANT.html) command.

------
#### [ IAM ]

   ```
   GRANT SELECT ON ALL TABLES IN SCHEMA ${schemaName} TO "IAMR:${serviceRole}";
   ```

------
#### [ Database user ]

   ```
   GRANT SELECT ON ALL TABLES IN SCHEMA ${schemaName} TO "${dbUser}";
   ```

------
#### [ AWS Secrets Manager username ]

   ```
   GRANT SELECT ON ALL TABLES IN SCHEMA ${schemaName} TO "${secretsUsername}";
   ```

------
**Important**  
Don't grant `CREATE`, `UPDATE`, or `DELETE` access. Granting these actions can lead to unintended modification of your data.

   For finer-grained control on the tables that can be accessed, you can replace `ALL TABLES` specific table names with the following notation: *\$1\$1schemaName\$1**\$1\$1tableName\$1*. For more information about this notation, see the **Query objects** section at [Cross-database queries](https://docs.aws.amazon.com/redshift/latest/dg/cross-database-overview.html).

------
#### [ IAM ]

   ```
   GRANT SELECT ON ${schemaName}.${tableName} TO "IAMR:${serviceRole}";
   ```

------
#### [ Database user ]

   ```
   GRANT SELECT ON ${schemaName}.${tableName} TO "${dbUser}";
   ```

------
#### [ AWS Secrets Manager username ]

   ```
   GRANT SELECT ON ${schemaName}.${tableName} TO "${secretsUsername}";
   ```

------

1. If you created a new schema in the Redshift database, run the following command to grant an identity permissions against the new schema.

   ```
   GRANT USAGE ON SCHEMA ${schemaName} TO "IAMR:${serviceRole}";
   ```

### AWS Glue Data Catalog
<a name="w2aac28c10c27c13c13c13b3"></a>

To grant your Amazon Bedrock Knowledge Bases service role access to your AWS Glue Data Catalog data store, use the [Amazon Redshift query editor v2](https://docs.aws.amazon.com/redshift/latest/mgmt/query-editor-v2.html) and run the following SQL commands:

1. Run the following command, which uses [CREATE USER](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_USER.html) to create a database user and allow it to authenticate through IAM, replacing *\$1\$1service-role\$1* with the name of the custom Amazon Bedrock Knowledge Bases service role you created:

   ```
   CREATE USER "IAMR:${service-role}" WITH PASSWORD DISABLE;
   ```
**Important**  
If you use the Amazon Bedrock Knowledge Bases service role created for you in the console and then [sync your data store](kb-data-source-structured-sync-ingest.md) before you do this step, the user will be created for you, but the sync will fail because the user hasn't been granted permissions to access your data store. You must carry out the following step before syncing.

1. Grant the service role permissions to retrieve information from your database by running the following [GRANT](https://docs.aws.amazon.com/redshift/latest/dg/r_GRANT.html) command:

   ```
   GRANT USAGE ON DATABASE awsdatacatalog TO "IAMR:${serviceRole}";
   ```
**Important**  
Don't grant `CREATE`, `UPDATE`, or `DELETE` access. Granting these actions can lead to unintended modification of your data.

1. To allow access to your AWS Glue Data Catalog databases, attach the following permissions to the service role:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Sid": "VisualEditor0",
               "Effect": "Allow",
               "Action": [
                   "glue:GetDatabases",
                   "glue:GetDatabase",
                   "glue:GetTables",
                   "glue:GetTable",
                   "glue:GetPartitions",
                   "glue:GetPartition",
                   "glue:SearchTables"
               ],
               "Resource": [
                   "arn:aws:glue:us-east-1:123456789012:table/${DatabaseName}/${TableName}",
                   "arn:aws:glue:us-east-1:123456789012:database/${DatabaseName}",
                   "arn:aws:glue:us-east-1:123456789012:catalog"
               ]
           }
       ]
   }
   ```

------

1. Grant permissions to your service role through AWS Lake Formation (to learn more about Lake Formation and its relationship with Amazon Redshift, see [Data sources for Redshift](https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-source.html)) by doing the following:

   1. Sign in to the AWS Management Console, and open the Lake Formation console at [https://console.aws.amazon.com/lakeformation/](https://console.aws.amazon.com/lakeformation/).

   1. Select **Data permissions** from the left navigation pane.

   1. Grant permissions to the service role you're using for Amazon Bedrock Knowledge Bases.

   1. Grant **Describe** and **Select** permissions for your databases and tables.

1. Depending on the data source you use in AWS Glue Data Catalog, you might need to add permissions to access that data source (for more information, see [AWS Glue dependency on other AWS services](https://docs.aws.amazon.com/glue/latest/dg/dependency-on-other-services.html)). For example, if your data source is in an Amazon S3 location, you'll need to add the following statement to the policy above.

   ```
   {
       "Sid": "Statement1",
       "Effect": "Allow",
       "Action": [
           "s3:ListBucket",
           "s3:GetObject"
       ],
       "Resource": [
           "arn:aws:s3:::${BucketName}",
           "arn:aws:s3:::${BucketName}/*"
       ]
   }
   ```

1. (Optional) If you use AWS KMS to encrypt the data in Amazon S3 or AWS Glue Data Catalog, then you need to add permissions to the role to decrypt the data on the KMS key.

   ```
   {
       "Action": [
           "kms:Decrypt"
       ],
       "Resource": [
           "arn:aws:kms:${Region}:${Account}:key/{KmsId}",
           "arn:aws:kms:${Region}:${Account}:key/{KmsId}"
       ],
       "Effect": "Allow"
   }
   ```

# Create a knowledge base by connecting to a structured data store
<a name="knowledge-base-structured-create"></a>

To connect a knowledge base to a structured data store, you specify the following components:
+ 

**Query engine configuration**  
The configuration for the compute service that will execute the generated SQL querries. The query engine is used to convert natural language user queries into SQL queries that can be used to extract data from your data store. You can choose Amazon Redshift as your query engine. When choosing this configuration, you must specify:
  + The compute connnection metadata such as the cluster ID or the workgroup ARN depending on the chosen query engine.
  + The authentication method for using the query engine, which can be using an IAM service role with the appropriate permissions, a query engine database user, or an AWS Secrets Manager secret that is linked to your database credentials.
+ 

**Storage configuration**  
The configuration for the data store containing your data. You can connect to Amazon Redshift Provisioned or Amazon Redshift Serverless and use Amazon Redshift or AWS Glue Data Catalog as your data store.
+ 

**(Optional) Query configurations**  
You can use optional query configurations for improving the accuracy of SQL generation:
  + **Maximum query time** – The amount of time after which the query times out.
  + **Descriptions** – Provides metadata or supplementary information about tables or columns. You can include descriptions of the tables or columns, usage notes, or any additional attributes. The descriptions you add can improve SQL query generation by providing extra context and information about the structure of the tables or columns.
  + **Inclusions and Exclusions** – Specifies a set of tables or columns to be included or excluded for SQL generation. This field is crucial if you want to limit the scope of SQL queries to a defined subset of available tables or columns. This option can help optimize the generation process by reducing unnecessary table or column references.

    If you specify inclusions, all other tables and columns are ignored. If you specify exclusions, the tables and columns you specify are ignored.
**Note**  
Inclusions and exclusions aren't a substitute for guardrails and is only intended for improving model accuracy.
  + **Curated queries** – A set of predefined question and answer examples. Questions are written as natural language queries (NLQ) and answers are the corresponding SQL query. These examples help the SQL generation process by providing examples of the kinds of queries that should be generated. They serve as reference points to improve the accuracy and relevance of generative SQL outputs.

Expand the section that corresponds to your use case:

## Use the console
<a name="knowledge-base-structured-create-console"></a>

To connect to a structured data store using the AWS Management Console, do the following:

1. Sign in to the AWS Management Console with an IAM identity that has permissions to use the Amazon Bedrock console. Then, open the Amazon Bedrock console at [https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock).

1. In the left navigation pane, choose **Knowledge bases**.

1. In the **Knowledge bases** section, choose **Create** and then select **Knowledge base with structured data store**.

1. Set up the following details for the knowledge base:

   1. (Optional) Change the default name and provide a description for your knowledge base.

   1. Select the query engine to use for retrieving data from your data store.

   1. Choose an IAM service role with the proper permissions to create and manage this knowledge base. You can let Amazon Bedrock create the service role or choose a custom role that you have created. For more information about creating a custom role, see [Set up your query engine and permissions for creating a knowledge base with structured data store](knowledge-base-prereq-structured.md).

   1. (Optional) Add tags to associate with your knowledge base. For more information, see [Tagging Amazon Bedrock resources](tagging.md).

   1. Choose **Next**.

1. Configure your query engine:

   1. Select the service in which you created a cluster or workgroup. Then choose the cluster or workgroup to use.

   1. Select the authentication method and provide the necessary fields.

   1. Select the data store in which to store your metadata. Then, choose or enter the name of the database.

   1. (Optional) Modify the query configurations as necessary. Refer to the beginning of this topic for more information about different configurations.

   1. Choose **Next**.

1. Review your knowledge base configurations and edit any sections as necessary. Confirm to create your knowledge base.

## Use the API
<a name="knowledge-base-structured-create-api"></a>

To connect to a structured data store using the Amazon Bedrock API, send a [CreateKnowledgeBase](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateKnowledgeBase.html) request with an [Agents for Amazon Bedrock build-time endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bra-bt) with the following general request body:

```
{
    "name": "string",
    "roleArn": "string",
    "knowledgeBaseConfiguration": {
        "type": "SQL",
        "sqlKnowledgeBaseConfiguration": [SqlKnowledgeBaseConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_SqlKnowledgeBaseConfiguration.html)
    },
    "description": "string",
    "clientToken": "string",
    "tags": {
        "string": "string"
    }
}
```

The following fields are required.


****  

| Field | Basic description | 
| --- | --- | 
| Name | A name for the knowledge base | 
| roleArn | A [knowledge base service role](kb-permissions.md) with the proper permissions. You can use the console to automatically create a service role with the proper permissions. | 
| knowledgeBaseConfiguration | Contains configurations for the knowledge base. For a structured database, specify SQL as the type and include the sqlKnowledgeBaseConfiguration field. | 

The following fields are optional.


****  

| Field | Use | 
| --- | --- | 
| description | To include a description for the knowledge base. | 
| clientToken | To ensure the API request completes only once. For more information, see [Ensuring idempotency](https://docs.aws.amazon.com/ec2/latest/devguide/ec2-api-idempotency.html). | 
| tags | To associate tags with the flow. For more information, see [Tagging Amazon Bedrock resources](tagging.md). | 

The `SQLKnowledgeBaseConfiguration` depends on the query engine that you use. For Amazon Redshift, specify the `type` field as `REDSHIFT` and include the `redshiftConfiguration` field, which maps to a [RedshiftConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_RedshiftConfiguration.html). For the [RedshiftConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_RedshiftConfiguration.html), you configure the following fields:

### queryEngineConfiguration
<a name="w2aac28c10c27c15b9b3c17b1"></a>

You can configure the following types of query engine:

#### Amazon Redshift Provisioned
<a name="w2aac28c10c27c15b9b3c17b1b5b1"></a>

If your Amazon Redshift databases are provisioned on dedicated compute nodes, the value of the `queryEngineConfiguration` field should be a [RedshiftQueryEngineConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_RedshiftQueryEngineConfiguration.html) in the following format:

```
{
    "type": "PROVISIONED",
    "provisionedConfiguration": {
        "clusterIdentifier": "string",
        "authConfiguration": [RedshiftProvisionedAuthConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_RedshiftProvisionedAuthConfiguration.html)
    },
}
```

Specify the ID of the cluster in the `clusterIdentifier` field. The [RedshiftProvisionedAuthConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_RedshiftProvisionedAuthConfiguration.html) depends on the type of authorization you're using. Select the tab that matches your authorization method:

------
#### [ IAM role ]

If you authorize with your IAM role, you need to specify only `IAM` as the type in the [RedshiftProvisionedAuthConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_RedshiftProvisionedAuthConfiguration.html) with no additional fields.

```
{
    "type": "IAM"
}
```

------
#### [ Temporary credentials user name ]

If you authorize with the database user name, specify the `type` as `USERNAME` and specify the user name in the `databaseUser` field in the `RedshiftProvisionedAuthConfig`:

```
{
    "type": "USERNAME",
    "databaseUser": "string"
}
```

------
#### [ AWS Secrets Manager ]

If you authorize with AWS Secrets Manager, specify the `type` as `USERNAME_PASSWORD` and specify the ARN of the secret in the `usernamePasswordSecretArn` field in the `RedshiftProvisionedAuthConfig`:

```
{
    "type": "USERNAME_PASSWORD",
    "usernamePasswordSecretArn": "string"
}
```

------

#### Amazon Redshift Serverless
<a name="w2aac28c10c27c15b9b3c17b1b5b3"></a>

If you're using Amazon Redshift Serverless, the value of the `queryConfiguration`field should be a [RedshiftQueryEngineConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_RedshiftQueryEngineConfiguration.html) in the following format:

```
{
    "type": "SERVERLESS",
    "serverlessConfiguration": {
        "workgroupArn": "string",
        "authConfiguration": 
    }
}
```

Specify the ARN of your workgroup in the `workgroupArn` field. The [RedshiftServerlessAuthConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_RedshiftServerlessAuthConfiguration.html) depends on the type of authorization you're using. Select the tab that matches your authorization method:

------
#### [ IAM role ]

If you authorize with your IAM role, you need to specify only `IAM` as the type in the `RedshiftServerlessAuthConfiguration` with no additional fields.

```
{
    "type": "IAM"
}
```

------
#### [ AWS Secrets Manager ]

If you authorize with AWS Secrets Manager, specify the `type` as `USERNAME_PASSWORD` and specify the ARN of the secret in the `usernamePasswordSecretArn` field in the `RedshiftServerlessAuthConfiguration`:

```
{
    "type": "USERNAME_PASSWORD",
    "usernamePasswordSecretArn": "string"
}
```

------

### storageConfigurations
<a name="w2aac28c10c27c15b9b3c17b3"></a>

This field maps to an array containing a single [RedshiftQueryEngineStorageConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_RedshiftQueryEngineStorageConfiguration.html), whose format depends on where your data is stored.

#### AWS Glue Data Catalog
<a name="w2aac28c10c27c15b9b3c17b3b5b1"></a>

If your data is stored in AWS Glue Data Catalog, the `RedshiftQueryEngineStorageConfiguration` should be in the following format:

```
{
    "type": "AWS_DATA_CATALOG",
    "awsDataCatalogConfiguration": {
        "tableNames": ["string"]
    }
}
```

Add the name of each table that you want to connect your knowledge base to in the array that `tableNames` maps to.

**Note**  
Enter table names in the pattern described in [Cross-database queries](https://docs.aws.amazon.com/redshift/latest/dg/cross-database-overview.html) (`${databaseName}.${tableName}`). You can include all tables by specifying `${databaseName.*}`.

#### Amazon Redshift databases
<a name="w2aac28c10c27c15b9b3c17b3b5b3"></a>

If your data is stored in an Amazon Redshift database, the `RedshiftQueryEngineStorageConfiguration` should be in the following format:

```
{
    "type": "string",
    "redshiftConfiguration": {
        "databaseName": "string"
    }
}
```

Specify the name of your Amazon Redshift database in the `databaseName` field.

**Note**  
Enter table names in the pattern described in [Cross-database queries](https://docs.aws.amazon.com/redshift/latest/dg/cross-database-overview.html) (`${databaseName}.${tableName}`). You can include all tables by specifying `${databaseName.*}`.

If your database is mounted through Amazon SageMaker AI Lakehouse, the database name is in the format *\$1\$1db\$1@\$1\$1schema\$1*.

### queryGenerationConfiguration
<a name="w2aac28c10c27c15b9b3c17b5"></a>

This field maps to the following [QueryGenerationConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_QueryGenerationConfiguration.html) that you can use to configure how your data is queried:

```
{
    "executionTimeoutSeconds": number,
    "generationContext": {
        "tables": [
            {
                "name": "string",
                "description": "string",
                "inclusion": "string",
                "columns": [
                    {
                        "name": "string",
                        "description": "string",
                        "inclusion": "string"
                    },
                    ...
                ]
            },
            ...
        ],
        "curatedQueries": [
            {
                "naturalLanguage": "string",
                "sql": "string"
            },
            ...
        ]
    }
}
```

If you want the query to time out, specify the timeout duration in seconds in the `executionTimeoutSeconds` field.

The `generationContext` field maps to a [QueryGenerationContext](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_QueryGenerationContext.html) object in which you can configure as many of the following options as you need.

**Important**  
If you include a generation context, the query engine makes a best effort attempt to apply it when generating SQL. The generation context is non-deterministic and is only intended for improving model accuracy. To ensure accuracy, verify the generated SQL queries.

For information about generation contexts that you can include, expand the following sections:

#### Add descriptions for tables or columns in the database
<a name="w2aac28c10c27c15b9b3c17b5c15b1"></a>

To improve the accuracy of SQL generation for querying the database, you can provide a description for the table or column that provides more context than a short table or column name. You can do the following:
+ To add a description for a table, include a [QueryGenerationTable](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_QueryGenerationTable.html) object in the `tables` array. In that object, specify the name of the table in the `name` field and a description in the `description` field, as in the following example:

  ```
  {
      "name": "database.schema.tableA",
      "description": "Description for Table A"
  }
  ```
+ To add a description for a column, include a [QueryGenerationTable](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_QueryGenerationTable.html) object in the `tables` array. In that object, specify the name of the table in the `name` field and include the `columns` field, which maps to an array of [QueryGenerationColumn](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_QueryGenerationColumn.html). In a `QueryGenerationColumn` object, include the name of the column in the `name` field and a description in the `description` field, as in the following example:

  ```
  {
      "name": "database.schema.tableA",
      "columns": [
          {
              "name": "Column A",
              "description": "Description for Column A"
          }
      ]
  }
  ```
+ You can add a description for both a table and a column in it, as in the following example:

  ```
  {
      "name": "database.schema.tableA",
      "description": "Description for Table A",
      "columns": [
          {
              "name": "columnA",
              "description": "Description for Column A"
          }
      ]
  }
  ```
**Note**  
Enter table and column names in the pattern described in [Cross-database queries](https://docs.aws.amazon.com/redshift/latest/dg/cross-database-overview.html). If your database is in AWS Glue Data Catalog, the format is `awsdatacatalog.gluedatabase.table`.

#### Include or exclude tables or columns in the database
<a name="w2aac28c10c27c15b9b3c17b5c15b3"></a>

You can suggest tables or columns to include or exclude when generating SQL by using the `inclusion` field in the [QueryGenerationTable](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_QueryGenerationTable.html) and [QueryGenerationColumn](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_QueryGenerationColumn.html) objects. You can specify one of the following values in the `inclusion` field:
+ INCLUDE – Only the tables or columns that you specify are included as context when generating SQL.
+ EXCLUDE – The tables or columns that you specify are excluded as context when generating SQL.

You can specify whether to include or exclude tables or columns in the following ways:
+ To include or exclude a table, include a [QueryGenerationTable](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_QueryGenerationTable.html) object in the `tables` array. In that object, specify the name of the table in the `name` field and whether to include or exclude it in the `inclusion` field, as in the following example:

  ```
  {
      "name": "database.schema.tableA",
      "inclusion": "EXCLUDE"
  }
  ```

  The query engine doesn't add `Table A` in the additional context for generating SQL.
+ To include or exclude a column, include a [QueryGenerationTable](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_QueryGenerationTable.html) object in the `tables` array. In that object, specify the name of the table in the `name` field and include the `columns` field, which maps to an array of [QueryGenerationColumn](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_QueryGenerationColumn.html). In a `QueryGenerationColumn` object, include the name of the column in the `name` field and whether to include or exclude it in the `inclusion` field, as in the following example:

  ```
  {
      "name": "database.schema.tableA",
      "columns": [
          {
              "name": "database.schema.tableA.columnA",
              "inclusion": "EXCLUDE"
          }
      ]
  }
  ```

  The SQL generation ignores `Column A` in `Table A` in the context when generating SQL.
+ You can combine tables and columns when specifying inclusions or exclusions, as in the following example:

  ```
  {
      "name": "database.schema.tableA",
      "inclusion": "INCLUDE",
      "columns": [
          {
              "name": "database.schema.tableA.columnA",
              "inclusion": "EXCLUDE"
          }
      ]
  }
  ```

  SQL generation includes `Table A`, but excludes `Column A` within it when adding context for generating SQL.

**Important**  
Table and column exclusions aren't substitutes for guardrails. These table and column inclusions and exclusions are used as additional context for model to consider when generating SQL.

#### Give the query engine example mappings of natural language to SQL queries
<a name="w2aac28c10c27c15b9b3c17b5c15b5"></a>

To improve a query engine's accuracy in converting user queries into SQL queries, you can provide it examples in the `curatedQueries` field in the [QueryGenerationContext](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_QueryGenerationContext.html) object, which maps to an array of [CuratedQuery](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CuratedQuery.html) objects. Each object contains the following fields:
+ naturalLanguage – An example of a query in natural language.
+ sql – The SQL query that corresponds to the natural language query.

# Sync your structured data store with your Amazon Bedrock knowledge base
<a name="kb-data-source-structured-sync-ingest"></a>

After you connect your knowledge base to a structured data store, you perform a sync to start the metadata ingestion process so that data can be retrieved. The metadata allows Amazon Bedrock Knowledge Bases to translate user prompts into a query for the connected database.

Whenever you make modifications to your database schema, you need to sync the changes.

To learn how to ingest your metadata into your knowledge base and sync with your latest data, choose the tab for your preferred method, and then follow the steps:

------
#### [ Console ]

**To ingest your data into your knowledge base and sync with your latest data**

1. Open the Amazon Bedrock console at [https://console.aws.amazon.com/bedrock/](https://console.aws.amazon.com/bedrock/).

1. From the left navigation pane, select **Knowledge base** and choose your knowledge base.

1. In the **Data source** section, select **Sync** to begin the metadata ingestion process. To stop a data source currently syncing, select **Stop**. A data source must be currently syncing in order to stop syncing the data source. You can select **Sync** again to ingest the rest of your data.

1. When data ingestion completes, a green success banner appears if it is successful.

1. You can choose a data source to view its **Sync history**. Select **View warnings** to see why a data ingestion job failed.

------
#### [ API ]

To ingest your data into your knowledge base and sync with your latest data, send a [StartIngestionJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_StartIngestionJob.html) request with an [Agents for Amazon Bedrock build-time endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bra-bt).

Use the `ingestionJobId` returned in the response in a [GetIngestionJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_GetIngestionJob.html) request with an [Agents for Amazon Bedrock build-time endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bra-bt) to track the status of the ingestion job.

You can see information for all ingestion jobs for a data source by sending a [ListIngestionJobs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_ListIngestionJobs.html) request with a [Agents for Amazon Bedrock build-time endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bra-bt).

To stop a data ingestion job that is currently running, send a [StopIngestionJobs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_StopIngestionJob.html) request with an [Agents for Amazon Bedrock build-time endpoint](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bra-bt). You can send a `StartIngestionJob` request again to ingest the rest of your data when you are ready.

------

**Important**  
If you use the Amazon Bedrock Knowledge Bases service role created for you in the console and then sync your data store before granting access to your database to the authentication role that you use, the sync will fail because the user hasn't been granted permissions to access your data store. For information about granting permissions to a role to access your data store, see [Allow knowledge base service role to access your data store](knowledge-base-prereq-structured.md#knowledge-base-prereq-structured-db-access).

# Cross-region inference for knowledge bases with structured data store
<a name="kb-structured-cris"></a>

Starting May 10, 2026, Amazon Bedrock Knowledge Bases with structured data store will use cross-region inference to process your API requests. With cross-region inference, Amazon Bedrock Knowledge Bases will automatically select the optimal region within your geography to process your inference request, maximizing available compute resources and model availability, and providing the best customer experience. This applies to the [GenerateQuery](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_GenerateQuery.html), [Retrieve](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html), and [RetrieveAndGenerate](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) API operations when used with a structured data store.

Cross-region inference requests stay within the AWS Regions that are part of the geography where your data originally resides. For example, a request made within the US is kept within AWS Regions in the US. Although your knowledge base data remains stored only in the primary Region, input prompts and output results may be processed in another Region within the same geography. All data is transmitted encrypted across Amazon's secure network.

For the following Regions, geo-specific cross-region inference is not available, and inference requests may be processed in Regions outside of the local geography:
+ Asia Pacific (Seoul) (`ap-northeast-2`)
+ Asia Pacific (Mumbai) (`ap-south-1`)
+ Asia Pacific (Singapore) (`ap-southeast-1`)
+ South America (São Paulo) (`sa-east-1`)

**Note**  
There is no additional cost for using cross-region inference with knowledge bases with structured data store.

Cross-region inference is automatically enabled for all knowledge bases with structured data store. No configuration changes are required. For more information about cross-region inference and supported Regions, see [Increase throughput with cross-Region inference](cross-region-inference.md).