Connect your knowledge base to a custom data source

Instead of choosing a supported data source service, you can connect to a custom data source for the following advantages:

Flexibility and control over the data types that you want your knowledge base to have access to.
The ability to use the KnowledgeBaseDocuments API operations to directly ingest or delete documents without the need to sync changes.
The ability to view documents in your data source directly through the Amazon Bedrock console or API.
The ability to upload documents into the data source directly in the AWS Management Console or to add them inline.
The ability to add metadata directly to each document for when adding or updating a document in the data source. For more information on how to use metadata for filtering when retrieving information from a data source, see the Metadata and filtering tab in Configure and customize queries and response generation.

To connect a knowledge base to a custom data source, send a CreateDataSource request with an Agents for Amazon Bedrock build-time endpoint. Specify the knowledgeBaseId of the knowledge base to connect to, give a name to the data source, and specify the type field in the dataSourceConfiguration as CUSTOM. The following shows a minimal example to create this data source:


PUT /knowledgebases/KB12345678/datasources/ HTTP/1.1
Content-type: application/json

{
    "name": "MyCustomDataSource",
    "dataSourceConfiguration": {
        "type": "CUSTOM"
    }
}

You can include any of the following optional fields to configure the data source:

Field	Use case
description	To provide a description for the data source.
clientToken	To ensure the API request completes only once. For more information, see Ensuring idempotency.
serverSideEncryptionConfiguration	To specify a custom KMS key for transient data storage while converting your data into embeddings. For more information, see Encryption of transient data storage during data ingestion
dataDeletionPolicy	To configure what to do with the vector embeddings for your data source in your vector store, if you delete the data source. Specify `RETAIN` to retain the data in the vector store or the default option of `DELETE` to delete them.
vectorIngestionConfiguration	To configure options for ingestion of the data source. See below for more information.

The vectorIngestionConfiguration field maps to a VectorIngestionConfiguration object containing the following fields:

chunkingConfiguration – To configure the strategy to use for chunking the documents in the data source. For more information about chunking strategies, see How content chunking works for knowledge bases.
parsingConfiguration – To configure the strategy to use for parsing the data source. For more information about parsing options, see Parsing options for your data source.
customTransformationConfiguration – To customize how the data is transformed and to apply a Lambda function for greater customization. For more information about how to customize chunking of your data and processing of your metadata with a Lambda function, see Use a custom transformation Lambda function to define how your data is ingested.

After setting up your custom data source, you can add documents into it and directly ingest them into the knowledge base. Unlike other data sources, you don't need to sync a custom data source. To learn how to ingest documents directly, see Ingest changes directly into a knowledge base.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Web Crawler

Customize ingestion for a data source