CreateDataSource - Amazon Bedrock

CreateDataSource

Connects a knowledge base to a data source. You specify the configuration for the specific data source service in the dataSourceConfiguration field.

Important

You can't change the chunkingConfiguration after you create the data source connector.

Request Syntax

PUT /knowledgebases/knowledgeBaseId/datasources/ HTTP/1.1 Content-type: application/json { "clientToken": "string", "dataDeletionPolicy": "string", "dataSourceConfiguration": { "confluenceConfiguration": { "crawlerConfiguration": { "filterConfiguration": { "patternObjectFilter": { "filters": [ { "exclusionFilters": [ "string" ], "inclusionFilters": [ "string" ], "objectType": "string" } ] }, "type": "string" } }, "sourceConfiguration": { "authType": "string", "credentialsSecretArn": "string", "hostType": "string", "hostUrl": "string" } }, "s3Configuration": { "bucketArn": "string", "bucketOwnerAccountId": "string", "inclusionPrefixes": [ "string" ] }, "salesforceConfiguration": { "crawlerConfiguration": { "filterConfiguration": { "patternObjectFilter": { "filters": [ { "exclusionFilters": [ "string" ], "inclusionFilters": [ "string" ], "objectType": "string" } ] }, "type": "string" } }, "sourceConfiguration": { "authType": "string", "credentialsSecretArn": "string", "hostUrl": "string" } }, "sharePointConfiguration": { "crawlerConfiguration": { "filterConfiguration": { "patternObjectFilter": { "filters": [ { "exclusionFilters": [ "string" ], "inclusionFilters": [ "string" ], "objectType": "string" } ] }, "type": "string" } }, "sourceConfiguration": { "authType": "string", "credentialsSecretArn": "string", "domain": "string", "hostType": "string", "siteUrls": [ "string" ], "tenantId": "string" } }, "type": "string", "webConfiguration": { "crawlerConfiguration": { "crawlerLimits": { "maxPages": number, "rateLimit": number }, "exclusionFilters": [ "string" ], "inclusionFilters": [ "string" ], "scope": "string", "userAgent": "string" }, "sourceConfiguration": { "urlConfiguration": { "seedUrls": [ { "url": "string" } ] } } } }, "description": "string", "name": "string", "serverSideEncryptionConfiguration": { "kmsKeyArn": "string" }, "vectorIngestionConfiguration": { "chunkingConfiguration": { "chunkingStrategy": "string", "fixedSizeChunkingConfiguration": { "maxTokens": number, "overlapPercentage": number }, "hierarchicalChunkingConfiguration": { "levelConfigurations": [ { "maxTokens": number } ], "overlapTokens": number }, "semanticChunkingConfiguration": { "breakpointPercentileThreshold": number, "bufferSize": number, "maxTokens": number } }, "customTransformationConfiguration": { "intermediateStorage": { "s3Location": { "uri": "string" } }, "transformations": [ { "stepToApply": "string", "transformationFunction": { "transformationLambdaConfiguration": { "lambdaArn": "string" } } } ] }, "parsingConfiguration": { "bedrockDataAutomationConfiguration": { "parsingModality": "string" }, "bedrockFoundationModelConfiguration": { "modelArn": "string", "parsingModality": "string", "parsingPrompt": { "parsingPromptText": "string" } }, "parsingStrategy": "string" } } }

URI Request Parameters

The request uses the following URI parameters.

knowledgeBaseId

The unique identifier of the knowledge base to which to add the data source.

Pattern: ^[0-9a-zA-Z]{10}$

Required: Yes

Request Body

The request accepts the following data in JSON format.

clientToken

A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.

Type: String

Length Constraints: Minimum length of 33. Maximum length of 256.

Pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,256}$

Required: No

dataDeletionPolicy

The data deletion policy for the data source.

You can set the data deletion policy to:

  • DELETE: Deletes all data from your data source that’s converted into vector embeddings upon deletion of a knowledge base or data source resource. Note that the vector store itself is not deleted, only the data. This flag is ignored if an AWS account is deleted.

  • RETAIN: Retains all data from your data source that’s converted into vector embeddings upon deletion of a knowledge base or data source resource. Note that the vector store itself is not deleted if you delete a knowledge base or data source resource.

Type: String

Valid Values: RETAIN | DELETE

Required: No

dataSourceConfiguration

The connection configuration for the data source.

Type: DataSourceConfiguration object

Required: Yes

description

A description of the data source.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 200.

Required: No

name

The name of the data source.

Type: String

Pattern: ^([0-9a-zA-Z][_-]?){1,100}$

Required: Yes

serverSideEncryptionConfiguration

Contains details about the server-side encryption for the data source.

Type: ServerSideEncryptionConfiguration object

Required: No

vectorIngestionConfiguration

Contains details about how to ingest the documents in the data source.

Type: VectorIngestionConfiguration object

Required: No

Response Syntax

HTTP/1.1 200 Content-type: application/json { "dataSource": { "createdAt": "string", "dataDeletionPolicy": "string", "dataSourceConfiguration": { "confluenceConfiguration": { "crawlerConfiguration": { "filterConfiguration": { "patternObjectFilter": { "filters": [ { "exclusionFilters": [ "string" ], "inclusionFilters": [ "string" ], "objectType": "string" } ] }, "type": "string" } }, "sourceConfiguration": { "authType": "string", "credentialsSecretArn": "string", "hostType": "string", "hostUrl": "string" } }, "s3Configuration": { "bucketArn": "string", "bucketOwnerAccountId": "string", "inclusionPrefixes": [ "string" ] }, "salesforceConfiguration": { "crawlerConfiguration": { "filterConfiguration": { "patternObjectFilter": { "filters": [ { "exclusionFilters": [ "string" ], "inclusionFilters": [ "string" ], "objectType": "string" } ] }, "type": "string" } }, "sourceConfiguration": { "authType": "string", "credentialsSecretArn": "string", "hostUrl": "string" } }, "sharePointConfiguration": { "crawlerConfiguration": { "filterConfiguration": { "patternObjectFilter": { "filters": [ { "exclusionFilters": [ "string" ], "inclusionFilters": [ "string" ], "objectType": "string" } ] }, "type": "string" } }, "sourceConfiguration": { "authType": "string", "credentialsSecretArn": "string", "domain": "string", "hostType": "string", "siteUrls": [ "string" ], "tenantId": "string" } }, "type": "string", "webConfiguration": { "crawlerConfiguration": { "crawlerLimits": { "maxPages": number, "rateLimit": number }, "exclusionFilters": [ "string" ], "inclusionFilters": [ "string" ], "scope": "string", "userAgent": "string" }, "sourceConfiguration": { "urlConfiguration": { "seedUrls": [ { "url": "string" } ] } } } }, "dataSourceId": "string", "description": "string", "failureReasons": [ "string" ], "knowledgeBaseId": "string", "name": "string", "serverSideEncryptionConfiguration": { "kmsKeyArn": "string" }, "status": "string", "updatedAt": "string", "vectorIngestionConfiguration": { "chunkingConfiguration": { "chunkingStrategy": "string", "fixedSizeChunkingConfiguration": { "maxTokens": number, "overlapPercentage": number }, "hierarchicalChunkingConfiguration": { "levelConfigurations": [ { "maxTokens": number } ], "overlapTokens": number }, "semanticChunkingConfiguration": { "breakpointPercentileThreshold": number, "bufferSize": number, "maxTokens": number } }, "customTransformationConfiguration": { "intermediateStorage": { "s3Location": { "uri": "string" } }, "transformations": [ { "stepToApply": "string", "transformationFunction": { "transformationLambdaConfiguration": { "lambdaArn": "string" } } } ] }, "parsingConfiguration": { "bedrockDataAutomationConfiguration": { "parsingModality": "string" }, "bedrockFoundationModelConfiguration": { "modelArn": "string", "parsingModality": "string", "parsingPrompt": { "parsingPromptText": "string" } }, "parsingStrategy": "string" } } } }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

dataSource

Contains details about the data source.

Type: DataSource object

Errors

For information about the errors that are common to all actions, see Common Errors.

AccessDeniedException

The request is denied because of missing access permissions.

HTTP Status Code: 403

ConflictException

There was a conflict performing an operation.

HTTP Status Code: 409

InternalServerException

An internal server error occurred. Retry your request.

HTTP Status Code: 500

ResourceNotFoundException

The specified resource Amazon Resource Name (ARN) was not found. Check the Amazon Resource Name (ARN) and try your request again.

HTTP Status Code: 404

ServiceQuotaExceededException

The number of requests exceeds the service quota. Resubmit your request later.

HTTP Status Code: 402

ThrottlingException

The number of requests exceeds the limit. Resubmit your request later.

HTTP Status Code: 429

ValidationException

Input validation failed. Check your request parameters and retry the request.

HTTP Status Code: 400

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: