CreateDataSource
Creates a data source connector for a knowledge base.
Important
You can't change the chunkingConfiguration
after you create the data source connector.
Request Syntax
PUT /knowledgebases/knowledgeBaseId
/datasources/ HTTP/1.1
Content-type: application/json
{
"clientToken": "string
",
"dataDeletionPolicy": "string
",
"dataSourceConfiguration": {
"confluenceConfiguration": {
"crawlerConfiguration": {
"filterConfiguration": {
"patternObjectFilter": {
"filters": [
{
"exclusionFilters": [ "string
" ],
"inclusionFilters": [ "string
" ],
"objectType": "string
"
}
]
},
"type": "string
"
}
},
"sourceConfiguration": {
"authType": "string
",
"credentialsSecretArn": "string
",
"hostType": "string
",
"hostUrl": "string
"
}
},
"s3Configuration": {
"bucketArn": "string
",
"bucketOwnerAccountId": "string
",
"inclusionPrefixes": [ "string
" ]
},
"salesforceConfiguration": {
"crawlerConfiguration": {
"filterConfiguration": {
"patternObjectFilter": {
"filters": [
{
"exclusionFilters": [ "string
" ],
"inclusionFilters": [ "string
" ],
"objectType": "string
"
}
]
},
"type": "string
"
}
},
"sourceConfiguration": {
"authType": "string
",
"credentialsSecretArn": "string
",
"hostUrl": "string
"
}
},
"sharePointConfiguration": {
"crawlerConfiguration": {
"filterConfiguration": {
"patternObjectFilter": {
"filters": [
{
"exclusionFilters": [ "string
" ],
"inclusionFilters": [ "string
" ],
"objectType": "string
"
}
]
},
"type": "string
"
}
},
"sourceConfiguration": {
"authType": "string
",
"credentialsSecretArn": "string
",
"domain": "string
",
"hostType": "string
",
"siteUrls": [ "string
" ],
"tenantId": "string
"
}
},
"type": "string
",
"webConfiguration": {
"crawlerConfiguration": {
"crawlerLimits": {
"rateLimit": number
},
"exclusionFilters": [ "string
" ],
"inclusionFilters": [ "string
" ],
"scope": "string
"
},
"sourceConfiguration": {
"urlConfiguration": {
"seedUrls": [
{
"url": "string
"
}
]
}
}
}
},
"description": "string
",
"name": "string
",
"serverSideEncryptionConfiguration": {
"kmsKeyArn": "string
"
},
"vectorIngestionConfiguration": {
"chunkingConfiguration": {
"chunkingStrategy": "string
",
"fixedSizeChunkingConfiguration": {
"maxTokens": number
,
"overlapPercentage": number
},
"hierarchicalChunkingConfiguration": {
"levelConfigurations": [
{
"maxTokens": number
}
],
"overlapTokens": number
},
"semanticChunkingConfiguration": {
"breakpointPercentileThreshold": number
,
"bufferSize": number
,
"maxTokens": number
}
},
"customTransformationConfiguration": {
"intermediateStorage": {
"s3Location": {
"uri": "string
"
}
},
"transformations": [
{
"stepToApply": "string
",
"transformationFunction": {
"transformationLambdaConfiguration": {
"lambdaArn": "string
"
}
}
}
]
},
"parsingConfiguration": {
"bedrockFoundationModelConfiguration": {
"modelArn": "string
",
"parsingPrompt": {
"parsingPromptText": "string
"
}
},
"parsingStrategy": "string
"
}
}
}
URI Request Parameters
The request uses the following URI parameters.
- knowledgeBaseId
-
The unique identifier of the knowledge base to which to add the data source.
Pattern:
^[0-9a-zA-Z]{10}$
Required: Yes
Request Body
The request accepts the following data in JSON format.
- clientToken
-
A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.
Type: String
Length Constraints: Minimum length of 33. Maximum length of 256.
Pattern:
^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,256}$
Required: No
- dataDeletionPolicy
-
The data deletion policy for the data source.
You can set the data deletion policy to:
-
DELETE: Deletes all data from your data source that’s converted into vector embeddings upon deletion of a knowledge base or data source resource. Note that the vector store itself is not deleted, only the data. This flag is ignored if an AWS account is deleted.
-
RETAIN: Retains all data from your data source that’s converted into vector embeddings upon deletion of a knowledge base or data source resource. Note that the vector store itself is not deleted if you delete a knowledge base or data source resource.
Type: String
Valid Values:
RETAIN | DELETE
Required: No
-
- dataSourceConfiguration
-
The connection configuration for the data source.
Type: DataSourceConfiguration object
Required: Yes
- description
-
A description of the data source.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 200.
Required: No
- name
-
The name of the data source.
Type: String
Pattern:
^([0-9a-zA-Z][_-]?){1,100}$
Required: Yes
- serverSideEncryptionConfiguration
-
Contains details about the server-side encryption for the data source.
Type: ServerSideEncryptionConfiguration object
Required: No
- vectorIngestionConfiguration
-
Contains details about how to ingest the documents in the data source.
Type: VectorIngestionConfiguration object
Required: No
Response Syntax
HTTP/1.1 200
Content-type: application/json
{
"dataSource": {
"createdAt": "string",
"dataDeletionPolicy": "string",
"dataSourceConfiguration": {
"confluenceConfiguration": {
"crawlerConfiguration": {
"filterConfiguration": {
"patternObjectFilter": {
"filters": [
{
"exclusionFilters": [ "string" ],
"inclusionFilters": [ "string" ],
"objectType": "string"
}
]
},
"type": "string"
}
},
"sourceConfiguration": {
"authType": "string",
"credentialsSecretArn": "string",
"hostType": "string",
"hostUrl": "string"
}
},
"s3Configuration": {
"bucketArn": "string",
"bucketOwnerAccountId": "string",
"inclusionPrefixes": [ "string" ]
},
"salesforceConfiguration": {
"crawlerConfiguration": {
"filterConfiguration": {
"patternObjectFilter": {
"filters": [
{
"exclusionFilters": [ "string" ],
"inclusionFilters": [ "string" ],
"objectType": "string"
}
]
},
"type": "string"
}
},
"sourceConfiguration": {
"authType": "string",
"credentialsSecretArn": "string",
"hostUrl": "string"
}
},
"sharePointConfiguration": {
"crawlerConfiguration": {
"filterConfiguration": {
"patternObjectFilter": {
"filters": [
{
"exclusionFilters": [ "string" ],
"inclusionFilters": [ "string" ],
"objectType": "string"
}
]
},
"type": "string"
}
},
"sourceConfiguration": {
"authType": "string",
"credentialsSecretArn": "string",
"domain": "string",
"hostType": "string",
"siteUrls": [ "string" ],
"tenantId": "string"
}
},
"type": "string",
"webConfiguration": {
"crawlerConfiguration": {
"crawlerLimits": {
"rateLimit": number
},
"exclusionFilters": [ "string" ],
"inclusionFilters": [ "string" ],
"scope": "string"
},
"sourceConfiguration": {
"urlConfiguration": {
"seedUrls": [
{
"url": "string"
}
]
}
}
}
},
"dataSourceId": "string",
"description": "string",
"failureReasons": [ "string" ],
"knowledgeBaseId": "string",
"name": "string",
"serverSideEncryptionConfiguration": {
"kmsKeyArn": "string"
},
"status": "string",
"updatedAt": "string",
"vectorIngestionConfiguration": {
"chunkingConfiguration": {
"chunkingStrategy": "string",
"fixedSizeChunkingConfiguration": {
"maxTokens": number,
"overlapPercentage": number
},
"hierarchicalChunkingConfiguration": {
"levelConfigurations": [
{
"maxTokens": number
}
],
"overlapTokens": number
},
"semanticChunkingConfiguration": {
"breakpointPercentileThreshold": number,
"bufferSize": number,
"maxTokens": number
}
},
"customTransformationConfiguration": {
"intermediateStorage": {
"s3Location": {
"uri": "string"
}
},
"transformations": [
{
"stepToApply": "string",
"transformationFunction": {
"transformationLambdaConfiguration": {
"lambdaArn": "string"
}
}
}
]
},
"parsingConfiguration": {
"bedrockFoundationModelConfiguration": {
"modelArn": "string",
"parsingPrompt": {
"parsingPromptText": "string"
}
},
"parsingStrategy": "string"
}
}
}
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- dataSource
-
Contains details about the data source.
Type: DataSource object
Errors
For information about the errors that are common to all actions, see Common Errors.
- AccessDeniedException
-
The request is denied because of missing access permissions.
HTTP Status Code: 403
- ConflictException
-
There was a conflict performing an operation.
HTTP Status Code: 409
- InternalServerException
-
An internal server error occurred. Retry your request.
HTTP Status Code: 500
- ResourceNotFoundException
-
The specified resource Amazon Resource Name (ARN) was not found. Check the Amazon Resource Name (ARN) and try your request again.
HTTP Status Code: 404
- ServiceQuotaExceededException
-
The number of requests exceeds the service quota. Resubmit your request later.
HTTP Status Code: 402
- ThrottlingException
-
The number of requests exceeds the limit. Resubmit your request later.
HTTP Status Code: 429
- ValidationException
-
Input validation failed. Check your request parameters and retry the request.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: