interface ParsingConfigurationProperty
Language | Type name |
---|---|
![]() | Amazon.CDK.aws_bedrock.CfnDataSource.ParsingConfigurationProperty |
![]() | github.com/aws/aws-cdk-go/awscdk/v2/awsbedrock#CfnDataSource_ParsingConfigurationProperty |
![]() | software.amazon.awscdk.services.bedrock.CfnDataSource.ParsingConfigurationProperty |
![]() | aws_cdk.aws_bedrock.CfnDataSource.ParsingConfigurationProperty |
![]() | aws-cdk-lib » aws_bedrock » CfnDataSource » ParsingConfigurationProperty |
Settings for parsing document contents.
By default, the service converts the contents of each document into text before splitting it into chunks. To improve processing of PDF files with tables and images, you can configure the data source to convert the pages of text into images and use a model to describe the contents of each page.
To use a model to parse PDF documents, set the parsing strategy to BEDROCK_FOUNDATION_MODEL
and specify the model or inference profile to use by ARN. You can also override the default parsing prompt with instructions for how to interpret images and tables in your documents. The following models are supported.
- Anthropic Claude 3 Sonnet -
anthropic.claude-3-sonnet-20240229-v1:0
- Anthropic Claude 3 Haiku -
anthropic.claude-3-haiku-20240307-v1:0
You can get the ARN of a model with the ListFoundationModels action. Standard model usage charges apply for the foundation model parsing strategy.
Example
// The code below shows an example of how to instantiate this type.
// The values are placeholders you should change.
import { aws_bedrock as bedrock } from 'aws-cdk-lib';
const parsingConfigurationProperty: bedrock.CfnDataSource.ParsingConfigurationProperty = {
parsingStrategy: 'parsingStrategy',
// the properties below are optional
bedrockFoundationModelConfiguration: {
modelArn: 'modelArn',
// the properties below are optional
parsingPrompt: {
parsingPromptText: 'parsingPromptText',
},
},
};
Properties
Name | Type | Description |
---|---|---|
parsing | string | The parsing strategy for the data source. |
bedrock | IResolvable | Bedrock | Settings for a foundation model used to parse documents for a data source. |
parsingStrategy
Type:
string
The parsing strategy for the data source.
bedrockFoundationModelConfiguration?
Type:
IResolvable
|
Bedrock
(optional)
Settings for a foundation model used to parse documents for a data source.