AWS::Wisdom::KnowledgeBase SemanticChunkingConfiguration
Settings for semantic document chunking for a data source. Semantic chunking splits a document into smaller documents based on groups of similar content derived from the text with natural language processing.
Syntax
To declare this entity in your AWS CloudFormation template, use the following syntax:
JSON
{ "BreakpointPercentileThreshold" :
Number
, "BufferSize" :Number
, "MaxTokens" :Number
}
YAML
BreakpointPercentileThreshold:
Number
BufferSize:Number
MaxTokens:Number
Properties
BreakpointPercentileThreshold
-
The dissimilarity threshold for splitting chunks.
Required: Yes
Type: Number
Minimum:
50
Maximum:
99
Update requires: No interruption
BufferSize
-
The buffer size.
Required: Yes
Type: Number
Minimum:
0
Maximum:
1
Update requires: No interruption
MaxTokens
-
The maximum number of tokens that a chunk can contain.
Required: Yes
Type: Number
Minimum:
1
Update requires: No interruption