AWS::Wisdom::KnowledgeBase ChunkingConfiguration
Details about how to chunk the documents in the data source. A chunk refers to an excerpt from a data source that is returned when the knowledge base that it belongs to is queried.
Syntax
To declare this entity in your AWS CloudFormation template, use the following syntax:
JSON
{ "ChunkingStrategy" :
String
, "FixedSizeChunkingConfiguration" :FixedSizeChunkingConfiguration
, "HierarchicalChunkingConfiguration" :HierarchicalChunkingConfiguration
, "SemanticChunkingConfiguration" :SemanticChunkingConfiguration
}
YAML
ChunkingStrategy:
String
FixedSizeChunkingConfiguration:FixedSizeChunkingConfiguration
HierarchicalChunkingConfiguration:HierarchicalChunkingConfiguration
SemanticChunkingConfiguration:SemanticChunkingConfiguration
Properties
ChunkingStrategy
-
Knowledge base can split your source data into chunks. A chunk refers to an excerpt from a data source that is returned when the knowledge base that it belongs to is queried. You have the following options for chunking your data. If you opt for
NONE
, then you may want to pre-process your files by splitting them up such that each file corresponds to a chunk.Required: Yes
Type: String
Allowed values:
FIXED_SIZE | NONE | HIERARCHICAL | SEMANTIC
Update requires: No interruption
FixedSizeChunkingConfiguration
-
Configurations for when you choose fixed-size chunking. If you set the
chunkingStrategy
asNONE
, exclude this field.Required: No
Type: FixedSizeChunkingConfiguration
Update requires: No interruption
HierarchicalChunkingConfiguration
-
Settings for hierarchical document chunking for a data source. Hierarchical chunking splits documents into layers of chunks where the first layer contains large chunks, and the second layer contains smaller chunks derived from the first layer.
Required: No
Type: HierarchicalChunkingConfiguration
Update requires: No interruption
SemanticChunkingConfiguration
-
Settings for semantic document chunking for a data source. Semantic chunking splits a document into smaller documents based on groups of similar content derived from the text with natural language processing.
Required: No
Type: SemanticChunkingConfiguration
Update requires: No interruption