DatasetAugmentedManifestsListItem
An augmented manifest file that provides training data for your custom model. An augmented manifest file is a labeled dataset that is produced by Amazon SageMaker Ground Truth.
Contents
- AttributeNames
-
The JSON attribute that contains the annotations for your training documents. The number of attribute names that you specify depends on whether your augmented manifest file is the output of a single labeling job or a chained labeling job.
If your file is the output of a single labeling job, specify the LabelAttributeName key that was used when the job was created in Ground Truth.
If your file is the output of a chained labeling job, specify the LabelAttributeName key for one or more jobs in the chain. Each LabelAttributeName key provides the annotations from an individual job.
Type: Array of strings
Length Constraints: Minimum length of 1. Maximum length of 63.
Pattern:
^[a-zA-Z0-9](-*[a-zA-Z0-9])*
Required: Yes
- S3Uri
-
The Amazon S3 location of the augmented manifest file.
Type: String
Length Constraints: Maximum length of 1024.
Pattern:
s3://[a-z0-9][\.\-a-z0-9]{1,61}[a-z0-9](/.*)?
Required: Yes
- AnnotationDataS3Uri
-
The S3 prefix to the annotation files that are referred in the augmented manifest file.
Type: String
Length Constraints: Maximum length of 1024.
Pattern:
s3://[a-z0-9][\.\-a-z0-9]{1,61}[a-z0-9](/.*)?
Required: No
- DocumentType
-
The type of augmented manifest. If you don't specify, the default is PlainTextDocument.
PLAIN_TEXT_DOCUMENT
A document type that represents any unicode text that is encoded in UTF-8.Type: String
Valid Values:
PLAIN_TEXT_DOCUMENT | SEMI_STRUCTURED_DOCUMENT
Required: No
- SourceDocumentsS3Uri
-
The S3 prefix to the source files (PDFs) that are referred to in the augmented manifest file.
Type: String
Length Constraints: Maximum length of 1024.
Pattern:
s3://[a-z0-9][\.\-a-z0-9]{1,61}[a-z0-9](/.*)?
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: