Running asynchronous jobs

After you train a custom classifier, you can use asynchronous jobs to analyze large documents or multiple documents in one batch.

Custom classification accepts a variety of input document types. For details, see Inputs for asynchronous custom analysis.

If you plan to analyze image files or scanned PDF documents, your IAM policy must grant permissions to use two Amazon Textract API methods (DetectDocumentText and AnalyzeDocument). Amazon Comprehend invokes these methods during text extraction. For an example policy, see Permissions required to perform document analysis actions.

For classification of semi-structured documents (image, PDF, or Docx files) using a plain-text model, use the one document per file input format. Also, include the DocumentReaderConfig parameter in your StartDocumentClassificationJob request.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Outputs for real-time analysis

Input file formats