文件 AWS SDK AWS 範例 SDK 儲存庫中有更多可用的
本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。
StartDocumentTextDetection
搭配 a AWS SDK 或 CLI 使用
下列程式碼範例示範如何使用 StartDocumentTextDetection
。
- CLI
-
- AWS CLI
-
若要開始偵測多頁文件中的文字
下列
start-document-text-detection
範例示範如何啟動多頁文件中文字的非同步偵測。Linux/macOS:
aws textract start-document-text-detection \ --document-location '
{"S3Object":{"Bucket":"bucket","Name":"document"}}
' \ --notification-channel"SNSTopicArn=arn:snsTopic,RoleArn=roleARN"
Windows:
aws textract start-document-text-detection \ --document-location "{\"S3Object\":{\"Bucket\":\"bucket\",\"Name\":\"document\"}}" \ --region
region-name
\ --notification-channel"SNSTopicArn=arn:snsTopic,RoleArn=roleArn"
輸出:
{ "JobId": "57849a3dc627d4df74123dca269d69f7b89329c870c65bb16c9fd63409d200b9" }
如需詳細資訊,請參閱 Amazon Textract 開發人員指南中的偵測和分析多頁文件中的文字
-
如需 API 詳細資訊,請參閱 AWS CLI 命令參考中的 StartDocumentTextDetection
。
-
- Python
-
- SDK for Python (Boto3)
-
注意
還有更多 on GitHub。尋找完整範例,並了解如何在 AWS 程式碼範例儲存庫
中設定和執行。 啟動非同步任務以偵測文件中的文字。
class TextractWrapper: """Encapsulates Textract functions.""" def __init__(self, textract_client, s3_resource, sqs_resource): """ :param textract_client: A Boto3 Textract client. :param s3_resource: A Boto3 Amazon S3 resource. :param sqs_resource: A Boto3 Amazon SQS resource. """ self.textract_client = textract_client self.s3_resource = s3_resource self.sqs_resource = sqs_resource def start_detection_job( self, bucket_name, document_file_name, sns_topic_arn, sns_role_arn ): """ Starts an asynchronous job to detect text elements in an image stored in an Amazon S3 bucket. Textract publishes a notification to the specified Amazon SNS topic when the job completes. The image must be in PNG, JPG, or PDF format. :param bucket_name: The name of the Amazon S3 bucket that contains the image. :param document_file_name: The name of the document image stored in Amazon S3. :param sns_topic_arn: The Amazon Resource Name (ARN) of an Amazon SNS topic where the job completion notification is published. :param sns_role_arn: The ARN of an AWS Identity and Access Management (IAM) role that can be assumed by Textract and grants permission to publish to the Amazon SNS topic. :return: The ID of the job. """ try: response = self.textract_client.start_document_text_detection( DocumentLocation={ "S3Object": {"Bucket": bucket_name, "Name": document_file_name} }, NotificationChannel={ "SNSTopicArn": sns_topic_arn, "RoleArn": sns_role_arn, }, ) job_id = response["JobId"] logger.info( "Started text detection job %s on %s.", job_id, document_file_name ) except ClientError: logger.exception("Couldn't detect text in %s.", document_file_name) raise else: return job_id
-
如需 API 詳細資訊,請參閱 StartDocumentTextDetection AWS SDK for Python (Boto3) Word 參考中的 API。
-
- SAP ABAP
-
- SDK for SAP ABAP
-
注意
還有更多 on GitHub。尋找完整範例,並了解如何在 AWS 程式碼範例儲存庫
中設定和執行。 "Starts the asynchronous detection of text in a document." "Amazon Textract can detect lines of text and the words that make up a line of text." "Create an ABAP object for the Amazon S3 object." DATA(lo_s3object) = NEW /aws1/cl_texs3object( iv_bucket = iv_s3bucket iv_name = iv_s3object ). "Create an ABAP object for the document." DATA(lo_documentlocation) = NEW /aws1/cl_texdocumentlocation( io_s3object = lo_s3object ). "Start document analysis." TRY. oo_result = lo_tex->startdocumenttextdetection( io_documentlocation = lo_documentlocation ). DATA(lv_jobid) = oo_result->get_jobid( ). "oo_result is returned for testing purposes." MESSAGE 'Document analysis started.' TYPE 'I'. CATCH /aws1/cx_texaccessdeniedex. MESSAGE 'You do not have permission to perform this action.' TYPE 'E'. CATCH /aws1/cx_texbaddocumentex. MESSAGE 'Amazon Textract is not able to read the document.' TYPE 'E'. CATCH /aws1/cx_texdocumenttoolargeex. MESSAGE 'The document is too large.' TYPE 'E'. CATCH /aws1/cx_texidempotentprmmis00. MESSAGE 'Idempotent parameter mismatch exception.' TYPE 'E'. CATCH /aws1/cx_texinternalservererr. MESSAGE 'Internal server error.' TYPE 'E'. CATCH /aws1/cx_texinvalidkmskeyex. MESSAGE 'AWS KMS key is not valid.' TYPE 'E'. CATCH /aws1/cx_texinvalidparameterex. MESSAGE 'Request has non-valid parameters.' TYPE 'E'. CATCH /aws1/cx_texinvalids3objectex. MESSAGE 'Amazon S3 object is not valid.' TYPE 'E'. CATCH /aws1/cx_texlimitexceededex. MESSAGE 'An Amazon Textract service limit was exceeded.' TYPE 'E'. CATCH /aws1/cx_texprovthruputexcdex. MESSAGE 'Provisioned throughput exceeded limit.' TYPE 'E'. CATCH /aws1/cx_texthrottlingex. MESSAGE 'The request processing exceeded the limit.' TYPE 'E'. CATCH /aws1/cx_texunsupporteddocex. MESSAGE 'The document is not supported.' TYPE 'E'. ENDTRY.
-
如需 API 詳細資訊,請參閱 StartDocumentTextDetection for Word Word 參考中的 Word。 AWS SDK SAP ABAP API
-
StartDocumentAnalysis
案例