GetDocumentAnalysis 搭配 a AWS SDK 或 CLI 使用 - AWS SDK 程式碼範例

文件 AWS SDK AWS 範例 SDK 儲存庫中有更多可用的 GitHub 範例。

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

GetDocumentAnalysis 搭配 a AWS SDK 或 CLI 使用

下列程式碼範例示範如何使用 GetDocumentAnalysis

動作範例是大型程式的程式碼摘錄,必須在內容中執行。您可以在下列程式碼範例的內容中看到此動作:

CLI
AWS CLI

取得多頁文件的非同步文字分析結果

下列get-document-analysis範例顯示如何取得多頁文件的非同步文字分析結果。

aws textract get-document-analysis \ --job-id df7cf32ebbd2a5de113535fcf4d921926a701b09b4e7d089f3aebadb41e0712b \ --max-results 1000

輸出:

{ "Blocks": [ { "Geometry": { "BoundingBox": { "Width": 1.0, "Top": 0.0, "Left": 0.0, "Height": 1.0 }, "Polygon": [ { "Y": 0.0, "X": 0.0 }, { "Y": 0.0, "X": 1.0 }, { "Y": 1.0, "X": 1.0 }, { "Y": 1.0, "X": 0.0 } ] }, "Relationships": [ { "Type": "CHILD", "Ids": [ "75966e64-81c2-4540-9649-d66ec341cd8f", "bb099c24-8282-464c-a179-8a9fa0a057f0", "5ebf522d-f9e4-4dc7-bfae-a288dc094595" ] } ], "BlockType": "PAGE", "Id": "247c28ee-b63d-4aeb-9af0-5f7ea8ba109e", "Page": 1 } ], "NextToken": "cY1W3eTFvoB0cH7YrKVudI4Gb0H8J0xAYLo8xI/JunCIPWCthaKQ+07n/ElyutsSy0+1VOImoTRmP1zw4P0RFtaeV9Bzhnfedpx1YqwB4xaGDA==", "DocumentMetadata": { "Pages": 1 }, "JobStatus": "SUCCEEDED" }

如需詳細資訊,請參閱 Amazon Textract 開發人員指南中的偵測和分析多頁文件中的文字

Python
SDK for Python (Boto3)
注意

還有更多 on GitHub。尋找完整範例,並了解如何在 AWS 程式碼範例儲存庫中設定和執行。

class TextractWrapper: """Encapsulates Textract functions.""" def __init__(self, textract_client, s3_resource, sqs_resource): """ :param textract_client: A Boto3 Textract client. :param s3_resource: A Boto3 Amazon S3 resource. :param sqs_resource: A Boto3 Amazon SQS resource. """ self.textract_client = textract_client self.s3_resource = s3_resource self.sqs_resource = sqs_resource def get_analysis_job(self, job_id): """ Gets data for a previously started detection job that includes additional elements. :param job_id: The ID of the job to retrieve. :return: The job data, including a list of blocks that describe elements detected in the image. """ try: response = self.textract_client.get_document_analysis(JobId=job_id) job_status = response["JobStatus"] logger.info("Job %s status is %s.", job_id, job_status) except ClientError: logger.exception("Couldn't get data for job %s.", job_id) raise else: return response
  • 如需 API 詳細資訊,請參閱 GetDocumentAnalysis AWS SDK for Python (Boto3) Word 參考中的 API

SAP ABAP
SDK for SAP ABAP
注意

還有更多 on GitHub。尋找完整範例,並了解如何在 AWS 程式碼範例儲存庫中設定和執行。

"Gets the results for an Amazon Textract" "asynchronous operation that analyzes text in a document." TRY. oo_result = lo_tex->getdocumentanalysis( iv_jobid = iv_jobid ). "oo_result is returned for testing purposes." WHILE oo_result->get_jobstatus( ) <> 'SUCCEEDED'. IF sy-index = 10. EXIT. "Maximum 300 seconds. ENDIF. WAIT UP TO 30 SECONDS. oo_result = lo_tex->getdocumentanalysis( iv_jobid = iv_jobid ). ENDWHILE. DATA(lt_blocks) = oo_result->get_blocks( ). LOOP AT lt_blocks INTO DATA(lo_block). IF lo_block->get_text( ) = 'INGREDIENTS: POWDERED SUGAR* (CANE SUGAR,'. MESSAGE 'Found text in the doc: ' && lo_block->get_text( ) TYPE 'I'. ENDIF. ENDLOOP. MESSAGE 'Document analysis retrieved.' TYPE 'I'. CATCH /aws1/cx_texaccessdeniedex. MESSAGE 'You do not have permission to perform this action.' TYPE 'E'. CATCH /aws1/cx_texinternalservererr. MESSAGE 'Internal server error.' TYPE 'E'. CATCH /aws1/cx_texinvalidjobidex. MESSAGE 'Job ID is not valid.' TYPE 'E'. CATCH /aws1/cx_texinvalidkmskeyex. MESSAGE 'AWS KMS key is not valid.' TYPE 'E'. CATCH /aws1/cx_texinvalidparameterex. MESSAGE 'Request has non-valid parameters.' TYPE 'E'. CATCH /aws1/cx_texinvalids3objectex. MESSAGE 'Amazon S3 object is not valid.' TYPE 'E'. CATCH /aws1/cx_texprovthruputexcdex. MESSAGE 'Provisioned throughput exceeded limit.' TYPE 'E'. CATCH /aws1/cx_texthrottlingex. MESSAGE 'The request processing exceeded the limit.' TYPE 'E'. ENDTRY.