AnalyzeExpense
AnalyzeExpense
synchronously analyzes an input document for financially
related relationships between text.
Information is returned as ExpenseDocuments
and seperated as
follows:
-
LineItemGroups
- A data set containingLineItems
which store information about the lines of text, such as an item purchased and its price on a receipt. -
SummaryFields
- Contains all other information a receipt, such as header information or the vendors name.
Request Syntax
{
"Document": {
"Bytes": blob
,
"S3Object": {
"Bucket": "string
",
"Name": "string
",
"Version": "string
"
}
}
}
Request Parameters
The request accepts the following data in JSON format.
- Document
-
The input document, either as bytes or as an S3 object.
You pass image bytes to an Amazon Textract API operation by using the
Bytes
property. For example, you would use theBytes
property to pass a document loaded from a local file system. Image bytes passed by using theBytes
property must be base64 encoded. Your code might not need to encode document file bytes if you're using an AWS SDK to call Amazon Textract API operations.You pass images stored in an S3 bucket to an Amazon Textract API operation by using the
S3Object
property. Documents stored in an S3 bucket don't need to be base64 encoded.The AWS Region for the S3 bucket that contains the S3 object must match the AWS Region that you use for Amazon Textract operations.
If you use the AWS CLI to call Amazon Textract operations, passing image bytes using the Bytes property isn't supported. You must first upload the document to an Amazon S3 bucket, and then call the operation using the S3Object property.
For Amazon Textract to process an S3 object, the user must have permission to access the S3 object.
Type: Document object
Required: Yes
Response Syntax
{
"DocumentMetadata": {
"Pages": number
},
"ExpenseDocuments": [
{
"Blocks": [
{
"BlockType": "string",
"ColumnIndex": number,
"ColumnSpan": number,
"Confidence": number,
"EntityTypes": [ "string" ],
"Geometry": {
"BoundingBox": {
"Height": number,
"Left": number,
"Top": number,
"Width": number
},
"Polygon": [
{
"X": number,
"Y": number
}
]
},
"Id": "string",
"Page": number,
"Query": {
"Alias": "string",
"Pages": [ "string" ],
"Text": "string"
},
"Relationships": [
{
"Ids": [ "string" ],
"Type": "string"
}
],
"RowIndex": number,
"RowSpan": number,
"SelectionStatus": "string",
"Text": "string",
"TextType": "string"
}
],
"ExpenseIndex": number,
"LineItemGroups": [
{
"LineItemGroupIndex": number,
"LineItems": [
{
"LineItemExpenseFields": [
{
"Currency": {
"Code": "string",
"Confidence": number
},
"GroupProperties": [
{
"Id": "string",
"Types": [ "string" ]
}
],
"LabelDetection": {
"Confidence": number,
"Geometry": {
"BoundingBox": {
"Height": number,
"Left": number,
"Top": number,
"Width": number
},
"Polygon": [
{
"X": number,
"Y": number
}
]
},
"Text": "string"
},
"PageNumber": number,
"Type": {
"Confidence": number,
"Text": "string"
},
"ValueDetection": {
"Confidence": number,
"Geometry": {
"BoundingBox": {
"Height": number,
"Left": number,
"Top": number,
"Width": number
},
"Polygon": [
{
"X": number,
"Y": number
}
]
},
"Text": "string"
}
}
]
}
]
}
],
"SummaryFields": [
{
"Currency": {
"Code": "string",
"Confidence": number
},
"GroupProperties": [
{
"Id": "string",
"Types": [ "string" ]
}
],
"LabelDetection": {
"Confidence": number,
"Geometry": {
"BoundingBox": {
"Height": number,
"Left": number,
"Top": number,
"Width": number
},
"Polygon": [
{
"X": number,
"Y": number
}
]
},
"Text": "string"
},
"PageNumber": number,
"Type": {
"Confidence": number,
"Text": "string"
},
"ValueDetection": {
"Confidence": number,
"Geometry": {
"BoundingBox": {
"Height": number,
"Left": number,
"Top": number,
"Width": number
},
"Polygon": [
{
"X": number,
"Y": number
}
]
},
"Text": "string"
}
}
]
}
]
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- DocumentMetadata
-
Information about the input document.
Type: DocumentMetadata object
- ExpenseDocuments
-
The expenses detected by Amazon Textract.
Type: Array of ExpenseDocument objects
Errors
- AccessDeniedException
-
You aren't authorized to perform the action. Use the Amazon Resource Name (ARN) of an authorized user or IAM role to perform the operation.
HTTP Status Code: 400
- BadDocumentException
-
Amazon Textract isn't able to read the document. For more information on the document limits in Amazon Textract, see Quotas in Amazon Textract.
HTTP Status Code: 400
- DocumentTooLargeException
-
The document can't be processed because it's too large. The maximum document size for synchronous operations 10 MB. The maximum document size for asynchronous operations is 500 MB for PDF files.
HTTP Status Code: 400
- InternalServerError
-
Amazon Textract experienced a service issue. Try your call again.
HTTP Status Code: 500
- InvalidParameterException
-
An input parameter violated a constraint. For example, in synchronous operations, an
InvalidParameterException
exception occurs when neither of theS3Object
orBytes
values are supplied in theDocument
request parameter. Validate your parameter before calling the API operation again.HTTP Status Code: 400
- InvalidS3ObjectException
-
Amazon Textract is unable to access the S3 object that's specified in the request. for more information, Configure Access to Amazon S3 For troubleshooting information, see Troubleshooting Amazon S3
HTTP Status Code: 400
- ProvisionedThroughputExceededException
-
The number of requests exceeded your throughput limit. If you want to increase this limit, contact Amazon Textract.
HTTP Status Code: 400
- ThrottlingException
-
Amazon Textract is temporarily unable to process the request. Try your call again.
HTTP Status Code: 500
- UnsupportedDocumentException
-
The format of the input document isn't supported. Documents for operations can be in PNG, JPEG, PDF, or TIFF format.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: