
The AnalyzeExpense JSON output contains ExpenseDocuments, and each ExpenseDocument contains SummaryFields and LineItemGroups. The following is a sample receipt and the corresponding AnalyzeExpense response JSON.ĪnalyzeExpense JSON response of SummaryFields : In this section, we explain the AnalyzeExpense API response structure using sample images. The Amazon Textract AnalyzeExpense API response Amazon Textract detects “Whole Foods Market” as VENDOR_NAME even though the receipt doesn’t explicitly mention it as the vendor name. The following is a similar example of a receipt. For example, it identifies “INVOICE #” as the standard field INVOICE_RECEIPT_ID.Īdditionally, Amazon Textract detects the items purchased and displays them on the Line Item Fields tab. It also represents the standard taxonomy of fields in brackets next to the actual value on the document. The following images show examples of both an invoice and a receipt document on the Analyze Expense output tab of the Amazon Textract console.Īmazon Textract automatically detects the vendor name, invoice number, ship to address, and more from the sample invoice and displays them on the Summary Fields tab.
EXAMPLE INVOICES CODE
Amazon Textract console walkthroughīefore we get started with the API and code samples, let’s review the Amazon Textract console. For example, Amazon Textract maps relationships between field names in different documents such as customer no., customer number, and account ID, and outputs standard taxonomy (in this case, INVOICE_RECEIPT_ID), thereby representing data consistently across document types. Different documents use different words for the same concept. For example, Amazon Textract can find the vendor name on a receipt even if it’s only indicated within a logo at the top of the page without an explicit key-value pair combination.Īmazon Textract also makes it easy to consolidate input from diverse receipts and invoices. That includes the line-item details, not just the headline amounts.Īmazon Textract also identifies vendor names that are critical for your workflows but may not be explicitly labeled. Amazon Textract works with any style of invoice or receipt, no templates or configuration required, and extracts relevant data that can be tricky to extract such as contact information, items purchased, and vendor name from those documents. To solve this problem, you can use Amazon Textract to process invoices and receipts at scale. Other important information such as customer number, customer ID, or account ID are labeled differently from document to document. Vendor name is often not explicitly labeled and has to be interpreted based on context. The labels are imperfect and inconsistent. The peculiarities of invoices and receipts mean it’s also a difficult problem to solve at scale-invoices and receipts all look different, because each vendor designs its own documents independently. Companies try to standardize electronic invoicing, but some vendors only offer paper invoices, and some countries legally require paper invoices.

Employees who submit expense reports also submit scans or images of the associated receipts. SMBs, startups, and enterprises process paper-based invoices and receipts as part of their accounts payable process to reconcile their goods received and for auditing purposes. Invoice and receipt processing using Amazon Textract Sample solution architecture to automate invoice and receipts processing.
EXAMPLE INVOICES HOW TO
How to process the response with the Amazon Textract parser library.Anatomy of the Amazon Textract AnalyzeExpense API response.A walkthrough of the Amazon Textract console.How Amazon Textract processes invoices and receipts.We cover the following topics in this post: While AWS takes care of building, training, and deploying advanced ML models in a highly available and scalable environment, you take advantage of these models with simple-to-use API actions. In this post, we walk you through processing an invoice/receipt using Amazon Textract and extracting a set of fields and line-item details. In this post, we show how you can use Amazon Textract’s new Analyze Expense API to extract line item details in addition to key-value pairs from invoices and receipts, which is a frequent request we hear from customers. Amazon Textract uses machine learning (ML) to understand the context of invoices and receipts, and automatically extracts specific information like vendor name, price, and payment terms. These types of documents are difficult to process at scale because they follow no set design rules, yet any individual customer encounters thousands of distinct types of these documents.

Receipts and invoices are documents that are critical to small and medium businesses (SMBs), startups, and enterprises for managing their accounts payable processes.
