9

Document Information Extraction Activities in SAP IRPA

 2 years ago
source link: https://blogs.sap.com/2021/10/07/document-information-extraction-activities-in-sap-irpa/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
October 7, 2021 3 minute read

Document Information Extraction Activities in SAP IRPA

0 3 113

Introduction:

Today we will try to understand the process of reading data from scanned or digital documents using the below activities provided by SAP as part of irpa_pdf sdk library 1.15.83.

  1. Extract Data Without Template : Input for this activity is Document Type, Document Path of PDF, Output is extracted data using standard schema of that particular document type.img12.png
  2. Extract Data With Template: Input for this activity is Document Template, Document Path, Output is extracted data using either  standard schema or custom schema.img13.png

Prerequisites to understand before using above activities:

  1. currently SAP only supports 3 Document types: Invoice, Purchase Order, Payment Advice.document%20types
  2. For each document type sap has provided schemas which can not be editable  eg: For document type Invoice, schema is  SAP_invoice_schema.(Schema is the list of fields (header, Item) used to identify the required information from corresponding document like invoice number, Total, subtotal, Tax ..)Schemas
  3. By copying the standard schema we can add or delete the  required fields from the schema and activate it..Customschema

Steps to design automation with Extract Data with Template:

We have to use this approach when the template complexity is high ,AI & ML models not able to determine the fields from the schema, By using the annotations functionality while creating the template we are training our invoices(we can upload max 5 sample invoices for annotating) ,hence next time same vendor invoice comes it will able to extract the data using this templates making the accuracy to 100 percent.

  1. How to create Template?
  •        After creating automation project just select the artifact create template

img16.png

  •  Provide the meaning full Name , description of template , any document type as per your     requirement, select the schema either standard or custom here i am using standard template   and provide the document path and click on createimg17-1.pngimg18.png
  • After this open the document in Document Information Extraction editor for annotation, like invoice number, PO number, total, subtotal. Next save and activate the Template for consuming this template in automation.img20.pngIMG21.png
  • In automation pass the template name as vendor1 and path of the invoice with different data of same vendor.IMG23-1.pngimg30.png
  • Now the bot is able to understand this template and able to retrieve the required data, same has been printed in console.img31.png

Conclusion: For invoices which we are not able to get required field information using the activity Extract Data Without Template we have to use the activity Extract Data with Template using above steps.

Thanks for reading and please provide your comments and questions.

For More Info: SAP Help


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK