Question

Document OCR in Pega Robotics

Hi Team,

I have a requirement of extracting text from a hand written PDF file.

So i used Document OCR to extract text ,But i am getting junk values. Could anyone help me to resolve this.

Thanks,

Harshita

Comments

Keep up to date on this post and subscribe to comments

Pega
March 22, 2019 - 8:19am

Are there images in this scanned PDF? There is a possibility of images or something else throwing off the Abby Reader. Using the parameter "OCR Dictionary Type" you can specify what type of characters for OCR to look for and output. This parameter may help in removing the junk values that you see. http://help.openspan.com/80/Components/DocumentOCR_Component.htm

March 25, 2019 - 1:39am
Response to heffc

Hi, Thank you for your response. The PDF which is scanned is hand written. Please refer the attachment.

 

April 24, 2019 - 3:06am

The OCR component is not guaranteed to work with handwritten text. Unfortunately, it was not designed to work with handwritten text. The component was originally designed for printed documents that are scanned (rather than written text). Further explanation can be found on ABBYY's KB Articles - https://abbyy.technology/en:kb:faq:recognizing_handprinted_text.

November 25, 2019 - 6:31am

Hi,

     Does Pega Robotics has the feature of Extracting text from handwritten notes by using OCR Component

November 25, 2019 - 6:56am
Response to DivyaG93

Hi Divya,

No, we do not have inbuilt feature in Pega Robotics to extract text from handwritten notes.

December 2, 2019 - 2:13am

Hi Aditya Menda,

        Thanks for your Response ...But is there any possible way to solve it ?