Can Document OCR not get text out of a scanned pdf?
I have tried processToXml and ProcessToPdf and tried putting ProcessToPdf before each of these and thried everything with and without ocrImagesAndText being true. everything just returns false. I am trying to get text out of a pdf produced by scanning a paper document, but there are even some pdfs the regular pdf connector can read that document ocr cannot, unless I just cannot sort out how to use it. I can make it get text from images in word documents and it can get text out of a pdf I make by doing a print to pdf, so I know I am not doing everything wrong. can this component actually not get text from a scanned pdf?
Keep up to date on this post and subscribe to comments
- Unable to show the Valid certified PDF document in PDF viewer. Is there any issues in reading text from digitally signed and certified PDF?
- Reading a Scanned PDF Document using Pega Robotics OCR
- How to add a comment in a scanned pdf file and save as without OCR/using pdfconnector/viewer
- How to Wrap Text With in a Report- while Displaying in PDF?
- Incorrect text in pdf in "Best Practices for Managing Data Reference Document"