When working in an Enterprise Content Management system or document repository, like Microsoft SharePoint for example, searching for the right document can be a challenge. You are up against the sheer volume of documents that the system has to search. You may not remember the exact title of the document, or there may be many documents with similar titles. What do you do if you only remember a specific phrase or line from the document, but not the document’s title?
The business case for Optical Character Recognition (OCR)
When dealing with file types that are not inherently searchable, your document management system will not be able to find the file you are looking for if all you know is a phrase from the document. You can solve this dilemma by ensuring that all files which are uploaded to your system are automatically enhanced using Optical Character Recognition. OCR will ensure that your documents are not only searchable, but findable.
Automating the OCR process
Organizations can transform the way they find and use their valuable content by automating the OCR process through their content management system. In SharePoint, for example, OCR can be performed using Adlib PDF with Windows Workflow Foundation Activities. Automated rules can be set up so that when a new item is created or changed, the OCR workflow is initiated and applied, ensuring that the new or changed document is searchable and findable within SharePoint.
Documents that were previously non-searchable or image-based can be made text-searchable within seconds. Content from the document can now also be copied and reused easily and efficiently.
With automated OCR capabilities, IT and IS departments no longer need to be burdened with helping users to find or search content. End users can use this valuable tool to find and share information with others in the organization.