Optimize Scanner Output to PDF – Storage is More Expensive Than You Think

By Paul Dyck | August 1, 2009

Most offices are equipped with MFD (multi-function devices) that allow employees to scan paper documents and create an electronic version in PDF or image format.  I have heard horror stories of scan-happy employees that create high resolution, colour scans of 70-page documents and then upload the 100MB result into SharePoint.  Before long, they are using Terabytes of storage space.  So what’s the problem, storage is cheap isn’t it? Well, not really.  Corporations don’t buy the $200 USB hard drives on sale at your local Best Buy. They purchase enterprise quality hardware, and that hardware needs to be supported with backups and remotely-stored disaster recovery. The real cost of storage is closer to $4,000 a Terabyte.  When we consider the ecological costs of the extra hardware, power, and storage space, storage is much more expensive than most people think.

Another problem with this scenario is that the content of the scanned document is in image format, limiting the ability of other users to search for it.  SharePoint, or other document repositories, will have little information to index the document with.

Storage costs can be minimized either by training people or making it difficult for them to make these types of mistakes.  Providing a centralized solution for converting image files created by MFDs to searchable PDF allows businesses to set up a process that does not require users to understand how to create the optimal file output using OCR (Optical Character Recognition) to create searchable PDF files which, are much smaller and will be far more valuable to your business.  If you are using SharePoint, this can be achieved using automated document conversion workflows that will process new files as they are uploaded into SharePoint.

Don’t forget to share this post