The PDF – More Than Just Output, It’s About The Document Lifecycle
For years I have witnessed companies spend massive amounts of time, resources, energy and money researching, designing and implementing DM (Document Management) and ECM (Enterprise Content Management) systems with the goal to better organize their valuable content, provide easier access to information, improve their business processes and simply better manage their document lifecycles – and as a result, their organizational processes. While today’s DM and ECM systems are quite sophisticated (with add-ons and customizations, and we can throw SharePoint into the fray, too), their ability to effectively achieve these goals is seriously hampered by their users’ inability to easily discover and leverage the information they need. This is not a fault with their DM and ECM systems; it’s more of a challenge with the content contained within. This is where I’d like to introduce how using the PDF file format can benefit all phases of the document lifecycle. In the diagram below is our interpretation of the Association for Information and Image Management (AIIM) Document Lifecycle diagram.
Phase 1. Capture or Document onboarding:
This is where content gets added to the system. The content is typically office documents, legacy formats, and scanned images imported from another repository. Simply adding this content to your new ECM system does little to help other users discover and leverage their contained information as most of the content is locked within the format or “unsearchable”, may not have any standard naming conventions or the appropriate metadata assigned. Leveraging a server-based document transformation technology like Adlib would enable you to on-board this content, automatically convert it to searchable PDF (OCR if images or faxes), and you could assign the appropriate metadata and naming conventions automatically making the resulting PDF documents easily accessible and usable by the new ECM system.
Phase 2. Manage:
This stage is where business process automation or collaboration typically occurs. Once content is properly captured, then it can be more easily managed. Good workflow tools will leverage the document metadata (applied in phase 1) to further automate a business process. An example might be the automatic assembly of a sales proposal. Building a package from multiple files and various file types can be a very tedious and task if done manually, but by leveraging metadata and workflows, this process can be fully automated, eliminating the risk of human error and allowing the users to focus on business tasks, not technical tasks of assembling documentation.
Phase 3. Enterprise Search:
This one is a no brainer. If all of your content within your ECM system is searchable, you can free yourself from the shackles of taxonomy. No longer will you have to rely on cryptic classification codes or metadata, but instead perform searches on what’s IN the documents, not how someone though to classify them. Benefits include finding things faster, improved collaboration, and you’ve basically set your content up to help facilitate e-discovery. If your content is not searchable, it might as well not even exist in your ECM system. If a tree falls in the forest…
Phase 4. Delivery:
This where you typically transform your content into the optimal state required for final delivery. The PDF has become an industry standard for document distribution, and PDF rendering for distribution is what most server-based PDF vendors (like Adlib and Adobe) do well. This is where the content is enhanced with watermarks, headers, footers, hyperlinks, table of contents, signatures, accessibility security, etc… then merged with other documents (even different file types) and converted to PDF for final consumption. This process results in a high-res PDF optimized for print or CAD review, or a smaller PDF optimized for web or mobile consumption.
Phase 5. Archive:
Given the out-of-this-world growth of corporate data and documents, efficient archiving has become an important requirement for many organizations and governments. Traditional archiving strategies have been to simply ensure that the original data is intact and retrievable over time, but one important element that has been missing is what to do with that data or files when you want to retrieve them? For example, legacy file formats have been preserved, but the technology to view them has often vanished. Examples might be Aldus Page Maker, old versions of any application, proprietary file formats from the days of MS DOS, etc. Even current versions of MS Office have challenges faithfully displaying documents that were created in previous versions. Throughout the document lifecycle process above, any document deemed suitable for archive can be automatically converted to PDF/A (PDF for Archiving). This is a recognized archive industry standard defined by the ISO (19005) standard and advocated by the PDF Association and already adopted globally as one of the best file formats for long-term digital archiving as it is totally self-contained (all fonts and images get embedded) and you guessed it, made searchable, for easy future retrieval. Most importantly, it is the first file format specifically designed for long-term durability meaning that the way it appears today is exactly how it will appear in 50 years.
So as you can see, having a little PDF file format strategy for the capture, management and delivery of your documents within your document lifecycle can have immense impact on the efficiency of you DM or ECM system! Learn more about this process in this new white paper.