Artificial Intelligence (AI) systems require massive amounts of data to train their algorithms. Unfortunately, they’re not built to simultaneously amass this data and prime it for ingestion. This creates a challenge for enterprises looking to adopt AI because more than 80 percent of their data is scattered among various file shares, silos and repositories in formats that are not readily machine-readable such as emails, scanned documents, images and others.
Positioned as the Achilles Heel of AI, this unstructured data challenge was discussed among tech professionals who took part in a recent #AIChat Twitter discussion hosted by Adlib Software and Nick Tang.
Beyond the Buzzword of AI
While businesses in life sciences, energy, insurance, banking and financial services are all at varying levels of AI implementation, firms in telecommunications, high-tech and finance are leading the way in overall AI adoption.*
According to Scott Mackey, Adlib’s SVP of Market Strategy, these figures are promising. “I think we’re coming out of disillusionment and into enlightenment. There are some real world examples of AI working in retail, insurance and others.”
Thought-leader Colin McGuire agreed, noting, “AI is a broad term that’s widely applied. There are already real world use cases with notable benefits, but there’s a sensationalized viewpoint in public perception. Educating the public is as much of a challenge as implementation.”
Conquering AI’s Unstructured Data Challenge
Since AI cannot run without high-quality, structured data to fuel the algorithms, it’s essential for businesses to improve the access, quality and efficiency of their data.
“With 80 percent of current data locked up in unstructured content, the value AI brings will be limited,” noted Rob Schaafsma, Adlib’s Director of Technical Business Development.
On top of the current volume of unstructured data, it’s predicted that this content will continue to grow at a rate of 62 percent annually** making enterprise data governance an imminent imperative for any business looking to pilot an AI initiative.
How does one go about conquering this challenge? Scott suggested starting with the outcome in mind. “Determine what you are trying to achieve, then dig into the underlying elements that will feed that outcome. This includes the people, process, technology and the data sources required,” he said.
What comes first – AI or underlying data processes?
While there may be some instances where businesses have structured data and can get started with AI, Scott noted that in other scenarios “you need to ensure the fuel you’re using for AI is clean, broadly sourced and relevant.”
Author Isaac Sacolick agreed, stating that businesses have some choices before leveraging AI:
- Cleanse data using traditional methods.
- Make AI harder, more expensive and less accurate to implement by working with noisy data.
- Put biased AI in production because of poor data.
- Do nothing and fall behind without machine learning.
Rob followed up by suggesting “a responsibility that if a business is going to leverage AI, it needs to be done right, and that includes unlocking data from unstructured content.”
Data Governance is Essential for Effective AI Output
But how do businesses go about unlocking this data so they can see results with AI? It begins by making strides to improve the visibility and accessibility of their data stores across all business units. This includes developing a data strategy that addresses:
For a comprehensive look at how to address data governance so that it sets the foundation for innovation with AI and other digital tactics, check out our Enterprise Guide.
Before you get started with any of the above steps as part of an AI implementation project, be sure to kick-start a Data Discovery Assessment to garner a fulsome look into your content stores and understand exactly what type of content you’re working with.