Human information: A data revolution
Our growing reliance on unstructured data is driving the need for machines that understand human-friendly information.
For many years, business information systems have relied exclusively on relational data. This is because relational data is designed to be easy for computers to understand and process—but it has forced people to think in ways that fit the machine, rather than the other way around.
Recent, rapid growth in unstructured data has made this situation increasingly untenable. Unstructured information already encompasses 85 percent of all information. Beyond its sheer size, unstructured information is also where all the interesting, differentiating, and vital things happen. Customers don’t send you databases—they send emails, make calls, tweet, log into user forums. Policies, regulations, and governance practices depend on human meaning and intent. Yet for decades, information has had to fit the technology, and humans have had to fit the machine.
Today, it is possible to make machines fit the human. Machines can be made to understand unstructured, human-friendly information, bringing you new discoveries rather than answers to the questions you already knew to ask. As the rising tide of “big data” has some businesses running scared, others are seeing the use of this flood of information as the next generation of competitive opportunity.
Helping computers understand meaning
Computers aren't good at understanding the meaning of words in context. It’s not at all uncommon for humans to use different words to describe the same thing. "High-efficiency aerofoil designer" and "low-drag wing design expert" describe the same job, yet none of the words in either description match.
The meaning of information can change based on where it appears. And thanks to the dynamic nature of language, meanings also change over time. These contextual challenges are precisely why unstructured information has historically been excluded from business information systems.
However, once you've mastered the challenges of conceptual meaning, unstructured information can be interchanged effortlessly. This is a huge advantage over database information, which must be aligned with the specific database from which it came.
Today's de facto solution to this problem, metadata, is an attempt to apply structure to unstructured content. If there were a wealth of reliable, perfectly created metadata this might be a viable approach. However, metadata has a number of fatal flaws:
- Time—Creating proper metadata is time-consuming. Most data creators will not bother to create any metadata, much less high-quality metadata.
- Lack of objectivity—Accurate, reliable metadata requires objectivity. People's perceptions and backgrounds color their interpretation of even the most benign data.
- Lack of standardization—There are many ways to accurately describe the same content.
Extracting rich meaning
In the unstructured world, information always relates to other pieces of information. Customer data such as call center calls, tweets, emails, or website comments are often connected to voicemails, emails, documents, or SMS messages. Tracking all of these connections can quickly become a rat’s nest, because every application has a separate connection to every data type. As soon as any data type or source is changed, all the connections must also be changed.
As business moves forward, seizing the opportunity and potential presented by unstructured, human data will be a critical priority. Autonomy, an HP company, has created a single processing layer for forming a conceptual, contextual, and real-time understanding of all forms of data, both inside and outside an enterprise through its IDOL 10 product. It is expected that tools like this will become increasingly important to help companies understand and act on 100 percent of enterprise information—both structured and unstructured—in real-time.
The shift toward human-friendly information is a once-in-a-generation opportunity. Because of this change, companies can link a customer’s call center call to that customer’s website activity, entry in the database, and purchase history—all in real time. Policies can be implemented across emails, documents, voicemails, social media, SMS messages, and transaction histories, to not only flag non-compliant materials, but to stop non-compliant posts or even transactions before they occur.
To learn more about how HP Autonomy is delivering on the promise of human-friendly information, read our whitepaper "Human Information," or visit http://www.autonomy.com/.
Discover Performance ebooks
Download eBook (PDF-file, 300dpi, 3.7MB)
Download eBook (PDF-file, 300dpi, 9.5MB)
Download eBook (PDF-file, 300dpi, 2.45MB)
Connect your Big Data strategy
Sign in or register now to download our new, free ebook for insight into becoming a data-driven business. Register/Sign in.
Take our quick, free assessment to see how you stack up.
Connect with your peers in our IT Strategy & Performance group on LinkedIn.
Sign up to get the best of the Discover Performance community delivered via email.
Read where the cloud is headed, and more, with our collaborative ebook project.
Get articles, demos, discussions, and downloads for and by software practitioners.
HP Discover Barcelona
Shine a light on dark data. More
Hear the latest insights on how business analytics can monetize your data. More
Make unstructured and structured data work for your enterprise. More
Most read articles