Discover PerformanceHP Software's community for IT leaders // March 2012
Human information: A data revolution
Our growing reliance on unstructured data is driving the need for machines that understand human-friendly information.
For many years, business information systems have relied exclusively on relational data. This is because relational data is designed to be easy for computers to understand and process—but it has forced people to think in ways that fit the machine, rather than the other way around.
Recent, rapid growth in unstructured data has made this situation increasingly untenable. Unstructured information already encompasses 85 percent of all information. Beyond its sheer size, unstructured information is also where all the interesting, differentiating, and vital things happen. Customers don’t send you databases—they send emails, make calls, tweet, log into user forums. Policies, regulations, and governance practices depend on human meaning and intent. Yet for decades, information has had to fit the technology, and humans have had to fit the machine.
Today, it is possible to make machines fit the human. Machines can be made to understand unstructured, human-friendly information, bringing you new discoveries rather than answers to the questions you already knew to ask. As the rising tide of “big data” has some businesses running scared, others are seeing the use of this flood of information as the next generation of competitive opportunity.
Helping computers understand meaning
Computers aren't good at understanding the meaning of words in context. It’s not at all uncommon for humans to use different words to describe the same thing. "High-efficiency aerofoil designer" and "low-drag wing design expert" describe the same job, yet none of the words in either description match.
The meaning of information can change based on where it appears. And thanks to the dynamic nature of language, meanings also change over time. These contextual challenges are precisely why unstructured information has historically been excluded from business information systems.
However, once you've mastered the challenges of conceptual meaning, unstructured information can be interchanged effortlessly. This is a huge advantage over database information, which must be aligned with the specific database from which it came.
Today's de facto solution to this problem, metadata, is an attempt to apply structure to unstructured content. If there were a wealth of reliable, perfectly created metadata this might be a viable approach. However, metadata has a number of fatal flaws:
- Time—Creating proper metadata is time-consuming. Most data creators will not bother to create any metadata, much less high-quality metadata.
- Lack of objectivity—Accurate, reliable metadata requires objectivity. People's perceptions and backgrounds color their interpretation of even the most benign data.
- Lack of standardization—There are many ways to accurately describe the same content.
Extracting rich meaning
In the unstructured world, information always relates to other pieces of information. Customer data such as call center calls, tweets, emails, or website comments are often connected to voicemails, emails, documents, or SMS messages. Tracking all of these connections can quickly become a rat’s nest, because every application has a separate connection to every data type. As soon as any data type or source is changed, all the connections must also be changed.
As business moves forward, seizing the opportunity and potential presented by unstructured, human data will be a critical priority. Autonomy, an HP company, has created a single processing layer for forming a conceptual, contextual, and real-time understanding of all forms of data, both inside and outside an enterprise through its IDOL 10 product. It is expected that tools like this will become increasingly important to help companies understand and act on 100 percent of enterprise information—both structured and unstructured—in real-time.
The shift toward human-friendly information is a once-in-a-generation opportunity. Because of this change, companies can link a customer’s call center call to that customer’s website activity, entry in the database, and purchase history—all in real time. Policies can be implemented across emails, documents, voicemails, social media, SMS messages, and transaction histories, to not only flag non-compliant materials, but to stop non-compliant posts or even transactions before they occur.
To learn more about how HP Autonomy is delivering on the promise of human-friendly information, read our whitepaper "Human Information," or visit http://www.autonomy.com/.
HP Software’s Paul Muller hosts a weekly video digging into the hottest IT issues. Check out the latest episodes.
Welcome to a new reality of split-second decisions and marketing by the numbers.
Looking toward the era when everyone — and everything — is connected.
Introduction to Enterprise 20/20
What will a successful enterprise look like in the future?
Challenges and opportunities for the CIO of the future.
Dev Center 20/20
How will we organize development centers for the apps that will power our enterprises?
IT Operations 20/20
How can you achieve the data center of the future?
What the workforce of 2020 can expect from IT, and what IT can expect from the workforce.
Preparing today for tomorrow’s threats.
Data Center 20/20
The innovation and revenue engine of the enterprise.