Discover PerformanceHP Software's community for IT leaders // May 2013
March Madness: A slam-dunk analytics scenario
An application that analyzes March Madness tweeting highlights the business intelligence gold mine lurking in social media data.
During the annual bacchanal of basketball known as March Madness, thousands of NCAA basketball fans (and probably a few neglected spouses) take to Twitter to unleash their excitement, vindication, disappointment, and predictions about every aspect of the tournament. In 2012, there were more than 2 million tournament-related tweets.
With 64 basketball games scheduled in a tight three-week timeframe, the annual surfeit of riches for sports fans is also an excellent proving ground for the power of social media data analytics. Widespread use of social media is a huge business opportunity, especially for organizations with well-known brands.
With this in mind, a few sports fans across HP’s data analytics solutions teams hatched a plan to use tweets in the runup to March Madness to demonstrate some potential uses of social media data. The core question: If you could read and understand the sentiment of all of those tweets, what might you learn about the business of basketball?
- Most loved (and most hated) players?
- Geographic patterns of team loyalty?
- Trending patterns based on injury reports, sports commentary, and handicaps?
- Most influential sports analysts and commentators?
- Broadcast viewership patterns?
- Live attendance footprint?
More importantly, if you knew the answers to such questions, what could you do with that information?
Inspired by madness
A team of data analytics experts at HP built an application this spring using Autonomy, HP Vertica Analytics Platform, and TIBCO Spotfire, and ran their first test on tweets made in the days before the tournament began, to see just what they could learn from social media data. The demo:
- Analyzed 500,000 Twitter posts using a wide variety of hashtag and keyword information
- Applied sentiment to determine a positive, negative, or neutral context
- Tracked location data for geographical analysis
- Tracked the language to provide cultural context
- Tracked time and date of postings to show activity spikes
- Provided a visualization dashboard to make it easy to consume query results
“We didn't have a predefined storyline,” says Kaloyan Kanev, a consultant with the Information Management and Analytics team in HP Enterprise Services. “We wanted to let the data speak to us, so we crawled a lot of data and started looking at what people were talking about, and measuring the buzz as it changed before and during the tournament.”
Once the data was queried and visualized, the team began to explore some interesting data patterns. For example, in the weeks leading up to the tournament, the data showed a major spike in tweets from areas of the United States where college basketball is wildly popular (the Midwest and South, for example) and even the lesser-known players dominate Twitter, indicating big games, big plays, or notorious off-court activity.
The application was never designed to predict winners and losers of games: sentiment data alone is never a predictor, Kanev explains. It does, however, identify trends that could potentially be used by the NCAA or individual teams to increase ticket or merchandise sales, grow the fan base, or improve the return on investment from marketing efforts.
Getting back to business
For businesses, social media data offers new insights into customer opinions, trends, and buying behaviors that, until now, have been difficult or impossible to come by. The specific ways you can analyze this data will vary greatly by industry, but the possibilities include:
- Measure response to new campaigns
- Measure enthusiasm for new products
- Understand public reaction to news events that affect your industry
- Measure changes in sentiment over time
- Remediate poor customer service experiences
- Identify influential individuals
- Improve predictive modeling for retail sales, financial products, or customer churn
To make the most of your sentiment data analytics efforts, Kanev offers the following tips:
- Use a variety of social media data—The March Madness exercise looked only at tweets, but any significant social media site (Facebook, Foursquare, Pinterest, Instagram) is a potential source of rich business intelligence.
- Incorporate social media data with other data sources—You needn’t develop your social media intelligence in a data silo: incorporate other types of hard data, including structured data such as sales data or web traffic patterns, and use them collectively to develop predictive models that perform better than ones created from traditional data warehouses.
Learn more about social media sentiment analysis with Autonomy IDOL and how Vertica’s analytics platform performs split-second queries on big data sets. For more on the future of big data in sports, check out the Vertica webinar with STATS and Sports-Reference.com.
HP CEO Meg Whitman discusses how connected intelligence will drive IT operations, application development, IT security, marketing, compliance—and the bottom line. Register now.
HP Software’s Paul Muller hosts a weekly video digging into the hottest IT issues. Check out the latest episode.
Speed, reliability, and quality are essential, but hard to balance. Get better insight into cloud resourcing and consumption.
Network with your peers and our experts and partners to learn how to maximize your Big Data analytics outcomes.
Welcome to a new reality of split-second decisions and marketing by the numbers.
Looking toward the era when everyone — and everything — is connected.
Introduction to Enterprise 20/20
What will a successful enterprise look like in the future?
Challenges and opportunities for the CIO of the future.
Dev Center 20/20
How will we organize development centers for the apps that will power our enterprises?
IT Operations 20/20
How can you achieve the data center of the future?
What the workforce of 2020 can expect from IT, and what IT can expect from the workforce.
Preparing today for tomorrow’s threats.
Data Center 20/20
The innovation and revenue engine of the enterprise.