1. SECURITY
DISASTER RECOVERY/COMPLIANCE
BI/APPLICATIONS
DATA CENTER MANAGEMENT
STORAGE ARCHITECTURE
NETWORKING
HEALTH IT
APPLICATION DEVELOPMENT
CLOUD
VIRTUALIZATION
Handbook
1
2
EDITOR’S NOTE
Getting Down to Business
on Big Data Analytics
Capturing and storing big data is just the beginning. Reaping real
business value and competitive advantages from collections of
structured and unstructured data is the end goal.
WITH BIG DATA SYSTEMS,
PLAN AHEAD
3
FOCUS SHARPENS ON
BIG DATA’S BUSINESS VALUE
4
HADOOP LIGHTS PATH
TO CONSUMER INSIGHTS
2. 1
EDITOR’S NOTE
Home
Editor’s Note
With Big Data
Systems, Plan
Ahead
Focus Sharpens
on Big Data’s
Business Value
Hadoop Lights
Path to Consumer
Insights
It’s All About the Analytics
Collecting reams of big data is one thing;
doing something useful with all that information is another. But the former without the
latter won’t win business intelligence, analytics and IT teams any plaudits from corporate
executives. As Gartner analyst Doug Laney put
it at the consulting company’s 2013 Business
Intelligence and Analytics Summit, successful big data initiatives depend on companies
“recognizing that there are opportunities to really innovate with this information”—and then
taking the required steps to capitalize on those
opportunities.
That’s where big data analytics comes in.
Finding the business value hidden in hoards
2 G E T T I N G D O W N T O B U S I N E S S O N B I G DATA A N A LY T I C S
of big data can be a tough nut to crack—but
there’s a growing body of examples to help organizations get cracking. This three-part guide
provides insight and advice on managing big
data analytics programs. We start with a look
at four key factors to consider in planning
projects. Next we catalog guidance on deriving
business value from big data. We close with an
interview about a Hadoop-based analytics system that’s being used to examine the shopping
habits of Latino consumers. n
Craig Stedman
Executive Editor
SearchBusinessAnalytics.com
3. 2
STRATEGY
Home
Editor’s Note
With Big Data
Systems, Plan
Ahead
Focus Sharpens
on Big Data’s
Business Value
Hadoop Lights
Path to Consumer
Insights
With Big Data Systems, Plan Ahead
By providing access to broader sets of
information, big data can help maximize the
analytical insights generated by data analysts
and business users. Successful big data analytics applications uncover trends and patterns
that enable better decision making, point to
new revenue opportunities and keep companies
ahead of their business rivals. But first, organizations often need to enhance their existing IT
infrastructure and data management processes
to support the scale and complexity of big data
architecture.
Hadoop systems and NoSQL databases have
become key tools for managing big data environments. In many cases, though, businesses
are utilizing their existing data warehouse infrastructure, or a combination of the new and
old technologies, to manage the big data flowing into their systems.
Whatever type of big data technology stack
a company deploys, there are some common
3 G E T T I N G D O W N T O B U S I N E S S O N B I G DATA A N A LY T I C S
considerations that must be addressed to ensure it will provide an effective framework for
big data analysis efforts. Before getting started
on big data projects, it’s crucial to look at the,
er, bigger picture of the new data requirements
they entail. Let’s examine four of the considerations that need to be taken into account.
Data accuracy. Data quality issues are certainly no stranger to BI and data management
professionals. Many BI and analytics teams
struggle to ensure the validity of data and
convince business users to trust in the accuracy and reliability of information assets. The
widespread use of spreadsheets as personalized analytics repositories, or spreadmarts, can
contribute to a lack of trust in data: The ability
to store and manipulate analytics data in Excel
creates an environment that supports self-service analysis capabilities but might not inspire
other users to act confidently on the findings.
n
4. 2
STRATEGY
Home
Editor’s Note
With Big Data
Systems, Plan
Ahead
Focus Sharpens
on Big Data’s
Business Value
Hadoop Lights
Path to Consumer
Insights
Data warehouses, coupled with data integration
and data quality tools, can help instill that confidence by providing standardized processes for
managing BI and analytics data. But a big data
implementation adds to the degree of difficulty
due to increased data volumes and a wider variety of data types, particularly when a mix of
structured and unstructured data is involved.
Assessing data quality measures and upgrading them as needed to handle those larger and
more varied data sets is vital to the successful
implementation and usage of a big data analytics framework.
Storage fit. One of the core demands of data
warehousing is the ability to process and store
large data sets. But not all data warehouses are
created equally in that regard. Some are optimized for complex query processing, while
others aren’t. And in many big data applications, the addition of unstructured data and the
increased velocity at which data is created and
collected compared to transactional systems
makes augmenting a data warehouse with Hadoop or NoSQL technologies a necessity. For
an organization looking to capture and analyze
n
4 G E T T I N G D O W N T O B U S I N E S S O N B I G DATA A N A LY T I C S
big data, storage capacity isn’t enough; the important part is where the data is best put so
it can be transformed into useful information
and made available to data scientists and other
users.
Query performance. Big data analytics depends on the ability to process and query complex data in a timely fashion. A good example is
a company that developed a data warehouse to
maintain data collected from energy usage meters. During product evaluations, one vendor’s
system was able to process 7 million records
in 15 minutes, while another’s topped out at
300,000 records in the same amount of time.
Identifying the right infrastructure to support
fast data availability and high-performance
querying can make the difference between success and failure.
n
Scalability. With growing data volumes and
variety in many organizations, a big data platform can’t be built without the future in mind.
It’s imperative to think ahead and ask whether
the big data technologies being evaluated can
scale to the levels that will be required going
n
5. 2
STRATEGY
Home
Editor’s Note
With Big Data
Systems, Plan
Ahead
Focus Sharpens
on Big Data’s
Business Value
forward. That extends beyond storage capacity
to include performance as well, particularly in
companies that are looking at data from social
networks, sensors, system log files and other
non-transactional sources as extensions of
their business data.
Analyzing diverse and complex data sets requires a robust and resilient big data architecture. By considering these four factors when
planning projects, organizations can determine
whether what they already have in-house can
handle the rigors of big data analytics applications or if additional software, hardware and
data management processes are required to
achieve their big data goals. —Lyndsay Wise
Hadoop Lights
Path to Consumer
Insights
5 G E T T I N G D O W N T O B U S I N E S S O N B I G DATA A N A LY T I C S
6. 3
PROJECT
MANAGEMENT
Home
Editor’s Note
With Big Data
Systems, Plan
Ahead
Focus Sharpens
on Big Data’s
Business Value
Hadoop Lights
Path to Consumer
Insights
Focus Sharpens on Big Data’s Business Value
How best to deliver big data analytics to
users was one of big topics of discussion at
the TDWI BI Executive Summit in Las Vegas.
Presenters and attendees alike looked to chart
the way to successful analytics initiatives that
connect big data and business value, enabling
companies to get their money’s worth from the
mountains of structured and unstructured data
they’re building up in data warehouses, Hadoop
systems and NoSQL databases.
The key issue business intelligence (BI) and
analytics professionals must address hasn’t
changed with the advent of new data types
and technologies for managing them, said Barbara Wixom, an associate professor of IT at
the University of Virginia’s McIntire School
of Commerce and a visiting scholar at MIT’s
Sloan School of Management. The goal, she
said, is still to come up with a good answer to
this question: How do we get the data to the
users?
6 G E T T I N G D O W N T O B U S I N E S S O N B I G DATA A N A LY T I C S
“There is no value created without use,” said
Wixom, who isn’t a fan of the term “big data”
but does agree that “data is changing.” Those
changes, she added, require data management professionals to redouble their efforts
to develop data architectures that can support
the expanding variety of big data captured by
companies.
“The quantity of sources themselves is becoming overwhelming. We have more data
sources popping up every day,” said Evan Levy,
vice president of business consulting at software vendor SAS Institute Inc. The key problem is figuring out how to move that data
around the organization and to deliver it to
business users, Levy said in a keynote speech
at a TDWI World Conference held jointly with
the BI summit. “I have a rat’s nest of code going
back and forth,” he said. “What do I know about
all the programs moving that data?”
Levy said IT and data management teams
7. 3
PROJECT
MANAGEMENT
Home
Editor’s Note
With Big Data
Systems, Plan
Ahead
Focus Sharpens
on Big Data’s
Business Value
Hadoop Lights
Path to Consumer
Insights
will be best served by taking incremental approaches to dealing with the influx of big data.
He also told attendees to take cues from established manufacturing supply chain applications, which deliver final, uniform products
working from diverse raw materials.
SIGHTS SET ON HADOOP
The growing supply of Web data challenges
conventional data warehouse and analytics delivery planning, according to Jessica Thorud,
director of enterprise travel data warehouse
and business intelligence solutions at Sabre
Holdings Corp. in Southlake, Texas. She said
the travel reservation systems developer is
moving to alternative data strategies due in
part to the huge volume of shopping data it is
gathering.
“We know it is coming, and we know how to
do it,” said Thorud, who is continually looking
for new and better ways to integrate modern
Hadoop big data tools with enterprise data
warehouses and analytics systems. “We hope
the [Hadoop] tools continue to evolve so we
can connect them to the BI tools,” she said.
7 G E T T I N G D O W N T O B U S I N E S S O N B I G DATA A N A LY T I C S
While it’s early in the big data era, Thorud
predicts that her company will one day deliver big data analytics to customers, especially
those focused on marketing applications. “They
want the insight. They have ideas on how to
use it in decision support,” she said. To deploy
Hadoop, Sabre selected the jointly produced
Oracle-Cloudera Big Data Appliance, which is
based on the open source distributed processing software.
“ e hope the [Hadoop] tools
W
continue to evolve so we can
connect them to the BI tools.”
—jessica thorud,
director of enterprise travel data warehouse
and business intelligence solutions, Sabre
Holdings Corp.
Thorud’s comments came as she took part
in a summit session on delivering innovation
in travel through data and BI products. She
also encouraged attendees to focus on usability and design when designing BI and analytics
8. 3
PROJECT
MANAGEMENT
applications, and she reminded them to “know
your customers’ needs and capabilities.”
MARKETING TAKES THE LEAD
Home
Editor’s Note
With Big Data
Systems, Plan
Ahead
Focus Sharpens
on Big Data’s
Business Value
Hadoop Lights
Path to Consumer
Insights
Like Thorud, Wixom noted that marketingfocused applications have garnered significant
attention among new and innovative analytics
initiatives. She told the TDWI audience that
she had studied the early efforts of data warehousing, finding that back then, marketing also
led the way in adoption.
“There is lots of buzz around big data. Different people have different concepts,” said
conference attendee Masood Ali, information
management architect at the Royal Bank of
Canada, who came to the TWDI summit in part
to help determine the correct big data strategy
for his organization. Despite the hype, he sees
greater use on the way.
“Big data will soon become normal data,” he
said. “The important thing is to build for purpose, so big data has value. —Jack Vaughan
8 G E T T I N G D O W N T O B U S I N E S S O N B I G DATA A N A LY T I C S
9. 4
APPLICATIONS
Home
Editor’s Note
With Big Data
Systems, Plan
Ahead
Focus Sharpens
on Big Data’s
Business Value
Hadoop Lights
Path to Consumer
Insights
Hadoop Lights Path to Consumer Insights
Analytics services provider Luminar is counting on big data technology to help
it deliver valuable insights about U.S. Latino
consumers to marketers and other corporate
clients. The Denver-based company ditched
a traditional data warehouse setup in favor of
a system built around the Hortonworks Data
Platform—a Hadoop distribution—in an effort
to speed up the analytics process and make it
easier to manage the data being collected.
The idea behind the company’s service, says
Luminar President Franklin Rios, is to provide
clients with a far more reliable alternative to
focus groups and surveys—and to provide better information about Latino communities than
ever before. For example, Luminar—a unit of
Spanish-language media company Entravision
Communication Corp.—can tell its customers
how Cubans in Miami spend their money on
technology, or how much the typical Puerto Rican male in New York spends on food.
9 G E T T I N G D O W N T O B U S I N E S S O N B I G DATA A N A LY T I C S
SearchBusinessAnalytics.com spoke with
Rios about how his company is using the Hadoop system to support its big data analytics
efforts. Excerpts from the interview follow:
What is Luminar and what does it do?
In short, Luminar is an analytics and modeling
company that specializes in the U.S. Latino consumer. What I mean by that is we drive insights
through analytics and modeling by diving into
[data] that we ingest from multiple sources.
We give consumer packaged goods companies
(CPGs), retailers or what have you insights into
the true behavior of the Latino consumer.
Why did you see a need for a Latinofocused market analytics provider?
Before Luminar, a lot of marketers or advertising agents wanted to start reaching out
to the Latino consumer, [but] they relied on
highly sampled data from the usual suspects.
10. 4
APPLICATIONS
Home
Editor’s Note
With Big Data
Systems, Plan
Ahead
Focus Sharpens
on Big Data’s
Business Value
Hadoop Lights
Path to Consumer
Insights
Companies would use focus groups and selfreported panels and [small surveys]. They
would extra-polate [from that] and do statistical numbers to suggest how the rest of the 52
million Latinos in the U.S. behave. There is $1.5
trillion worth of purchasing power in the U.S.
Hispanic space. With 52 million Latino men,
women and children in the U.S., I don’t know
how a sample of 10,000 self-reported Latinos
can give any true indication to any retailer or
CPG on their behavior.
What is Luminar doing that is different
from that “legacy” approach?
The traditional way is highly sampled. So we’re
saying that we are not going to do that. We’re
going to take transactional data that we’re going to license from multiple sources. We have
about 2,000 sources of data that we ingest and
we analyze and we clean up, and then we apply what we call cultural filters in order to truly
find out who is a Latino.
Some of the data you receive must be
ambiguous. How do you tackle the
problem of identifying who is who?
1 0 G E T T I N G D O W N T O B U S I N E S S O N B I G DATA A N A LY T I C S
We have access to data that comes from loyalty
systems. So if you belong to a grocery store
loyalty system, your name and address and all
of that is in there. But how we truly start deriving who is a Latino is by starting to look at the
purchases and the transactions that the household has made. We also have access to things
like magazine subscriptions, and we know if
somebody is getting their utility bill in Spanish, and all kinds of stuff. We use a scoring
mechanism that says if you’re doing 55 or 100
of these behaviors, and if in the grocery store
the contents of your basket contain products
that are very much Latino products for cooking
and such, then the scoring keeps going up and
up and up to identify a Latino.
Luminar has access to names, addresses and
other personal information about individuals.
How do you deal with privacy concerns?
The privacy issue would come into play if I
were sharing that data with my clients at the
personal level, and I’m not. I’m aggregating it to
create personas, and we identify those personas
at a group level and we start telling the behavior of the personas to our clients.
11. 4
APPLICATIONS
Home
Editor’s Note
With Big Data
Systems, Plan
Ahead
Focus Sharpens
on Big Data’s
Business Value
Hadoop Lights
Path to Consumer
Insights
Your company launched a Hortonworksbased Hadoop system in 2012. How were
you handling this operation before that?
We handled it in the traditional way. We had
a data warehouse with all of the legacy pluses
and minuses that you run into in a data warehouse environment. But we realized that we
needed to be significantly more agile in being
able to ingest data, clean it up, run our cultural
filters into it and analyze the data and start
deriving insights out of it. We are ingesting
2,000 sources of data and the traditional tools
for data processing were not cutting it.
Are you still running the data warehouse
as well as the Hadoop system?
Nope. It’s off. Thank goodness we turned
it off. Not [only] am I happy but my CFO is
happy because of the savings. Currently, we
1 1 G E T T I N G D O W N T O B U S I N E S S O N B I G DATA A N A LY T I C S
do the [extract, transform and load] processing of data using Talend software and the
data gets ingested either via Talend or using
Sqoop directly into Hortonworks. Once it’s
in Hortonworks, we then use a combination
of Hive and R to load our analytical models.
Then, the results of that are presented via
Tableau.
What advice do you have for other companies
considering a big data analytics initiative?
If anybody is starting to look into migrating
into this technology, don’t try to do it alone.
You’ve got to have the right technology partner
or consulting partner to help you through this
project. Also, don’t be geographically limited
in terms of who you evaluate. We found talent
in Latin America that was able to help us with
this process. —Mark Brunelli