В обзоре Technology Forecast: Reshaping the workforce with the new analytics исследуется воздействие новых аналитических инструментов и культуры работы с данными, которую организации могут создать с помощью новых инструментов и услуг по анализу данных.
Дивидендные доходы нерезидентов РФ. Как избежать излишнего налогообложения
Technology Forecast: Reshaping the workforce with the new analytics
1. A quarterly journal
06 30 44 58
2012 The third wave of The art and science Natural language Building the foundation
Issue 1 customer analytics of new analytics processing and social for a data science culture
technology media intelligence
Reshaping the
workforce with
the new analytics
Mike Driscoll
CEO, Metamarkets
2. Acknowledgments
Advisory Center for Technology
Principal & Technology Leader & Innovation
Tom DeGarmo Managing Editor
Bo Parker
US Thought Leadership
Partner-in-Charge Editors
Tom Craren Vinod Baya
Alan Morrison
Strategic Marketing
Natalie Kontra Contributors
Jordana Marx Galen Gruman
Steve Hamby and Orbis Technologies
Bud Mathaisel
Uche Ogbuji
Bill Roberts
Brian Suda
Editorial Advisors
Larry Marion
Copy Editor
Lea Anne Bantsari
Transcriber
Dawn Regan
02 PwC Technology Forecast 2012 Issue 1
3. US studio Industry perspectives Jonathan Newman
Design Lead During the preparation of this Senior Director, Enterprise Web & EMEA
Tatiana Pechenik publication, we benefited greatly eSolutions
from interviews and conversations Ingram Micro
Designer with the following executives:
Peggy Fresenburg Ashwin Rangan
Kurt J. Bilafer Chief Information Officer
Illustrators Regional Vice President, Analytics, Edwards Lifesciences
Don Bernhardt Asia Pacific Japan
James Millefolie SAP Seth Redmore
Vice President, Marketing and Product
Jonathan Chihorek Management
Production
Vice President, Global Supply Chain Lexalytics
Jeff Ginsburg
Systems
Ingram Micro Vince Schiavone
Online Co-founder and Executive Chairman
Managing Director Online Marketing Zach Devereaux ListenLogic
Jack Teuber Chief Analyst
Nexalogy Environics Jon Slade
Designer and Producer Global Online and Strategic Advertising
Scott Schmidt Mike Driscoll Sales Director
Chief Executive Officer Financial Times
Animator Metamarkets
Roger Sano Claude Théoret
Elissa Fink President
Reviewers Chief Marketing Officer Nexalogy Environics
Jeff Auker Tableau Software
Ken Campbell Saul Zambrano
Murali Chilakapati Kaiser Fung Senior Director,
Oliver Halter Adjunct Professor Customer Energy Solutions
Matt Moore New York University Pacific Gas & Electric
Rick Whitney
Kent Kushar
Special thanks Chief Information Officer
Cate Corcoran E. & J. Gallo Winery
WIT Strategy
Josée Latendresse
Nisha Pathak Owner
Metamarkets Latendresse Groupe Conseil
Lisa Sheeran Mario Leone
Sheeran/Jager Communication Chief Information Officer
Ingram Micro
Jock Mackinlay
Director, Visual Analysis
Tableau Software
Reshaping the workforce with the new analytics 03
4. The right data +
the right resolution =
a new culture
of inquiry
Message from the editor disease sit at the other end of the size
James Balog1 may have more influence spectrum. Scientists’ understanding
on the global warming debate than of the role of amyloid particles in
any scientist or politician. By using Alzheimer’s has relied heavily on
time-lapse photographic essays of technologies such as scanning tunneling
shrinking glaciers, he brings art and microscopes.2 These devices generate
science together to produce striking visual data at sufficient resolution
visualizations of real changes to so that scientists can fully explore
the planet. In 60 seconds, Balog the physical geometry of amyloid
shows changes to glaciers that take particles in relation to the brain’s
place over a period of many years— neurons. Once again, data at the right
introducing forehead-slapping resolution together with the ability to
insight to a topic that can be as visually understand a phenomenon
difficult to see as carbon dioxide. are moving science forward.
Part of his success can be credited to
creating the right perspective. If the Science has long focused on data-driven
photographs had been taken too close understanding of phenomenon. It’s
Tom DeGarmo
to or too far away from the glaciers, called the scientific method. Enterprises
US Technology Consulting Leader the insight would have been lost. Data also use data for the purposes of
thomas.p.degarmo@us.pwc.com at the right resolution is the key. understanding their business outcomes
and, more recently, the effectiveness and
Glaciers are immense, at times more efficiency of their business processes.
than a mile deep. Amyloid particles But because running a business is not the
that are the likely cause of Alzheimer’s same as running a science experiment,
1 http://www.jamesbalog.com/. 2 Davide Brambilla, et al., “Nanotechnologies for
Alzheimer’s disease: diagnosis, therapy, and safety
issues,” Nanomedicine: Nanotechnology, Biology and
Medicine 7, no. 5 (2011): 521–540.
04 PwC Technology Forecast 2012 Issue 1
5. there has long been a divergence with big data techniques (including This issue also includes interviews
between analytics as applied to science NoSQL and in-memory databases), with executives who are using new
and the methods and processes that through advanced statistical packages analytics technologies and with subject
define analytics in the enterprise. (from the traditional SPSS and SAS matter experts who have been at the
to open source offerings such as R), forefront of development in this area:
This difference partly has been a to analytic visualization tools that put
question of scale and instrumentation. interactive graphics in the control of • Mike Driscoll of Metamarkets
Even a large science experiment (setting business unit specialists. This arc is considers how NoSQL and other
aside the Large Hadron Collider) will positioning the enterprise to establish analytics methods are improving
introduce sufficient control around the a new culture of inquiry, where query speed and providing
inquiry of interest to limit the amount of decisions are driven by analytical greater freedom to explore.
data collected and analyzed. Any large precision that rivals scientific insight.
enterprise comprises tens of thousands • Jon Slade of the Financial Times
of moving parts, from individual The first article, “The third wave of (FT.com) discusses the benefits
employees to customers to suppliers to customer analytics,” on page 06 reviews of cloud analytics for online
products and services. Measuring and the impact of basic computing trends ad placement and pricing.
retaining the data on all aspects of an on emerging analytics technologies.
enterprise over all relevant periods of Enterprises have an unprecedented • Jock Mackinlay of Tableau Software
time are still extremely challenging, opportunity to reshape how business describes the techniques behind
even with today’s IT capacities. gets done, especially when it comes interactive visualization and
to customers. The second article, how more of the workforce can
But targeting the most important “The art and science of new analytics become engaged in analytics.
determinants of success in an enterprise technology,” on page 30 explores the
context for greater instrumentation— mix of different techniques involved • Ashwin Rangan of Edwards
often customer information—can be and in making the insights gained from Lifesciences highlights new
is being done today. And with Moore’s analytics more useful, relevant, and ways that medical devices can
Law continuing to pay dividends, this visible. Some of these techniques are be instrumented and how new
instrumentation will expand in the clearly in the data science realm, while business models can evolve.
future. In the process, and with careful others are more art than science. The
attention to the appropriate resolution article, “Natural language processing Please visit pwc.com/techforecast
of the data being collected, enterprises and social media intelligence,” on to find these articles and other issues
that have relied entirely on the art of page 44 reviews many different of the Technology Forecast online.
management will increasingly blend in language analytics techniques in use If you would like to receive future
the science of advanced analytics. Not for social media and considers how issues of this quarterly publication as
surprisingly, the new role emerging in combinations of these can be most a PDF attachment, you can sign up at
the enterprise to support these efforts effective.“How CIOs can build the pwc.com/techforecast/subscribe.
is often called a “data scientist.” foundation for a data science culture”
on page 58 considers new analytics as As always, we welcome your feedback
This issue of the Technology Forecast an unusually promising opportunity and your ideas for future research
examines advanced analytics through for CIOs. In the best case scenario, and analysis topics to cover.
this lens of increasing instrumentation. the IT organization can become the
PwC’s view is that the flow of data go-to group, and the CIO can become
at this new, more complete level of the true information leader again.
resolution travels in an arc beginning
Reshaping the workforce with the new analytics 05
6. Bahrain World Trade Center
gets approximately 15% of its
power from these wind turbines
06 PwC Technology Forecast 2012 Issue 1
7. The third wave of
customer analytics
These days, there’s only one way to scale the
analysis of customer-related information to
increase sales and profits—by tapping the data
and human resources of the extended enterprise.
By Alan Morrison and Bo Parker
As director of global online and strategic issues. The parallel processing,
strategic advertising sales for FT.com, in-memory technology, the interface,
the online face of the Financial Times, and many other enhancements led to
Jon Slade says he “looks at the 6 billion better business results, including double-
ad impressions [that FT.com offers] digit growth in ad yields and 15 to 20
each year and works out which one percent accuracy improvement in the
is worth the most for any particular metrics for its ad impression supply.
client who might buy.” This activity
previously required labor-intensive The technology trends behind
extraction methods from a multitude FT.com’s improvements in advertising
of databases and spreadsheets. Slade operations—more accessible data;
made the process much faster and faster, less-expensive computing; new
vastly more effective after working software tools; and improved user
with Metamarkets, a company that interfaces—are driving a new era in
offers a cloud-based, in-memory analytics use at large companies around
analytics service called Druid. the world, in which enterprises make
decisions with a precision comparable
“Before, the sales team would send to scientific insight. The new analytics
an e-mail to ad operations for an uses a rigorous scientific method,
inventory forecast, and it could take including hypothesis formation and
a minimum of eight working hours testing, with science-oriented statistical
and as long as two business days to packages and visualization tools. It is
get an answer,” Slade says. Now, with spawning business unit “data scientists”
a direct interface to the data, it takes who are replacing the centralized
a mere eight seconds, freeing up the analytics units of the past. These trends
ad operations team to focus on more will accelerate, and business leaders
Reshaping the workforce with the new analytics 07
8. Figure 1: How better customer analytics capabilities are affecting enterprises
Processing power and memory keep increasing, the
More computing speed, ability to leverage massive parallelization continues to
storage, and ability to scale
expand in the cloud, and the cost per processed bit
keeps falling.
Leads to
Data scientists are seeking larger data sets and iterating
More time and better tools more to refine their questions and find better answers.
Visualization capabilities and more intuitive user
interfaces are making it possible for most people in
the workforce to do at least basic exploration.
Social media data is the most prominent example of the
More data sources many large data clouds emerging that can help
enterprises understand their customers better. These
clouds augment data that business units have direct
access to internally now, which is also growing.
A core single metric can be a way to rally the entire
More focus on key metrics organization’s workforce, especially when that core
metric is informed by other metrics generated with the
help of effective modeling.
Whether an enterprise is a gaming or an e-commerce
Better access to results company that can instrument its own digital environ-
ment, or a smart grid utility that generates, slices, dices,
and shares energy consumption analytics for its
customers and partners, better analytics are going
Leads to
direct to the customer as well as other stakeholders.
And they’re being embedded where users can more
easily find them.
Visualization and user interface improvements have
A broader culture of inquiry made it possible to spread ad hoc analytics capabilities
across the workplace to every user role. At the same
time, data scientists—people who combine a creative
ability to generate useful hypotheses with the savvy to
Leads to simulate and model a business as it’s changing—have
never been in more demand than now.
The benefits of a broader culture of inquiry include new
Less guesswork opportunities, a workforce that shares a better under-
standing of customer needs to be able to capitalize on
Less bias
the opportunities, and reduced risk. Enterprises that
More awareness understand the trends described here and capitalize
Better decisions on them will be able to change company culture and
improve how they attract and retain customers.
who embrace the new analytics will be in this issue focus on the technologies
able to create cultures of inquiry that behind these capabilities (see the
lead to better decisions throughout article, “The art and science of new
their enterprises. (See Figure 1.) analytics technology,” on page 30)
and identify the main elements of a
This issue of the Technology Forecast CIO strategic framework for effectively
explores the impact of the new taking advantage of the full range of
analytics and this culture of inquiry. analytics capabilities (see the article,
This first article examines the essential “How CIOs can build the foundation for
ingredients of the new analytics, using a data science culture,” on page 58).
several examples. The other articles
08 PwC Technology Forecast 2012 Issue 1
9. More computing speed, decision-making capabilities. “Because
storage, and ability to scale our technology is optimized for the
Basic computing trends are providing cloud, we can harness the processing
the momentum for a third wave power of tens, hundreds, or thousands
in analytics that PwC calls the new of servers depending on our customers’
analytics. Processing power and data and their specific needs,” states
memory keep increasing, the ability Mike Driscoll, CEO of Metamarkets.
to leverage massive parallelization “We can ask questions over billions
continues to expand in the cloud, and of rows of data in milliseconds. That
the cost per processed bit keeps falling. kind of speed combined with data
science and visualization helps business
FT.com benefited from all of these users understand and consume
trends. Slade needs multiple computer information on top of big data sets.”
screens on his desk just to keep up. His
job requires a deep understanding of Decades ago, in the first wave of
the readership and which advertising analytics, small groups of specialists
suits them best. Ad impressions— managed computer systems, and even
appearances of ads on web pages— smaller groups of specialists looked for
are the currency of high-volume media answers in the data. Businesspeople
industry websites. The impressions typically needed to ask the specialists
need to be priced based on the reader to query and analyze the data. As
segments most likely to see them and enterprise data grew, collected from
click through. Chief executives in enterprise resource planning (ERP)
France, for example, would be a reader systems and other sources, IT stored the
segment FT.com would value highly. more structured data in warehouses so
analysts could assess it in an integrated
“The trail of data that users create form. When business units began to
when they look at content on a website ask for reports from collections of data
like ours is huge,” Slade says. “The relevant to them, data marts were born,
real challenge has been trying to but IT still controlled all the sources.
understand what information is useful
to us and what we do about it.” The second wave of analytics saw
variations of centralized top-down data
FT.com’s analytics capabilities were collection, reporting, and analysis. In
a challenge, too. “The way that data the 1980s, grassroots decentralization
was held—the demographics data, the began to counter that trend as the PC
behavior data, the pricing, the available era ushered in spreadsheets and other
inventory—was across lots of different methods that quickly gained widespread
databases and spreadsheets,” Slade use—and often a reputation for misuse.
says. “We needed an almost witchcraft- Data warehouses and marts continue
like algorithm to provide answers to to store a wealth of helpful data.
‘How many impressions do I have?’ and
‘How much should I charge?’ It was an In both waves, the challenge for
extremely labor-intensive process.” centralized analytics was to respond to
business needs when the business units
FT.com saw a possible solution when themselves weren’t sure what findings
it first talked to Metamarkets about they wanted or clues they were seeking.
an initial concept, which evolved as
they collaborated. Using Metamarkets’ The third wave does that by giving
analytics platform, FT.com could access and tools to those who act
quickly iterate and investigate on the findings. New analytics taps
numerous questions to improve its the expertise of the broad business
Reshaping the workforce with the new analytics 09
10. Figure 2: The three waves of analytics and the impact of decentralization
Cloud computing accelerates decentralization of the analytics function.
Cloud co-creation
Self-service Data
in the
Trend toward decentralization
cloud
Central IT generated C
B
A
1
2
3
4 The trend toward
5 decentralization continues as
6
7
business units, customers, and
other stakeholders collaborate
to diagnose and work on
PCs and then the web and an problems of mutual interest in
increasingly interconnected the cloud.
business ecosystem have provided
Analytics functions in enterprises more responsive alternatives.
were all centralized in the beginning,
but not always responsive to
business needs.
ecosystem to address the lack of More time and better tools
responsiveness from central analytics Big data techniques—including NoSQL1
units. (See Figure 2.) Speed, storage, and in-memory databases, advanced
and scale improvements, with the statistical packages (from SPSS and
help of cloud co-creation, have SAS to open source offerings such as R),
made this decentralized analytics visualization tools that put interactive
possible. The decentralized analytics graphics in the control of business
innovation has evolved faster than unit specialists, and more intuitive
the centralized variety, and PwC user interfaces—are crucial to the new
expects this trend to continue. analytics. They make it possible for
many people in the workforce to do
“In the middle of looking at some data, some basic exploration. They allow
you can change your mind about what business unit data scientists to use larger
question you’re asking. You need to be data sets and to iterate more as they test
able to head toward that new question hypotheses, refine questions, and find
on the fly,” says Jock Mackinlay, better answers to business problems.
director of visual analysis at Tableau
Software, one of the vendors of the new Data scientists are nonspecialists
visualization front ends for analytics. who follow a scientific method of
“No automated system is going to keep iterative and recursive analysis with a
up with the stream of human thought.” practical result in mind. Even without
formal training, some business users
in finance, marketing, operations,
human capital, or other departments
1 See “Making sense of Big Data,” Technology Forecast
2010, Issue 3, http://www.pwc.com/us/en/technology-
forecast/2010/issue3/index.jhtml, for more information
on Hadoop and other NoSQL databases.
10 PwC Technology Forecast 2012 Issue 1
11. Case study
How the E. & J. Gallo Winery
matches outbound shipments
to retail customers
E. & J. Gallo Winery, one of the world’s Years ago, Gallo’s senior management
largest producers and distributors of understood that customer analytics
wines, recognizes the need to precisely would be increasingly important. The
identify its customers for two reasons: company’s most recent investments are
some local and state regulations mandate extensions of what it wanted to do 25
restrictions on alcohol distribution, years ago but was limited by availability
and marketing brands to individuals of data and tools. Since 1998, Gallo
requires knowing customer preferences. IT has been working on advanced
data warehouses, analytics tools, and
“The majority of all wine is consumed visualization. Gallo was an early adopter
within four hours and five miles of visualization tools and created IT
of being purchased, so this makes subgroups within brand marketing to
it critical that we know which leverage the information gathered.
products need to be marketed and
distributed by specific destination,” The success of these early efforts has
says Kent Kushar, Gallo’s CIO. spurred Gallo to invest even more
in analytics. “We went from step
Gallo knows exactly how its products function growth to logarithmic growth
move through distributors, but of analytics; we recently reinvested
tracking beyond them is less clear. heavily in new appliances, a new
Some distributors are state liquor system architecture, new ETL [extract,
control boards, which supply the transform, and load] tools, and new
wine products to retail outlets and ways our SQL calls were written; and
other end customers. Some sales are we began to coalesce unstructured
through military post exchanges, and data with our traditional structured
in some cases there are restrictions and consumer data,” says Kushar.
regulations because they are offshore.
“Recognizing the power of these
Gallo has a large compliance capabilities has resulted in our taking a
department to help it manage the 10-year horizon approach to analytics,”
regulatory environment in which Gallo he adds. “Our successes with analytics
products are sold, but Gallo wants to date have changed the way we
to learn more about the customers think about and use analytics.”
who eventually buy and consume
those products, and to learn from The result is that Gallo no longer relies
them information to help create on a single instance database, but has
new products that localize tastes. created several large purpose-specific
databases. “We have also created
Gallo sometimes cannot obtain point of new service level agreements for our
sales data from retailers to complete the internal customers that give them
match of what goes out to what is sold. faster access and more timely analytics
Syndicated data, from sources such as and reporting,” Kushar says. Internal
Information Resources, Inc. (IRI), serves customers for Gallo IT include supply
as the matching link between distribution chain, sales, finance, distribution,
and actual consumption. This results and the web presence design team.
in the accumulation of more than 1GB
of data each day as source information
for compliance and marketing.
Reshaping the workforce with the new analytics 11
12. already have the skills, experience, Analytics tools were once the province
and mind-set to be data scientists. of experts. They weren’t intuitive,
Others can be trained. The teaching of and they took a long time to learn.
the discipline is an obvious new focus Those who were able to use them
for the CIO. (See the article,”How tended to have deep backgrounds
CIOs can build the foundation for a in mathematics, statistical analysis,
data science culture” on page 58.) or some scientific discipline. Only
companies with dedicated teams of
Visualization tools have been especially specialists could make use of these
useful for Ingram Micro, a technology tools. Over time, academia and the
products distributor, which uses them business software community have
to choose optimal warehouse locations collaborated to make analytics tools
around the globe. Warehouse location is more user-friendly and more accessible
a strategic decision, and Ingram Micro to people who aren’t steeped in the
can run many what-if scenarios before it mathematical expressions needed to
decides. One business result is shorter- query and get good answers from data.
term warehouse leases that give Ingram
Micro more flexibility as supply chain Products from QlikTech, Tableau
requirements shift due to cost and time. Software, and others immerse users in
fully graphical environments because
“Ensuring we are at the efficient frontier most people gain understanding more
for our distribution is essential in this quickly from visual displays of numbers
fast-paced and tight-margin business,” rather than from tables. “We allow
Over time, academia says Jonathan Chihorek, vice president users to get quickly to a graphical view
and the business of global supply chain systems at Ingram of the data,” says Tableau Software’s
Micro. “Because of the complexity, Mackinlay. “To begin with, they’re
software community size, and cost consequences of these using drag and drop for the fields
have collaborated warehouse location decisions, we run in the various blended data sources
extensive models of where best to they’re working with. The software
to make analytics locate our distribution centers at least interprets the drag and drop as algebraic
tools more user- once a year, and often twice a year.” expressions, and that gets compiled
into a query database. But users don’t
friendly and more Modeling has become easier thanks need to know all that. They just need
accessible to people to mixed integer, linear programming to know that they suddenly get to
optimization tools that crunch large see their data in a visual form.”
who aren’t steeped and diverse data sets encompassing
in the mathematical many factors. “A major improvement Tableau Software itself is a prime
came from the use of fast 64-bit example of how these tools are
expressions needed to processors and solid-state drives that changing the enterprise. “Inside
query and get good reduced scenario run times from Tableau we use Tableau everywhere,
six to eight hours down to a fraction from the receptionist who’s keeping
answers from data. of that,” Chihorek says. “Another track of conference room utilization
breakthrough for us has been improved to the salespeople who are monitoring
visualization tools, such as spider and their pipelines,” Mackinlay says.
bathtub diagrams that help our analysts
choose the efficient frontier curve These tools are also enabling
from a complex array of data sets that more finance, marketing, and
otherwise look like lists of numbers.” operational executives to become
data scientists, because they help
them navigate the data thickets.
12 PwC Technology Forecast 2012 Issue 1
13. Figure 3: Improving the signal-to-noise ratio in social media monitoring
Social media is a high-noise environment But there are ways to reduce the noise And focus on significant conversations
work boots Illuminating and helpful dialogue
leather heel heel
boots color fashion color fashion
construction safety style style
rugged leather cool leather cool
shoes toe shoes toe
boots boots
price safety price safety
store value store value
rugged rugged
wear wear
construction construction
An initial set of relevant terms is used to cut With proper guidance, machines can do Visualization tools present “lexical maps” to
back on the noise dramatically, a first step millions of correlations, clustering words by help the enterprise unearth instances of
toward uncovering useful conversations. context and meaning. useful customer dialog.
Source: Nexalogy Environics and PwC, 2012
More data sources of shoes and boots. The manufacturer
The huge quantities of data in the was mining conventional business data
cloud and the availability of enormous for insights about brand status, but
low-cost processing power can help it had not conducted any significant
enterprises analyze various business analysis of social media conversations
problems—including efforts to about its products, according to Josée
understand customers better, especially Latendresse, who runs Latendresse
through social media. These external Groupe Conseil, which was advising
clouds augment data that business units the company on its repositioning
already have direct access to internally. effort. “We were neglecting the
wealth of information that we could
Ingram Micro uses large, diverse data find via social media,” she says.
sets for warehouse location modeling,
Chihorek says. Among them: size, To expand the analysis, Latendresse
weight, and other physical attributes brought in technology and expertise
of products; geographic patterns of from Nexalogy Environics, a company
consumers and anticipated demand that analyzes the interest graph implied
for product categories; inbound and in online conversations—that is, the
outbound transportation hubs, lead connections between people, places, and
times, and costs; warehouse lease and things. (See “Transforming collaboration
operating costs, including utilities; with social tools,” Technology Forecast
and labor costs—to name a few. 2011, Issue 3, for more on interest
graphs.) Nexalogy Environics studied
Social media can also augment millions of correlations in the interest
internal data for enterprises willing to graph and selected fewer than 1,000
learn how to use it. Some companies relevant conversations from 90,000 that
ignore social media because so much mentioned the products. In the process,
of the conversation seems trivial, Nexalogy Environics substantially
but they miss opportunities. increased the “signal” and reduced
the “noise” in the social media about
Consider a North American apparel the manufacturer. (See Figure 3.)
maker that was repositioning a brand
Reshaping the workforce with the new analytics 13
14. Figure 4: Adding social media analysis techniques
suggests other changes to the BI process
Here’s one example of how the larger business intelligence (BI) process might
Adding SMA techniques
change with the addition of social media analysis.
One apparel maker started with its conventional BI analysis cycle.
Conventional BI techniques 1 1. Develop questions
used by an apparel 2. Collect data
company client ignored 5 2 3. Clean data
social media and required
lots of data cleansing. The 4. Analyze data
results often lacked insight. 5. Present results
4 3
Then it added social media and targeted focus groups to the mix.
The company’s revised approach 1. Develop questions
1
added several elements such as 2. Refine conventional BI
social media analysis and 6 2 - Collect data
expanded others, but kept the - Clean data
focus group phase near the - Analyze data
beginning of the cycle. The 3. Conduct focus groups
company was able to mine new 5 3
(retailers and end users)
insights from social media
4 4. Select conversations
conversations about market
segments that hadn’t occurred to 5. Analyze social media
the company to target before. 6. Present results
Then it tuned the process for maximum impact.
The company’s current 1. Develop questions
1
approach places focus 2. Refine conventional BI
groups near the end, where 7 2 - Collect data
they can inform new - Clean data
questions more directly. This - Analyze data
approach also stresses how 6 3
3. Select conversations
the results get presented to
4. Analyze social media
executive leadership. 5 4
5. Present results
6. Tailor results to audience
7. Conduct focus groups
New step added (retailers and end users)
What Nexalogy Environics discovered generally. “The key step,” she says,
suggested the next step for the brand “is to define the questions that you
repositioning. “The company wasn’t want to have answered. You will
marketing to people who were blogging definitely be surprised, because
about its stuff,” says Claude Théoret, the system will reveal customer
president of Nexalogy Environics. attitudes you didn’t anticipate.”
The shoes and boots were designed
for specific industrial purposes, but Following the social media analysis
the blogging influencers noted their (SMA), Latendresse saw the retailer
fashion appeal and their utility when and its user focus groups in a new
riding off-road on all-terrain vehicles light. The analysis “had more complete
and in other recreational settings. results than the focus groups did,” she
“That’s a whole market segment says. “You could use the focus groups
the company hadn’t discovered.” afterward to validate the information
evident in the SMA.” The revised
Latendresse used the analysis to intelligence development process
help the company expand and now places focus groups closer to the
refine its intelligence process more end of the cycle. (See Figure 4.)
14 PwC Technology Forecast 2012 Issue 1
15. Figure 5: The benefits of big data analytics: A carrier example
By analyzing billions of call records, carriers are able to obtain early warning of groups of subscribers likely to switch services.
Here is how it works:
1 Carrier notes big peaks 2 Dataspora brought in to 3 The initial analysis debunks some Carrier’s
in churn.* analyze all call records. myths and raises new questions prime hypothesis
discussed with the carrier. disproved
Dropped calls/poor service? Merged to family plan?
14 billion Preferred phone unavailable? Offer by competitor?
call data records
analyzed Financial trouble? Dropped dead?
Incarcerated? Friend dropped recently!
Pattern spotted: Those with a
relationship to a dropped customer
$ $
DON’T GO! (calls lasting longer than two minutes,
We’ll miss you! more than twice in the previous
$ $ month) are 500% more likely to drop.
6 Marketers begin 5 Data group deploys a call 4 Further analysis confirms that friends influence
campaigns that target record monitoring system that other friends’ propensity to switch services.
at-risk subscriber groups issues an alert that identifies
with special offers. at-risk subscribers. * Churn: the proportion of contractual subscribers who leave during
a given time period
Source: Metamarkets and PwC, 2012
Third parties such as Nexalogy A telecom provider illustrates the
Environics are among the first to point. The carrier was concerned
take advantage of cloud analytics. about big peaks in churn—customers
Enterprises like the apparel maker may moving to another carrier—but hadn’t
have good data collection methods methodically mined the whole range of
but have overlooked opportunities to its call detail records to understand the
mine data in the cloud, especially social issue. Big data analysis methods made
media. As cloud capabilities evolve, a large-scale, iterative analysis possible.
enterprises are learning to conduct more The carrier partnered with Dataspora, a
iteration, to question more assumptions, consulting firm run by Driscoll before he
and to discover what else they can founded Metamarkets. (See Figure 5.)2
learn from data they already have.
“We analyzed 14 billion call data
More focus on key metrics records,” Driscoll recalls, “and built a
One way to start with new analytics is high-frequency call graph of customers
to rally the workforce around a single who were calling each other. We found
core metric, especially when that core that if two subscribers who were friends
metric is informed by other metrics spoke more than once for more than
generated with the help of effective two minutes in a given month and the
modeling. The core metric and the first subscriber cancelled their contract
model that helps everyone understand in October, then the second subscriber
it can steep the culture in the language, became 500 percent more likely to
methods, and tools around the cancel their contract in November.”
process of obtaining that goal.
2 For more best practices on methods to address churn,
see Curing customer churn, PwC white paper, http://
www.pwc.com/us/en/increasing-it-effectiveness/
publications/curing-customer-churn.jhtml, accessed
April 5, 2012.
Reshaping the workforce with the new analytics 15
16. Data mining on that scale required that policymakers are encouraging
distributed computing across hundreds more third-party access to the usage
of servers and repeated hypothesis data from the meters. “One of the big
testing. The carrier assumed that policy pushes at the regulatory level
dropped calls might be one reason is to create platforms where third
why clusters of subscribers were parties can—assuming all privacy
cancelling contracts, but the Dataspora guidelines are met—access this data
analysis disproved that notion, to build business models they can
finding no correlation between drive into the marketplace,” says
dropped calls and cancellation. Zambrano. “Grid management and
energy management will be supplied
“There were a few steps we took. One by both the utilities and third parties.”
was to get access to all the data and next
do some engineering to build a social Zambrano emphasizes the importance
graph and other features that might of customer participation to the energy
be meaningful, but we also disproved efficiency push. The issue he raises is
some other hypotheses,” Driscoll says. the extent to which blended operational
Watching what people actually did and customer data can benefit the
confirmed that circles of friends were larger ecosystem, by involving millions
cancelling in waves, which led to the of residential and business customers.
peaks in churn. Intense focus on the key “Through the power of information
metric illustrated to the carrier and its and presentation, you can start to show
workforce the power of new analytics. customers different ways that they can
“Through the power become stewards of energy,” he says.
of information and Better access to results
The more pervasive the online As a highly regulated business, the
presentation, you can environment, the more common the utility industry has many obstacles to
start to show customers sharing of information becomes. overcome to get to the point where
Whether an enterprise is a gaming smart grids begin to reach their
different ways that they or an e-commerce company that potential, but the vision is clear:
can become stewards can instrument its own digital
environment, or a smart grid utility • Show customers a few key
of energy.” that generates, slices, dices, and metrics and seasonal trends in
shares energy consumption analytics an easy-to-understand form.
—Saul Zambrano, PG&E for its customers and partners, better
analytics are going direct to the • Provide a means of improving those
customer as well as other stakeholders. metrics with a deeper dive into where
And they’re being embedded where they’re spending the most on energy.
users can more easily find them.
• Allow them an opportunity to
For example, energy utilities preparing benchmark their spending by
for the smart grid are starting to providing comparison data.
invite the help of customers by
putting better data and more broadly This new kind of data sharing could be a
shared operational and customer chance to stimulate an energy efficiency
analytics at the center of a co-created competition that’s never existed between
energy efficiency collaboration. homeowners and between business
property owners. It is also an example of
Saul Zambrano, senior director of how broadening access to new analytics
customer energy solutions at Pacific can help create a culture of inquiry
Gas & Electric (PG&E), an early throughout the extended enterprise.
installer of smart meters, points out
16 PwC Technology Forecast 2012 Issue 1
17. Case study
Smart shelving: How the
E. & J. Gallo Winery analytics
team helps its retail partners
Some of the data in the E. & J. Gallo what the data reveal (for underlying
Winery information architecture is for trends of specific brands by location),
production and quality control, not just or to conduct R&D in a test market,
customer analytics. More recently, Gallo or to listen to the web platforms.
has adopted complex event processing
methods on the source information, These insights inform a specific design
so it can look at successes and failures for “smart shelving,” which is the
early in its manufacturing execution placement of products by geography
system, sales order management, and location within the store. Gallo
and the accounting system that offers a virtual wine shelf design
front ends the general ledger. schematic to retailers, which helps
the retailer design the exact details
Information and information flow are of how wine will be displayed—by
the lifeblood of Gallo, but it is clearly brand, by type, and by price. Gallo’s
a team effort to make the best use wine shelf design schematic will help
of the information. In this team: the retailer optimize sales, not just for
Gallo brands but for all wine offerings.
• Supply chain looks at the flows.
Before Gallo’s wine shelf design
• ales determines what information is
S schematic, wine sales were not a major
needed to match supply and demand. source of retail profits for grocery stores,
but now they are the first or second
• &D undertakes the heavy-duty
R highest profit generators in those stores.
customer data integration, and it “Because of information models such as
designs pilots for brand consumption. the wine shelf design schematic, Gallo
has been the wine category captain for
• T provides the data and consulting
I some grocery stores for 11 years in a row
on how to use the information. so far,” says Kent Kushar, CIO of Gallo.
Mining the information for patterns and
insights in specific situations requires
the team. A key goal is what Gallo refers
to as demand sensing—to determine
the stimulus that creates demand by
brand and by product. This is not just
a computer task, but is heavily based
on human intervention to determine
Reshaping the workforce with the new analytics 17
18. Conclusion: A broader have found. The return on investment
culture of inquiry for finding a new market segment can
This article has explored how be the difference between long-term
enterprises are embracing the big data, viability and stagnation or worse.
tools, and science of new analytics
along a path that can lead them to a Tackling the new kinds of data being
broader culture of inquiry, in which generated is not the only analytics task
improved visualization and user ahead. Like the technology distributor,
interfaces make it possible to spread ad enterprises in all industries have
hoc analytics capabilities to every user concerns about scaling the analytics
role. This culture of inquiry appears for data they’re accustomed to having
likely to become the age of the data and now have more. Publishers can
scientists—workers who combine serve readers better and optimize ad
a creative ability to generate useful sales revenue by tuning their engines
hypotheses with the savvy to simulate for timing, pricing, and pinpointing
and model a business as it’s changing. ad campaigns. Telecom carriers can
mine all customer data more effectively
It’s logical that utilities are to be able to reduce the expense
instrumenting their environments as of churn and improve margins.
a step toward smart grids. The data
they’re generating can be overwhelming, What all of these examples suggest is a
but that data will also enable the greater need to immerse the extended
analytics needed to reduce energy workforce—employees, partners, and
consumption to meet efficiency and customers—in the data and analytical
environmental goals. It’s also logical methods they need. Without a view
that enterprises are starting to hunt into everyday customer behavior,
for more effective ways to filter social there’s no leverage for employees to
media conversations, as apparel makers influence company direction when
One way to raise awareness about the
power of new analytics comes from
articulating the results in a visual form
that everyone can understand. Another
is to enable the broader workforce to
work with the data themselves and to ask
them to develop and share the results of
their own analyses.
18 PwC Technology Forecast 2012 Issue 1
19. Table 1: Key elements of a culture of inquiry
Element How it is manifested within an organization Value to the organization
Executive support Senior executives asking for data to support any Set the tone for the rest of the organization with
opinion or proposed action and using interactive examples
visualization tools themselves
Data availability Cloud architecture (whether private or public) and Find good ideas from any source
semantically rich data integration methods
Analytics tools Higher-profile data scientists embedded in the Identify hidden opportunities
business units
Interactive visualization Visual user interfaces and the right tool for the right Encourage a culture of inquiry
person
Training Power users in individual departments Spread the word and highlight the most effective and
user-friendly techniques
Sharing Internal portals or other collaborative environments Prove that the culture of inquiry is real
to publish and discuss inquiries and results
markets shift and there are no insights would be to designate, train, and
into improving customer satisfaction. compensate the more enthusiastic users
Computing speed, storage, and scale in all units—finance, product groups,
make those insights possible, and it is supply chain, human resources, and
up to management to take advantage so forth—as data scientists. Table 1
of what is becoming a co-creative presents examples of approaches to
work environment in all industries— fostering a culture of inquiry.
to create a culture of inquiry.
The arc of all the trends explored
Of course, managing culture change is in this article is leading enterprises
a much bigger challenge than simply toward establishing these cultures
rolling out more powerful analytics of inquiry, in which decisions can be
software. It is best to have several informed by an analytical precision
starting points and to continue to find comparable to scientific insight. New
ways to emphasize the value of analytics market opportunities, an energized
in new scenarios. One way to raise workforce with a stake in helping to
awareness about the power of new achieve a better understanding of
analytics comes from articulating the customer needs, and reduced risk are
results in a visual form that everyone just some of the benefits of a culture of
can understand. Another is to enable inquiry. Enterprises that understand
the broader workforce to work with the trends described here and capitalize
the data themselves and to ask them to on them will be able to improve how
develop and share the results of their they attract and retain customers.
own analyses. Still another approach
Reshaping the workforce with the new analytics 19
20. PwC: What’s your background,
The nature of cloud-
and how did you end up running
a data science startup?
MD: I came to Silicon Valley after
based data science
studying computer science and biology
for five years, and trying to reverse
engineer the genome network for
Mike Driscoll of Metamarkets talks about uranium-breathing bacteria. That
was my thesis work in grad school.
the analytics challenges and opportunities There was lots of modeling and causal
that businesses moving to the cloud face. inference. If you were to knock this gene
out, could you increase the uptake of the
reduction of uranium from a soluble to
Interview conducted by Alan Morrison and Bo Parker
an insoluble state? I was trying all these
simulations and testing with the bugs
to see whether you could achieve that.
PwC: You wanted to clean up
radiation leaks at nuclear plants?
Mike Driscoll MD: Yes. The Department of
Mike Driscoll is CEO of Metamarkets, Energy funded the research work
a cloud-based analytics company he I did. Then I came out here and I
co-founded in San Francisco in 2010. gave up on the idea of building a
biotech company, because I didn’t
think there was enough commercial
viability there from what I’d seen.
I did think I could take this toolkit I’d
developed and apply it to all these other
businesses that have data. That was the
genesis of the consultancy Dataspora.
As we started working with companies
at Dataspora, we found this huge gap
between what was possible and what
companies were actually doing.
Right now the real shift is that
companies are moving from this very
high-latency-course era of reporting
into one where they start to have lower
latency, finer granularity, and better
20 PwC Technology Forecast 2012 Issue 1
21. Some companies don’t have all the capabilities Critical
business
they need to create data science value. questions
Companies need these three capabilities
to excel in creating data science value. Value and
change
Good Data
data science
visibility into their operations. They expensive relational database. There PwC: How are companies that do
realize the problem with being walking needs to be different temperatures have data science groups meeting
amnesiacs, knowing what happened of data, and companies need to the challenge? Take the example
to their customers in the last 30 days put different values on the data— of an orphan drug that is proven
and then forgetting every 30 days. whether it’s hot or cold, whether it’s to be safe but isn’t particularly
active. Most companies have only one effective for the application it
Most businesses are just now temperature: they either keep it hot in was designed for. Data scientists
figuring out that they have this a database, or they don’t keep it at all. won’t know enough about a broad
wealth of information about their range of potential biological
customers and how their customers PwC: So they could just systems for which that drug might
interact with their products. keep it in the cloud? be applicable, but the people
MD: Absolutely. We’re starting to who do have that knowledge
PwC: On its own, the new see the emergence of cloud-based don’t know the first thing about
availability of data creates databases where you say, “I don’t data science. How do you bring
demand for analytics. need to maintain my own database those two groups together?
MD: Yes. The absolute number-one on the premises. I can just rent some MD: My data science Venn diagram
thing driving the current focus in boxes in the cloud and they can helps illustrate how you bring those
analytics is the increase in data. What’s persist our customer data that way.” groups together. The diagram has three
different now from what happened 30 circles. [See above.] The first circle is
years ago is that analytics is the province Metamarkets is trying to deliver data science. Data scientists are good
of people who have data to crunch. DaaS—data science as a service. If a at this. They can take data strings,
company doesn’t have analytics as a perform processing, and transform
What’s causing the data growth? I’ve core competency, it can use a service them into data structures. They have
called it the attack of the exponentials— like ours instead. There’s no reason for great modeling skills, so they can use
the exponential decline in the cost of companies to be doing a lot of tasks something like R or SAS and start to
compute, storage, and bandwidth, that they are doing in-house. You need build a hypothesis that, for example,
and the exponential increase in the to pick and choose your battles. if a metric is three standard deviations
number of nodes on the Internet. above or below the specific threshold
Suddenly the economics of computing We will see a lot of IT functions then someone may be more likely to
over data has shifted so that almost all being delivered as cloud-based cancel their membership. And data
the data that businesses generate is services. And now inside of those scientists are great at visualization.
worth keeping around for its analysis. cloud-based services, you often
will find an open source stack. But companies that have the tools and
PwC: And yet, companies are expertise may not be focused on a
still throwing data away. Here at Metamarkets, we’ve drawn critical business question. A company
MD: So many businesses keep only heavily on open source. We have is trying to build what it calls the
60 days’ worth of data. The storage Hadoop on the bottom of our stack, technology genome. If you give them
cost is so minimal! Why would you and then at the next layer we have our a list of parts in the iPhone, they can
throw it away? This is the shift at the own in-memory distributed database. look and see how all those different
big data layer; when these companies We’re running on Amazon Web Services parts are related to other parts in
store data, they store it in a very and have hundreds of nodes there. camcorders and laptops. They built
this amazingly intricate graph of the
Reshaping the workforce with the new analytics 21
22. “[Companies] realize the problem with being
walking amnesiacs, knowing what happened
to their customers in the last 30 days and then
forgetting every 30 days.”
actual makeup. They’ve collected large shopping carts?” Well, the company PwC: In many cases, the data
amounts of data. They have PhDs from has 600 million shopping cart flows is going to be fresh enough,
Caltech; they have Rhodes scholars; that it has collected in the last six because the nature of the business
they have really brilliant people. years. So the company says, “All right, doesn’t change that fast.
But they don’t have any real critical data science group, build a sequential MD: Real time actually means two
business questions, like “How is this model that shows what we need to things. The first thing has to do with
going to make me more money?” do to intervene with people who have the freshness of data. The second
abandoned their shopping carts and has to do with the query speed.
The second circle in the diagram is get them to complete the purchase.”
critical business questions. Some By query speed, I mean that if you have
companies have only the critical business PwC: The questioning nature of a question, how long it takes to answer
questions, and many enterprises fall business—the culture of inquiry— a question such as, “What were your top
in this category. For instance, the CEO seems important here. Some products in Malaysia around Ramadan?”
says, “We just released a new product who lack the critical business
and no one is buying it. Why?” questions don’t ask enough PwC: There’s a third one also,
questions to begin with. which is the speed to knowledge.
The third circle is good data. A beverage MD: It’s interesting—a lot of businesses The data could be staring you
company or a retailer has lots of POS have this focus on real-time data, in the face, and you could have
[point of sale] data, but it may not have and yet it’s not helping them get incredibly insightful things in
the tools or expertise to dig in and figure answers to critical business questions. the data, but you’re sitting there
out fast enough where a drink was Some companies have invested a with your eyes saying, “I don’t
selling and what demographics it was lot in getting real-time monitoring know what the message is here.”
selling to, so that the company can react. of their systems, and it’s expensive. MD: That’s right. This is about how fast
It’s harder to do and more fragile. can you pull the data and how fast can
On the other hand, sometimes some you actually develop an insight from it.
web companies or small companies A friend of mine worked on the data
have critical business questions and team at a web company. That company For learning about things quickly
they have the tools and expertise. developed, with a real effort, a real-time enough after they happen, query speed
But because they have no customers, log monitoring framework where they is really important. This becomes
they don’t have any data. can see how many people are logging a challenge at scale. One of the
in every second with 15-second latency problems in the big data space is that
PwC: Without the data, they across the ecosystem. It was hard to keep databases used to be fast. You used
need to do a simulation. up and it was fragile. It broke down and to be able to ask a question of your
MD: Right. The intersection in the Venn they kept bringing it up, and then they inventory and you’d get an answer
diagram is where value is created. When realized that they take very few business in seconds. SQL was quick when the
you think of an e-commerce company actions in real time. So why devote scale wasn’t large; you could have an
that says, “How do we upsell people all this effort to a real-time system? interactive dialogue with your data.
and reduce the number of abandoned
22 PwC Technology Forecast 2012 Issue 1
23. But now, because we’re collecting appliance. We solve the performance
millions and millions of events a problem in the cloud. Our mantra is
day, data platforms have seen real visibility and performance at scale.
performance degradation. Lagging
performance has led to degradation Data in the cloud liberates companies
of insights. Companies literally from some of these physical box
are drowning in their data. confines and constraints. That means
that your data can be used as inputs to
In the 1970s, when the intelligence other types of services. Being a cloud
agencies first got reconnaissance service really reduces friction. The
satellites, there was this proliferation coefficient of friction around data has
in the amount of photographic data for a long time been high, and I think
they had, and they realized that it we’re seeing that start to drop. Not
paralyzed their decision making. So to just the scale or amount of data being
this point of speed, I think there are a collected, but the ease with which data
number of dimensions here. Typically can interoperate with different services,
when things get big, they get slow. both inside your company and out.
PwC: Isn’t that the problem I believe that’s where tremendous
the new in-memory database value lies.
appliances are intended to solve?
MD: Yes. Our Druid engine on the back
end is directly competitive with those
proprietary appliances. The biggest
difference between those appliances
and what we provide is that we’re cloud
“Being a cloud service really
based and are available on Amazon. reduces friction. The coefficient
If your data and operations are in
of friction around data has for a
the cloud, it does not make sense long time been high, and I think
to have your analytics on some we’re seeing that start to drop.”
Reshaping the workforce with the new analytics 23