Page of1 11
DECEMBER 6TH, 2015
BUSINESS ANALYTICS AND
FALL TRIMESTER 2015 - SESSION B
Data leveraged decision making is not a novel idea. Throughout history, competent
people have been using whatever information is available and relevant in order to make
better decisions. Classically speaking, more information means better decisions, and
thus the idea of Big Data was born. This ideology held true until very recently when
advances in data gathering and storage technologies outpaced developments in
processing capabilities. This and other bottlenecks have forced professionals to
reevaluate how to interact with data.
Josh Wills, Senior Director of Data Science at Cloudera, explains that one or two or
even thousands of data points may not be very useful from a business analytics
perspective, but that the value in Big Data is only retrievable when the data set is
massive(Cloudera). Massive data sets can be expensive to store, difﬁcult to process
and, if handled incorrectly, troublesome to learn from. As Data Science professionals
constantly evolve their methods of data collection and management, companies get the
beneﬁt of richer, more relevant information.
Many companies have been collecting structured data for decades.
Structured data alone can be directly analyzed, but will only provide
one or two dimensional trend analysis. Combining structured and
unstructured data creates the opportunity for automated reasoning,
and eventually predictive analytics(Andriole).
The fastest growing form of big data is unstructured data. Unstructured implies that the
the data is not intrinsically quantitative. This type of data can be highly contextual and
requires more advanced processing than simple statistical analysis. Almost 80% of
newly created data is unstructured(Cloudera). The table below lists some everyday
examples of this type of data.
Page of2 11
The size and richness of these combined data sets makes it
challenging to process. Filtering for temporal relevancy can help
hone in on what really matters within the data. According to a
recent report from job listing startup, Textio, “Big Data” may not be
the hot topic it once was. Over the past year, job listings have
been transitioning to a new term: “Real Time Data”(Bass).
In today’s mobile world things happen fast. Social media can take ideas viral in mere
minutes. Data scientists realize that the “latest information is more important than
having a lot of information(Bass).” As the data sector continues to develop, methods for
capturing the right data at the right time will be top of mind. These new practices will
provide faster conclusions to more complex business problems. After all, decision
makers are not necessarily interested in the data itself, but the secrets trapped inside.
Page of3 11
Social Media: Tweets,
blogs, Facebook posts
Call Center Notes
Emails, Chat History
Andriole, Steve. "Unstructured Data: The Other Side of Analytics." Forbes. Forbes Magazine, 5 Mar. 2015.
Web. 06 Dec. 2015.
Bass, Dina. "Top 10 Rising and Falling Buzzwords in Tech Job Postings.” Bloomberg.com. Bloomberg, 3
Nov. 2015. Web. 06 Dec. 2015.
Cloudera: Training A New Generation Of Data Scientists. Dir. Josh Wills. Cloudera. N.p., 3 Sept. 2013.
Web. 06 Dec. 2015.
Page of3 11
The average person will store information on a computer using the built-in folder —> ﬁle
system. For typical computer use, this system will ﬁt well and take care of all the needs
of the user. However when it comes to business analytics, the requirements are much
higher. Data sets are much larger, and can contain many different types of data. The
sheer size of the data sets and the diversity of information types calls for a more
sophisticated data management system. Businesses need databases.
Maxim Levkov summed it up by saying that “databases provide a systematic way to
access, process, and correlate data that can be stored for further use.” Databases
enable sets of information to be organized and effectively accessed. There are many
different types of databases for the many different types of data.
Relational databases recognize relationships
among stored pieces of information(McNutt).
Before the days of big data, this used to be the
most common from of database. Its speed and
reliability come from its clearly deﬁned structure.
Data relationships are built across tables by
matching ﬁelds within rows. Users interact with
the data through Standard Query Language,
Today’s information demands a more dynamic and ﬂexible platform. To overcome the
limitations of strictly deﬁned scheme-style data management, “Not-only SQL” (NoSQL)
databases have been developed. These databases can handle diverse types of data,
and opposed to Relational, NoSQL databases are built to scale(Moniruzzaman).
Page of4 11
a systematic way to
and correlate data
that can be stored
for further use.”
The market has developed 5 major classes of NoSQL databases. Each one with its
particular strengths and weaknesses outlined in the chart below:
Even with advances in database architecture, relational databases have not been
completely replaced. In most cases, optimal data management will require some
combination of both the old and newer technologies.
Main Structure Strengths Weaknesses Examples
groups of columns
Document stores document
rather than data
Graph diagram of data
Key-Value database key simple and easy
no inherent data
XML XML nontraditional data;
Page of5 11
Brown, Meta S. "Next-Generation Databases Take On Big Data Management Challenges."
Forbes. Forbes Magazine, 30 June 2015. Web. 06 Dec. 2015.
Kumar, Girish. "Exploring the Different Types of NoSQL Databases Part Ii." 3Pillar Global.
N.p., 07 Oct. 2013. Web. 06 Dec. 2015.
McNutt, Louise-Anne. "Relational Database." Encyclopedia of Epidemiology. By Sarah
Boslaugh. Los Angeles: Sage Publications, 2008. 908-11. Web.
Moniruzzaman, A B M, and Syed Akhter Hossain. "NoSQL Database: New Era of Databases
for Big Data Analytics - Classiﬁcation, Characteristics and Comparison." (n.d.): n. pag.
International Journal of Database Theory and Application. Web. 6 Dec. 2015.
Sources: Aggregated from Brown, Kumar, and Moniruzzaman
Page of5 11
Infrastructure as it relates to business intelligence is another area that is in transition.
Historically, infrastructure was maintained with physical hardware on-site and in-house.
Small companies would either employ someone to manage IT or outsource the duties.
Large companies would have silo’d IT departments that controlled access and protected
company data. When the idea of business intelligence began to develop as a
management tool, IT departments were expanded and given the responsibility of
The new trends in infrastructure break through the silos and head to the cloud.
Software, Infrastructure, and Platform As A Service offerings allow businesses to access
the most up to date technologies at minimal cost. Companies no longer have to deal
with building and maintaining their own data centers(Gorelik).
Enterprise wide access to information leads to streamlined IT management and the
spreading of business analysts throughout the organization. The graphic below outlines
a possible workﬂow from the end user all the way up the chain to the
At the top of the chain is the Developer. They are responsible for getting the business
Page of6 11
systems working together. Business have data, and they need the proper infrastructure
in place to tie the different data sources together. Once the developer makes the data
container available, Business Analysts can begin working with reporting tools. This chart
lists all examples as “custom” but there are many standard reporting platforms available
Last in the line are the End Users. End users experience the ﬁnal product of the
business intelligence tools. The dashboards and other interactive visualizations deliver
insights that pull data from the different business units. This company-wide distillation of
information is only possible due to the purposeful integration of the networked data
system. Without cloud infrastructure, this rich level of business intelligence would not be
Cloud infrastructure is not a requirement, but it does provide many beneﬁts. At a
minimum, cloud infrastructure encourages open information across business units
within organizations and facilitates the rapid dissemination of business intelligence.
Page of7 11
Baars, Henning, and Hans-George Kemper. "Management Support with Structured and
Unstructured Data-An Integrated Business Intelligence Framework." Taylor & Francis.
Information Systems Management, 7 Apr. 2008. Web. 06 Dec. 2015.
Gilliland, Dan. "NetSuite SuiteCloud Platform Overview.” NetSuite. N.p., 28 July 2015. Web. 6
Gorelik, Eugene. "Cloud Computing Models." Cloud Computing (2013): n. pag. MIT.
Massachusetts Institute of Technology, Jan. 2013. Web. 6 Dec. 2015.
Zachman, John A. "Cloud Computing and Enterprise Architecture By: John A. Zachman."
Zachman International. N.p., 2011. Web. 06 Dec. 2015.
Page of7 11
Analytics are evolving. As computers became commonplace in the corporate world,
companies could more easily store business data. The ﬁrst stage of analytics was
explored by companies that collected their own transaction data(Davenport). Some
savvy companies would even cross reference their internal data and look for ways to
improve efﬁciencies. This model carried on until computing power grew and eventually
data pioneers began looking outside the company for more data feeds. This second
stage of analytics quickly grew to include many diverse data types and
sources(Davenport). Data sets exploded in size and companies struggled to develop
the technologies needed to process and store these incredible amounts of information.
Today, agile companies are moving into the 3rd stage of Davenport’s model. Companies
are leveraging their data to build better products. During the ﬁrst wave of innovation,
companies have started to integrate multiple types of structured and unstructured data
from internal and external sources(Davenport). These intertwining “super sets” can
deliver completely new predictive and prescriptive insights.
The volume of data generated and recorded is growing at an exponential rate. As
companies learn about new data relationships, they should start integrating analytics
directly into their business processes(Davenport). These automated events will improve
speed of delivery and therefore increase the impact of the derived insights.
These data initiatives need support from above if they are to stay on track. Adding the
role of Chief Analytics Ofﬁcer to the C Suite will give the required oversight. On the
ground, cross disciplinary teams are key. Seemingly disparate data sources need to
understood and conjoined to continue developing successes. Multi-disciplinary skills will
be invaluable on these data teams(Davenport).
Page of8 11
Traditional data warehouses are losing popularity compared to new, more agile data
constructs. Companies need capable platforms that facilitate the transfer of data
sources in and out of the system.
The new age of data is all about prescriptive analytics. Key internal business processes
and external customer interactions should have analytics embedded as much as
possible. This recipe for advancement will enable companies to make better predictions
about customer needs, improve customer service, and eliminate pain points(5 ways).
Humans can still think faster than computers can process, but data intensive predictive
analytics can help augment human beings to better understand and solve problems
Page of9 11
"5 Ways Companies Are Using Big Data to Help Their Customers." VentureBeat. IDA
Singapore, 21 Apr. 2014. Web. 06 Dec. 2015.
Davenport, Thomas H. "Analytics 3.0." Harvard Business Review. N.p., Dec. 2013. Web. 06
Page of9 11
After gathering, cleaning, and analyzing the data, the ﬁnal step is to share the ﬁndings.
Data analytics can reveal powerful insights, but if the ﬁndings cannot be communicated,
it is a waste of resources.
Sharing your ﬁndings usually involves written
reports or presentations and visual aids are helpful to share the results. Static
visualizations are the simplest form of expressing information. Printed maps, charts, and
graphs are general examples of standard static data visuals. More advanced interactive
visualizations are also becoming popular. Some interactive visuals let users manipulate
predeﬁned ﬁlters, layers, and queries in order to look at a dataset from a different
perspective. In some cases, users are only cycling through different views of pre-
processed reports, but a few of the more exciting technologies, Tableau for example,
allow visualizations to be processed directly from the actual dataset(Spiegel).
Data Visualization can be a effective method of communicating your ﬁndings, but only if
the preceding steps are taken with care. As an observer of data visualizations, it is
important to understand the source of the data. If the visualization is to be credible, then
the source of the data must also be credible. Bad, dirty, incomplete, or irrelevant data
Page of10 11
Getting Started with Data Visualization
Geoff McGhee, Stanford University
can undermine the quality of a visualization, and as viewers of the ﬁnal product, it is
difﬁcult to know the quality behind any publication.
Crafting meaningful visualizations can be a challenge. Nancy Duarte at The Harvard
Business Review offers the following key points to consider:
1. Am I presenting or circulating my data?
• Presentation visuals need to be succinct. Use simple lines and contrasting colors
to prove the point. Have the back up data tables ready if questioned, but do not
include them on the slides.
• When circulating information provide more detail. Readers can use as much time
as they like to digest the information.
2. Am I using the right kind of chart or table?
• Be sure the visualization (chart) projects the relationship you are purporting.
3. What message am I trying to convey?
• Use this question to identify and highlight the most important parts of the
4. Do my visuals accurately reﬂect the numbers?
• Formatting can be fun and pretty, but also distracting. Remember that the value of
the visualization is in the point it proves, not how advanced the chart appears.
5. Are my data memorable?
• Visualization doesn’t mean “use a chart.” Be sure to use visuals that are striking.
The more memorable the visualization, the more effective it will be at
communicating the idea.
Page of11 11
Duarte, Nancy. "The Quick and Dirty on Data Visualization." Harvard Business Review. N.p.,
16 Apr. 2014. Web. 06 Dec. 2015.
McGhee, Geoff. "Getting Started With Data Visualization." Stanford University. The Bill Lane
Center for the American West, 5 May 2011. Web. 6 Dec. 2015.
Spiegel, Benjamin. "Analytics: A Beginner's Guide To Data Visualization." Marketing Land.
N.p., 18 Dec. 2013. Web. 06 Dec. 2015.
Page of11 11
Apparemment, vous utilisez un bloqueur de publicités qui est en cours d'exécution. En ajoutant SlideShare à la liste blanche de votre bloqueur de publicités, vous soutenez notre communauté de créateurs de contenu.
Vous détestez les publicités?
Nous avons mis à jour notre politique de confidentialité.
Nous avons mis à jour notre politique de confidentialité pour nous conformer à l'évolution des réglementations mondiales en matière de confidentialité et pour vous informer de la manière dont nous utilisons vos données de façon limitée.
Vous pouvez consulter les détails ci-dessous. En cliquant sur Accepter, vous acceptez la politique de confidentialité mise à jour.