2. For more information get in touch on 020 7112 4949 or visit sonovate.com
Big data is changing
the world dramatically
right before our eyes –
from the amount of
data being produced to
the way in which it’s
structured and used.
This QuickView provides an overview to
what big data is, how big it is, how it’s
mined, refined and used, what the key
roles are and the popular big data
techniques used by suppliers.
Contents
Explosive Growth
What is Big Data?
Management of Big Data
How Big is Big Data?
Big Data in Numbers
5 Predictions For The $125 Billion Big Data
Analytics Market in 2015
5 Key Big Data Positions Used in a Big Data Flow
Top 10 Related IT Skills
Top 10 Industries Hiring Big Data Professionals
Top 10 Qualifications Sought by Hirers
Top 10 Database and BI skills Sought by Hirers
Key Big Data Terms Demystified
Popular Big Data Techniques and Vendors
“The most valuable commodity I know of is information”.
- Gordon Gekko
3. For more information get in touch on 020 7112 4949 or visit sonovate.com
Explosive Growth
The explosion of the data industry,
which has been likened to oil in the 18th
Century: an immensely, untapped
valuable asset, is fueling extraordinary
demand for “big data” skilled
professionals. Estimates suggest that
between 2012-17, use of big data could
contribute £216 billion to the UK
economy via business creation,
efficiency and innovation, and generate
58,000 new jobs. Big data is big business.
According to a report conducted by
leading business analytics software and
services company SAS, over the past five
years big data job growth has risen at an
annual rate of 212%.This presents both
challenges and opportunities for
businesses. A study by the Royal
Academy of Engineering shows that
British industry will need 1.25 million
new graduates in science, technology,
engineering and maths subjects
between now and 2020 to maintain
current employment numbers in an
ever-evolving market.
With no guarantee that universities will
produce the number of graduates
needed to meet the demand, or that
companies will invest in training and
development for existing staff,
companies who are seeking to
implement a big data strategy will need
to pursue a defined hiring strategy.
By working with hiring specialists to tap
into the existing talent pool and extract
the hard to find candidates to meet their
objectives, businesses will benefit
significantly.
According to Accenture, one of the world’s biggest parcel
companies and also among the world’s largest big data
users, spending $1bn annually to store and study 16
petabytes of data from every conceivable point of its
business. The enormity of this statistic underlines how
valuable big data is (in the right hands), how important it
is for businesses to acquire, interpret and use the right
data, and just how exciting the market potential is.
Source: SAS
1.2New Grads Needed
Million
212Job Growth
%
58,000
New Jobs
4. For more information get in touch on 020 7112 4949 or visit sonovate.com
What is Big Data?
Buzzword? catchphrase?
technology? In the last decade
there has been a lot said about big
data with hundreds of definitions,
such as:
“Big data is the derivation of value from
traditional relational database-driven business
decision making, augmented with new sources
of unstructured data” - Oracle
“Datasets whose size is beyond the ability of
typical database software tools to capture,
store, manage, and analyze” - McKinsey
“Big data is the data characterized by 3
attributes: volume, variety and velocity” - IBM
David Wellman’s succinct offering captures
the essence of what big data is really about -
“Big Data is not about the size of the data, it’s
about the value within the data.” Taking this
point back to Accenture’s research on the
$1bn packaging company, the value of big
data is about the specific purpose and intent
it’s used for and ultimately it’s impact on the
bottom line. To paraphrase William Bruce
Cameron “not everything that can be counted
counts”.
Technological Factors
Driving the Growth of Big
Data
New sources of data are being created
through:
• Digitisation of existing processes and
services, for example online banking, email
and medical records
• Automatic generation of data, such as web
server logs that record web page requests
• Reduction in the cost and size of sensors
found in aeroplanes, buildings and the
environment
• Production of new gadgets that collect and
transmit data, for example GPS location
information from mobile phones and
capacity updates from ‘smart’ waste bins
Enhanced Computing
Capabilities Driving Big
Data Include:
• Improved data storage at higher densities,
for lower cost
• Greater computing power for faster and
more complex calculations
• Cloud computing (remote access to shared
computing resources via a device connected
to a network), facilitating cheaper access to
data storage, computation, software and
other services
• Recent advances in statistical and
computational techniques, which can be
used to analyse and extract meaning from
big data
• Development of new tools such as Apache
Hadoop (which enables large data sets to be
processed across clusters of computers) and
extension of existing software, such as
Microsoft Excel.
5. For more information get in touch on 020 7112 4949 or visit sonovate.com
Mining
Big data can be acquired from a vast,
and increasing, number of sources.
These include images, sound recordings,
user click streams that measure internet
activity, and data generated by computer
simulations (such as those used in
weather forecasting). Key to managing
data collection are metadata, which is
data about data. An email, for example,
automatically generates metadata
containing the addresses of the sender
and recipient, and the date and time it
was sent, to aid the manipulation and
storage of email archives. Producing
metadata for big data sets can be
challenging, and may not capture all the
nuances of the data.
Refining
Data may undergo numerous processes
to improve quality and usability before
analysis, including:
Extraction – pulling out required
information from the initial data and
expressing it in a structured form
Cleansing – detecting and then
correcting or removing corrupt or
inaccurate records standardization –
formatting data to aid interoperability
Linkage – connecting records from
different sources.
.
Management of Big Data
Use
Analytics are used to gain insight from
data. They typically involve applying an
algorithm (a sequence of calculations) to
data to find patterns, which can then be
used to make predictions or forecasts. Big
data analytics encompass various
inter-related techniques, including the
following examples.
Data mining - identifies patterns by sifting
through data. It can be applied to user click
streams to understand how customers use
web pages to inform web page design.
Machine learning - describes systems that
learn from data. For example, a system
that compares documents in two different
languages can infer translation rules;
human correction of any errors in the rules
can result in the system learning how to
improve the software.
Simulation - can be used to model the
behaviour of complex systems. For
example, building a trading simulation can
help to assess the effectiveness of
measures to reduce insider trading.
6. For more information get in touch on 020 7112 4949 or visit sonovate.com
How Big is Big Data?
Research group IDC predicts the digital universe
will reach 40 zettabytes in size – that’s 45 trillion
gigabytes – by 2020. That’s a 50-fold growth in just
one decade. There is now almost as many bits of
data as there are known stars in the universe.
2013: 4.4 zettabytes, 2020: 44 zettabytes.
Source: Oracle 2012
What is a Zettabyte?
1,000,000,000,000 Gigabytes
1,000,000,000,000 Terabytes
1,000,000,000,000 Petabytes
1,000,000,000,000 Exabytes
1,000,000,000,000 Zettabytes
0
5
10
15
20
25
30
35
40
45
50
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
1 terabyte
holds the
equivalent
of roughly
210 single-
sided DVDs In 20007,
the estimated infomation
content of all human knowledge
was 295 exabytes
Data is growing at a 40 % compound annual rate, reaching nearly 45 ZB by 2020
Data in zettabytes ( ZB )
7. For more information get in touch on 020 7112 4949 or visit sonovate.com
64,000UKorganisationswith100ormore
staffwillhaveimplemented
bigdataanalyticsby2020
2009 2020
By 2020 over 1/3 of all
data will live in or pass
through the cloud
346,000
Big data job opportunities
created in the economy
in the UK by 2020
The digital universe will grow from 3.2 zettabytes today
to 40 zettabytes in only 6 years
0 6
Big Data in Numbers
Individuals create 70 % of all dataEnterprises store 80 % of all data
Data production will be 44 times greater in 2020 than it was in 2009
222%
2017
The UK is forecasting a 222%
increase in big data jobs by 2017
x+100
41%2011 2013
Rise in “big data” jobs
throughout the UK
Source: IDC/SAS
8. For more information get in touch on 020 7112 4949 or visit sonovate.com
Predictions For The
$125 Billion Big Data
Analytics Market in
2015
The big data and analytics market
will reach $125 billion worldwide in
2015, according to IDC and The
International Institute of Analytics
(IIA).
Here are their top five predictions
for 2015:
1. Over the next five years spending on
cloud-based big data and analytics (BDA)
solutions will grow three times faster than
spending for on-premise solutions.
2. Shortag of skilled staff will persist. In the
US alone there will be 181,000 deep
analytics roles in 2018 and five times that
many positions requiring related skills in
data management and interpretation. In the
UK, there will be 47,000 big data roles by
2017, a 222% increase on 2013.
3. Growth in applications incorporating
advanced and predictive analytics, including
machine learning, will accelerate in 2015.
These apps will grow 65% faster than apps
without predictive functionality.
4. 70% of large organisations already
purchase external data and 100% will do so
by 2019. In parallel more organisations will
begin to monetise their data by selling them
or providing value-added content.
5. Rich media (video, audio, image) analytics
will at least triple in 2015 and emerge as the
key driver for BDA technology investment.
Hiring Big Data Specialists: The Key
Roles
Data Analyst
Big Data Developer
Data Modeler
Big Data Architect
Business Data Analyst
Data Scientist
SAS Data Analyst
SAP Data Analyst
SQL Data Analyst
Data Warehousing (DWH) Developer
Data Centre Architect
Master Data Analyst
Data Governance Manager
Big Data Consultant
Data Warehousing (DWH) Analyst
Data Integration Developer
Data Migration Analyst
Master Data Consultant
Data Migration Manager
Data Business Analyst
Oracle Data Warehousing (DWH) Developer
Market Data Engineer
Data Migration Project Manager
Data Centre Project Manager
Data Centre Consultant
Data Protection Manager
SAS Data Integration (DI) Studio Developer
125$BillionIDC and (IIA)
9. For more information get in touch on 020 7112 4949 or visit sonovate.com
Big Data top 10s
For the six months to 3
March 2015, IT jobs
within the UK citing big
data also mentioned the
following IT skills in
order of popularity. The
figures indicate the
number of jobs and
their proportion against
the total number of IT
job ads sampled that
cited big data.
Top 10 Industries
1 Finance
2 Marketing
3 Banking
4 Retail
5 Telecoms
6 Advertising
7 Games
8 Pharmaceutical
9 Investment Banking
10 Legal
Top 10 Qualifications
1 Degree
2 phD
3 Security Cleared
4 VCP4
5 SQL
6 MBA
7 Microsoft Certification
8 DV Cleared
9 ISEB
10 PMI Cirtification
Database & Business
Intelligence
1 Hadoop
2 noSQL
3 SQL Server
4 Data Warehouse
5 MongoDB
6 Apache Hive
7 mySQL
8 Data Mining
9 Apache Cassandra
10 SQL Server Integration Serv.
Top 10 Related IT Skills
1 Java
2 Hadoop
3 Agile Software Dev.
4 Analytics
5 SQL
6 Business Inteligence
7 Finance
8 NoSQL
9 Python
10 SQL Server
10. For more information get in touch on 020 7112 4949 or visit sonovate.com
Key Big Data Terms
Demystified
Hadoop
Hadoop is a complex software ecosystem central
to a broad range of state-of-art big data
technologies (learn more about what is Hadoop).
Companies that work with data at super-massive
scale inevitably need expert engineers who can
work nimbly within the Hadoop framework.
NoSQL
NoSQL (commonly referred to as "Not Only SQL")
represents a completely different framework of
databases that allows for high-performance,
agile processing of information at massive scale.
In other words, it is a database infrastructure
that as been very well-adapted to the heavy
demands of big data.
MongoDB
MongoDB is a leading NoSQL database that is
very popular among companies with big data
initiatives. Demand for talented engineers with
MongoDB familiarity is very high.
Cassandra
Cassandra is a popular NoSQL technology stack
that was originally developed at Facebook, and is
now deployed at large number of companies
with big data initiatives.
Business Intelligence (BI)
BI is a critical capability in any data-driven
organisation, responsible for making data visible
and actionable for smarter decision-making. BI
teams accomplish this by developing tools that
make data easy to digest – i.e. data reporting,
visualisation, and query platforms such as
dashboards and OLAP tools. BI
developers/analysts require sharp technical skills
and comfort working with large database
systems
Database Administrators (DBA)
DBAs are vital engineers at any company with
data infrastructure. The role of DBA has actually
become more complex over the years. In the
past, data may have been adequately managed
on a single server. But the big data infrastructure
of today is often comprised of a medley of
intricate, interconnected data platforms,
potentially involving large clusters of massively
parallel processing servers. Demand for this type
of DBA talent is very high.
Key Big Data Positions Used
in a Big Data Flow
Data Hygienists make sure that data coming
into the system is clean and accurate, and stays
that way over the entire data lifecycle.
Data Explorers sift through mountains of data
to discover the data you actually need.
Business Solution Architects put the
discovered data together and organise it so that
it's ready to analyse.
Data Scientists take this organised data and
create sophisticated analytics models that, for
example, help predict customer behavior and
allow advanced customer segmentation and
pricing optimization.
Campaign Experts turn the models into results.
They have a thorough knowledge of the technical
systems that deliver specific marketing
campaigns, such as which customer should get
what message when.
11. Popular Big Data Techniques and Vendors
Business Intelligence (BI)/Online Analytical Processing
(OLAP):
Users interactively analyse multidimensional data
users can roll-up, drill-down, and slice data
BI tools provide dashboard and report capabilities
Cluster Analysis:
Segment objects (e.g., users) into groups based on similar
properties or attributes
Data Mining:
Process to discover and extract new patterns in large data sets
Predictive Modeling:
A model is created to best predict the probability of an outcome
SQL:
A computer language that manages (e.g., query, insert, delete,
extract) data from a relational database
Crowdsourcing:
A process for collecting data from a large community or
distributed group of people
Idea submission is a common crowdsourcing activity
Textual Analysis:
Computer algorithms that analyse natural language
Topics can be extracted from text along with their linkages
Sentiment Analysis:
A form of textual analysis that determines a positive, negative, or
neutral reaction
Often used in marketing brand campaigns
Network analysis:
A methodology to analyse the relationship among nodes (e.g.,
people)
On social media platforms, it can be used to create the social
graph of follower and friends’ connections among users
For more information get in touch on 020 7112 4949 or visit sonovate.com
Technique Vendor
TransactionalDataNon-transactionalSocialData
12. Reference Shelf:
Cebr "Data Equity: unlocking the value of big data, 2012"
Hadoop Summit 2014
IDC "Big Data and Analytics and Enterprise Applications Will Continue to Drive
Software Market Growth Until 2018"
Computer Weekly: "IT Department for Big Data Projects"
McKinsey "Big Data is the data the next frontier for innovation"
McKinsey "Big data: The next frontier for innovation, competition, and
productivity"
David Wellman "What is Big Data"
IDC: "Worldwide Big Data and Analytics Predictions for 2015"
Accenture Big Success with Big Data Survey
Onrec: "How is the online recruitment IT sector faring"
ITJobs Watch "Big Data skills in IT jobs"
POSTnote "Big Data: An Overview"
The International Institute of Analytics "Analytics predictions for 2015"
Data Jobs "Key Big Data Terms Demystified"
HBR "Five Roles You Need on Your Big Data Team"
CION Insight "Digital Universe Expands at an Alarming Rate"
The Telegraph "Big data skills will lead to big IT jobs"
MIT: "The Big Data Conundrum: How to Define It?"
For more information get in touch on 020 7112 4949 or visit sonovate.com