This document discusses big data and its growth. It notes that in 2000, 2 exabytes of new data were produced, while in 2011 1.8 zettabytes of new data were produced. By 2020, data production is expected to grow 40 times to 35 zettabytes. The traditional 3-4 V's of big data (volume, velocity, variety, veracity) are expanding to 5-7 V's with the addition of viscosity, virality, and value. Examples of big data use cases include sensor data from CERN and jet engines, social media data from Twitter, and transactional data from Walmart. Atos provides big data analytics solutions and has implemented projects for smart metering,
2. What is Big Data? Data Pioneers
10 april 2013
▶ In the year 2000 we produced 2 Exabytes of new data
▶ In the year 2011 we produced 1.8 Zettabytes of new data
▶This is: 1.800.000.000.000.000.000.000 bytes
▶ In 2020: 40x more data towards 35 Zettabytes
▶This growth every year to even Yottabyte(s) (=10 to the powe r24)
2
3. What is Big Data, the 3-4 Data Pioneers
10 april 2013
traditional V’s
Source: Oracle
3
4. From the traditional 3-4 V’s Data Pioneers
10 april 2013
towards the 5-7 V’s
Viscosity – Viscosity measures the
resistance to flow in the volume of
Value
data. This resistance can come from
different data sources, friction from
integration flow rates, and processing
required to turn the data into insight.
Technologies to deal with viscosity
include improved streaming, agile
integration bus’, and complex event
processing.
Virality – Virality describes how quickly
information gets dispersed across
people to people (P2P) networks.
Virality measures how quickly data is
spread and shared to each unique
node. Time is a determinant factor
along with rate of spread.
Veracity: Trust & Quality
Veracity
4
5. Big Data & Internet of Things Data Pioneers
10 april 2013
Context is key for generating value
Sensors /
Actuators
Web Portal to get user
actions
M2M
M2M B2B Partner IS
(Data Provider) Mediators
(Nfc, gps, accelero) Machine to Machine
Subscriptions
Big Data
Context Big Data
Aggregators
Engine Broker Engine
Aggregation
Correlation Platform
Application Application Application Application Application Application Applications
5
6. Struggles for Business Data Pioneers
10 april 2013
▶ Driver: Who is the driving Force, IT, Business, Cost?
▶ Opportunities: Which Opportunities is Big Data (Analytics) deliver, how Big Data
can make a difference?
▶ How to Start: Which Roadmap(s) should we follow?
▶ How to Integrate: How integrate Big Data (Strategy) within the
current Infra-architecture?
McKinsey calls Big Data “the next frontier
for innovation,competition and
productivity”
6
7. Data Pioneers
Use Cases 10 april 2013
Large Hadron Collider: An example of sensor and machine data is found at the Large Hadron Collider at CERN, the
European Organization for Nuclear Research. CERN scientists can generate 40 terabytes of data every second during
experiments.
Boeing Jets: Boeing jet engines can produce 10 terabytes of operational information for every 30 minutes they turn. A
four-engine jumbo jet can create 640 terabytes of data on just one Atlantic crossing; multiply that by the more than
25,000 flights flown each day, and you get an understanding of the impact that sensor and machine-produced data can
make on a BI environment.
Twitter: The micro blogging site Twitter serves more than 200 million users who produce more than 90 million "tweets"
per day, or 800 per second. Each of these posts is approximately 200 bytes in size. On an average day, this traffic equals
more than 12 gigabytes and, throughout the Twitter ecosystem, the company produces a total of eight terabytes of data
per day. In comparison, the New York Stock Exchange produces about one terabyte of data per day.
Wal-Mart: Transactional data has grown in velocity and volume at many companies. As recently as 2005, the largest data
warehouse in the world was estimated to be 100 terabytes in size. Today, Wal-Mart, the world's largest retailer, is logging
one million customer transactions per hour and feeding information into databases estimated at 2.5 petabytes in size.
Financial services: Discover fraud patterns based on multi-years worth of credit card transactions and in a time scale that
does not allow new patterns to accumulate significant losses. Measure transaction processing latency across many
business processes by processing and correlating system log data.
Internet retailers: Discover fraud patterns in Internet retailing by mining web click logs. Assess risk by product type and
session Internet Protocol (IP) address activity.
Retailers: Perform sentiment analysis by analysing social media data.
Drug discovery: Perform large-scale text analytics on publicly available information sources.
Healthcare: Analyse medical insurance claims data for financial analysis, fraud detection, and preferred patient treatment
plans. Analyse patient electronic health records for evaluation of patient care regimes and drug safety.
Mobile telecom: Discover mobile phone churn patterns based on analysis of call detail records and correlation with
activity in subscribers' networks of callers.
IT technical support: Perform large-scale text analytics on help desk support data and publicly available support forums
to correlate system failures with known problems.
Scientific research: Analyse scientific data to extract features (e.g., identify celestial objects from telescope imagery).
Internet travel: Improve product ranking (e.g., of hotels) by analysis of multi-years worth of web click logs.
7
11. Big Data & Internet of Things Data Pioneers
Smart Metering at ERDF 10 april 2013
▶ Atos is the first IT services company to manage
such a large scale implementation of smart
meters in Europe
▶ Targeting 35 million meters being installed for
French distribution system operator ERDF. The
smart meter solutions developed by Atos help
Smart Utilities to meet three goals: lower costs;
improved delivery and more efficient services
to home and business users and a reduction of
energy usage by regulating the network. At the
beginning of March 2011, ERDF started the
operation of its new IT platform of its Linky
project.
11
12. Atos Olympische Spelen, London 2012 Data Pioneers
and vision for 2020 (Real Big Data) 10 april 2013
12
13. Opportunity from CNES : Big Data Data Pioneers
10 april 2013
for Control Systems
▶ Atos won a 25 M€ contract with French Space Agency
(CNES) for a “Product Line” to build Control Systems for
spacecraft
– First control system will be for a military satellite
▶ IP of some components will be shared between CNES
and Atos
▶ More interesting asset to share is infrastructure
▶ Key components : several data stores
– Distributed architecture
– Lightweight
– Very fast
– Based on manageable, understandable open source
components
• Security, maintainability, long term support, …
▶ Our innovative architecture has been a key element of
our selection
13
14. Red Spotted Hankey, Data Pioneers
10 april 2013
Travel Web Site
Business Issue
▶ Limited understanding of the dynamics of marketing
response and external influences on web traffic and
sales
▶ “Static” customer information
Use Cases sentiment translate into an increase in web
sales?
Does a local radio advertising campaign translate into
increased web traffic and sales?
▶ How can RSH derive the best possible value from its
marketing strategies, eg:
redspottedhanky.com – Does a positive spike in social media sen
sells discounted train – What impact does weather have on web traffic?
tickets on-line. Customers Solution
gather loyalty points for ▶ Cloud based Big Data platform integrating, storing and
each ticket purchased analysing unstructured and structured data
which can be used to buy ▶ Hadoop based solution integrating weather, twitter
additional train tickets. feeds, ticketing sales, CRM and web traffic data into
single repository for trend identification and analysis
14
15. MyCity – Real Time Traffic Data Pioneers
10 april 2013
Forecast
Traffic sensors of the City of Berlin CityCockpit for RTTF Vehicle’s on-board unit
. .
. Smart
phone app
1 1200
Real-time sensor data Real-time data
Additional data Forecasted data
Other data sources
Traffic web (e.g. crowd sourced and open data) Traffic data server Traffic forecast server
server Data, Services & Analytics 4 hours forecast service
15
16. Nieuwe mogelijkheden; Customer profiling Data Pioneers
Personal Based Economy / Personal Data Economy 10 april 2013
▶ Laatste Web klikken van de klant / click-stream analysis
– tonen juiste advertenties
– Flexibele prijzen / aanbiedingen
– Loyaliteitsprogramma
▶ Klant “usage patterns” van uw services
– Veel gebelde telefoonnummers speciale aanbiedingen
▶ Locatie van klant
..en vergeet niet dit
– Location based services
kun je ook allemaal
▶ Genetische / DNA patronen van uw klant / patiënt
weer combineren met
– Voorschrijven de best werkende medicatie gebaseerd op best
Big Data!......
werkende statistische analyse
– Preventieve geneeskunde
▶ Beleid / Declaratie profiel van klanten
– Fraude detectie / opsporing / management
– Proactieve verzekeringspakket aanbieden
▶ Klanten bezitten Twitter stream, Facebook pagina
– Detecteer hobby's en interesses
– Detecteer belangrijke gebeurtenissen (Geboorte, verhuizing, etc.)
– Quantified Self by Numbers
16
17. Keeping track of the Customer journey Data Pioneers
10 april 2013
From Traditional (single path, predictable process)
View TV or Compare Choose
Go to Store Buy Item
print ad Options Best Option
To Connected (multi channel, multi path, complex unpredictable process)
Levels of
Search Smartphone
Customer
app Compare Interest
prices
Loyalty
Demo in
Like on store
Commitment
Watch on Facebook
View Youtube
print ad Watch Evaluation
tutorial
Buy Item
Read Interest
Read Online
reviews
Blog shopping View
banner ad Awareness
17
18. Now Banking (Atos Smart Mobility + Big Data) Data Pioneers
interacting with consumers and guiding them in their day 2 day life’s 10 april 2013
Home Travel Work Hospital Shopping Culture Travel Home
-Personal FM -Personal FM
-Casualty Man -Casualty Man
-Savings -Micro credit -Savings
-Mortgages -Car insurance -Credit/Debit -Sustainable -Car insurance -Mortgages
-Investments -Liability -Income -Health cards banking -Liability -Investments
-Financial goals -Work away* -Life insurance insurance -Personal loans -Sponsoring -Work away -Financial goals
Morning Evening
*Atos proposition for banks facilitating to work anywhere , anytime
18
19. Risks to be aware of Data Pioneers
10 april 2013
(several, and quite diverse)
Policies:
The risks of misuse:
security, privacy, Emergent, immature
“Lies, Damned Lies and
intellectual property, technologies
Statistics”
liability ..
Mixing “old” tech with Data Garbage: “Digital Access to data can be
the new platforms Diogenes” problematic
Scarcity of talent in a
Transparency is hard
Data ownership issues complex field (“Data
to achieve
Scientists”)
19
20. More info Data Pioneers
10 april 2013
▶ See Factsheet and whitepaper Open source Solutions For Big Data Management:
http://nl.Atos.net/BigData
20
21. Contact? Data Pioneers
10 april 2013
» Name: Roland Haeve
» Role: Global Director Big Data;
Information Management &
Analytics
» Mail: Roland.Haeve@atos.net
» Tel: 06-22465013
» @Rhaeve
21