Contenu connexe Similaire à Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_2014_John_Sing (20) Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_2014_John_Sing1. © 2014 John Sing – All Rights Reserved
Big Data: the Big Picture
For your 2014+ Business and Career
Opening video
John Sing, Executive IT Consultant
http://johnsing.us
2. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
2
John Sing 32 years of experience in enterprise servers, storage, and software
– 2009 – 2014: IBM Executive IT Consultant: IT Strategy and Planning, Enterprise
Large Scale Storage, Internet Scale Workloads and Data Center Design, Big Data
Analytics, HA/DR/BC
– 2002-2008: IBM IT Data Center Strategy, Large Scale Systems, Business
Continuity, HA/DR/BC, IBM Storage
– 1998-2001: IBM Storage Subsystems Group - Enterprise Storage Server Marketing
Manager, Planner for ESS Copy Services (FlashCopy, PPRC, XRC, Metro Mirror,
Global Mirror)
– 1994-1998: IBM Hong Kong, IBM China Marketing Specialist for High-End Storage
– 1989-1994: IBM USA Systems Center Specialist for High-End S/390 processors
– 1982-1989: IBM USA Marketing Specialist for S/370, S/390 customers (including
VSE and VSE/ESA)
john@johnsing.us
http://johnsing.us
Follow my daily IT research blog
– http://www.delicious.com/atsf_arizona
Follow me on Slideshare.net:
– http://www.slideshare.net/johnsing1
LinkedIn:
– http://www.linkedin.com/in/johnsing
3. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
3
Data, the new natural Resource
Big Data in context:
Cloud, Analytics, Mobil, Social
Innovating using Big Data:
Monetizing, innovating, creating competitive
advantage out of Big Data
Agenda
4. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
4
1. Data + Analytics = Information
2. Information + Context = Insight
3. Insight + Actions = Desired
Outcomes
Today’s message: The Big Data Journey to Value
5. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
5
Data, the new natural Resource
Data, the new natural resource
6. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
6
Time
ComputingPowerGrowth
Traditional IT
“sensemaking”
capability
Available data
for
observation
What we see in the world today……
Chart by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
7. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
7
Time
ComputingPowerGrowth
Traditional IT
“sensemaking”
capability
Available data
for
observation
Context
Enterprise
Amnesia
What we see in the world today ………..
Chart by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
8. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
8
Enterprise Amnesia, definition
A defect in memory, resulting in missed
opportunity, wasted resources, lower
revenues, unnecessary fraud losses, and
other bad news.
9. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
9
Time
ComputingPowerGrowth
Traditional IT
“sensemaking”
capabilities
Available
Observation
Space
Because traditional IT methods could not keep pace
WHY?
Chart by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
10. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
10
Time
ComputingPowerGrowth
This is the Big Data Opportunity
Add: Big Data
Sensemaking
Algorithms
Available
Observation
Space
Context Big Data
capability
New/Useful
Information
Data
Analytics
11. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
11
Think of the Gold Mine analogy – in the “Olden
Days”
Miners could actually see nuggets / veins
of gold
There was much more gold
out there….
– but it wasn’t visible to naked
eye…
It was a big gambling game
– You dig like crazy, but you’ve no
idea where more gold will be
found
12. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
12
In the “olden days”, no one could afford to dig
everywhere
Where gold is mined on Earth (as of 2006)
Despite gold rush fevers, no one could afford to mobilize millions of
people to dig everywhere
13. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
13
Gold mining in 2012:
Massive capital equipment
Millions of tons of dirt
Ore of 30 mg/kg (30 ppm)
– Needed to even see the gold
By using the right equipment
On a massive scale
We can process lots of dirt affordably
and keep the gold we find
That’s
like Big
Data!
14. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
14
Processor power: Google, Yahoo, Facebook surpassed the
Supercomputer community in compute power and scale…… in 2008
Google in 2012:
– 200+ petaflops
– Processes 1 TB / hour
– 2003: Batch
– 2005: Warehouse
– 2011: Instant
– Dumped MapReduce
– Wrote replacement real-time indexing
(“Percolator”)
– Click here for architecture
Facebook in 20 Minutes in 2012
– 30 PB cluster of storage
– 2.7M Photos, 10.2M Comments, 4.6M
Messages
– Facebook's New Realtime Analytics
System: Hadoop HBase To Process 20
Billion Events Per Day
May 21, 2008: http://www.circleid.com/posts/85218_google_surpasses_supercomputer/
http://highscalability.com real life internet architectures
http://highscalability.com/display/Search?searchQuery=facebook&moduleId=4876569
15. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
15
Hmmmm. What might we find in all this data? And How?
Cisco estimate: by 2015, will be annual 4,8 zettabytes of data center traffic flowing
through Internet, Only 5% will be traditional OLTP database
Data in existence today =
1,000 exabytes = 1 million
petabytes
http://venturebeat.com/2011/11/29/cisco-global-cloud-traffic/
16. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
16
Visualizing Big Data
Source: Wikibon March 2011
Goal: Analyze
*all* the data
real time
Original source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
Very large
Distributed
aggregation
Loosely
structured
Often
incomplete
Sampling not
strategically
competitive
17. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
17
Visualizing Big Data….
Source: Wikibon March 2011
Original source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
Petabytes
Exabytes
Millions /
Billions of
people
Billions /
Trillions of
records
Time-
stamped
events
Unknown
inter-
relationships
Flat files
18. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
18
Visualizing Big Data…..
Source: Wikibon March 2011
Original source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
Connections
determined by
probability
Process
entire (huge)
data set
Data generated by collective action
over the Internet
Open
Source
innovation
It’s more than
the
algorithms….
19. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
19
It’s also:
Source: Wikibon March 2011
Original source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
Its
collaboration
of algorithms
Combined /
Collaborated
innovative
ways
A software
Ecosystem
is essentialOn a worldwide
scale
Multiple
Worldwide
“Pockets of
Value”
20. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
20
Kevin Slavin at TEDGlobal July 2011
“How algorithms shape our world”
http://www.ted.com/talks/kevin_slavin_how_algorithms_shape_our_world
Visualizing what Algorithms are doing
21. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
21
Big Data and Hadoop: worldwide usage
eBay
Linkedin
Yahoo!
Facebook
Major Fortune
500 customers
Including all IBM
industries:
– Financial
– Healthcare
– M&E
– Telecom
– Utilities
– Retail
http://www.datanami.com/datanami/2012-04-26/six_super-scale_hadoop_deployments.html
One source for Hadoop users (but not the only one!): http://wiki.apache.org/hadoop/PoweredBy
22. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
22
Hadoop is a well-developed ecosystem for Big Data app development
Hadoop
– Overall name of software
stack
HDFS
– Hadoop Distributed File
System
MapReduce
– Software compute framework
• Map = queries
• Reduce=aggregates
answers
Hive
– Hadoop-based data
warehouse
Pig
– Hadoop-based language
Hbase
– Non-relationship database fast
lookups
Flume
– Populate Hadoop with data
Oozie
– Workflow processing
system
Whirr
– Libraries to spin up Hadoop
on Amazon EC2,
Rackspace, etc.
Avro
– Data serialization
Mahout
– Data mining
Sqoop
– Connectivity to non-Hadoop
data stores
BigTop
– Packaging / interop of all
Hadoop components
http://wikibon.org/wiki/v/Big_Data:_Hadoop%2C_Business_Analytics_and_Beyond
http://blog.cloudera.com/blog/2013/01/apache-hadoop-in-2013-the-state-of-the-platform/
http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/
23. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
23
Visualizing why Hadoop was created for Big Data
Traditional approach : Move data to program
Big Data approach: Move function/programs to data
Database
server
Data
Query Data
return Data
process Data
Master
node
Data
nodes
Data
Application
server
User request
Send result
User request
Send Function to
process on Data
Query &
process Data
Data
nodes
Data
Data
nodes
Data
Data
nodes
Data
Send Consolidate result
Traditional approach
Application server and Database
server are separate
Analysis Program can run on
multiple Application servers
Network is still in the middle
Data has to go through network
Designed to analyze TBs of data
•Big Data Approach
Analysis Program runs where the
data is : on Data Node
Only Analysis Program has to go
through the network
Analysis Program is executed on
every DataNode
Designed to analyze PBs of data
Highly Scalable :
1000s Nodes
Petabytes and more
Thank you to: Pascal VEZOLLE/France/IBM@IBMFR and Francois Gibello/France/IBM for the use of this slide
24. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
24
Example of Hadoop in action
Database
server
Data
Query Data
return Data
process Data
Application
server
User request
Send result
Master
node
Data
nodes
Data
User request
Send Function to
process on Data
Query &
process Data
Data
nodes
Data
Data
nodes
Data
Data
nodes
Data
Send Consolidate result
Example: How many hours of Clint
Eastwood appears in all the movies he
has done?
Task: All movies need to be
parsed to find Clint’s face
•Traditional approach :
1)Upload a movie to the application server
through the network
2) The Analysis Program compares Clint’s
picture with every frame of the loaded movie.
3) Repeat the 2 previous steps for every movie
•Big Data Approach :
1)Send the Analysis Program and Clint’s
picture to all the DataNodes.
2) The Analysis Program in every DataNode
(all in parallel) compares the Clint’s picture
with every frame of the loaded movie.
3) The results of every DataNodes are
consolidated. A unique result is generated.
Traditional approach : Move data to program
Big Data approach: Move function/programs to data
Thank you to: Pascal VEZOLLE/France/IBM@IBMFR and
Francois Gibello/France/IBM for the use of this slide
25. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
25
Hadoop in action – details:
Hadoop Distributed File System = HDFS : where Hadoop stores the data
– HDFS file system spans all the nodes in a cluster with locality awareness
Hadoop data storage, computation model
– Data stored in a distributed file system, spanning many inexpensive computers
– Send function/program to the data nodes
– i.e. distribute application to compute resources where the data is stored
– Scalable to thousands of nodes and petabytes of data
MapReduce Application
1. Map Phase
(break job into small parts)
2. Shuffle
(transfer interim output
for final processing)
3. Reduce Phase
(boil all output down to
a single result set)
Return a single result setResult Set
Shuffle
public static class TokenizerMapper
extends Mapper<Object,Text,Text,IntWritable> {
private final static IntWritable
one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text val, Context
StringTokenizer itr =
new StringTokenizer(val.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWrita
private IntWritable result = new IntWritable();
public void reduce(Text key,
Iterable<IntWritable> val, Context context){
int sum = 0;
for (IntWritable v : val) {
sum += v.get();
. . .
Distribute map
tasks to cluster
Hadoop Data Nodes
Data is loaded,
spread, resident in
Hadoop cluster
Performance =
tuning Map Reduce workflow,
network, application, servers,
and storage
http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/
http://blog.cloudera.com/blog/2009/12/7-tips-for-improving-mapreduce-performance/
http://www.slideshare.net/allenwittenauer/2012-lihadoopperf
26. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
26
What is being done
with Big Data today?
27. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
27
Humans are collecting useful data on massive scale
Chart in public domain: IEEE Massive File Storage presentation, author: Bill Kramer, NCSA: http://storageconference.org/2010/Presentations/MSST/1.Kramer.pdf:
28. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
28
We are building real-time, integrated stream computing on massive scale
Chart in public domain: IEEE Massive File Storage presentation, author: Bill Kramer, NCSA: http://storageconference.org/2010/Presentations/MSST/1.Kramer.pdf:
n d
Chapter 1
29. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
29
• Unlimited in amount, but you have to
refine it
• Basis of competitive advantage, no
matter what industry
• Every market being transformed by
data
Data is the new natural resource
30. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
30
Aerospace / defense transformation
31. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
31
Automobile transformation
Ford: https://www.youtube.com/watch?v=nFUszkSv5X0
32. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
32
Energy & utilities transformation
33. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
33
Government transformation
Miami-Dade County: https://www.youtube.com/watch?v=toL4Yx9WYPo
Miami-Dade Police: https://www.youtube.com/watch?v=1b5RiPWd-Pw
34. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
34
Media and entertainment transformation
35. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
35
Predictive Analytics: Movement in a City
•10 minute-ahead volume forecast (blue) vs. actual
value (black)
•10 minute-ahead speed forecast (blue) vs. actual
value (black).
Blue line: analytics prediction 10 minutes in advance
Black line: actual result
Stockholm: http://www.youtube.com/watch?v=rfMylzF4lv8
36. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
36
Predictive Analytics: Using Information to Ensure Public Safety:
Blue CRUSH in Memphis, TN & Richmond, VA
Blue CRUSH predictive analysis for officer deployment & risk management generated easy-to-read crime maps every four hours
Richmond, VA: Violent crime decreased in the first year by 32%, another 40% thereafter,
moving Richmond from #5 on the list of the most dangerous US cities to #99
Memphis Blue CRUSH Map
Police videos: http://www.youtube.com/watch?v=8SJQtn4RO7I
Play
video
https://www.youtube.com/watch?v=_xsffIAHY3I
37. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
37
A new class of data-rich industries has already emerged
Yesterday’s Hyperscale
Data Companies
New business models: company’s value based on amount of information stored, exploited
Today’s Hyperscale Data Companies
Aerospace
Banking
Energy
Government
Healthcare
Insurance
Manufacturing
Media and
Entertainment
Retail
3.5 PB in 2010
1 TB CT scanner → 2.5 PB/Year/Scanner
20 PB in 2011
Grow 300 TB per month, every month
ExamplesIndustries
Healthcare
Provider
Claims
Processor
38. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
38
How much data?
39. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
39
1. Data + Analytics = Information
2. Information + Context = Insight
3. Insight + Actions = Desired Outcomes
Solution: take Big Data on the Journey to Value
40. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
40
Data + Analytics = Information
Information + context = Insight
So…. What is “context”?
41. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
41
Time
ComputingPowerGrowth
Review: this is the Big Data Opportunity
Add: Big Data
Sensemaking
Algorithms
Available
Observation
Space
Context Big Data
Capability
“context”
New/Useful
Information
Data
Analytics
42. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
42
No Context
scrila34@msn.com
43. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
43
Context, definition
Better understanding something by taking into
account the things around it.
44. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
44
Information in Context … = Insights
Top 200
Customer
Job
Applicant
Identity
Thief
Criminal
Investigation
scrila34@msn.com
45. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
45
The Puzzle Metaphor: what we mean by “Context”
Imagine an ever-growing pile of puzzle pieces of varying sizes, shapes
and colors
What it represents is unknown – there is no picture on hand
Is it one puzzle, 15 puzzles, or 1,500 different puzzles?
Some pieces are duplicates, missing, incomplete, low quality,
or have been misinterpreted
Some pieces may even be professionally fabricated lies
Until you take the pieces to the table and attempt assembly,
you don’t know what you are dealing with
46. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
46
Puzzling
270 pieces
90%
200 pieces
66%
150 pieces
50%
6 pieces
2%
(pure noise)
30 pieces
10%
(duplicates)
47. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
47
48. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
48
49. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
49
First Discovery
50. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
50
More Data Finds Data
51. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
51
Duplicates in Front Of Your Eyes
52. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
52
First Duplicate Found Here
53. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
53
54. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
54
55. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
55
Incremental Context – Incremental Discovery
6:40pm START
22min “Hey, this one is a duplicate!”
35min “I think some pieces are missing.”
37min “Looks like a bunch of hillbillies on a porch.”
44min “Hillbillies, playing guitars, sitting on a porch,
near a barber sign … and a banjo!”
56. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
56
150 pieces
50%
57. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
57
Incremental Context – Incremental Discovery
47min “We should take the sky and grass off the table.”
2hr “Let’s switch sides, and see if we can make sense
of this from different perspectives.”
2hr10m “Wait, there are three … no, four puzzles.”
2hr17m “We need a bigger table.”
2hr18m “I think you threw in a few random pieces.”
58. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
58
59. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
59
60. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
60
61. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
61
How Context Accumulates
With each new observation … one of three assertions are made: 1) Un-associated;
2) placed near like neighbors; or 3) connected
New observations sometimes reverse earlier assertions
Some observations produce new discovery
As the working space expands, computational effort increases
Given sufficient observations, there can come a tipping point. Thereafter,
confidence improves while computational effort decreases!
62. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
62
Big Data [in context] = Insights.
More data: better the predictions
– Lower false positives
– Lower false negatives
More data: bad data … good
– Suddenly glad your data was not perfect
More data: less compute
63. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
63
1. Data + Analytics = Information
2. Information + Context = Insight
3. Insight + Actions = Desired Outcomes
Quiz: The Big Data Journey to Value
64. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
64
The most competitive organizations
are going to make sense of what they are observing
fast enough to do something about it
while they are observing it.
65. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
65
65
Data in
Motion
Data at
Rest
Data in
Many Forms
Information
Ingestion and
Operational
Information
Decision
Management
BI and Predictive
Analytics
Navigation
and Discovery
Intelligence
Analysis,
Raw Data
Structured Data
Text Analytics
Data Mining
Entity Analytics
Machine Learning
Landing Area,
Analytics Zone, Archive
Video/Audio
Network/Sensor
Entity Analytics
Predictive
Real-time Analytics
Exploration,
Integrated Warehouse,
and Mart Zones
Discovery
Deep Reflection
Operational
PredictiveStream Processing
Data Integration
Master Data
Streams
Information Governance, Security and Business Continuity
Batch parallel Big
Data processing
Real-Time
In-memory servers
Data Warehouse
Traditional IT
Thus, there is a Workflow in a Big Data infrastructure
66. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
66
In order to build a workflow for Big Data, you must know:
Where/how is Big Data is stored, analyzed, delivered?
Understanding Big Data in Context
67. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
67
C = cloud
A = analytics
M = Mobile
S = Social
Remember this acronym: C.A.M.S.
68. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
68
C = cloud
A = analytics
M = Mobile
S = Social
Big Data in Context:
Where data is generated and
collected
Where data is stored
How data is analyzed
Where data is analyzed
How data is delivered
Who is consuming it
69. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
69
Cloud – today’s Delivery Model
70. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
70
Where is the Big Data?
Answer: Cloud Data Centers
71. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
71
71
Bandwidth availability is tipping point for adoption of “The
Cloud”………
Worldwide broadband bandwidth availability is
becoming commonplace
Facilitates a pervasive web services delivery model
– (i.e. “The Cloud”)
Hosted in mega data centers with massive amounts:
– Processors, Storage, Network
As a result:
– We are seeing on-premise data centers worldwide
rapidly disappearing, off-premise, into the cloud
72. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
72
72 http://wikibon.org/blog/wp-content/uploads/2011/10/5-top-data-centers.html
73. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
73http://wikibon.org/blog/wp-content/uploads/2011/10/5-top-data-centers.html
74. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
74
Large Cloud Data Centers
10. SUPERNAP, LAS VEGAS, 407,000 SF
9A and 9B. MICROSOFT QUINCY AND SAN ANTONIO DATA CENTERS, 470,000 S
75. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
75
75
Container Data Center Architecture 7. PHOENIX ONE, PHOENIX, ARIZ. 538,000 SF
5. MICROSOFT CHICAGO DATA CENTER,
Chicago 700,000 SF
2. QTS METRO DATA CENTER, ATLANTA, 990,000 SF
Microsoft’s
Chicago
Container
Data Center
76. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
76
76
More data centers….
4. NEXT
GENERATION DATA
EUROPE, WALES
750,000 SF
3. NAP OF THE AMERICAS,
MIAMI, 750,000 SF
1. 350 EAST CERMAK,
CHICAGO, 1.1 MILLION
SQUARE FEET
Consumes 100 megawatts of power, 2nd-largest power customer for Commonwealth Edison, trailing only Chicago’s O’Hare Airport.
77. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
77
Now….. what about the web giants?
i.e. Apple, Facebook, Google, Amazon, etc?
That’s Big!
Great Technology Wars of 2012 – Future of the Innovation Economy - Fast Company.com
78. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
78
Apple
Here’s what powers iCloud, see Jobs at WWDC 2011 iCloud announce (YouTube)
Rendering of Apple's new North Carolina Data Center. Credit: Apple
Other Apple data centers:
Cork, Ireland
Munich, Germany
Newark, California
Cupertion, Calif
Apple
Data Center
FAQ
Maiden,
North Carolina
500K sq ft
USD $1B
This is phase 1 only
Apple Data Center Newark, California
Purposes for all these data
centers:
•iCloud
•Support Apple’s WW install base
of devices
•Futures: Move Content Delivery
Network in-house?
•Futures: Streaming video?
Under construction: Prineville, Oregon
79. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
79
Facebook
Facebook’s
North Carolina
Data Center
Goes Live
Lulea, Sweden - 290K sq ft (27K
sq meters) by late 2012
Facebook –
Prinville,
Oregon
Has spent
$1B on it’s
data
centers
Open
Compute
Project
http://www.wired.com/wiredenterprise/2011/12/facebook-data-center/all/1
80. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
80
Amazon Web Services
Amazon Web Services 1Q12: 450,000 servers
Amazon Perdix Modular Datacenter
EC2 17K core, 240 teraflop cluster
42nd fastest supercomputer in world
1Q12:
450,000
Servers
estimated
1Q13: >
2 trillion
objects in S3
1Q13: 1.1 M
req/sec
http://aws.typepad.com/aws/2012/04/amazon-s3-905-billion-objects-and-650000-requestssecond.html
http://gigaom.com/cloud/how-big-is-amazon-web-services-bigger-than-a-billion/
http://aws.typepad.com/aws/2013/04/amazon-s3-two-trillion-objects-11-million-requests-second.html
81. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
81
What is Google? Google is not a search engine
Google is a real-time “Data Factory” ecosystem
– Defacto organizer of all human internet data
– Provides worldwide Patterns of Life data
• Search, analytics, etc as processing
• Interactive maps as visualization
– Android as ingest / output devices
• Motorola Wireless acquisition $12B
– Supporting businesses and ecosystem roles:
• Google+, Play, Shop, Books, Gmail, Docs
• Voice recognition software
The history of search engine http://www.wordstream.com/articles/internet-search-engines-history
82. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
82
82
Google: The Dalles, Oregon internet scale data center
82
Google Data Center – The Dalles, Oregon
83. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
83
83
Google Data Center Photo Gallery
http://www.google.com/about/datacenters/gallery/#/
84. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
84
84
Google
Data Centers
in 2008:
85. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
85
85
Google Data Center CAPEX worldwide
Capital expenditures on datacenters:
– YTD 2013: USD$ 2.4B
– 2012: USD$ 3.2B
– 2011: USD$ 3.4B
– 2010: USD$ 4.0B
– 2009: USD$ 809M
The Dalles, Oregon
Each data center
between $200M and
$600M
86. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
86
Time to market
Cost Reduction
Data proximity
Better/faster technology support
Self-service
Shift the culture/business process
New kinds of applications
At scale never before imagined
Why Cloud Delivery Model, Cloud Data Centers
87. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
87
Primary drivers for move to cloud = business reasons
http://www.kpmg.com/global/en/issuesandinsights/articlespublications/cloud-service-providers-survey/pages/service-providers.aspx
Competitive Advantage,
Revenue
88. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
88
Value delivered
IT Infrastructure Provisioning
Continuous Access to data
From traditional
Weeks
To cloud
Minutes
For
users
Reduced admin costs Up to 50% savings
For IT
Reduced energy costs Up to 36%
Increased utilization Up to 90%From 50%
Localized, any time
any where
Dynamic (Elastic)
Centralized
FixedCapacity
Cloud Infrastructure Business Value
Time-to-Delivery
Competitive Advantage
Revenue
“Time is Money”
89. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
89
Growth of
The Cloud
by 2016
Mobile
Geo-locational
Real-time data
Shift to cloud
mega-data centers
http://www.datacenterknowledge.com/archives/2012/10/23/cisco-releases-2nd-annual-global-cloud-index/
Source:
> 50% in
cloud
Cisco
already
knows
> 50%
workload is
in the cloud
90. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
90
Visualizing Mobile and Social
C = cloud
A = analytics
M = Mobile
S = Social
91. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
91
Space-Time-Travel
Source - blog by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
6 billion
mobile phones
6.8 billion
people
92. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
92
Space-Time-Travel
6 billion
mobile phones
6.8 billion
people
Source - blog by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
Re-Identify
(figuring who is
who) is somewhat
trivial
Reveal
Where you spend
time
Who with (e.g.,
friends)
Geo-location data
Mobile Phones
600B transactions /
day
(in US)
De-Identify
in volume
in real-time
share with third
parties
93. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
93
Space-Time-Travel
6 billion
mobile phones
6.8 billion
people
Source - blog by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
Here Now
More to come
Unravel
All of one’s secretsAbsolute
identification
Ultimate biometric
Reshape
Tough problems
Image classification
Identification
Enormous
Opportunity
Challenge all
notions of privacy
94. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
94
Possible….. Like Magic …
Source - blog by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics
http://jeffjonas.typepad.com/jeff_jonas/2009/08/your-movements-speak-for-themselves-spacetime-travel-data-is-analytic-superfood.html
87% certainty where you
will be this
Thursday at 5pm
Top 10 people you co-
locate with (home /
work)
High quality traffic-
avoid predictions
pushed to you real-time
Transactions not consistent with your
pattern = reduce credit card theft 90%
Political opponent crushed,
resigns two days after
announcing candidacy
Governments change
Due to mass online social
networking
Cannot truly be turned off
6 billion
mobile phones
6.8 billion
people
95. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
95
80%
5 minutes
4/5ths
2/3rds
$1Tril.
84%
of Millennials say social
and user-generated content
has an influence on what they
buy.
70%
2x
of Boomers agree.
57%
57% of companies
in 2014 expect to
devote more than 25%
of their IT spending to
systems of
engagement. (Almost
double the investment
one year ago.)
9
5
IBM CONFIDENTIAL 2014
Mobile/Social:
84%
of smartphone users check an
app as soon as they wake up.
as many people in 2013 were
willing to share their geolocation
data in return for personalized
offers compared to the previous
year.
The response time users expect
from a company once they have
contacted them via social media.
of U.S. adult smartphone users keep
their phones with them 22 hours per
day.
of individuals are willing
to trade their information for a
personalized offering.
of U.S. adults say they would not
return to a business that lost their
personal, confidential information.
of upside potential in online
retail sales if buyers trust
more.
96. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
96
Observe: how fast mobile internet grows by 2014
By 2014:
Mobile will be
main way
Of connecting to
Internet
97. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
9797
Mobile affects all business
models…
Mobile =
Geo-locational superfood
for real-time analytics
98. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
98
Mobile / Social endpoints for Data Supertransformagicability
TaxiWiz
HousingMaps
Source: http://mashable.com/2007/07/11/google-maps-mashups-2/
Weatherbug
99. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
9999
By 2016, how much mobile data? What kind?
2012:
–Mobile-connected
devices > # people
• 2016:
– 10 billion mobile devices
– (world population: 7.3 B)
http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-520862.html
Smartphones
48%
Web data,
video
70%
100. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
100
Monetizing, innovating, creating competitive
advantage out of Big Data
Innovating using Big Data
101. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
101
Different forms of automation have had a profound
impact
0
102. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
102
Manufacturing changes with an end of mass
production..
• Growth in manufacturing
capable countries
• Global levelling out
• Hybridised manufacturing
• Micro multi-nationals
clusters
• Globally recognised
specialisation
103. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
103
3D printing has the potential to drive another step
change
• Digitisation often leads to the
freemium
• Defining a sustainable
position in the value chain
• Really understanding what
customer value is critical
104. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
104
“The Curve”: giving away things for free, in
exchange for data?
http://www.youtube.com/watch?v=pcyzn5oiDrI
Today’s changing business models
105. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
105
1
2
3
4
5
Augment
Products
Codify
Services
Interconnect
Industries
Trade
Information
Digitise
Assets
Instrument products to create new data and
extend notion of client value
Expand use of differentiated capabilities through
ecosystems or business platforms to create
additional value
Use information to create new value chains that
reduce waste and bridge gaps between
organizations
Translate data into information that is of value to
adjacent industries
Transform analogue into digital assets
New Patterns for Innovation have emerged
106. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
106
Using this patterns require elastic enterprises..
Adjacency
Leverage core competency
Earn market permission
Differentiation
Maintainable advantage
Serve individual needs
Scaling Ecosystems
Amplified innovation
Co-creation of new value
Dynamic Operating Model
Able to share the new value
Scalable business platform
Source: Elastic Enterprise, Nicholas Vilatari and Haydn Shaughnessy
107. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
107
Interesting to look at Corning
Strong light glass for light bulbs
Dishes, plates…
They are the “standard” in some cultures
Glass for LCD screens.
Now predicting the future of glass
http://www.youtube.com/watch?v=jZkHpNnXLB0
108. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
108
Big Data is at the heart of innovation in business
109. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
109
Big Data business decisions URL:
https://bda.expertise.client-conversations.com
Available on the internet
Complete information on Innovating with Big Data:
110. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
110
Jobs replaced by Technology
http://www.businessinsider.com/the-future-of-jobs-the-onrushing-wave-2014-1
111. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
111
1. Data + Analytics = Information
2. Information + Context = Insight
3. Insight + Actions = Desired Outcomes
Quiz: The Big Data Journey to Value
112. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
112
Data, the new natural Resource
Data + Analytics = Information. Information + Context =
Insight. Insight + Action = Outcomes
Big Data in context:
Cloud, Analytics, Mobil, Social
Innovating using Big Data:
Monetizing, innovating, creating competitive advantage
out of Big Data
Summary – what we covered today:
113. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
113
Thank You
Merci
Grazie
Obrigado
Danke
Japanese
Hebrew
English
French
Russian
German
Italian
Brazilian Portuguese
Arabic
Traditional Chinese
Simplified
Chinese
Hindi
Tamil
Korean
Thai
TesekkurlerTurkish
114. © 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
114