SlideShare a Scribd company logo
1 of 15
(and live)

(and think)

The City University of New York
New York City
December 4, 2013

@BlairReeves
Blair Reeves

Product Lead, IBM Digital Analytics

IBM.com/digitalmarketing
I live here:
Durham, North Carolina

@BlairReeves
“The Year of Big Data”

Credit: Gartner Research

… is every year. From now on.
@BlairReeves
The Value of Data is Increasing

@BlairReeves
The Value of Data
… is still being decided

Book Value: $13 billion

Market Value: $114 billion
= 1.3 billion MAUs
~500 terabytes of data added… per day

$101 billion in data
@BlairReeves
A Short History of Data
300 B.C.
Great Library of Alexandria (Egypt)

970 A.D.
Al-Azhar University (Egypt)

1400
Cambridge University owns 122 books

1450s
Invention of the Gutenberg printing
press

1520s
Martin Luther translates the Latin Bible, accelerating
mass literacy

1710
Copyright law is born

1770s
Press freedom guarantees; pamphleteering

1890
Herman Hollerith invents machine-readable
data for U.S. Census

1969
ARPANET – first TCP/IP Protocol

2013
Watson

~2.8 billion global internet users
(40% of world’s population)

@BlairReeves
The Way We Use Data Will Change

Trade Exactitude for Size
Why Sample?

Correlation Over Causality

@BlairReeves
1 – Trade Exactitude for Size

Precision < Size
More data > Better algorithms

@BlairReeves
1 – Trade Exactitude for Size
1954

1990

250 word pairs

2006

3 million word pairs

>100 billion word pairs
(and counting)

@BlairReeves
2 – Why Sample?
• Sampling relies on randomness
• Difficult to drill down into
subcategories
• Requires careful pre-planning

@BlairReeves
2 – Why Sample?
• Sumo wrestlers
• Google Flu
• Non-linear relationships
(social media)

@BlairReeves
3 – Correlation Over Causality

When does knowing “why” matter?
Data rather than hypotheses

Correlations are value

@BlairReeves
3 – Correlation Over Causality

A/B Testing

Attribution

@BlairReeves
“Everything is obvious once you
know the answer.”
- Duncan Watts

@BlairReeves
Thanks!
BReeves@us.ibm.com
@BlairReeves

IBM.com/digitalmarketing
IBMBigDataHub.com

More Related Content

Similar to Why Big Data Will Survive the Hype - and Change the Way We Work

Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and HumanitiesAndrew Prescott
 
Brief History Of Big Data
Brief History Of Big DataBrief History Of Big Data
Brief History Of Big DataTyrone Systems
 
Big Data in the Arts and Humanities: Stirling presentation
Big Data in the Arts and Humanities: Stirling presentationBig Data in the Arts and Humanities: Stirling presentation
Big Data in the Arts and Humanities: Stirling presentationAndrew Prescott
 
A Brief History of Big Data
A Brief History of Big DataA Brief History of Big Data
A Brief History of Big DataBernard Marr
 
Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and HumanitiesAndrew Prescott
 
Briefhistoryofbigdata 150223152350-conversion-gate02
Briefhistoryofbigdata 150223152350-conversion-gate02Briefhistoryofbigdata 150223152350-conversion-gate02
Briefhistoryofbigdata 150223152350-conversion-gate02Mohammad Alkhalifah
 
Module 1 Introduction to Big and Smart Data- Online
Module 1 Introduction to Big and Smart Data- Online Module 1 Introduction to Big and Smart Data- Online
Module 1 Introduction to Big and Smart Data- Online caniceconsulting
 
E-Learning Prácticas y Promesas
E-Learning Prácticas y PromesasE-Learning Prácticas y Promesas
E-Learning Prácticas y PromesasDaniel Osorio
 
Bigdataforesight
BigdataforesightBigdataforesight
Bigdataforesightsuresh sood
 
What is the Internet.ppt
What is the Internet.pptWhat is the Internet.ppt
What is the Internet.pptgrendel3
 
new chap16.ppt
new chap16.pptnew chap16.ppt
new chap16.pptasastm2015
 
10 Jahre Web Science
10 Jahre Web Science10 Jahre Web Science
10 Jahre Web ScienceSteffen Staab
 
History of the internet
History of the internetHistory of the internet
History of the internetAmal Jith
 

Similar to Why Big Data Will Survive the Hype - and Change the Way We Work (20)

Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and Humanities
 
Rc 11.networks
Rc 11.networksRc 11.networks
Rc 11.networks
 
Brief History Of Big Data
Brief History Of Big DataBrief History Of Big Data
Brief History Of Big Data
 
Big Data in the Arts and Humanities: Stirling presentation
Big Data in the Arts and Humanities: Stirling presentationBig Data in the Arts and Humanities: Stirling presentation
Big Data in the Arts and Humanities: Stirling presentation
 
A Brief History of Big Data
A Brief History of Big DataA Brief History of Big Data
A Brief History of Big Data
 
Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and Humanities
 
Briefhistoryofbigdata 150223152350-conversion-gate02
Briefhistoryofbigdata 150223152350-conversion-gate02Briefhistoryofbigdata 150223152350-conversion-gate02
Briefhistoryofbigdata 150223152350-conversion-gate02
 
The Big Data Economy
The Big Data EconomyThe Big Data Economy
The Big Data Economy
 
AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101  AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101
 
Module 1 Introduction to Big and Smart Data- Online
Module 1 Introduction to Big and Smart Data- Online Module 1 Introduction to Big and Smart Data- Online
Module 1 Introduction to Big and Smart Data- Online
 
E-Learning Prácticas y Promesas
E-Learning Prácticas y PromesasE-Learning Prácticas y Promesas
E-Learning Prácticas y Promesas
 
Bigdataforesight
BigdataforesightBigdataforesight
Bigdataforesight
 
What is the Internet.ppt
What is the Internet.pptWhat is the Internet.ppt
What is the Internet.ppt
 
new chap16.ppt
new chap16.pptnew chap16.ppt
new chap16.ppt
 
The-Information-Age.pptx
The-Information-Age.pptxThe-Information-Age.pptx
The-Information-Age.pptx
 
It\'s Your Move
It\'s Your MoveIt\'s Your Move
It\'s Your Move
 
10 Jahre Web Science
10 Jahre Web Science10 Jahre Web Science
10 Jahre Web Science
 
Internet based communication
Internet based communicationInternet based communication
Internet based communication
 
Internet based communication
Internet based communicationInternet based communication
Internet based communication
 
History of the internet
History of the internetHistory of the internet
History of the internet
 

Recently uploaded

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Why Big Data Will Survive the Hype - and Change the Way We Work

  • 1. (and live) (and think) The City University of New York New York City December 4, 2013 @BlairReeves
  • 2. Blair Reeves Product Lead, IBM Digital Analytics IBM.com/digitalmarketing I live here: Durham, North Carolina @BlairReeves
  • 3. “The Year of Big Data” Credit: Gartner Research … is every year. From now on. @BlairReeves
  • 4. The Value of Data is Increasing @BlairReeves
  • 5. The Value of Data … is still being decided Book Value: $13 billion Market Value: $114 billion = 1.3 billion MAUs ~500 terabytes of data added… per day $101 billion in data @BlairReeves
  • 6. A Short History of Data 300 B.C. Great Library of Alexandria (Egypt) 970 A.D. Al-Azhar University (Egypt) 1400 Cambridge University owns 122 books 1450s Invention of the Gutenberg printing press 1520s Martin Luther translates the Latin Bible, accelerating mass literacy 1710 Copyright law is born 1770s Press freedom guarantees; pamphleteering 1890 Herman Hollerith invents machine-readable data for U.S. Census 1969 ARPANET – first TCP/IP Protocol 2013 Watson ~2.8 billion global internet users (40% of world’s population) @BlairReeves
  • 7. The Way We Use Data Will Change Trade Exactitude for Size Why Sample? Correlation Over Causality @BlairReeves
  • 8. 1 – Trade Exactitude for Size Precision < Size More data > Better algorithms @BlairReeves
  • 9. 1 – Trade Exactitude for Size 1954 1990 250 word pairs 2006 3 million word pairs >100 billion word pairs (and counting) @BlairReeves
  • 10. 2 – Why Sample? • Sampling relies on randomness • Difficult to drill down into subcategories • Requires careful pre-planning @BlairReeves
  • 11. 2 – Why Sample? • Sumo wrestlers • Google Flu • Non-linear relationships (social media) @BlairReeves
  • 12. 3 – Correlation Over Causality When does knowing “why” matter? Data rather than hypotheses Correlations are value @BlairReeves
  • 13. 3 – Correlation Over Causality A/B Testing Attribution @BlairReeves
  • 14. “Everything is obvious once you know the answer.” - Duncan Watts @BlairReeves

Editor's Notes

  1. No strict definition of the term – merely refers to the process (or capability) of analyzing datasets so large that they couldn’t previously fit into computer memory. This is where we got Google MapReduce and Hadoop. Technology companies who pioneered these techniques thus were able to extract unique new value from huge troves of data that many “offline” companies in a wide number of sectors had kept for years.
  2. Today, up to a third of Amazon’s online revenue is derived from its personalization and recommendations engine.Case studiesYou can cite any number of case studies about how innovative companies have been able to extract new value from large, previously unremarkable datasets. But in any of these cases, what we see is that data has become the newest natural resource, and it’s being exploited to create new markets.
  3. Interestingly, guess how many companies have a line item on their balance sheets for “data?” None. FB is one of the single best examples of this mismatch between traditional systems of financial value and new ones. Intangible assets 40% of value of public companies in 1980s; 75% of their value in 2010s
  4. As human societies consume, generate and process more data, our political, legal and conceptual models must change along with them. While it took hundreds of years for mass literacy and printed information to change Western civilization, we are now living in an era where amounts of and access to data are completely unprecedented. It will change how we think about the nature of information itself.8M books printed from 1453 to 1503Hollerith shrunk tabulating times for the U.S. Census from 8 years to &lt;1.
  5. Interestingly, guess how many of the companies listed here have a line item on their balance sheets for “data?” None.
  6. Collecting more data, more often frequently means sacrificing some level of precision. At large scale, accepting some noise – messiness – in exchange for collecting a larger dataset can mean better predictive power.NoSQL
  7. IBM 701 Machine – punch card system. Translated 60 sentences smoothly.IBM Candide – ten years worth of Canadian parliamentary transcripts. Ultimately was difficult to scale due to lack of additional data.Google Translate uses billions of websites, book-scanning project. In 2013, covers more than 60 languages.
  8. Sampling is sometimes a definitional characteristic of what qualifies as “big data” – whether we’re querying an entire dataset rather than a select part of it.Sampling is still very useful sometimes, but always as a second-best alternative to querying an entire dataset. Artifact of data-constrained environment where storage and processing power was sharply limited
  9. Up to a third of all Amazon’s sales are a result from its recommendation and personalization engines. These product-to-product correlations matter far more than understanding WHY customers who buy one product like another.