SlideShare une entreprise Scribd logo
1  sur  31
Roger Ehrenberg
Founder & Managing Partner
IA Ventures
http://www.flickr.com/photos/wallyg/3777954520/




http://www.flickr.com/photos/chanc/310847464/
                                        http://www.flickr.com/photos/northeastindiana/2313044640/ http://www.flickr.com/photos/ynse/542370154/
Storage cost                                                          Network access
                                                                                                                       1B hosts




                                                                          # of hosts
$ per TB




                        1980 – Apple: $14M per TB
                                                                                       ARPAnet Node 1
                                           2010 – Barracuda, $70 per TB                At UCLA

               1970                           today                                    1969                   today

                       CPU cost                                                               Bandwidth cost
                                                                                        $1200 per Mbps
                 1961 – IBM 1620 , $1,100,000,000
$ per GFLOPS




                                                                          $ per Mbps


                                              2009 – AMD Radeon, $0.59                                       $5 per Mbps

               1960                           today                                    1998                   today

                              Source: Mike Driscoll, CTO Metamarkets: The Three Sexy Skills of Data Scientists (& Data Driven Startups)
Small   Thousands of sales figures (10 GB)
          Stored in memory




Medium    Millions of web pages
          Stored on disk




Large     Billions of web clicks (1TB+)
          Distributed storage
✗
Data Only From

                                    Others Data
Others Platforms
                   Source of Data




     Hybrid
                                    Your Data




Data Only From
 Your Platform
                                                    Data Product                            Data-driven Product
                                                                       Final Product
                                                  Sell Data Directly         Sell Insight           Sell Product
Data Only From

                                    Others Data
Others Platforms
                   Source of Data




     Hybrid
                                    Your Data




Data Only From
 Your Platform
                                                    Data Product                            Data-driven Product
                                                                       Final Product
                                                  Sell Data Directly         Sell Insight           Sell Product
Data Only From

                                    Others Data
Others Platforms
                   Source of Data




     Hybrid
                                    Your Data




Data Only From
 Your Platform
                                                    Data Product                            Data-driven Product
                                                                       Final Product
                                                  Sell Data Directly         Sell Insight           Sell Product
Data Only From

                                    Others Data
Others Platforms
                   Source of Data




     Hybrid
                                    Your Data




Data Only From
 Your Platform
                                                    Data Product                            Data-driven Product
                                                                       Final Product
                                                  Sell Data Directly         Sell Insight           Sell Product
Data Only From

                                    Others Data
Others Platforms
                   Source of Data




     Hybrid
                                    Your Data




Data Only From
 Your Platform
                                                    Data Product                            Data-driven Product
                                                                       Final Product
                                                  Sell Data Directly         Sell Insight           Sell Product
Data Only From

                                    Others Data
Others Platforms
                   Source of Data




     Hybrid
                                    Your Data




Data Only From
 Your Platform
                                                    Data Product                            Data-driven Product
                                                                       Final Product
                                                  Sell Data Directly         Sell Insight           Sell Product
Data Only From

                                    Others Data
Others Platforms
                   Source of Data




     Hybrid                                            Companies focused on
                                                     delivering increasing insight
                                    Your Data




Data Only From
 Your Platform
                                                    Data Product                                Data-driven Product
                                                                         Final Product
                                                  Sell Data Directly             Sell Insight           Sell Product
http://www.flickr.com/photos/tps58/6158683716
Complex Data Architectures
  Proprietary Algorithms
      Rich Analytics
Complex Data Architectures
  Proprietary Algorithms
      Rich Analytics
010001011
        Contributory
         Database
          Platform
User
               engagement




Improvements
               PRODUCT      Data




                 Insight
http://www.billfrymire.com/blog/wp-content/uploads/2008/04/dna-strand-code.jpg
Hacking                          Statistics




          Domain Expertise
                             Drew Conway, The Data Science Venn Diagram
Machine
Hacking                          Statistics
              Learning

                Data
              Scientist




          Domain Expertise
                             Drew Conway, The Data Science Venn Diagram
Creating Competitive Advantage Through Data (IA Ventures)
Creating Competitive Advantage Through Data (IA Ventures)

Contenu connexe

Similaire à Creating Competitive Advantage Through Data (IA Ventures)

Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)Apigee | Google Cloud
 
Avista Partners Interactive Entertainment Summit 23 Nov 09 Main Pres
Avista Partners Interactive Entertainment Summit 23 Nov 09 Main PresAvista Partners Interactive Entertainment Summit 23 Nov 09 Main Pres
Avista Partners Interactive Entertainment Summit 23 Nov 09 Main PresPaul Heydon
 
Tableau 7.0 prsentation
Tableau 7.0 prsentationTableau 7.0 prsentation
Tableau 7.0 prsentationinam_slides
 
Using Machine Learning at Scale: A Gaming Industry Experience!
Using Machine Learning at Scale: A Gaming Industry Experience!Using Machine Learning at Scale: A Gaming Industry Experience!
Using Machine Learning at Scale: A Gaming Industry Experience!Databricks
 
Media and Entertainment Industry Analysis
Media and Entertainment Industry AnalysisMedia and Entertainment Industry Analysis
Media and Entertainment Industry AnalysisDraup
 
OneBigPlanet: Powering the Consumer Savings World
OneBigPlanet: Powering the Consumer Savings WorldOneBigPlanet: Powering the Consumer Savings World
OneBigPlanet: Powering the Consumer Savings WorldMichael Monaghan
 
Umsl challanges for brand measuring social media -marshall sponder - apr...
Umsl    challanges for brand measuring social media  -marshall sponder  - apr...Umsl    challanges for brand measuring social media  -marshall sponder  - apr...
Umsl challanges for brand measuring social media -marshall sponder - apr...Marshall Sponder
 
Utilisation du cloud dans les systèmes intelligent
Utilisation du cloud dans les systèmes intelligentUtilisation du cloud dans les systèmes intelligent
Utilisation du cloud dans les systèmes intelligentMicrosoft Technet France
 
Big data paris 2011 is cool florian douetteau
Big data paris 2011 is cool florian douetteauBig data paris 2011 is cool florian douetteau
Big data paris 2011 is cool florian douetteauIsCoolEnt
 
Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, Nokia
Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, NokiaHadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, Nokia
Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, NokiaCloudera, Inc.
 
Harnessing Big Data to Better Serve Your Audience - Core Audience / iCrossing
Harnessing Big Data to Better Serve Your Audience - Core Audience / iCrossingHarnessing Big Data to Better Serve Your Audience - Core Audience / iCrossing
Harnessing Big Data to Better Serve Your Audience - Core Audience / iCrossingiCrossing
 
Big Data and Competitive Intelligence
Big Data and Competitive Intelligence Big Data and Competitive Intelligence
Big Data and Competitive Intelligence Connotate
 
Marshall Sponder - Social Media Monitoring Analytics - Measure13
Marshall Sponder - Social Media Monitoring Analytics - Measure13Marshall Sponder - Social Media Monitoring Analytics - Measure13
Marshall Sponder - Social Media Monitoring Analytics - Measure13Our Social Times
 
Bin3 Open Source BI, overhyped or undervalued?
Bin3 Open Source BI, overhyped or undervalued?Bin3 Open Source BI, overhyped or undervalued?
Bin3 Open Source BI, overhyped or undervalued?Jos van Dongen
 
Virdatint Distributed Data Virtualization Basics_2.6
Virdatint Distributed Data Virtualization Basics_2.6Virdatint Distributed Data Virtualization Basics_2.6
Virdatint Distributed Data Virtualization Basics_2.6Virdatint
 
Windows Azure Platform
Windows Azure PlatformWindows Azure Platform
Windows Azure PlatformSoumow Dollon
 
Jaspersoft Webinar deck
Jaspersoft Webinar deckJaspersoft Webinar deck
Jaspersoft Webinar deckJos van Dongen
 

Similaire à Creating Competitive Advantage Through Data (IA Ventures) (20)

Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)
 
Avista Partners Interactive Entertainment Summit 23 Nov 09 Main Pres
Avista Partners Interactive Entertainment Summit 23 Nov 09 Main PresAvista Partners Interactive Entertainment Summit 23 Nov 09 Main Pres
Avista Partners Interactive Entertainment Summit 23 Nov 09 Main Pres
 
Tableau 7.0 prsentation
Tableau 7.0 prsentationTableau 7.0 prsentation
Tableau 7.0 prsentation
 
Using Machine Learning at Scale: A Gaming Industry Experience!
Using Machine Learning at Scale: A Gaming Industry Experience!Using Machine Learning at Scale: A Gaming Industry Experience!
Using Machine Learning at Scale: A Gaming Industry Experience!
 
Media and Entertainment Industry Analysis
Media and Entertainment Industry AnalysisMedia and Entertainment Industry Analysis
Media and Entertainment Industry Analysis
 
OneBigPlanet: Powering the Consumer Savings World
OneBigPlanet: Powering the Consumer Savings WorldOneBigPlanet: Powering the Consumer Savings World
OneBigPlanet: Powering the Consumer Savings World
 
Infochimps + CloudCon: Infinite Monkey Theorem
Infochimps + CloudCon: Infinite Monkey TheoremInfochimps + CloudCon: Infinite Monkey Theorem
Infochimps + CloudCon: Infinite Monkey Theorem
 
Umsl challanges for brand measuring social media -marshall sponder - apr...
Umsl    challanges for brand measuring social media  -marshall sponder  - apr...Umsl    challanges for brand measuring social media  -marshall sponder  - apr...
Umsl challanges for brand measuring social media -marshall sponder - apr...
 
Utilisation du cloud dans les systèmes intelligent
Utilisation du cloud dans les systèmes intelligentUtilisation du cloud dans les systèmes intelligent
Utilisation du cloud dans les systèmes intelligent
 
Big data paris 2011 is cool florian douetteau
Big data paris 2011 is cool florian douetteauBig data paris 2011 is cool florian douetteau
Big data paris 2011 is cool florian douetteau
 
Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, Nokia
Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, NokiaHadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, Nokia
Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, Nokia
 
Intranet & Extranet
Intranet & ExtranetIntranet & Extranet
Intranet & Extranet
 
Intranets and Extranets
Intranets and Extranets Intranets and Extranets
Intranets and Extranets
 
Harnessing Big Data to Better Serve Your Audience - Core Audience / iCrossing
Harnessing Big Data to Better Serve Your Audience - Core Audience / iCrossingHarnessing Big Data to Better Serve Your Audience - Core Audience / iCrossing
Harnessing Big Data to Better Serve Your Audience - Core Audience / iCrossing
 
Big Data and Competitive Intelligence
Big Data and Competitive Intelligence Big Data and Competitive Intelligence
Big Data and Competitive Intelligence
 
Marshall Sponder - Social Media Monitoring Analytics - Measure13
Marshall Sponder - Social Media Monitoring Analytics - Measure13Marshall Sponder - Social Media Monitoring Analytics - Measure13
Marshall Sponder - Social Media Monitoring Analytics - Measure13
 
Bin3 Open Source BI, overhyped or undervalued?
Bin3 Open Source BI, overhyped or undervalued?Bin3 Open Source BI, overhyped or undervalued?
Bin3 Open Source BI, overhyped or undervalued?
 
Virdatint Distributed Data Virtualization Basics_2.6
Virdatint Distributed Data Virtualization Basics_2.6Virdatint Distributed Data Virtualization Basics_2.6
Virdatint Distributed Data Virtualization Basics_2.6
 
Windows Azure Platform
Windows Azure PlatformWindows Azure Platform
Windows Azure Platform
 
Jaspersoft Webinar deck
Jaspersoft Webinar deckJaspersoft Webinar deck
Jaspersoft Webinar deck
 

Dernier

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 

Dernier (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

Creating Competitive Advantage Through Data (IA Ventures)

  • 1. Roger Ehrenberg Founder & Managing Partner IA Ventures
  • 2.
  • 3. http://www.flickr.com/photos/wallyg/3777954520/ http://www.flickr.com/photos/chanc/310847464/ http://www.flickr.com/photos/northeastindiana/2313044640/ http://www.flickr.com/photos/ynse/542370154/
  • 4. Storage cost Network access 1B hosts # of hosts $ per TB 1980 – Apple: $14M per TB ARPAnet Node 1 2010 – Barracuda, $70 per TB At UCLA 1970 today 1969 today CPU cost Bandwidth cost $1200 per Mbps 1961 – IBM 1620 , $1,100,000,000 $ per GFLOPS $ per Mbps 2009 – AMD Radeon, $0.59 $5 per Mbps 1960 today 1998 today Source: Mike Driscoll, CTO Metamarkets: The Three Sexy Skills of Data Scientists (& Data Driven Startups)
  • 5.
  • 6.
  • 7. Small Thousands of sales figures (10 GB) Stored in memory Medium Millions of web pages Stored on disk Large Billions of web clicks (1TB+) Distributed storage
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. Data Only From Others Data Others Platforms Source of Data Hybrid Your Data Data Only From Your Platform Data Product Data-driven Product Final Product Sell Data Directly Sell Insight Sell Product
  • 13. Data Only From Others Data Others Platforms Source of Data Hybrid Your Data Data Only From Your Platform Data Product Data-driven Product Final Product Sell Data Directly Sell Insight Sell Product
  • 14. Data Only From Others Data Others Platforms Source of Data Hybrid Your Data Data Only From Your Platform Data Product Data-driven Product Final Product Sell Data Directly Sell Insight Sell Product
  • 15. Data Only From Others Data Others Platforms Source of Data Hybrid Your Data Data Only From Your Platform Data Product Data-driven Product Final Product Sell Data Directly Sell Insight Sell Product
  • 16. Data Only From Others Data Others Platforms Source of Data Hybrid Your Data Data Only From Your Platform Data Product Data-driven Product Final Product Sell Data Directly Sell Insight Sell Product
  • 17. Data Only From Others Data Others Platforms Source of Data Hybrid Your Data Data Only From Your Platform Data Product Data-driven Product Final Product Sell Data Directly Sell Insight Sell Product
  • 18. Data Only From Others Data Others Platforms Source of Data Hybrid Companies focused on delivering increasing insight Your Data Data Only From Your Platform Data Product Data-driven Product Final Product Sell Data Directly Sell Insight Sell Product
  • 20. Complex Data Architectures Proprietary Algorithms Rich Analytics
  • 21. Complex Data Architectures Proprietary Algorithms Rich Analytics
  • 22.
  • 23. 010001011 Contributory Database Platform
  • 24. User engagement Improvements PRODUCT Data Insight
  • 25.
  • 26.
  • 28. Hacking Statistics Domain Expertise Drew Conway, The Data Science Venn Diagram
  • 29. Machine Hacking Statistics Learning Data Scientist Domain Expertise Drew Conway, The Data Science Venn Diagram

Notes de l'éditeur

  1. HOW DID WE GET HERE?WHAT IS BIG DATA?WHAT REALLY CREATES TRUE COMPETITIVE BARRIERS IN DATA-DRIVEN BUSINESSES?
  2. As I wrote a post recently, DATA IS THE NEW DOT COM. Funds are announcing a new focus on “Big Data.” – IT’S HOT. WHY NOW?
  3. Big Data is pervasive - permeating every industryAdvertisingGovernmentFinancial ServicesCommercePharma Biotech & HealthcareThe good news: data is becoming MORE ACTIONABLEThe bad news: it is INCREASINGLY DIFFICULT TO EXTRACT VALUE given the VOLUME, VELOCITY AND MULTIPLE DATA TYPES
  4. MASSINVE ADVANCES IN INFRASTRUCTURE HAS SEEDED THE BIG DATA REVOLUTION OVER THE PAST 50 years
  5. THESE TRENDS HAVE A DIRECT IMPACT UPON BUSINESS – AND THE BOTTOM LINEe.g., RECOMMENDATION ENGINESTHAT LEVERAGEHISTORICAL DATA andPREDICTIVE ANALYTICS to generateACTIONABLE REAL-TIME INSIGHT for customers
  6. CAN WE AGREE ON A SET OF DEFINITIONS GIVEN THE AMBIGUITY OF THE TERM?
  7. Sizes that were unimaginable a few years ago are now commonplaceJust storing and accessing the data can be difficultSIZE – MANAGED WITH – STOREDSmall :: Excel, R :: fits in memory on one machineMedium :: indexed files, monolithic DB :: fits on disk on one machineLarge :: Hadoop, Distributed DB :: stored across many machines Example in the IA Ventures portfolio: METAMARKETS – LARGE + REAL-TIMEPROBLEM: WHEN YOU MOVE TO DISTRIBUTED DATABASES, even the most simple mathematical tasks which are trivial for small and medium size systems are challenging
  8. Data that DIFFICULT FOR COMPUTERS TO UNDERSTANDPrincipal example being NATURAL LANGUAGETEXT, IMAGES, VIDEOVALUABLE INFORMATION TRAPPED INSIDE THIS DATA, e.g., Twitter, earnings releasesExample in the IA Ventures portfolio: RECORDED FUTURE – LARGE + UNSTRUCTURED
  9. More data coming in fasterDecision windows getting shorterValuable to worthless in a matter of minutes. (seconds … no milliseconds) :: RAPID VALUE DECAY – EVERYTHING IS BEGINNING TO LOOK LIKE TRADINGe.g., trading, ad servingSTREAMS ARE WHERE REAL-TIME INSIGHT COME FROM:: Stream processing – insight is extracted as soon as the data shows upExample in the IA Ventures portfolio: DATASIFT – LARGE + UNSTRUCTURED + REAL-TIME
  10. BIG DATA = COMPLEX DATAExtracting value from Big Data is FREAKING HARDBig Data companies are mash-ups of these different attributes :: we like that at IA Ventures. WE BELIEVE THIS CREATES BARRIERSSTORAGE AND ANALYTICS generally go hand in hand :: LOTS OF DEPENDENCIES
  11. THE IA VENTURE DEFINITION
  12. At IA Ventures we call this the DATA TAXONOMYINPUTS on the y-axisOUTPUTS on the x-axis
  13. SINGLE SOURCE DATA PLATFORMS TWITTERData generated on its platform – consumed as a discrete data streamPeople come to Twitter for the streamHigher order enrichment delivered by others
  14. THIRD PARTY DATA PLATFORMSDATASIFTIngests a variety of streams from a range of platforms – Twitter, Wordpress,LinkedIn, etc.ENRICHES THOSE STREAMS with analytics and other forms of data like SENTIMENT AND REPUTATIONCan either consume a pure data product (the Twitter firehose) or OVERLAY ADDED VALUE TO EXTRACT INSIGHT
  15. MORE SOPHISTICATED PRODUCTIZATION AROUND THE DATA ASSETPLACE IQMULTI-SOURCE – GEO DATA, WEATHER DATA, TRAFFIC DATA, ETC.COMPLEX ALGORITHMS, e.g., looking at the relationship among brand, weather forecast and time of day to optimize ad placement and offersCreate and maintain competitive advantage through FRESHNESS – TIMELY and ACTIONABLE information
  16. SINGLE SOURCE PLATFORMS WITH RICH PRODUCT OFFERINGSRepresent a phase change – Big Data companies who don’t sell data BUT USE DATA TO OPTIMIZE PRODUCT AMAZON – rich trove of user data that is leveraged to optimize both user experience and economic outcomes. REAL-TIME PERSONALIZATION, HYPER-CONTEXTUAL
  17. MULTI-SOURCE, HIGHLY REFINED PRODUCT –FUSING INTERNAL AND EXTERNAL DATA FOR MAXIMUM COMPETITIVE ADVANTAGEWAL-MARTIntersection of historical user behavior, inventory levels and weather data to optimize a promotion, shipping patterns, buying policy, etc.RENAISSANCE TECHNOLOGIESBuy massive amounts of external dataCreate their own metadataIndex and archive petabytes of data for historical analysis, model creation and calibrationThe firm’s success – massive absolute and relative returns – is the ultimate example of A HIGHER-ORDER DATA DRIVEN PRODUCT
  18. THE TREND AS SIMPLE DATA BECOMES COMMODITIZED andACTIONABLE INSIGHTS ARE WHAT CUSTOMERS REALLY WANT – AND ARE WILLING TO PAY FOR
  19. EXECUTION is TABLE STAKES TO PLAY THE GAME
  20. SO IF IT’S NOT ABOUT TECHNOLOGY AND ALGORITHMS, WHAT IS IT ABOUT??
  21. The rise of the CONTRIBUTORY DATABASE – DATA EXHIBITING TRADITIONAL NETWORK EFFECTSThese companies TRANSCEND SMART ALGORITHMSIn the SHORT RUN, SMARTER ALGOS provide a needed edge to gain early adoption (OUT-EXECUTE everyone else)In the LONG RUN, at scale, USER CONTRIBUTED DATA IS WHAT CREATES THE COMPETITIVE MOATBILLGUARD
  22. The rise of DATA ECONOMIES OF SCALEDay 1: not much data, not much valueAs the data asset builds, insights are gleaned, fed back into the product, users interact with the product and create more valuable usage dataBANKSIMPLE, PLACEIQ
  23. CORE COMPETENCIES FOR A BIG DATA COMPANY
  24. Machine learning: great skills, mathematically grounded but inability to bring deep industry knowledge to problem-solvingResearch: strong industry knowledge and mathematical grounding but inability to operate at scaleDanger zone: strong dev skills plus industry knowledge but without analytical rigorDATA SCIENTSTS ARE TRUE UNICORNS
  25. NOT ONLY ABOUT DATA SCIENTISTS AND TECHNOLOGISTS, but DATA CENTRIC LEADERSHIP