SlideShare une entreprise Scribd logo
1  sur  10
Strategies for Integrating with
                   Hadoop
                 Survey Results: May, 2012



www.bileadership.com         1
Hadoop adoption rates
             No plans                                                   38%


           Considering                                            32%


        Experimenting                             20%


        Implementing            5%


         In production        4%

           Based on 158 respondents, BI Leadership Forum, April, 2012

www.bileadership.com                  2
Hadoop workloads today
             Staging area                                        92%


           Online archive                                        92%


    Transformation Engine                                      83%


          Ad hoc queries                                 58%


       Scheduled reports                          42%


        Visual exploration                25%


              Data mining                                58%
            Based on respondents that have implemented
3           Hadoop. BI Leadership Forum, April, 2012
Hadoop workloads in 18 months
                                    Today     In 18 Months


          Staging area                                                         92%
                                                                               92%

        Online archive                                                         92%
                                                                               92%

Transformation Engine                                                    83%
                                                                               92%

        Ad hoc queries                                       58%
                                                                   67%

     Scheduled reports                                42%
                                                                   67%

     Visual exploration                 25%
                                                                   67%

           Data mining                                       58%
                                                                         83%



 4              Based on respondents that have implemented
                Hadoop. BI Leadership Forum, April, 2012
Hadoop’s impact on the data warehouse
                    Replaces it   0%


    Offloads existing workloads                               50%


       Handles new workloads                                        67%


     Shares existing workloads                          33%


         Shares new workloads                    25%


                   Don't know          8%

           Based on respondents that have implemented
5          Hadoop. BI Leadership Forum, April, 2012
What data does Hadoop support?
                                       Today     In 18 months



              Web logs                                                67%
                                                                            75%


           System logs                                                67%
                                                                      67%


           Social media                                         58%
                                                                            75%


       Transaction data                                                           92%
                                                                                        100%


   Semi-structured data                                         58%
                                                                      67%


           Sensor data         17%
                                                 42%


         Audio or video   0%
                                                 42%


                 Email               25%
                                                 42%


            Documents                      33%
                                                       50%

        Based on respondents that have implemented
        Hadoop. BI Leadership Forum, April, 2012
Hadoop integration options
                                       API request
  Import/Export        Relational
                                                     Data set
                                                                Data dump    Hadoop



                                      SQL query (ODBC, Hive , HCatalog)
  Interoperability     Relational     SQL input       Tables
                                                                MR output    Hadoop


                       Relational                                            Hadoop
  Hybrid
                       MapReduce                                             Relational




                                                                (MR Query)   Hadoop
  Native               Relational   (SQL response)                            MR App


www.bileadership.com                    7
Adoption Rate of Hadoop by Non-
Implementers
    Within 12 months                                            40%


    Within 24 months                                22%


    Within 36 months           5%


           In 3+ years        3%


             Not sure                                     30%


               Never     0%

Based on 76 respondents that have not yet implemented
Hadoop. BI Leadership Forum, April, 2012
Expected Use of Hadoop by Non-
Implementers
             Staging area                                  37%
           Online archive                      23%
    Transformation Engine                                    39%
          Ad hoc queries                                           45%
       Scheduled reports        5%
        Visual exploration                         27%
              Data mining                                                57%
                 Not sure                      23%
                   Other        5%

                 Based on respondents that have not yet implemented
                 Hadoop. BI Leadership Forum, April, 2012
www.bileader.com                     9
Data that Non-Implementers Will
Store in Hadoop
                Web logs                                              53%

              System logs                             33%

            Social media                                       47%

         Transaction data                                   44%

     Semi-structured data                                         50%

             Sensor data                        24%

           Audio or video      8%

                   Email                  18%

              Documents                   18%

                 Not sure           11%


                 Based on respondents that have not yet implemented
                 Hadoop. BI Leadership Forum, April, 2012
www.bileader.com                    10

Contenu connexe

Similaire à Strategies for Integrating with Hadoop

Mass tlc presentation menninger
Mass tlc presentation    menningerMass tlc presentation    menninger
Mass tlc presentation menninger
MassTLC
 
Mass tlc presentation menninger
Mass tlc presentation    menningerMass tlc presentation    menninger
Mass tlc presentation menninger
MassTLC
 
Agile By The Numbers - Scott Ambler
Agile By The Numbers - Scott AmblerAgile By The Numbers - Scott Ambler
Agile By The Numbers - Scott Ambler
Roopa Nadkarni
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Cloudera, Inc.
 
Data in your SOA: From SQL to NoSQL and Beyond
Data in your SOA: From SQL to NoSQL and BeyondData in your SOA: From SQL to NoSQL and Beyond
Data in your SOA: From SQL to NoSQL and Beyond
WSO2
 
Information Needs for Software Development Analytics
Information Needs for Software Development AnalyticsInformation Needs for Software Development Analytics
Information Needs for Software Development Analytics
Ray Buse
 

Similaire à Strategies for Integrating with Hadoop (20)

Mass tlc presentation menninger
Mass tlc presentation    menningerMass tlc presentation    menninger
Mass tlc presentation menninger
 
Mass tlc presentation menninger
Mass tlc presentation    menningerMass tlc presentation    menninger
Mass tlc presentation menninger
 
The Cloud and Mobility Pivot - How MSPs can retool for the next 5 years
The Cloud and Mobility Pivot - How MSPs can retool for the next 5 yearsThe Cloud and Mobility Pivot - How MSPs can retool for the next 5 years
The Cloud and Mobility Pivot - How MSPs can retool for the next 5 years
 
ESG Research Report Snapshot Big Data and Integrated Infrastructure Aug 2012
ESG Research Report Snapshot Big Data and Integrated Infrastructure Aug 2012ESG Research Report Snapshot Big Data and Integrated Infrastructure Aug 2012
ESG Research Report Snapshot Big Data and Integrated Infrastructure Aug 2012
 
Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011
 
Summary of Forrester Q3 2012 Global Cloud Developer Survey
Summary of Forrester Q3 2012 Global Cloud Developer SurveySummary of Forrester Q3 2012 Global Cloud Developer Survey
Summary of Forrester Q3 2012 Global Cloud Developer Survey
 
Big Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLBig Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQL
 
2012 06 hortonworks paris hug
2012 06 hortonworks paris hug2012 06 hortonworks paris hug
2012 06 hortonworks paris hug
 
Agile By The Numbers - Scott Ambler
Agile By The Numbers - Scott AmblerAgile By The Numbers - Scott Ambler
Agile By The Numbers - Scott Ambler
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
 
Data in your SOA: From SQL to NoSQL and Beyond
Data in your SOA: From SQL to NoSQL and BeyondData in your SOA: From SQL to NoSQL and Beyond
Data in your SOA: From SQL to NoSQL and Beyond
 
Enterprise Cloud Stakeholders Speak: Adoption Patterns, Barriers & Post-Adopt...
Enterprise Cloud Stakeholders Speak: Adoption Patterns, Barriers & Post-Adopt...Enterprise Cloud Stakeholders Speak: Adoption Patterns, Barriers & Post-Adopt...
Enterprise Cloud Stakeholders Speak: Adoption Patterns, Barriers & Post-Adopt...
 
Using HBase Coprocessors to implement Prospective Search - Berlin Buzzwords -...
Using HBase Coprocessors to implement Prospective Search - Berlin Buzzwords -...Using HBase Coprocessors to implement Prospective Search - Berlin Buzzwords -...
Using HBase Coprocessors to implement Prospective Search - Berlin Buzzwords -...
 
Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011
 
FrameMaker 11: What’s new in FM11 & FM11 Publishing Server (Lavacon2012)
FrameMaker 11: What’s new in FM11 & FM11 Publishing Server (Lavacon2012)FrameMaker 11: What’s new in FM11 & FM11 Publishing Server (Lavacon2012)
FrameMaker 11: What’s new in FM11 & FM11 Publishing Server (Lavacon2012)
 
Top 5 Ways the Cloud is Impacting Your IT
Top 5 Ways the Cloud is Impacting Your ITTop 5 Ways the Cloud is Impacting Your IT
Top 5 Ways the Cloud is Impacting Your IT
 
Barc - QlikTech in THE BI SURVEY 12
Barc - QlikTech in THE BI SURVEY 12Barc - QlikTech in THE BI SURVEY 12
Barc - QlikTech in THE BI SURVEY 12
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
Information Needs for Software Development Analytics
Information Needs for Software Development AnalyticsInformation Needs for Software Development Analytics
Information Needs for Software Development Analytics
 
Mria 2012 riding the change wave architecting market research for the future
Mria 2012 riding the change wave   architecting market research for the futureMria 2012 riding the change wave   architecting market research for the future
Mria 2012 riding the change wave architecting market research for the future
 

Plus de Eckerson Group

Big Data Analytics Webinar
Big Data Analytics WebinarBig Data Analytics Webinar
Big Data Analytics Webinar
Eckerson Group
 

Plus de Eckerson Group (9)

The Evolution of Self-Service Analytics
The Evolution of Self-Service AnalyticsThe Evolution of Self-Service Analytics
The Evolution of Self-Service Analytics
 
Managing Data Sprawl with Data Catalogs for Self-Service
Managing Data Sprawl with Data Catalogs for Self-ServiceManaging Data Sprawl with Data Catalogs for Self-Service
Managing Data Sprawl with Data Catalogs for Self-Service
 
Tips for BI & Analytics leaders
Tips for BI & Analytics leadersTips for BI & Analytics leaders
Tips for BI & Analytics leaders
 
AI in Financial Services
AI in Financial ServicesAI in Financial Services
AI in Financial Services
 
Operational Analytics
Operational AnalyticsOperational Analytics
Operational Analytics
 
Cloud BI Survey
Cloud BI SurveyCloud BI Survey
Cloud BI Survey
 
Big Data Analytics Webinar
Big Data Analytics WebinarBig Data Analytics Webinar
Big Data Analytics Webinar
 
BI Federation Survey Results
BI Federation Survey ResultsBI Federation Survey Results
BI Federation Survey Results
 
BI Architectures - Next Generation
BI Architectures - Next GenerationBI Architectures - Next Generation
BI Architectures - Next Generation
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Strategies for Integrating with Hadoop

  • 1. Strategies for Integrating with Hadoop Survey Results: May, 2012 www.bileadership.com 1
  • 2. Hadoop adoption rates No plans 38% Considering 32% Experimenting 20% Implementing 5% In production 4% Based on 158 respondents, BI Leadership Forum, April, 2012 www.bileadership.com 2
  • 3. Hadoop workloads today Staging area 92% Online archive 92% Transformation Engine 83% Ad hoc queries 58% Scheduled reports 42% Visual exploration 25% Data mining 58% Based on respondents that have implemented 3 Hadoop. BI Leadership Forum, April, 2012
  • 4. Hadoop workloads in 18 months Today In 18 Months Staging area 92% 92% Online archive 92% 92% Transformation Engine 83% 92% Ad hoc queries 58% 67% Scheduled reports 42% 67% Visual exploration 25% 67% Data mining 58% 83% 4 Based on respondents that have implemented Hadoop. BI Leadership Forum, April, 2012
  • 5. Hadoop’s impact on the data warehouse Replaces it 0% Offloads existing workloads 50% Handles new workloads 67% Shares existing workloads 33% Shares new workloads 25% Don't know 8% Based on respondents that have implemented 5 Hadoop. BI Leadership Forum, April, 2012
  • 6. What data does Hadoop support? Today In 18 months Web logs 67% 75% System logs 67% 67% Social media 58% 75% Transaction data 92% 100% Semi-structured data 58% 67% Sensor data 17% 42% Audio or video 0% 42% Email 25% 42% Documents 33% 50% Based on respondents that have implemented Hadoop. BI Leadership Forum, April, 2012
  • 7. Hadoop integration options API request Import/Export Relational Data set Data dump Hadoop SQL query (ODBC, Hive , HCatalog) Interoperability Relational SQL input Tables MR output Hadoop Relational Hadoop Hybrid MapReduce Relational (MR Query) Hadoop Native Relational (SQL response) MR App www.bileadership.com 7
  • 8. Adoption Rate of Hadoop by Non- Implementers Within 12 months 40% Within 24 months 22% Within 36 months 5% In 3+ years 3% Not sure 30% Never 0% Based on 76 respondents that have not yet implemented Hadoop. BI Leadership Forum, April, 2012
  • 9. Expected Use of Hadoop by Non- Implementers Staging area 37% Online archive 23% Transformation Engine 39% Ad hoc queries 45% Scheduled reports 5% Visual exploration 27% Data mining 57% Not sure 23% Other 5% Based on respondents that have not yet implemented Hadoop. BI Leadership Forum, April, 2012 www.bileader.com 9
  • 10. Data that Non-Implementers Will Store in Hadoop Web logs 53% System logs 33% Social media 47% Transaction data 44% Semi-structured data 50% Sensor data 24% Audio or video 8% Email 18% Documents 18% Not sure 11% Based on respondents that have not yet implemented Hadoop. BI Leadership Forum, April, 2012 www.bileader.com 10

Notes de l'éditeur

  1. How do this without having companies hire specialists who know how to query Hadoop using Java or overcome latency. Latency: via Hcatalog – Query: Better interfacesWon’t fix things like user concurrency – THIS IS ASPIRATION BUT LOTS OF OBSTACLES PREVENTING – - Latency via batch, user concurrency cause no workload mgmt or prioritization or query optimizer Know coding
  2. Offload log data, images, audio/video, data mining, transformationsTeradata appliances offload certain analytical workloads – Aster offloads unstructured data Allows Teradata to do more with what it has or add more structured data.
  3. Figure 9
  4. Hive converts queries into MR – Aster issues standard queries without creating MR jobsConnectivityPros: Easy to build and use; bring data down to analyze in RDBMS or in-memory cubeCons: Requires moving data from one system to the otherHybrid SystemsPros: One environment for all data and processingCons: Redundant if you already have Hadoop or RDBMSNative HadoopPros: Seamless access without translation Cons: MapReduce latency and external callsInteroperabilityPros: SQL access via Hadoop API; federated queries Cons: Lack of Hadoop metadata, not bidirectional
  5. Figure 11