SlideShare une entreprise Scribd logo
1  sur  32
THE BUSINESS ADVANTAGE OF
    HADOOP: LESSONS FROM THE FIELD
    Matt Aslett, Research Manager, 451 Research
    Mike Olson, CEO, Cloudera
    Bill Theisinger, Executive Director, Platform Data Services, YP
    Aaron Wiebe, Blackberry Infrastructure Architect, Research In Motion




1
Introducing our Speakers
      Matt      Mike         Bill      Aaron
     Aslett     Olson     Theisinger   Wiebe




2
Big Data, Total Data… Hadoop

  Matt Aslett - @maslett
   • Research manager, data
     management and analytics

  Total Data
   • Assesses data management
     approaches in an era of ‘big data’
   • Explores the drivers behind new
     approaches to data management
     and analytics
   • Explains the new and existing
     technologies used to store and
     process and deliver value from
     data



                           © 2012 by The 451 Group. All rights reserved
‘Big Data’
 “Big data” describes the realization of greater business intelligence
  by storing, processing and analyzing data that was previously
  ignored due to the limitations of traditional data management
  technologies to handle its volume, velocity and/or variety.




   Volume                 Velocity                               Variety
   The volume of data     The data is being                      The data lacks the
   is too large for       produced at a rate                     structure to make it
   traditional database   that is beyond the                     suitable for storage
   software tools to      performance limits                     and analysis in
   cope with              of traditional                         traditional
                          systems                                databases and data
                                                                 warehouses


                             © 2012 by The 451 Group. All rights reserved
‘Total Data’
 The adoption of non-traditional data processing technologies is
   driven not just by the nature of the data, but also by the user’s
   particular data processing requirements.




Totality                Exploration                        Frequency              Dependency
The desire to           The interest in                    The desire to          The reliance on
process and analyze     exploratory analytic               increase the rate of   existing
data in its entirety,   approaches, in                     analysis in order to   technologies and
rather than             which schema is                    generate more          skills, and the need
analyzing a sample      defined in response                accurate and timely    to balance
of data and             to the nature of the               business               investment in those
extrapolating the       query.                             intelligence.          existing
results.                                                                          technologies and
                                                                                  skills with the
                                                                                  adoption of new
                                                                                  techniques.

                                © 2012 by The 451 Group. All rights reserved
A virtuous circle?
 Increased use of interactive applications
and data-generating machines


 New commercial opportunities for
analyzing previously ignored data

 Increased desire to store and
process all available data


 More economically feasible to store
and process previously ignored data

 New infrastructure investments to
support new data processing software


                            © 2012 by The 451 Group. All rights reserved
What is Apache Hadoop?
 Distributed data storage (HDFS) and processing (MapReduce)
 Multiple associated data management projects

 • Open source
 • Vendor-supported                            Chukwa               Sqoop   ZooKeeper   Pig
 • Clusters of commodity servers               HBase                Avro     Mahout     Flume
 • Storage of large data volumes
 • Structured, unstructured and                MapReduce                                Whirr
   semi-structured data
 • Flexible, schema-on-read                                                             Hama
   processing                                  HDFS
                                                                                        Hive
 • Complex data sets
 • Connectors to existing                      Hadoop Common
   databases, data integration
   and business intelligence tools



                          © 2012 by The 451 Group. All rights reserved
What is Apache Hadoop for?


 Big-data        Hadoop as a platform for storing data that
 storage          could not previously be efficiently stored.



                 Hadoop as a large scale data ingestion/ETL
 Big-data
                  layer that complements existing databases.
 integration



                 Hadoop as a platform for new exploratory
 Big-data
                  analytic applications.
 analytics



                    © 2012 by The 451 Group. All rights reserved
THE EVOLUTION OF HADOOP
    And how it’s used in the real world today




    Mike Olson
    CEO & Co-Founder, Cloudera




9
Fastest sort of a TB, 62secs
over 1,460 nodes

Sorted a PB in 16.25hours
over 3,658 nodes
CORE HADOOP COMPONENTS
   Apache Hadoop is a platform for
   data storage and processing that is…                                     Hadoop
                                                                        Distributed File
   Scalable                                                            System (HDFS)                          MapReduce
   Fault tolerant
   Open source                                                            File Sharing & Data               Distributed Computing
                                                                            Protection Across               Across Physical Servers
                                                                            Physical Servers




 Has the Flexibility to Store                    Excels at                                                   Scales
 and Mine Any Type of Data                Processing Complex Data                                         Economically
Ask questions across structured and     Scale-out architecture divides                         Can be deployed on commodity
unstructured data that were previously   workloads across multiple nodes                         hardware
impossible to ask or solve               Flexible file system eliminates ETL                    Open source platform guards against
Not bound by a single schema            bottlenecks                                             vendor lock




    11                                         ©2011 Cloudera, Inc. All Rights Reserved.
2008                 2009                  2011                  2012             BEYOND…
 CLOUDERA             CDH:                  CLOUDERA              CLOUDERA          TRANSFORMING
 FOUNDED BY MIKE      FIRST                 REACHES 100           ENTERPRISE 4:    HOW COMPANIES
 OLSON,               COMMERCIAL            PRODUCTION            THE STANDARD        THINK ABOUT
 AMR AWADALLAH &      APACHE                CUSTOMERS             FOR HADOOP IN              DATA
 JEFF                 HADOOP                                      THE ENTERPRISE
 HAMMERBACHER         DISTRIBUTION




                                                                                         CHANGING
                                                                 CLO UDERA               THE WORLD
                                                                 ENTERPRIS               ONE PETABYTE
                                                                      E                  AT A TIME
                                                                    4




            2009                     2010                 2011                2012
            HADOOP             CLOUDERA              CLOUDERA              CLOUDERA
     CREATOR DOUG              MANAGER:             UNIVERSITY              CONNECT
      CUTTING JOINS                FIRST        EXPANDS TO 140           REACHES 300
         CLOUDERA           MANAGEMENT              COUNTRIES              PARTNERS
                         APPLICATION FOR
                                 HADOOP




12
CLOUDERA ENTERPRISE                                                     EDUCATION

       CLOUDERA SUPPORT:
       OUR TEAM OF EXPERTS ON CALL TO HELP YOU MEET YOUR SERVICE               DEVELOPERS
       LEVEL AGREEMENTS (SLAS)



                                                                              ADMINISTRATORS
        CLOUDERA MANAGER:
        END-TO-END MANAGEMENT APPLICATION FOR THE DEPLOYMENT &
        OPERATION OF CDH
                                                                               DATA SCIENTISTS


       CDH:
       BIG DATA STORAGE, PROCESSING & ANALYTICS PLATFORM BASED                 CERTIFICATION
       ON APACHE HADOOP – 100% OPEN SOURCE                                     PROGRAMS




     PROFESSIONAL SERVICES
     USE CASE        NEW HADOOP      PROOF OF        PRODUCTION    PROCESS & TEAM   DEPLOYMENT
     DISCOVERY       DEPLOYMENT      CONCEPT         PILOTS        DEVELOPMENT      CERTIFICATION




13
 Cloudera’s software is never installed all by itself

  It’s always deployed alongside mission-critical
     systems that represent enormous investment
  Extracting value from data requires sharing it
     across boundaries and among systems


 Goal: The right storage and the right
  processing in the right place at the right time


14                    ©2012 Cloudera, Inc. All Rights Reserved.
✛ Disparate data sources
 ✛ Disparate systems for transforming, processing
   and analyzing data
 ✛ Disparate systems for capturing and reporting
   data, and for enforcing business and legislative
   governance requirements

 All need to be connected for usability and to
   unlock the unique value of each


15                   ©2012 Cloudera, Inc. All Rights Reserved.
Consulting Services
     Cloudera University
      OPERATORS                                         ENGINEERS                   ANALYSTS         BUSINESS USERS   CUSTOMERS




     Management                                                                                         Enterprise      Web
        Tools                                               IDE’s                 BI / Analytics        Reporting     Application




                                                                                                Enterprise Data
                                                                                                 Warehouse
      Cloudera Enterprise
      •CDH
      •Cloudera Manager                                                                        Operational Rules
      •Technical Support                                                                           Engines



                                                        Relational
        Logs               Files   Web Data
                                                        Databases




16                                            ©2011 Cloudera, Inc. All Rights Reserved.
DATA                         ADVANCED
INDUSTRY           PROCESSING                   ANALYTICS
Web                Clickstream Sessionization   Social Network Analysis
Media              Engagement                   Content Optimization
Telecom            Mediation                    Network Analytics
Retail             Data Factory                 Loyalty & Promotions
Financial          Trade Reconciliation         Fraud Analysis
Government         Signal Intelligence (SIGINT) Entity Analysis
Biotech / Pharma   Genome Mapping               Sequencing Analysis
18
Hadoop@YP



                                                                                                                                                                      Sept 26, 2012
William Theisinger
Executive Director, Platform Computing


     © 2012 YP Holdings LLC Intellectual Property. All rights reserved. YP Holdings LLC, the YP Holdings LLC logo and all other YP Holdings LLC marks contained herein are
 trademarks of YP Holdings LLC Intellectual Property and/or YP Holdings LLC affiliated companies. All other marks contained herein are the property of their respective owners.
                                                                             (INTERNAL USE ONLY)
Challenges




Page 20
What we were facing

• Increasing volume of traffic data through our distribution
  network
• Need for a system to support changing data complexity and
  detail
• Adhere to tighter SLAs
• Provide intra-day reporting
• Benefit from the intelligence trapped in our data




                                                               21
Legacy processing flow


                                             Data Load



 Application Log   Data Layer      ETL
                                             Data Load   Data Warehouse
      Data                      processing
                                             Data Load




• Drop reportable events on the floor
• Loading multiple DBs
• Processing time was significant
• Reporting lag was in days, not hours
• High maintainability required


                                                                          Page
Hadoop Platform




Page 23
Hadoop processing flow



                        Data       Data    Hadoop Platform     Data
Applications
               LWES   Collection   Layer                     Warehouse




• All ETL processing in Hadoop
• Several systems integrate to Hadoop platform
• All Java MapReduce with some Hive for end user and
  dependent systems
• Reporting lag in hours, not days
• Actual reduction in maintainability needs
                                                                         Page
Next Generation




Page 25
Hadoop processing flow


                                                                Data
                                                              Warehouse
Applications            Data       Data    Hadoop Platform
               LWES   Collection   Layer

                                                             HBase Platform



• Migrating some reporting to HBase
• Exposing core business KPIs via APIs
• Replacing various data marts with HBase tables/schemas
• Reducing TCO
• Alignment of core skill sets


                                                                              Page
Hadoop @ Research In Motion
Aaron Wiebe
BlackBerry Infrastructure Architect
Internal Use Only




 The Problem

 1. BlackBerry Services currently generate 500TB of
    instrumentation data daily (and growing rapidly).


 2. Traditional systems unable to cope with both growth and
    access requests.


 3. Total global dataset of ~100PB.
28                         Confidential and Proprietary
Internal Use Only




 The Old Way
                                     Event Monitoring             Alerting

                   Filter
                                      Streaming ETL          Complex Correlation
 Services          and
                   Split              Streaming ETL          Data Warehouse

                                                               Archive Storage




 1. - Focus on reducing data to required data set
 2. - Pipeline data flows to avoid hitting disk
 3. - Scalability issues at most stages
 4. - Going back to the Archive was really time consuming
29                          Confidential and Proprietary
Internal Use Only




 The Hadoop Way
                                  Event Monitoring            Alerting

                Filter
 Services       and                    Hadoop
                                     Archive Storage
                Split                      ETL            Data Warehouse
                                       Correlation
                                      Stage 1 DWH




 1. - Archive storage moved to HDFS
 2. - ETL processes converted to Hadoop (Pig+Hive)
 3. - Some data warehouse functions migrating to Hadoop


30                       Confidential and Proprietary
Internal Use Only




 Real Results

 1. - 90% code base reduction for ETL Tools
 2. - Example Performance:
 3.      - Previous Ad-Hoc query would take around 4 days
         - Now takes 53 minutes
      - Significant capital cost reductions over previous system



31                          Confidential and Proprietary
Introducing our Speakers
      Matt    Mike        Bill      Aaron
     Aslett   Olson    Theisinger   Wiebe




32

Contenu connexe

Tendances

Customer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCustomer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCloudera, Inc.
 
快速数据快速分析引擎-Kudu
快速数据快速分析引擎-Kudu快速数据快速分析引擎-Kudu
快速数据快速分析引擎-KuduJianwei Li
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduCloudera, Inc.
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Cloudera, Inc.
 
Intro to HDFS and MapReduce
Intro to HDFS and MapReduceIntro to HDFS and MapReduce
Intro to HDFS and MapReduceRyan Tabora
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudCloudera, Inc.
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Cloudera, Inc.
 
Data Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the EnterpriseData Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the EnterpriseCloudera, Inc.
 
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...ArabNet ME
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsDataWorks Summit
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureVinod Kumar Vavilapalli
 
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science WorkbenchNOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science WorkbenchNOVA DATASCIENCE
 
巨量資料入門 The evolution of data architecture
巨量資料入門 The evolution of data architecture巨量資料入門 The evolution of data architecture
巨量資料入門 The evolution of data architectureWei-Chiu Chuang
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...Cloudera, Inc.
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全Jianwei Li
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Cloudera, Inc.
 
Making Sense of Big data with Hadoop
Making Sense of Big data with HadoopMaking Sense of Big data with Hadoop
Making Sense of Big data with HadoopGwen (Chen) Shapira
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousingDataWorks Summit
 

Tendances (20)

Customer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCustomer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWS
 
快速数据快速分析引擎-Kudu
快速数据快速分析引擎-Kudu快速数据快速分析引擎-Kudu
快速数据快速分析引擎-Kudu
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

 
Intro to HDFS and MapReduce
Intro to HDFS and MapReduceIntro to HDFS and MapReduce
Intro to HDFS and MapReduce
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 

 
Data Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the EnterpriseData Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the Enterprise
 
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the experts
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
 
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science WorkbenchNOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
 
巨量資料入門 The evolution of data architecture
巨量資料入門 The evolution of data architecture巨量資料入門 The evolution of data architecture
巨量資料入門 The evolution of data architecture
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
 
Making Sense of Big data with Hadoop
Making Sense of Big data with HadoopMaking Sense of Big data with Hadoop
Making Sense of Big data with Hadoop
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousing
 

En vedette

Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopCloudera, Inc.
 
Cloudera for Internet of Things
Cloudera for Internet of ThingsCloudera for Internet of Things
Cloudera for Internet of ThingsCloudera, Inc.
 
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013Publicis Sapient Engineering
 
Cloudera cluster setup and configuration
Cloudera cluster setup and configurationCloudera cluster setup and configuration
Cloudera cluster setup and configurationSudheer Kondla
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera, Inc.
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLCloudera, Inc.
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetupiwrigley
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
Architectural considerations for Hadoop Applications
Architectural considerations for Hadoop ApplicationsArchitectural considerations for Hadoop Applications
Architectural considerations for Hadoop Applicationshadooparchbook
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
 
Dynamic Empowerment Webinar #1--The Power of Goals
Dynamic Empowerment Webinar #1--The Power of GoalsDynamic Empowerment Webinar #1--The Power of Goals
Dynamic Empowerment Webinar #1--The Power of Goalsaltonbaird
 
Disueña tu profesión. Disueña tu barrio. Disueña tu vida
Disueña tu profesión. Disueña tu barrio. Disueña tu vidaDisueña tu profesión. Disueña tu barrio. Disueña tu vida
Disueña tu profesión. Disueña tu barrio. Disueña tu vidaRafa Cofiño
 
Η Σπάρτη
Η ΣπάρτηΗ Σπάρτη
Η Σπάρτηvasso76
 
Android Market
Android MarketAndroid Market
Android MarketTeo Romera
 
Daniel Avidor - Deciphering the Viral Code – The Secrets of Redmatch
Daniel Avidor - Deciphering the Viral Code – The Secrets of RedmatchDaniel Avidor - Deciphering the Viral Code – The Secrets of Redmatch
Daniel Avidor - Deciphering the Viral Code – The Secrets of RedmatchMIT Forum of Israel
 

En vedette (20)

Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache Hadoop
 
Cloudera for Internet of Things
Cloudera for Internet of ThingsCloudera for Internet of Things
Cloudera for Internet of Things
 
IoT Data as Service with Hadoop
IoT Data as Service with HadoopIoT Data as Service with Hadoop
IoT Data as Service with Hadoop
 
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
 
Cloudera cluster setup and configuration
Cloudera cluster setup and configurationCloudera cluster setup and configuration
Cloudera cluster setup and configuration
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETL
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
A Methodology for Building the Internet of Things
A Methodology for Building the Internet of ThingsA Methodology for Building the Internet of Things
A Methodology for Building the Internet of Things
 
Architectural considerations for Hadoop Applications
Architectural considerations for Hadoop ApplicationsArchitectural considerations for Hadoop Applications
Architectural considerations for Hadoop Applications
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Dynamic Empowerment Webinar #1--The Power of Goals
Dynamic Empowerment Webinar #1--The Power of GoalsDynamic Empowerment Webinar #1--The Power of Goals
Dynamic Empowerment Webinar #1--The Power of Goals
 
Disueña tu profesión. Disueña tu barrio. Disueña tu vida
Disueña tu profesión. Disueña tu barrio. Disueña tu vidaDisueña tu profesión. Disueña tu barrio. Disueña tu vida
Disueña tu profesión. Disueña tu barrio. Disueña tu vida
 
Η Σπάρτη
Η ΣπάρτηΗ Σπάρτη
Η Σπάρτη
 
Android Market
Android MarketAndroid Market
Android Market
 
Daniel Avidor - Deciphering the Viral Code – The Secrets of Redmatch
Daniel Avidor - Deciphering the Viral Code – The Secrets of RedmatchDaniel Avidor - Deciphering the Viral Code – The Secrets of Redmatch
Daniel Avidor - Deciphering the Viral Code – The Secrets of Redmatch
 
Presentacion departamentales
Presentacion departamentalesPresentacion departamentales
Presentacion departamentales
 
Thalia
ThaliaThalia
Thalia
 
Happy Monthsary!
Happy Monthsary!Happy Monthsary!
Happy Monthsary!
 

Similaire à The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer Webinar Series: 451 Research

Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Calpont Corporation
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Jonathan Seidman
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesCloudera, Inc.
 
What is the Point of Hadoop
What is the Point of HadoopWhat is the Point of Hadoop
What is the Point of HadoopDataWorks Summit
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopCloudera, Inc.
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...Amr Awadallah
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Cloudera, Inc.
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...Cloudera, Inc.
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Cloudera, Inc.
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendCaserta
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Managing The Data Deluge By Optimizing Storage
Managing The Data Deluge By Optimizing StorageManaging The Data Deluge By Optimizing Storage
Managing The Data Deluge By Optimizing StorageDell World
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyInside Analysis
 

Similaire à The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer Webinar Series: 451 Research (20)

Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
 
What is the Point of Hadoop
What is the Point of HadoopWhat is the Point of Hadoop
What is the Point of Hadoop
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
Big Data
Big DataBig Data
Big Data
 
Hadoop & Data Warehouse
Hadoop & Data Warehouse Hadoop & Data Warehouse
Hadoop & Data Warehouse
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Managing The Data Deluge By Optimizing Storage
Managing The Data Deluge By Optimizing StorageManaging The Data Deluge By Optimizing Storage
Managing The Data Deluge By Optimizing Storage
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
 

Plus de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Plus de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Dernier

APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menzaictsugar
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCRashishs7044
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailAriel592675
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy Verified Accounts
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyotictsugar
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...ssuserf63bd7
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCRashishs7044
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckHajeJanKamps
 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationAnamaria Contreras
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfpollardmorgan
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfrichard876048
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Anamaria Contreras
 
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxContemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxMarkAnthonyAurellano
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdfKhaled Al Awadi
 

Dernier (20)

No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detail
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail Accounts
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyot
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement Presentation
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdf
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.
 
Corporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information TechnologyCorporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information Technology
 
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptxContemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
Contemporary Economic Issues Facing the Filipino Entrepreneur (1).pptx
 
Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
 

The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer Webinar Series: 451 Research

  • 1. THE BUSINESS ADVANTAGE OF HADOOP: LESSONS FROM THE FIELD Matt Aslett, Research Manager, 451 Research Mike Olson, CEO, Cloudera Bill Theisinger, Executive Director, Platform Data Services, YP Aaron Wiebe, Blackberry Infrastructure Architect, Research In Motion 1
  • 2. Introducing our Speakers Matt Mike Bill Aaron Aslett Olson Theisinger Wiebe 2
  • 3. Big Data, Total Data… Hadoop  Matt Aslett - @maslett • Research manager, data management and analytics  Total Data • Assesses data management approaches in an era of ‘big data’ • Explores the drivers behind new approaches to data management and analytics • Explains the new and existing technologies used to store and process and deliver value from data © 2012 by The 451 Group. All rights reserved
  • 4. ‘Big Data’  “Big data” describes the realization of greater business intelligence by storing, processing and analyzing data that was previously ignored due to the limitations of traditional data management technologies to handle its volume, velocity and/or variety. Volume Velocity Variety The volume of data The data is being The data lacks the is too large for produced at a rate structure to make it traditional database that is beyond the suitable for storage software tools to performance limits and analysis in cope with of traditional traditional systems databases and data warehouses © 2012 by The 451 Group. All rights reserved
  • 5. ‘Total Data’  The adoption of non-traditional data processing technologies is driven not just by the nature of the data, but also by the user’s particular data processing requirements. Totality Exploration Frequency Dependency The desire to The interest in The desire to The reliance on process and analyze exploratory analytic increase the rate of existing data in its entirety, approaches, in analysis in order to technologies and rather than which schema is generate more skills, and the need analyzing a sample defined in response accurate and timely to balance of data and to the nature of the business investment in those extrapolating the query. intelligence. existing results. technologies and skills with the adoption of new techniques. © 2012 by The 451 Group. All rights reserved
  • 6. A virtuous circle?  Increased use of interactive applications and data-generating machines  New commercial opportunities for analyzing previously ignored data  Increased desire to store and process all available data  More economically feasible to store and process previously ignored data  New infrastructure investments to support new data processing software © 2012 by The 451 Group. All rights reserved
  • 7. What is Apache Hadoop?  Distributed data storage (HDFS) and processing (MapReduce)  Multiple associated data management projects • Open source • Vendor-supported Chukwa Sqoop ZooKeeper Pig • Clusters of commodity servers HBase Avro Mahout Flume • Storage of large data volumes • Structured, unstructured and MapReduce Whirr semi-structured data • Flexible, schema-on-read Hama processing HDFS Hive • Complex data sets • Connectors to existing Hadoop Common databases, data integration and business intelligence tools © 2012 by The 451 Group. All rights reserved
  • 8. What is Apache Hadoop for? Big-data  Hadoop as a platform for storing data that storage could not previously be efficiently stored.  Hadoop as a large scale data ingestion/ETL Big-data layer that complements existing databases. integration  Hadoop as a platform for new exploratory Big-data analytic applications. analytics © 2012 by The 451 Group. All rights reserved
  • 9. THE EVOLUTION OF HADOOP And how it’s used in the real world today Mike Olson CEO & Co-Founder, Cloudera 9
  • 10. Fastest sort of a TB, 62secs over 1,460 nodes Sorted a PB in 16.25hours over 3,658 nodes
  • 11. CORE HADOOP COMPONENTS Apache Hadoop is a platform for data storage and processing that is… Hadoop Distributed File Scalable System (HDFS) MapReduce Fault tolerant Open source File Sharing & Data Distributed Computing Protection Across Across Physical Servers Physical Servers Has the Flexibility to Store Excels at Scales and Mine Any Type of Data Processing Complex Data Economically Ask questions across structured and Scale-out architecture divides Can be deployed on commodity unstructured data that were previously workloads across multiple nodes hardware impossible to ask or solve Flexible file system eliminates ETL Open source platform guards against Not bound by a single schema bottlenecks vendor lock 11 ©2011 Cloudera, Inc. All Rights Reserved.
  • 12. 2008 2009 2011 2012 BEYOND… CLOUDERA CDH: CLOUDERA CLOUDERA TRANSFORMING FOUNDED BY MIKE FIRST REACHES 100 ENTERPRISE 4: HOW COMPANIES OLSON, COMMERCIAL PRODUCTION THE STANDARD THINK ABOUT AMR AWADALLAH & APACHE CUSTOMERS FOR HADOOP IN DATA JEFF HADOOP THE ENTERPRISE HAMMERBACHER DISTRIBUTION CHANGING CLO UDERA THE WORLD ENTERPRIS ONE PETABYTE E AT A TIME 4 2009 2010 2011 2012 HADOOP CLOUDERA CLOUDERA CLOUDERA CREATOR DOUG MANAGER: UNIVERSITY CONNECT CUTTING JOINS FIRST EXPANDS TO 140 REACHES 300 CLOUDERA MANAGEMENT COUNTRIES PARTNERS APPLICATION FOR HADOOP 12
  • 13. CLOUDERA ENTERPRISE EDUCATION CLOUDERA SUPPORT: OUR TEAM OF EXPERTS ON CALL TO HELP YOU MEET YOUR SERVICE DEVELOPERS LEVEL AGREEMENTS (SLAS) ADMINISTRATORS CLOUDERA MANAGER: END-TO-END MANAGEMENT APPLICATION FOR THE DEPLOYMENT & OPERATION OF CDH DATA SCIENTISTS CDH: BIG DATA STORAGE, PROCESSING & ANALYTICS PLATFORM BASED CERTIFICATION ON APACHE HADOOP – 100% OPEN SOURCE PROGRAMS PROFESSIONAL SERVICES USE CASE NEW HADOOP PROOF OF PRODUCTION PROCESS & TEAM DEPLOYMENT DISCOVERY DEPLOYMENT CONCEPT PILOTS DEVELOPMENT CERTIFICATION 13
  • 14.  Cloudera’s software is never installed all by itself  It’s always deployed alongside mission-critical systems that represent enormous investment  Extracting value from data requires sharing it across boundaries and among systems Goal: The right storage and the right processing in the right place at the right time 14 ©2012 Cloudera, Inc. All Rights Reserved.
  • 15. ✛ Disparate data sources ✛ Disparate systems for transforming, processing and analyzing data ✛ Disparate systems for capturing and reporting data, and for enforcing business and legislative governance requirements All need to be connected for usability and to unlock the unique value of each 15 ©2012 Cloudera, Inc. All Rights Reserved.
  • 16. Consulting Services Cloudera University OPERATORS ENGINEERS ANALYSTS BUSINESS USERS CUSTOMERS Management Enterprise Web Tools IDE’s BI / Analytics Reporting Application Enterprise Data Warehouse Cloudera Enterprise •CDH •Cloudera Manager Operational Rules •Technical Support Engines Relational Logs Files Web Data Databases 16 ©2011 Cloudera, Inc. All Rights Reserved.
  • 17. DATA ADVANCED INDUSTRY PROCESSING ANALYTICS Web Clickstream Sessionization Social Network Analysis Media Engagement Content Optimization Telecom Mediation Network Analytics Retail Data Factory Loyalty & Promotions Financial Trade Reconciliation Fraud Analysis Government Signal Intelligence (SIGINT) Entity Analysis Biotech / Pharma Genome Mapping Sequencing Analysis
  • 18. 18
  • 19. Hadoop@YP Sept 26, 2012 William Theisinger Executive Director, Platform Computing © 2012 YP Holdings LLC Intellectual Property. All rights reserved. YP Holdings LLC, the YP Holdings LLC logo and all other YP Holdings LLC marks contained herein are trademarks of YP Holdings LLC Intellectual Property and/or YP Holdings LLC affiliated companies. All other marks contained herein are the property of their respective owners. (INTERNAL USE ONLY)
  • 21. What we were facing • Increasing volume of traffic data through our distribution network • Need for a system to support changing data complexity and detail • Adhere to tighter SLAs • Provide intra-day reporting • Benefit from the intelligence trapped in our data 21
  • 22. Legacy processing flow Data Load Application Log Data Layer ETL Data Load Data Warehouse Data processing Data Load • Drop reportable events on the floor • Loading multiple DBs • Processing time was significant • Reporting lag was in days, not hours • High maintainability required Page
  • 24. Hadoop processing flow Data Data Hadoop Platform Data Applications LWES Collection Layer Warehouse • All ETL processing in Hadoop • Several systems integrate to Hadoop platform • All Java MapReduce with some Hive for end user and dependent systems • Reporting lag in hours, not days • Actual reduction in maintainability needs Page
  • 26. Hadoop processing flow Data Warehouse Applications Data Data Hadoop Platform LWES Collection Layer HBase Platform • Migrating some reporting to HBase • Exposing core business KPIs via APIs • Replacing various data marts with HBase tables/schemas • Reducing TCO • Alignment of core skill sets Page
  • 27. Hadoop @ Research In Motion Aaron Wiebe BlackBerry Infrastructure Architect
  • 28. Internal Use Only The Problem 1. BlackBerry Services currently generate 500TB of instrumentation data daily (and growing rapidly). 2. Traditional systems unable to cope with both growth and access requests. 3. Total global dataset of ~100PB. 28 Confidential and Proprietary
  • 29. Internal Use Only The Old Way Event Monitoring Alerting Filter Streaming ETL Complex Correlation Services and Split Streaming ETL Data Warehouse Archive Storage 1. - Focus on reducing data to required data set 2. - Pipeline data flows to avoid hitting disk 3. - Scalability issues at most stages 4. - Going back to the Archive was really time consuming 29 Confidential and Proprietary
  • 30. Internal Use Only The Hadoop Way Event Monitoring Alerting Filter Services and Hadoop Archive Storage Split ETL Data Warehouse Correlation Stage 1 DWH 1. - Archive storage moved to HDFS 2. - ETL processes converted to Hadoop (Pig+Hive) 3. - Some data warehouse functions migrating to Hadoop 30 Confidential and Proprietary
  • 31. Internal Use Only Real Results 1. - 90% code base reduction for ETL Tools 2. - Example Performance: 3. - Previous Ad-Hoc query would take around 4 days - Now takes 53 minutes - Significant capital cost reductions over previous system 31 Confidential and Proprietary
  • 32. Introducing our Speakers Matt Mike Bill Aaron Aslett Olson Theisinger Wiebe 32

Notes de l'éditeur

  1. Hadoop typically solves two types of problems. Data process is the first step after collection. Data is combined and prepared, features extracted and curated Advanced analytics is where science is applied. Extracting and understanding models of how the business operates. The results are then integrated back into business operations. These go by different terms in different industries The applicability of these solutions is broad We ’ve successfully deployed Hadoop and helped solve a diverse set of business problems
  2. Speak to the size and scope of the problem Problems with handling ~100PB of data using traditional methods
  3. -Lose data as pipelines progress -Going back for information after the fact is hard, if not impossible. -
  4. This is where Hadoop fit for us
  5. -But changing to Hadoop has bigger, more massive impacts overall. -Things we couldn ’t even consider doing are now feasible -