SlideShare une entreprise Scribd logo
1  sur  29
REALIZING THE PROMISE OF BIG
    DATA WITH HADOOP
    Noel Yuhanna - Forrester
    Omer Trajman - Cloudera
    Jeremy Dyer & Marty Smith - RelayHealth




1
Hadoop and Big Data
Noel Yuhanna, Principal Analyst




2   © 2009 Forrester Research, Inc. Reproduction Prohibited
      2012
Enterprises have 100s of terabytes or petabytes
 of data but most of it is unused…




Unused data is a valuable asset and should be leveraged !

3   © 2012 Forrester Research, Inc. Reproduction Prohibited
Big Data - Problem or opportunity?

 Big data presents serious challenges:
     – Strains the current limits of IT infrastructure and resources
     – Requires an upgrade across the stack: storage, compute


 A huge opportunity exists with big data!
     – Improve operational efficiency
     – Offer new insights that can provide competitive advantage
     – Deliver advanced, predictive analytics – with more precision
     – Support activities and analysis that generate revenue and bring businesses
       closer to their customers much faster




 4    © 2012 Forrester Research, Inc. Reproduction Prohibited
Big Data requires a new approach to data
processing and analytics
Organizations need to be able to:
      • Process any data at any given time
      • Manage very large data sets that run into 100s of TB and PBs
      • Process data economically
      • Integrate with many sources of data
      • Support predictive analytics and self-service data management
        platform




5   © 2012 Forrester Research, Inc. Reproduction Prohibited
What is Hadoop and how can it help?
Open source software that enables distributed
parallel processing of large amounts of data across                        Large
low-cost commodity servers.                                               amounts
                                                                           of data
 It leverages an extensible framework for building
  advanced analytics and new data management
  capabilities.

 It’s already being commercialized and adopted rapidly
  in enterprises.


                                                                           Hadoop
      Flexible                                Distributed processing


      Economical                                               Scalable
                                              Open Source
                                                                          Insights
6   © 2012 Forrester Research, Inc. Reproduction Prohibited
How are organizations adopting Hadoop?
 Hadoop adoption:
      – Current adoption estimate is 20% seen mostly in mid-sized to large
        organizations
      – Adoption is likely to double through 2016
      – Adoption seen across all vertical industries with various use cases
      – Many organizations are currently doing POC/Sandbox with Hadoop
        platform


 How Hadoop will evolve in organizations:
      – Will start out as independent project focusing on priority Analytics
      – Will start to integrate with existing systems, Apps and databases
      – Embed seamlessly into data management and Analytical Platforms
      – Hadoop will become the Data platform delivering self-service capabilities

7   © 2012 Forrester Research, Inc. Reproduction Prohibited
How to get going on the Big Data journey

 Big Data is here to stay! Hadoop is here to stay!
 Hadoop should be part of your data management and BI strategy
 Integrate Hadoop with existing data mgt., databases and Apps
 Hadoop can help save money, deliver new insights and possibilities
 Don’t limit yourself to structured data only
 A big data initiative is not a one time project, its an on-going process




8   © 2012 Forrester Research, Inc. Reproduction Prohibited
CLOUDERA: THE STANDARD FOR
APACHE HADOOP IN THE ENTERPRISE
OMER TRAJMAN, VP CUSTOMER SOLUTIONS
“   YOU CAN’T SOLVE 21ST
    CENTURY PROBLEMS
    WITH 20TH CENTURY
    TECHNOLOGIES
                  ”
HOSPITALS
    NEED MORE
  COMPREHENSIVE
      PATIENT
   INFORMATION

                                         BANKS MUST
                                        DETECT FRAUD                     BROADCAST NETWORKS
                                           FASTER                          WANT TO DELIVER
                                                                        CUSTOMIZED CONTENT BY
                                                                             HOUSEHOLD




                    AIRLINES WANT TO
                     UPDATE FLIGHT                        POWER COMPANIES
                     PRICES IN REAL-                        WANT TO SAVE
                          TIME                           CUSTOMERS MONEY BY
                                                              ANALYZING
                                                             USAGE DATA


                                        OIL COMPANIES
                                       WANT TO PREDICT
                                       THE LOCATION OF
                                        DEPOSITS MORE
                                         ACCURATALY

RETAILERS WANT TO                                                                 PARTICLE
   CREATE MORE                                                                 PHYSICISTS WANT
TARGETTED OFFERS                                                                REAL-TIME DATA
  TO CUSTOMERS                                                                FROM THE HADRON
                                                                                  COLLIDER
SCIENTIFIC APPROACH
TO DATA REQUIRES…
STORAGE FORMATS
FLEXIBILITY
EXTENSIBILITY
COMPACT STORAGE
FAST LOAD/STORE
WIDELY SUPPORTED
SIX CHARACTERISTICS OF
ENTERPRISE-GRADE HADOOP


1   HIGH
    AVAILABILITY                        2   GRANULAR
                                            SECURITY
    THERE’S NO DOWNTIME. YOUR DATA IS       PROCESS AND CONTROL SENSITIVE
    ALWAYS AVAILABLE FOR DECISIONS          DATA WITH CONFIDENCE




3   ROBUST
    MANAGEMENT                          4   SCALABLE AND
                                            EXTENSIBLE
    ACHIEVE OPTIMAL PERFORMANCE VIA         ADAPTS TO YOUR WORKLOAD AND
    CENTRALIZED ADMINISTRATION              GROWS WITH THE BUSINESS




5   CERTIFIED AND
    COMPATIBLE                          6   GLOBAL SUPPORT
                                            AND SERVICES
    EXTEND AND LEVERAGE EXISTING            ACHIEVE SLAs AND ADHERE TO
    INFRASTRUCTURE INVESTMENTS              EXISTING IT POLICIES
HADOOP PROVIDES A DATA HUB FOR ALL BIG DATA WORKLOADS




       • Brings storage and computation together in one single system
       • Works with every type of data in its native format
       • Changes the economics of data management
APACHE HADOOP
CO-EXISTS WITH EDW, ETL & BI TOOLS
             Consulting Services
             Cloudera University                                              Cloudera Services

OPERATORS                                               ENGINEERS     ANALYSTS             BUSINESS USERS   CUSTOMERS



                      Cloudera Enterprise
Management               Cloudera Manager                                                    Enterprise      Web
                         Cloudera Support                IDE’s      BI / Analytics
   Tools                                                                                      Reporting     Application




                                                                                      Enterprise Data
                                                                                       Warehouse
     Cloudera’s Distribution
Including Apache Hadoop (CDH)
               &                                                                 Operational Rules
 Cloudera Manager Free Edition                                                       Engines



                                                        Relational
   Logs                   Files              Web Data
                                                        Databases
CLOUDERA’S PARTNER ECOSYSTEM:
WIDEST INTEGRATION
                           All the industry leaders develop on CDH.

 CDH4
                                                                        STORAGE    COMPUTATION        ACCESS   INTEGRATION
 Big Data storage, processing and analytics platform based
 on Apache Hadoop – 100% open source




  BI / Analytics               Data Integration              Database             OS / Cloud / Sys Mgmt        Hardware




     16
REDEFINE WHAT’S
POSSIBLE WITH
YOUR DATA
Why Hadoop, Why Cloudera, Why Now?




          Agenda
          ✛ RH overview
          ✛ What is our need
          ✛ Why our system/data is complicated
          ✛ How Hadoop meets our needs
McKesson Corporation



   ✛   Largest healthcare company in the world
       $103+billion in revenues; Fortune 15; S&P 500
       Est. 1833
       Headquarters: San Francisco

   ✛   Business
       Distribution Solutions
       Technology Solutions

   ✛   Extensive resource base
       32,000+ employees solely dedicated to healthcare

   ✛   Comprehensive array of solutions
       Significant value through a single relationship

   ✛   Broadest customer base in healthcare
       Experienced partners in improving healthcare
Overview of Financial Solutions




  200,000
  Physicians                                           1900
  2,000                                                Payers /
  Hospitals                                            Health Plans




                Provider-to-Payer Interactions
                Total Interactions: 2.4 Billion/Year
Business Challenges



 ✛ Help customers save money
     ✛ Small reductions to time in AR 
        big savings, better cash flow




                                 ✛ Meet regulatory challenges
                                        > Must store 7 years transactional
                                          data
What Big Data Means to RelayHealth



                     Every single day:

                     + millions of transactions generated


                     + thousands of files received


                     + 150GB+ log data collected
                         …to be stored for 7 years
Why RelayHealth Considered Hadoop




✛ Business requirement around data storage & retrieval


✛ Looked at traditional solutions




                                 Database
                                                  File System
                                   $$$;
                                                 Untenable when
                                Not easy to
                                                    searching
                                index files


                Hybrid
         (File System + Solr)
             Not scalable
Achieving Operational Efficiency with Hadoop & Cloudera




✛ Why Hadoop?                          ✛ Why Cloudera?

    > Store billions of files across      > Core Apache Hadoop
       machines                              leveraging OSS community
    > Mine data in files using M/R            > Integration with other open
                                                 source solutions:
    > Aggregate log data & search                HBase, Solr, Camel
       through it using unique
                                          > Committer level knowledge of
       customer identifying
       information                           code & how it works
                                          > World-class support
    > Store data in its highest
       fidelity state                     > Cloudera Manager
Changing Perception




✛ Simple archive vs. a way to share data across the organization


✛ Building the ability to collect data flowing through our system at all
   points needed

✛ Integrating CDH into the rest of the enterprise
    > Storing data in its highest fidelity state
    > Moving away from traditional warehousing systems
    > Ability to distill data in the cluster for mining in other systems – CDH
      connectors
Summary




✛ Challenge:                         ✛ Solution:
   ✛ Shorten healthcare providers’     ✛ Hadoop
      payment cycles via                 scalable, flexible data
      streamlined message                processing & analysis on
      processing                         multi-structured data
      ✛ RDBMS can’t keep up               ✛ Cloudera Enterprise
          with growing data                 adding
          volumes + data storage            expertise, support &
          mandates for regulatory           management tools to
          compliance                        open source Hadoop
Q&A

28
REGISTER NOW FOR THE REMAINING
         ‘POWER OF HADOOP’ WEBINARS:
 THANK   WHAT THE HADOOP: WHY YOUR BUSINESS CAN’T


  YOU!
         AFFORD TO IGNORE THE POWER OF HADOOP
         GIGAOM PRO AND CLOUDERA
         WEDNESDAY, AUGUST 29, 10AM PST

         THE BUSINESS ADVANTAGE OF HADOOP:LESSONS
         FROM THE FIELD
         451 RESEARCH AND CLOUDERA
         THURSDAY, SEPTEMBER 26, 10AM PST




29

Contenu connexe

Tendances

Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduCloudera, Inc.
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionCloudera, Inc.
 
快速数据快速分析引擎-Kudu
快速数据快速分析引擎-Kudu快速数据快速分析引擎-Kudu
快速数据快速分析引擎-KuduJianwei Li
 
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIntel IT Center
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester WebinarCloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
IoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected WorldIoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected WorldDataWorks Summit
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsDataWorks Summit
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
 
Designing Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted AnalyticsDesigning Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted AnalyticsDataWorks Summit
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduCloudera, Inc.
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudCloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
 
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...ArabNet ME
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera, Inc.
 

Tendances (20)

Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache Kudu
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
 
快速数据快速分析引擎-Kudu
快速数据快速分析引擎-Kudu快速数据快速分析引擎-Kudu
快速数据快速分析引擎-Kudu
 
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester Webinar
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
IoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected WorldIoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected World
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the experts
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Designing Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted AnalyticsDesigning Data Pipelines for Automous and Trusted Analytics
Designing Data Pipelines for Automous and Trusted Analytics
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for Analytics
 

En vedette

Case study: Hadoop as ELT for Leading US Retailer - Happiest Minds
Case study: Hadoop as ELT for Leading US Retailer - Happiest MindsCase study: Hadoop as ELT for Leading US Retailer - Happiest Minds
Case study: Hadoop as ELT for Leading US Retailer - Happiest MindsHappiest Minds Technologies
 
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Cloudera, Inc.
 
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...Cloudera, Inc.
 
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UKSUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UKhuguk
 
Hw09 Clouderas Distribution For Hadoop
Hw09   Clouderas Distribution For HadoopHw09   Clouderas Distribution For Hadoop
Hw09 Clouderas Distribution For HadoopCloudera, Inc.
 
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemCloudera, Inc.
 
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14iwrigley
 
Geber Consulting - Big Data in Healthcare
Geber Consulting - Big Data in Healthcare Geber Consulting - Big Data in Healthcare
Geber Consulting - Big Data in Healthcare Martin Hiesboeck
 
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...Alex Zeltov
 
Константин Швачко, Yahoo!, - Scaling Storage and Computation with Hadoop
Константин Швачко, Yahoo!, - Scaling Storage and Computation with HadoopКонстантин Швачко, Yahoo!, - Scaling Storage and Computation with Hadoop
Константин Швачко, Yahoo!, - Scaling Storage and Computation with HadoopMedia Gorod
 
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)SironaHealth
 
Healthcare Analytics Maturity Model
Healthcare Analytics Maturity ModelHealthcare Analytics Maturity Model
Healthcare Analytics Maturity ModelFrank Wang
 
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...Spark Summit
 
Predicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using CascadingPredicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using CascadingCascading
 
Big Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life SciencesBig Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life SciencesAli Sanousi, MD, MBA, PhD
 
Big Data, CEP and IoT : Redefining Healthcare Information Systems and Analytics
Big Data, CEP and IoT : Redefining Healthcare Information Systems and AnalyticsBig Data, CEP and IoT : Redefining Healthcare Information Systems and Analytics
Big Data, CEP and IoT : Redefining Healthcare Information Systems and AnalyticsTauseef Naquishbandi
 
Medicine of the Future—The Transformation from Reactive to Proactive (P4) Med...
Medicine of the Future—The Transformation from Reactive to Proactive (P4) Med...Medicine of the Future—The Transformation from Reactive to Proactive (P4) Med...
Medicine of the Future—The Transformation from Reactive to Proactive (P4) Med...Ryan Squire
 
Agile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoopAgile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoopDataWorks Summit
 
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1
Cloudera, Inc.
 

En vedette (20)

Case study: Hadoop as ELT for Leading US Retailer - Happiest Minds
Case study: Hadoop as ELT for Leading US Retailer - Happiest MindsCase study: Hadoop as ELT for Leading US Retailer - Happiest Minds
Case study: Hadoop as ELT for Leading US Retailer - Happiest Minds
 
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
 
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
 
Cloudera
ClouderaCloudera
Cloudera
 
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UKSUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
 
Hw09 Clouderas Distribution For Hadoop
Hw09   Clouderas Distribution For HadoopHw09   Clouderas Distribution For Hadoop
Hw09 Clouderas Distribution For Hadoop
 
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop Ecosystem
 
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
 
Geber Consulting - Big Data in Healthcare
Geber Consulting - Big Data in Healthcare Geber Consulting - Big Data in Healthcare
Geber Consulting - Big Data in Healthcare
 
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...
IBM Insight 2014 session (4152 )- Accelerating Insights in Healthcare with “B...
 
Константин Швачко, Yahoo!, - Scaling Storage and Computation with Hadoop
Константин Швачко, Yahoo!, - Scaling Storage and Computation with HadoopКонстантин Швачко, Yahoo!, - Scaling Storage and Computation with Hadoop
Константин Швачко, Yahoo!, - Scaling Storage and Computation with Hadoop
 
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)
Hospital Readmission Reduction: How Important are Follow Up Calls? (Hint: Very)
 
Healthcare Analytics Maturity Model
Healthcare Analytics Maturity ModelHealthcare Analytics Maturity Model
Healthcare Analytics Maturity Model
 
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
Healthcare Predictive Analytics with the OR-(Denny Lee and Ayad Shammout, Dat...
 
Predicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using CascadingPredicting Hospital Readmission Using Cascading
Predicting Hospital Readmission Using Cascading
 
Big Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life SciencesBig Data Analytics in Healthcare and Life Sciences
Big Data Analytics in Healthcare and Life Sciences
 
Big Data, CEP and IoT : Redefining Healthcare Information Systems and Analytics
Big Data, CEP and IoT : Redefining Healthcare Information Systems and AnalyticsBig Data, CEP and IoT : Redefining Healthcare Information Systems and Analytics
Big Data, CEP and IoT : Redefining Healthcare Information Systems and Analytics
 
Medicine of the Future—The Transformation from Reactive to Proactive (P4) Med...
Medicine of the Future—The Transformation from Reactive to Proactive (P4) Med...Medicine of the Future—The Transformation from Reactive to Proactive (P4) Med...
Medicine of the Future—The Transformation from Reactive to Proactive (P4) Med...
 
Agile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoopAgile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoop
 
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1

 

Similaire à Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Series | Forrester

Appfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution BriefAppfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution BriefAppfluent Technology
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Pactera_US
 
A better business case for big data with Hadoop
A better business case for big data with HadoopA better business case for big data with Hadoop
A better business case for big data with HadoopAptitude Software
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldCA Technologies
 
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionCisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionAppfluent Technology
 
Hw09 Data Processing In The Enterprise
Hw09   Data Processing In The EnterpriseHw09   Data Processing In The Enterprise
Hw09 Data Processing In The EnterpriseCloudera, Inc.
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataWANdisco Plc
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifyHortonworks
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseCloudera, Inc.
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Hortonworks
 
Bb3061 bess systems of record sv
Bb3061 bess systems of record svBb3061 bess systems of record sv
Bb3061 bess systems of record svCharlie Bess
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Barijaxconf
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 

Similaire à Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Series | Forrester (20)

Appfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution BriefAppfluent and Cloudera Solution Brief
Appfluent and Cloudera Solution Brief
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
A better business case for big data with Hadoop
A better business case for big data with HadoopA better business case for big data with Hadoop
A better business case for big data with Hadoop
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
 
Why Hadoop as a Service?
Why Hadoop as a Service?Why Hadoop as a Service?
Why Hadoop as a Service?
 
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionCisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
 
Hw09 Data Processing In The Enterprise
Hw09   Data Processing In The EnterpriseHw09   Data Processing In The Enterprise
Hw09 Data Processing In The Enterprise
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
 
Hadoop in the Cloud
Hadoop in the CloudHadoop in the Cloud
Hadoop in the Cloud
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
 
Bb3061 bess systems of record sv
Bb3061 bess systems of record svBb3061 bess systems of record sv
Bb3061 bess systems of record sv
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 

Plus de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Cloudera, Inc.
 

Plus de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 

Dernier

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 

Dernier (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Series | Forrester

  • 1. REALIZING THE PROMISE OF BIG DATA WITH HADOOP Noel Yuhanna - Forrester Omer Trajman - Cloudera Jeremy Dyer & Marty Smith - RelayHealth 1
  • 2. Hadoop and Big Data Noel Yuhanna, Principal Analyst 2 © 2009 Forrester Research, Inc. Reproduction Prohibited 2012
  • 3. Enterprises have 100s of terabytes or petabytes of data but most of it is unused… Unused data is a valuable asset and should be leveraged ! 3 © 2012 Forrester Research, Inc. Reproduction Prohibited
  • 4. Big Data - Problem or opportunity?  Big data presents serious challenges: – Strains the current limits of IT infrastructure and resources – Requires an upgrade across the stack: storage, compute  A huge opportunity exists with big data! – Improve operational efficiency – Offer new insights that can provide competitive advantage – Deliver advanced, predictive analytics – with more precision – Support activities and analysis that generate revenue and bring businesses closer to their customers much faster 4 © 2012 Forrester Research, Inc. Reproduction Prohibited
  • 5. Big Data requires a new approach to data processing and analytics Organizations need to be able to: • Process any data at any given time • Manage very large data sets that run into 100s of TB and PBs • Process data economically • Integrate with many sources of data • Support predictive analytics and self-service data management platform 5 © 2012 Forrester Research, Inc. Reproduction Prohibited
  • 6. What is Hadoop and how can it help? Open source software that enables distributed parallel processing of large amounts of data across Large low-cost commodity servers. amounts of data  It leverages an extensible framework for building advanced analytics and new data management capabilities.  It’s already being commercialized and adopted rapidly in enterprises. Hadoop Flexible Distributed processing Economical Scalable Open Source Insights 6 © 2012 Forrester Research, Inc. Reproduction Prohibited
  • 7. How are organizations adopting Hadoop?  Hadoop adoption: – Current adoption estimate is 20% seen mostly in mid-sized to large organizations – Adoption is likely to double through 2016 – Adoption seen across all vertical industries with various use cases – Many organizations are currently doing POC/Sandbox with Hadoop platform  How Hadoop will evolve in organizations: – Will start out as independent project focusing on priority Analytics – Will start to integrate with existing systems, Apps and databases – Embed seamlessly into data management and Analytical Platforms – Hadoop will become the Data platform delivering self-service capabilities 7 © 2012 Forrester Research, Inc. Reproduction Prohibited
  • 8. How to get going on the Big Data journey  Big Data is here to stay! Hadoop is here to stay!  Hadoop should be part of your data management and BI strategy  Integrate Hadoop with existing data mgt., databases and Apps  Hadoop can help save money, deliver new insights and possibilities  Don’t limit yourself to structured data only  A big data initiative is not a one time project, its an on-going process 8 © 2012 Forrester Research, Inc. Reproduction Prohibited
  • 9. CLOUDERA: THE STANDARD FOR APACHE HADOOP IN THE ENTERPRISE OMER TRAJMAN, VP CUSTOMER SOLUTIONS
  • 10. YOU CAN’T SOLVE 21ST CENTURY PROBLEMS WITH 20TH CENTURY TECHNOLOGIES ”
  • 11. HOSPITALS NEED MORE COMPREHENSIVE PATIENT INFORMATION BANKS MUST DETECT FRAUD BROADCAST NETWORKS FASTER WANT TO DELIVER CUSTOMIZED CONTENT BY HOUSEHOLD AIRLINES WANT TO UPDATE FLIGHT POWER COMPANIES PRICES IN REAL- WANT TO SAVE TIME CUSTOMERS MONEY BY ANALYZING USAGE DATA OIL COMPANIES WANT TO PREDICT THE LOCATION OF DEPOSITS MORE ACCURATALY RETAILERS WANT TO PARTICLE CREATE MORE PHYSICISTS WANT TARGETTED OFFERS REAL-TIME DATA TO CUSTOMERS FROM THE HADRON COLLIDER
  • 12. SCIENTIFIC APPROACH TO DATA REQUIRES… STORAGE FORMATS FLEXIBILITY EXTENSIBILITY COMPACT STORAGE FAST LOAD/STORE WIDELY SUPPORTED
  • 13. SIX CHARACTERISTICS OF ENTERPRISE-GRADE HADOOP 1 HIGH AVAILABILITY 2 GRANULAR SECURITY THERE’S NO DOWNTIME. YOUR DATA IS PROCESS AND CONTROL SENSITIVE ALWAYS AVAILABLE FOR DECISIONS DATA WITH CONFIDENCE 3 ROBUST MANAGEMENT 4 SCALABLE AND EXTENSIBLE ACHIEVE OPTIMAL PERFORMANCE VIA ADAPTS TO YOUR WORKLOAD AND CENTRALIZED ADMINISTRATION GROWS WITH THE BUSINESS 5 CERTIFIED AND COMPATIBLE 6 GLOBAL SUPPORT AND SERVICES EXTEND AND LEVERAGE EXISTING ACHIEVE SLAs AND ADHERE TO INFRASTRUCTURE INVESTMENTS EXISTING IT POLICIES
  • 14. HADOOP PROVIDES A DATA HUB FOR ALL BIG DATA WORKLOADS • Brings storage and computation together in one single system • Works with every type of data in its native format • Changes the economics of data management
  • 15. APACHE HADOOP CO-EXISTS WITH EDW, ETL & BI TOOLS  Consulting Services  Cloudera University Cloudera Services OPERATORS ENGINEERS ANALYSTS BUSINESS USERS CUSTOMERS Cloudera Enterprise Management  Cloudera Manager Enterprise Web  Cloudera Support IDE’s BI / Analytics Tools Reporting Application Enterprise Data Warehouse Cloudera’s Distribution Including Apache Hadoop (CDH) & Operational Rules Cloudera Manager Free Edition Engines Relational Logs Files Web Data Databases
  • 16. CLOUDERA’S PARTNER ECOSYSTEM: WIDEST INTEGRATION All the industry leaders develop on CDH. CDH4 STORAGE COMPUTATION ACCESS INTEGRATION Big Data storage, processing and analytics platform based on Apache Hadoop – 100% open source BI / Analytics Data Integration Database OS / Cloud / Sys Mgmt Hardware 16
  • 18.
  • 19. Why Hadoop, Why Cloudera, Why Now? Agenda ✛ RH overview ✛ What is our need ✛ Why our system/data is complicated ✛ How Hadoop meets our needs
  • 20. McKesson Corporation ✛ Largest healthcare company in the world $103+billion in revenues; Fortune 15; S&P 500 Est. 1833 Headquarters: San Francisco ✛ Business Distribution Solutions Technology Solutions ✛ Extensive resource base 32,000+ employees solely dedicated to healthcare ✛ Comprehensive array of solutions Significant value through a single relationship ✛ Broadest customer base in healthcare Experienced partners in improving healthcare
  • 21. Overview of Financial Solutions 200,000 Physicians 1900 2,000 Payers / Hospitals Health Plans Provider-to-Payer Interactions Total Interactions: 2.4 Billion/Year
  • 22. Business Challenges ✛ Help customers save money ✛ Small reductions to time in AR  big savings, better cash flow ✛ Meet regulatory challenges > Must store 7 years transactional data
  • 23. What Big Data Means to RelayHealth Every single day: + millions of transactions generated + thousands of files received + 150GB+ log data collected …to be stored for 7 years
  • 24. Why RelayHealth Considered Hadoop ✛ Business requirement around data storage & retrieval ✛ Looked at traditional solutions Database File System $$$; Untenable when Not easy to searching index files Hybrid (File System + Solr) Not scalable
  • 25. Achieving Operational Efficiency with Hadoop & Cloudera ✛ Why Hadoop? ✛ Why Cloudera? > Store billions of files across > Core Apache Hadoop machines leveraging OSS community > Mine data in files using M/R > Integration with other open source solutions: > Aggregate log data & search HBase, Solr, Camel through it using unique > Committer level knowledge of customer identifying information code & how it works > World-class support > Store data in its highest fidelity state > Cloudera Manager
  • 26. Changing Perception ✛ Simple archive vs. a way to share data across the organization ✛ Building the ability to collect data flowing through our system at all points needed ✛ Integrating CDH into the rest of the enterprise > Storing data in its highest fidelity state > Moving away from traditional warehousing systems > Ability to distill data in the cluster for mining in other systems – CDH connectors
  • 27. Summary ✛ Challenge: ✛ Solution: ✛ Shorten healthcare providers’ ✛ Hadoop payment cycles via scalable, flexible data streamlined message processing & analysis on processing multi-structured data ✛ RDBMS can’t keep up ✛ Cloudera Enterprise with growing data adding volumes + data storage expertise, support & mandates for regulatory management tools to compliance open source Hadoop
  • 29. REGISTER NOW FOR THE REMAINING ‘POWER OF HADOOP’ WEBINARS: THANK WHAT THE HADOOP: WHY YOUR BUSINESS CAN’T YOU! AFFORD TO IGNORE THE POWER OF HADOOP GIGAOM PRO AND CLOUDERA WEDNESDAY, AUGUST 29, 10AM PST THE BUSINESS ADVANTAGE OF HADOOP:LESSONS FROM THE FIELD 451 RESEARCH AND CLOUDERA THURSDAY, SEPTEMBER 26, 10AM PST 29

Notes de l'éditeur

  1. http://www.flickr.com/photos/ychi2010/6769591849/sizes/m/in/photostream/For decades companies have been making decisions based on transactional data stored in relational databases, Beyond that data is a potential treasure trove of non-traditional, less structured data that can be mind for useful insight. Decreases in the cost of storage and compute power have made it feasible to collect this data – which would have been thrown away only a few years ago. As a result, more and more companies are looking to include non-traditional yet potentially valuable data with their traditional enterprise data in the analysis proceses.
  2. FALLBACK
  3. Data science involves looking at data differently. Rather than creating a uniform schema (rows and columns), tools like Hadoop give data scientists the flexibility to store data in a format that fits the question we're trying to answer. This requires an underlying system that's flexible. A system that can store and process any type of data, starting with it's original raw format and allowing scientists to transform and apply a schema to suit the particular problem.Data scientists use tools and technologies that can read and write data in compact storage, are fast to read and write and can be accessed from a wide variety of languages.We use libraries such as Avro, which gives flexibility to structure and process data.
  4. Standard pitch from CDH4 launch…When we talk about bringingHadoop to the enterprise, there are six essential characteristics or areas that we focus on.High Availability – most customers want to use Hadoop to power mission critical applications and workflows. As such the system must run with maximum uptime to keep all data and processes available to the business.Granular security – enterprises require the ability to secure sensitive data types as well as control who has access to system resources and when. Cloudera works with the open source community to build these capabilities into the platform and provides simple configuration and enforcement through our management application.Robust Management – Hadoop is a distributed system with many moving parts. Centralized management is critical for successful implementationScalable and Extensible – one of the great things about Hadoop is it’s massive scalability. We want to make it easy for you to take advantage of this by integrating your applications with the platform.Certified and Compatible – Enterprises have invested significant amounts of time and money into their existing infrastructure (data warehouses, BI applications, etc.). We want to make sure that Hadoop integrates seamlessly with those technologies.Global Support and Services – As Hadoop becomes a critical component of the data management infrastructure, we want to empower our customers to meet stringent service level agreements and build out their own Hadoop workforce.
  5. Hadoop is an open-source framework for running applications on large clusters of commodity hardware. As a result, it delivers enormous processing power and the ability to handle virtually limitless concurrent tasks and jobs, making it remarkably low-cost complement to traditional enterprise data infrastructure. Organizations use Hadoop in 5 ways. 1) staging area for data warehouse and analytics store, 2) initial discovery and analysis, 3) storage and analysis of unstructured/semistructured content, 4) making total data available for analysis, 5) low cost storage of large data volumes.With traditional database and data analytics tools, information is stored in neat rows and columns, and there are limits to how much data you can juggle and how quickly. The Hadoop Distributed File System provides an environment to exploit massively parallel processing against large amounts of data. Hadoop changes the dynamics of large scale computing. With Hadoop, you can distribute raw data across a vast cluster of low-cost machines, and you can process that data in the same place you store it. The result is that you can store all your data and analyze it as needed. A paradigm shift - merging the power of analytics with the power of Hadoop data storage and processing to get better answers faster. This new paradigm will significantly improve an organization’s ability to assimilate vast data assets and give them the compute and analytical power to tackle problems/opportunities they never thought possible. As businesses become more analytical to gain competitive advantage and comply with new regulations, enterprise data warehouses are pushed to answer more ad-hoc questions from more people analyzing vastly larger volumes of data, often in real-time. Hadoop and next-gen analytic platforms are fundamental building blocks of the architecture needed to compete effectively in a data-driven world. Hadoop is the next wave of strategic enterpriseinformation management. THE ‘BIG DATA’ SHIFT“Big Data analysis is usually iterative: you ask one question or examine one data set, then think of more questions or decide to look at more data. That’s different from the “single source of truth” approach to standard BI and data warehousing.” — PwC 2010 Technology Forecast-----------------------------------------BRINGS STORAGE AND COMPUTATION TOGETHER IN A SINGLE SYSTEMPROCESS & ANALYZE DATA IN PLACEREMOVE NETWORK BOTTLENECKSELIMINATE DATA MIGRATIONSWORKS WITH EVERY TYPE OF DATA, IN ITS NATIVE FORMATNO NEED TO FIT A SINGLE SCHEMANOTHING LOST THROUGH ETLLOOK AT ALL YOUR DATA FOR A COMPREHENSIVE VIEWCHANGES THE ECONOMICS OFDATA MANAGEMENTOSS + COMMODITY HARDWAREKEEP EVERYTHING ONLINE SUPERCOMPUTING FOR EVERYONE
  6. Hadoop is not a single entity. It is a rich, complex, and evolving ecosystem of multiple open source products from Apache. In addition, the ecosystem expands almost daily as
more open source and vendor products support or extend Hadoop products and technical approaches.We are a platform company. Within our partner ecosystem you get everything you need to leverage big data. Hadoop is now a 1st class citizen in the enterprise IT department. With so many key IT vendors “attaching to Hadoop” via the Cloudera Connect program, the penetration of Hadoop related technologies into the heart of the enterprise analytics environment is acceleratedCoordinating your traditional and Big Data processes takes a vendor that understands the legacy and modern approach to data processing Cloudera is differentiated by its combination of platform + methodology + ecosystem. (methodology = data computing)
  7. The possibilities of big data continue to evolve rapidly, driven by innovation in the underlying technologies, platforms, and analytic capabilities for handling data, as well as the evolution of behavior among its users as more and more individuals live/work digital lives. To evolve into an organization that is “data-driven” and competes on data, the business must make better use of data as it moves through daily operations which demands a radical rethinking of traditional data warehousing and transaction processing. Hadoop leverages several resources that have been outside the information architectures we have today. It is bringing in new programming languages, new skills and new data and being deployed as a new platform. Think how it is used to extend/supplement how we leverage information, synergistic if we put the pieces together right. What is possible now that so many of the constraints are removed?
  8. Business Challenges:We need to use all the data we collect to help our customersSmall reductions to time in AR lead to big savings and better cash flowRelay has an existing suite of Analytics products, but we always want to do more This means keeping data at much higher fidelityRegulatory challengesNeed to store these transactions to meet regulatory compliance
  9. Storage of transaction dataMillions of transactions per dayThousands of files coming in as well as data flowing through web service and direct connection requestsStorage of log dataAverage over 150 GB of log data collected per day Data is used for troubleshooting customer issues and may be used 30 to 60 days after it is collected
  10. Project in place to meet business requirement around storage and retrieval of dataLooked at traditional solutionsDatabase – too costly, would not allow for easy indexing of filesFile system – Using enterprise standards, (lots of CPUs and SAN), proved to be untenable when searchingHybrid – File system + Solr. Did not investigate very thoroughly as there were issues around working with that volume of data