SlideShare une entreprise Scribd logo
1  sur  22
Hortonworks
Enabling Apache Hadoop to
power next-generation enterprise data architectures




June 2012




© Hortonworks Inc. 2012                               Page 1
Topics
• Big Data Market Overview

• Hortonworks Company & Strategy Overview

• Hortonworks Offerings
  – Hortonworks Data Platform Subscriptions
  – Public & On-site Training
  – Expert Short-term Consulting Services




                                              Page 2
     © Hortonworks Inc. 2012
Big Data = Transactions + Interactions + Observations

                                                                              BIG DATA
                       Sensors / RFID / Devices                                                  User Generated Content
Petabytes                                       Mobile Web                          Sentiment
                                                                                                Social Interactions & Feeds
                        User Click Stream
                                                                                                 Spatial & GPS Coordinates
                             Web logs               WEB                  A/B testing
 Terabytes                                                                                      External Demographics
                      Offer history                               Dynamic Pricing
                                                                                                 Business Data Feeds
                                                                           Affiliate Networks
                                       CRM                                                        HD Video, Audio, Images
 Gigabytes                                     Segmentation                Search Marketing
                                                  Offer details                                     Speech to Text
                         ERP                                             Behavioral Targeting
                   Purchase detail           Customer Touches
                                                                                                  Product/Service Logs
 Megabytes        Purchase record            Support Contacts              Dynamic Funnels
                  Payment record                                                                        SMS/MMS




                                        Increasing Data Variety and Complexity
                         Source: Contents of above graphic created in partnership with Teradata, Inc.

                                                                                                                     Page 3
             © Hortonworks Inc. 2012
What is Apache Hadoop?


• Collection of Open Source Projects          One of the best examples of
   – Apache Software Foundation (ASF)        open source driving innovation
   – Loosely coupled, ship early/often          and creating a market



                                         • Foundation for Big Data Solutions
                                            – Stores petabytes of data reliably
                                                 – Hadoop Distributed File System
                                            – Runs highly distributed computations
                                                 – Hadoop MapReduce framework
                                            – Enables a rational economics model
                                                 – Commodity servers & storage
                                            – Powers data-driven business

                                                                                    Page 4
        © Hortonworks Inc. 2012
7 Key Drivers for Hadoop
    Business Pressure
1   Opportunity to enable innovative new business models

2   Potential new insights that drive competitive advantage

    Technical Pressure
3   Data collected and stored continues to grow exponentially

4   Data is increasingly everywhere and in many formats

5   Traditional solutions not designed for new requirements

    Financial Pressure
6   Cost of data systems, as % of IT spend, continues to grow

7   Cost advantages of commodity hardware & open source

                                                                Page 5
      © Hortonworks Inc. 2012
3 Phases of Hadoop Adoption
                         Educate/Evaluate             Initial Production           Wide-scale Production
Timeline           1 - 12 months                   9 - 24 months                   18 - 36 months

Stage              Awareness, adoption and         Departmental production         Enterprise wide production
                   proof of enterprise viability   usage                           usage
Description        See it -> Learn it -> Do it     Single business use case,       Multiple use cases,
                   Evaluation, exploration,        focused solution architecture   broader solution architecture
                   POCs, Dev & Admin training
Key Questions      What are the potential use      Can the solution enable         How can the solution be
                   cases? Which one should I       future business models?         leveraged enterprise-wide?
                   focus on?
                                                   Am I maximizing the value       What is required to enable,
                   How do I get value now?         from the chosen use case?       integrate, operate at scale?

                   Where does Hadoop fit in        How does this solution          What does our next-
                   my data architecture?           interact within our             generation data architecture
                                                   departmental data               look like?
                   Can I leverage my existing      architecture?
                   tools/platforms?                                                How can I maximize access
                                                   How do I operationalize the     to data while minimizing
                   Can I replace any of my         solution?                       risk?
                   existing systems?

                                                                                                        Page 6
              © Hortonworks Inc. 2012
What’s Needed to Accelerate Adoption?
• Enterprise tooling to become a complete data platform
  – Open deployment & provisioning
  – Higher quality data loading
  – Monitoring and management
  – APIs for easy and efficient integration


• Ecosystem support & development
  – Existing infrastructure vendors need to continue to integrate
  – Apps need to continue to be developed on this infrastructure


• Market to rally around core Apache Hadoop
  – To avoid splintering/market fragmentation
  – To accelerate adoption

                                                                    Page 7
      © Hortonworks Inc. 2012
Topics
• Big Data Market Overview

• Hortonworks Company & Strategy Overview

• Hortonworks Offerings
  – Hortonworks Data Platform Subscriptions
  – Public & On-site Training
  – Expert Architectural Services




                                              Page 8
     © Hortonworks Inc. 2012
Hortonworks Vision & Role

         We believe that by the end of 2015,
        more than half the world's data will be
          processed by Apache Hadoop.

1                       Make Hadoop easy to use and consume

2             Make Hadoop an enterprise-viable data platform

3                             Provide open APIs and data services

4            Enable ecosystem at each layer of the data stack

5       Be stewards of the core and innovators on the edges

                                                                    Page 9
    © Hortonworks Inc. 2012
Hortonworks Strategy
• Lead within Hadoop Community
   – Team has delivered every major Hadoop
     release since 0.1
   – Experience managing world’s largest
     deployment
   – Ongoing access to Y!’s 1,000+ users and
     40k+ nodes for testing, QA, etc.



• Embrace & Enable Hadoop Ecosystem
   – 100% open source software

   – Full lifecycle support subscriptions

   – Expert role-based training

   – Enable solution architectures


                                               Page 10
          © Hortonworks Inc. 2012
Enable Hadoop to be Next-Gen Data Platform

       Enable the ecosystem at each layer                                      Make Hadoop easy to use/consume

                                                                               •   Usability
              Applications & Solutions                                         •   Ease of Installation
                   Tools & Languages
                                                                                   Make Hadoop ent viable platform
                       BI & Analytics
   Data Management Systems                                                            Installation & Configuration

Data Movement & Integration                                                               Administration
    Infrastructure Platform
                                                         Hortonworks                       Monitoring
                                                         Data Platform
                                                                                               Data Extract & Load
                                                    Load and process data
                                                    Enterprise data services




                                        Provide open APIs and data services
                                                                                                            Page 12
              © Hortonworks Inc. 2012
Next-Generation Data Architecture
 Audio,                                 Web, Mobile, CRM,
 Video,                                      ERP, SCM, …
Images                                                        Business
                                                            Transactions
 Docs,
 Text,                                                      & Interactions
 XML


  Web
 Logs,
 Clicks
                           Big Data
                                                             SQL   NoSQL     NewSQL
Social, G                  Refinery
raph, Fe
  eds

                                                             EDW    MPP      NewSQL
Sensors,
Devices,
 RFID

                                                               Business
Spatial,                                                      Intelligence
 GPS                    Apache Hadoop
                                                              & Analytics
Events,
 Other                                   Dashboards, Reports,
                                              Visualization, …

                                                                                      Page 14
            © Hortonworks Inc. 2012
Maximizing the Value from ALL of your Data
 Audio,                 Retain runtime models and
 Video,
Images
                         historical data for ongoing   4         Business
                              refinement & analysis
                                                               Transactions
 Docs,
 Text,                                                         & Interactions
 XML


  Web
 Logs,
                                                                           Web, Mobile, CRM,
 Clicks                                                                    ERP, SCM, …
                           Big Data
Social, G                  Refinery                                                                       Classic
raph, Fe
                                                       3   Share refined data and                  1
  eds                                                                                                        ETL
                                                           runtime models                              processing
Sensors,     2
Devices,
 RFID       Store, aggregate,
            and transform                                         Business
Spatial,
            multi-structured
 GPS        data to unlock
                                                                 Intelligence
            value                                                & Analytics
                                      Retain historical
Events,                                data to unlock      5
 Other
                                      additional value                      Dashboards, Reports,
                                                                            Visualization, …
                                                                                                   Page 15
            © Hortonworks Inc. 2012
Topics
• Big Data Market Overview

• Hortonworks Company & Strategy Overview

• Hortonworks Offerings
  – Hortonworks Data Platform Subscriptions
  – Public & On-site Training
  – Expert Short-term Consulting Services




                                              Page 16
     © Hortonworks Inc. 2012
Balancing Innovation & Stability
• Apache: Be aggressive - ship early and often
  – Projects need to keep innovating and visibly improve
  – Aim for big improvements on trunk
  – Make early buggy releases


• Hortonworks: Be predictable - ship when stable
  – We need to ship stable, working releases
  – Make packaged binary releases available
  – We need to do regular sustaining engineering releases
  – QA for stable Hadoop releases
  – HDP quarterly release trains sweep in stable Apache projects
       – Enables HDP to stay reasonably current and predictable while minimizing risk
         of thrashing that coordinating large # of Apache projects can cause




                                                                                Page 17
      © Hortonworks Inc. 2012
Hadoop Now, Next, and Beyond
  Apache community, including Hortonworks investing to improve Hadoop:
  • Make Hadoop an open, extensible, and enterprise viable platform
  • Enable more applications to run on Apache Hadoop
                                                             “Hadoop.Beyond”
                                                            Integrate w/ecosystem
                                      “Hadoop.Next”
                                         (Hadoop 2.x)
                                            HDP 2

  “Hadoop.Now”                       Next-gen MapReduce & HDFS
     (Hadoop 1.0)
        HDP 1
Most stable Hadoop ever




                                                                              Page 18
           © Hortonworks Inc. 2012
Hortonworks Support Subscriptions
Objective: help organizations to successfully develop
and deploy solutions based upon Apache Hadoop
• Full-lifecycle technical support available
  – Developer support for design, development and POCs
  – Production support for staging and production environments
       – Up to 24x7 with 1-hour response times

• Delivered by the Apache Hadoop experts
  – Backed by development team that has released every major
    version of Apache Hadoop since 0.1

• Forward-compatibility
  – Hortonworks’ leadership role helps ensure bug fixes and patches
    can be included in future versions of Hadoop projects



                                                                 Page 19
      © Hortonworks Inc. 2012
Cluster Subscriptions
                              Starter                Standard                        Enterprise
                                                      Per Cluster                      Per Cluster
       Unit                       3 month      20 Nodes w/ 250TB of Storage     20 Nodes w/ 250TB of Storage
                                               (Compute or Storage Expansion)   (Compute or Storage Expansion)

Supported Hortonworks Data Platform (HDP) and patches and updates for HDP.
 Software Software acquired via Hortonworks website and Cluster Subscriptions.
                Cluster operators can interact with the expert Hortonworks support staff during the
                proof-of-concept, staging and deployment phases.

                We Support: Configuration and installation questions, explanation of routine
  Support       maintenance, analysis of performance issues, diagnosis of system or application
 Coverage       issues and any bug fixes or patches that may be necessary.

                We Don’t Support: Production issues with customer code, end-to-end debugging of
                customer code, development of customer code, 3rd-party products used during
                development and deployment.
                      Web, Monday to Friday,      Web, Monday to Friday,
   Access                6am to 6pm PT               6am to 6pm PT
                                                                                   Web and Phone, 24 x 7

 Incidents                        Unlimited              Unlimited                        Unlimited

                                                                                       Priority 1: 1 Hour
Response                     Business Day              Business Day                   Priority 2: 4 Hours
                                                                                 Priority 3: 8 Hours / Biz Day

                                                                                                          Page 20
        © Hortonworks Inc. 2012
Developer Subscription
                                                   Developer
      Price                                        Per Developer
                 Hortonworks Data Platform (HDP) and patches and updates for HDP.
                 Software acquired via Hortonworks website and Cluster Subscriptions.
Supported
 Software Software acquired via Hortonworks website, Cluster Subscriptions, or Virtual/Cloud
                 Sandbox environments.

                 Developers can interact with the expert Hortonworks support staff to receive guidance
                 on the use of the software and answers for “how-to” questions.

                 We Support: Design advice, performance tuning advice, code snippet review and
  Support
          advice, problem diagnosis, bug reports, and other development related questions.
 Coverage
                 We Don't Support: Production issues with customer code, end-to-end debugging of
                 customer code, development of customer code, 3rd-party products used during
                 development and deployment.
   Access                              Web, Monday to Friday, 6am to 6pm PT
 Incidents                                            Unlimited
Response                                       4 Hours / Business Day


                                                                                                 Page 21
        © Hortonworks Inc. 2012
Hortonworks Training
Objective: help organizations overcome Hadoop
knowledge gaps
• Expert role-based training for developers,
  administrators & data analysts
  – Heavy emphasis on hands-on labs
  – Extensive schedule of public training courses available
    (hortonworks.com/training)

• Comprehensive certification programs



• Customized, on-site courses available

                                                              Page 22
      © Hortonworks Inc. 2012
Hortonworks Architectural Services
• Services team dedicated to Hadoop Architecture and
  Optimization
  – Extensive cluster experience from smaller <100 clusters to the
    largest in the world
  – Recognized technical experts on Hadoop
• We work closely with the technical teams to
  understand the business need and use case
  – Translate the needs and use cases to technical requirements
  – Callout other considerations based on our extensive knowledge
    for growing and expanding clusters
• Designed for short-term high-impact knowledge
  transfer and assist
  – Complement internal technical team and SI

                                                                     Page 23
     © Hortonworks Inc. 2012
Thank You!
Questions & Answers




                              Page 24
    © Hortonworks Inc. 2012

Contenu connexe

Tendances

ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase
 
Compegence: Nagaraj Kulkarni - Hadoop and No SQL_TDWI_2011Jul23_Preso
Compegence: Nagaraj Kulkarni - Hadoop and No SQL_TDWI_2011Jul23_PresoCompegence: Nagaraj Kulkarni - Hadoop and No SQL_TDWI_2011Jul23_Preso
Compegence: Nagaraj Kulkarni - Hadoop and No SQL_TDWI_2011Jul23_PresoCOMPEGENCE
 
B13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John RobsonB13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John RobsonProvoke Solutions
 
Introduccion a SQL Server Master Data Services
Introduccion a SQL Server Master Data ServicesIntroduccion a SQL Server Master Data Services
Introduccion a SQL Server Master Data ServicesEduardo Castro
 
Manthan biim services and solutions
Manthan   biim services  and solutionsManthan   biim services  and solutions
Manthan biim services and solutionsJaikumar Karuppannan
 
CDM SIG: Fusion MDM for Customer Highlights [2010 OAUG Collaborate]
CDM SIG: Fusion MDM for Customer Highlights [2010 OAUG Collaborate]CDM SIG: Fusion MDM for Customer Highlights [2010 OAUG Collaborate]
CDM SIG: Fusion MDM for Customer Highlights [2010 OAUG Collaborate]Rhapsody Technologies, Inc.
 
Module 3 Adapative Customer Experience Final
Module 3 Adapative Customer Experience FinalModule 3 Adapative Customer Experience Final
Module 3 Adapative Customer Experience FinalVivastream
 
To Each Their Own: How to Solve Analytic Complexity
To Each Their Own: How to Solve Analytic ComplexityTo Each Their Own: How to Solve Analytic Complexity
To Each Their Own: How to Solve Analytic ComplexityInside Analysis
 
B13 Driving Business Intelligence
B13 Driving Business IntelligenceB13 Driving Business Intelligence
B13 Driving Business IntelligenceJohnRobson
 
The Evolution of Platforms - Drew Kurth and Matt Comstock
The Evolution of Platforms - Drew Kurth and Matt ComstockThe Evolution of Platforms - Drew Kurth and Matt Comstock
The Evolution of Platforms - Drew Kurth and Matt ComstockRazorfish
 
Scaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaleBase
 
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012m_hepburn
 
SunCorp Campaign Measurement
SunCorp Campaign MeasurementSunCorp Campaign Measurement
SunCorp Campaign MeasurementDatalicious
 
Module 2 Orchestration and Interaction Final
Module 2 Orchestration and Interaction FinalModule 2 Orchestration and Interaction Final
Module 2 Orchestration and Interaction FinalVivastream
 
Envision IT Seminar Presentation - Microsoft Office 365
Envision IT Seminar Presentation - Microsoft Office 365 Envision IT Seminar Presentation - Microsoft Office 365
Envision IT Seminar Presentation - Microsoft Office 365 Envision IT
 
Bizs Datasheet Gourangi 2009
Bizs Datasheet Gourangi 2009Bizs Datasheet Gourangi 2009
Bizs Datasheet Gourangi 2009soumadeep
 
Case Study: HP Products
Case Study: HP ProductsCase Study: HP Products
Case Study: HP Productsjzeiger
 
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data QualityInformatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data QualityDatabase Architechs
 

Tendances (19)

ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
 
Compegence: Nagaraj Kulkarni - Hadoop and No SQL_TDWI_2011Jul23_Preso
Compegence: Nagaraj Kulkarni - Hadoop and No SQL_TDWI_2011Jul23_PresoCompegence: Nagaraj Kulkarni - Hadoop and No SQL_TDWI_2011Jul23_Preso
Compegence: Nagaraj Kulkarni - Hadoop and No SQL_TDWI_2011Jul23_Preso
 
B13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John RobsonB13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John Robson
 
Introduccion a SQL Server Master Data Services
Introduccion a SQL Server Master Data ServicesIntroduccion a SQL Server Master Data Services
Introduccion a SQL Server Master Data Services
 
Information Worker
Information WorkerInformation Worker
Information Worker
 
Manthan biim services and solutions
Manthan   biim services  and solutionsManthan   biim services  and solutions
Manthan biim services and solutions
 
CDM SIG: Fusion MDM for Customer Highlights [2010 OAUG Collaborate]
CDM SIG: Fusion MDM for Customer Highlights [2010 OAUG Collaborate]CDM SIG: Fusion MDM for Customer Highlights [2010 OAUG Collaborate]
CDM SIG: Fusion MDM for Customer Highlights [2010 OAUG Collaborate]
 
Module 3 Adapative Customer Experience Final
Module 3 Adapative Customer Experience FinalModule 3 Adapative Customer Experience Final
Module 3 Adapative Customer Experience Final
 
To Each Their Own: How to Solve Analytic Complexity
To Each Their Own: How to Solve Analytic ComplexityTo Each Their Own: How to Solve Analytic Complexity
To Each Their Own: How to Solve Analytic Complexity
 
B13 Driving Business Intelligence
B13 Driving Business IntelligenceB13 Driving Business Intelligence
B13 Driving Business Intelligence
 
The Evolution of Platforms - Drew Kurth and Matt Comstock
The Evolution of Platforms - Drew Kurth and Matt ComstockThe Evolution of Platforms - Drew Kurth and Matt Comstock
The Evolution of Platforms - Drew Kurth and Matt Comstock
 
Scaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data Distribution
 
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
 
SunCorp Campaign Measurement
SunCorp Campaign MeasurementSunCorp Campaign Measurement
SunCorp Campaign Measurement
 
Module 2 Orchestration and Interaction Final
Module 2 Orchestration and Interaction FinalModule 2 Orchestration and Interaction Final
Module 2 Orchestration and Interaction Final
 
Envision IT Seminar Presentation - Microsoft Office 365
Envision IT Seminar Presentation - Microsoft Office 365 Envision IT Seminar Presentation - Microsoft Office 365
Envision IT Seminar Presentation - Microsoft Office 365
 
Bizs Datasheet Gourangi 2009
Bizs Datasheet Gourangi 2009Bizs Datasheet Gourangi 2009
Bizs Datasheet Gourangi 2009
 
Case Study: HP Products
Case Study: HP ProductsCase Study: HP Products
Case Study: HP Products
 
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data QualityInformatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
 

Similaire à 2012 06 hortonworks paris hug

Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformHortonworks
 
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshowAccenture
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsHortonworks
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightHortonworks
 
Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Hortonworks
 
Hadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotHadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotInside Analysis
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisOW2
 
Powering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache HadoopPowering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache HadoopHortonworks
 
Hadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesHadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesDataWorks Summit
 
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information ArchitectureThe Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information ArchitectureInside Analysis
 
Unified big data architecture
Unified big data architectureUnified big data architecture
Unified big data architectureDataWorks Summit
 
Hw09 Data Processing In The Enterprise
Hw09   Data Processing In The EnterpriseHw09   Data Processing In The Enterprise
Hw09 Data Processing In The EnterpriseCloudera, Inc.
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data SolutionsMark Kromer
 
Hadoop India Summit, Feb 2011 - Informatica
Hadoop India Summit, Feb 2011 - InformaticaHadoop India Summit, Feb 2011 - Informatica
Hadoop India Summit, Feb 2011 - InformaticaSanjeev Kumar
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendCaserta
 

Similaire à 2012 06 hortonworks paris hug (20)

Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data Platform
 
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
 
Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...
 
Hadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotHadoop: What It Is and What It's Not
Hadoop: What It Is and What It's Not
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
 
Powering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache HadoopPowering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache Hadoop
 
vBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and BeyondvBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and Beyond
 
Hadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesHadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation Architectures
 
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information ArchitectureThe Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information Architecture
 
Unified big data architecture
Unified big data architectureUnified big data architecture
Unified big data architecture
 
Enterprise Services Solutions
Enterprise Services SolutionsEnterprise Services Solutions
Enterprise Services Solutions
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
 
Hw09 Data Processing In The Enterprise
Hw09   Data Processing In The EnterpriseHw09   Data Processing In The Enterprise
Hw09 Data Processing In The Enterprise
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
Hadoop India Summit, Feb 2011 - Informatica
Hadoop India Summit, Feb 2011 - InformaticaHadoop India Summit, Feb 2011 - Informatica
Hadoop India Summit, Feb 2011 - Informatica
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 

Plus de Modern Data Stack France

Talend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark MeetupTalend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark MeetupModern Data Stack France
 
Paris Spark Meetup - Trifacta - 03_04_2017
Paris Spark Meetup - Trifacta - 03_04_2017Paris Spark Meetup - Trifacta - 03_04_2017
Paris Spark Meetup - Trifacta - 03_04_2017Modern Data Stack France
 
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...Modern Data Stack France
 
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...Modern Data Stack France
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with sparkModern Data Stack France
 
HUG France - 20160114 industrialisation_process_big_data CanalPlus
HUG France -  20160114 industrialisation_process_big_data CanalPlusHUG France -  20160114 industrialisation_process_big_data CanalPlus
HUG France - 20160114 industrialisation_process_big_data CanalPlusModern Data Stack France
 
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)Modern Data Stack France
 
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015Modern Data Stack France
 
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...Modern Data Stack France
 
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015Modern Data Stack France
 
June Spark meetup : search as recommandation
June Spark meetup : search as recommandationJune Spark meetup : search as recommandation
June Spark meetup : search as recommandationModern Data Stack France
 
Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)Modern Data Stack France
 
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamielParis Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamielModern Data Stack France
 

Plus de Modern Data Stack France (20)

Stash - Data FinOPS
Stash - Data FinOPSStash - Data FinOPS
Stash - Data FinOPS
 
Vue d'ensemble Dremio
Vue d'ensemble DremioVue d'ensemble Dremio
Vue d'ensemble Dremio
 
From Data Warehouse to Lakehouse
From Data Warehouse to LakehouseFrom Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
 
Talend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark MeetupTalend spark meetup 03042017 - Paris Spark Meetup
Talend spark meetup 03042017 - Paris Spark Meetup
 
Paris Spark Meetup - Trifacta - 03_04_2017
Paris Spark Meetup - Trifacta - 03_04_2017Paris Spark Meetup - Trifacta - 03_04_2017
Paris Spark Meetup - Trifacta - 03_04_2017
 
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
 
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with spark
 
Hug janvier 2016 -EDF
Hug   janvier 2016 -EDFHug   janvier 2016 -EDF
Hug janvier 2016 -EDF
 
HUG France - 20160114 industrialisation_process_big_data CanalPlus
HUG France -  20160114 industrialisation_process_big_data CanalPlusHUG France -  20160114 industrialisation_process_big_data CanalPlus
HUG France - 20160114 industrialisation_process_big_data CanalPlus
 
Hugfr SPARK & RIAK -20160114_hug_france
Hugfr  SPARK & RIAK -20160114_hug_franceHugfr  SPARK & RIAK -20160114_hug_france
Hugfr SPARK & RIAK -20160114_hug_france
 
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)
HUG France : HBase in Financial Industry par Pierre Bittner (Scaled Risk CTO)
 
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015
Apache Flink par Bilal Baltagi Paris Spark Meetup Dec 2015
 
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...
Datalab 101 (Hadoop, Spark, ElasticSearch) par Jonathan Winandy - Paris Spark...
 
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015
 
Spark dataframe
Spark dataframeSpark dataframe
Spark dataframe
 
June Spark meetup : search as recommandation
June Spark meetup : search as recommandationJune Spark meetup : search as recommandation
June Spark meetup : search as recommandation
 
Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)Spark ML par Xebia (Spark Meetup du 11/06/2015)
Spark ML par Xebia (Spark Meetup du 11/06/2015)
 
Spark meetup at viadeo
Spark meetup at viadeoSpark meetup at viadeo
Spark meetup at viadeo
 
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamielParis Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel
Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel
 

Dernier

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

2012 06 hortonworks paris hug

  • 1. Hortonworks Enabling Apache Hadoop to power next-generation enterprise data architectures June 2012 © Hortonworks Inc. 2012 Page 1
  • 2. Topics • Big Data Market Overview • Hortonworks Company & Strategy Overview • Hortonworks Offerings – Hortonworks Data Platform Subscriptions – Public & On-site Training – Expert Short-term Consulting Services Page 2 © Hortonworks Inc. 2012
  • 3. Big Data = Transactions + Interactions + Observations BIG DATA Sensors / RFID / Devices User Generated Content Petabytes Mobile Web Sentiment Social Interactions & Feeds User Click Stream Spatial & GPS Coordinates Web logs WEB A/B testing Terabytes External Demographics Offer history Dynamic Pricing Business Data Feeds Affiliate Networks CRM HD Video, Audio, Images Gigabytes Segmentation Search Marketing Offer details Speech to Text ERP Behavioral Targeting Purchase detail Customer Touches Product/Service Logs Megabytes Purchase record Support Contacts Dynamic Funnels Payment record SMS/MMS Increasing Data Variety and Complexity Source: Contents of above graphic created in partnership with Teradata, Inc. Page 3 © Hortonworks Inc. 2012
  • 4. What is Apache Hadoop? • Collection of Open Source Projects One of the best examples of – Apache Software Foundation (ASF) open source driving innovation – Loosely coupled, ship early/often and creating a market • Foundation for Big Data Solutions – Stores petabytes of data reliably – Hadoop Distributed File System – Runs highly distributed computations – Hadoop MapReduce framework – Enables a rational economics model – Commodity servers & storage – Powers data-driven business Page 4 © Hortonworks Inc. 2012
  • 5. 7 Key Drivers for Hadoop Business Pressure 1 Opportunity to enable innovative new business models 2 Potential new insights that drive competitive advantage Technical Pressure 3 Data collected and stored continues to grow exponentially 4 Data is increasingly everywhere and in many formats 5 Traditional solutions not designed for new requirements Financial Pressure 6 Cost of data systems, as % of IT spend, continues to grow 7 Cost advantages of commodity hardware & open source Page 5 © Hortonworks Inc. 2012
  • 6. 3 Phases of Hadoop Adoption Educate/Evaluate Initial Production Wide-scale Production Timeline 1 - 12 months 9 - 24 months 18 - 36 months Stage Awareness, adoption and Departmental production Enterprise wide production proof of enterprise viability usage usage Description See it -> Learn it -> Do it Single business use case, Multiple use cases, Evaluation, exploration, focused solution architecture broader solution architecture POCs, Dev & Admin training Key Questions What are the potential use Can the solution enable How can the solution be cases? Which one should I future business models? leveraged enterprise-wide? focus on? Am I maximizing the value What is required to enable, How do I get value now? from the chosen use case? integrate, operate at scale? Where does Hadoop fit in How does this solution What does our next- my data architecture? interact within our generation data architecture departmental data look like? Can I leverage my existing architecture? tools/platforms? How can I maximize access How do I operationalize the to data while minimizing Can I replace any of my solution? risk? existing systems? Page 6 © Hortonworks Inc. 2012
  • 7. What’s Needed to Accelerate Adoption? • Enterprise tooling to become a complete data platform – Open deployment & provisioning – Higher quality data loading – Monitoring and management – APIs for easy and efficient integration • Ecosystem support & development – Existing infrastructure vendors need to continue to integrate – Apps need to continue to be developed on this infrastructure • Market to rally around core Apache Hadoop – To avoid splintering/market fragmentation – To accelerate adoption Page 7 © Hortonworks Inc. 2012
  • 8. Topics • Big Data Market Overview • Hortonworks Company & Strategy Overview • Hortonworks Offerings – Hortonworks Data Platform Subscriptions – Public & On-site Training – Expert Architectural Services Page 8 © Hortonworks Inc. 2012
  • 9. Hortonworks Vision & Role We believe that by the end of 2015, more than half the world's data will be processed by Apache Hadoop. 1 Make Hadoop easy to use and consume 2 Make Hadoop an enterprise-viable data platform 3 Provide open APIs and data services 4 Enable ecosystem at each layer of the data stack 5 Be stewards of the core and innovators on the edges Page 9 © Hortonworks Inc. 2012
  • 10. Hortonworks Strategy • Lead within Hadoop Community – Team has delivered every major Hadoop release since 0.1 – Experience managing world’s largest deployment – Ongoing access to Y!’s 1,000+ users and 40k+ nodes for testing, QA, etc. • Embrace & Enable Hadoop Ecosystem – 100% open source software – Full lifecycle support subscriptions – Expert role-based training – Enable solution architectures Page 10 © Hortonworks Inc. 2012
  • 11. Enable Hadoop to be Next-Gen Data Platform Enable the ecosystem at each layer Make Hadoop easy to use/consume • Usability Applications & Solutions • Ease of Installation Tools & Languages Make Hadoop ent viable platform BI & Analytics Data Management Systems Installation & Configuration Data Movement & Integration Administration Infrastructure Platform Hortonworks Monitoring Data Platform Data Extract & Load Load and process data Enterprise data services Provide open APIs and data services Page 12 © Hortonworks Inc. 2012
  • 12. Next-Generation Data Architecture Audio, Web, Mobile, CRM, Video, ERP, SCM, … Images Business Transactions Docs, Text, & Interactions XML Web Logs, Clicks Big Data SQL NoSQL NewSQL Social, G Refinery raph, Fe eds EDW MPP NewSQL Sensors, Devices, RFID Business Spatial, Intelligence GPS Apache Hadoop & Analytics Events, Other Dashboards, Reports, Visualization, … Page 14 © Hortonworks Inc. 2012
  • 13. Maximizing the Value from ALL of your Data Audio, Retain runtime models and Video, Images historical data for ongoing 4 Business refinement & analysis Transactions Docs, Text, & Interactions XML Web Logs, Web, Mobile, CRM, Clicks ERP, SCM, … Big Data Social, G Refinery Classic raph, Fe 3 Share refined data and 1 eds ETL runtime models processing Sensors, 2 Devices, RFID Store, aggregate, and transform Business Spatial, multi-structured GPS data to unlock Intelligence value & Analytics Retain historical Events, data to unlock 5 Other additional value Dashboards, Reports, Visualization, … Page 15 © Hortonworks Inc. 2012
  • 14. Topics • Big Data Market Overview • Hortonworks Company & Strategy Overview • Hortonworks Offerings – Hortonworks Data Platform Subscriptions – Public & On-site Training – Expert Short-term Consulting Services Page 16 © Hortonworks Inc. 2012
  • 15. Balancing Innovation & Stability • Apache: Be aggressive - ship early and often – Projects need to keep innovating and visibly improve – Aim for big improvements on trunk – Make early buggy releases • Hortonworks: Be predictable - ship when stable – We need to ship stable, working releases – Make packaged binary releases available – We need to do regular sustaining engineering releases – QA for stable Hadoop releases – HDP quarterly release trains sweep in stable Apache projects – Enables HDP to stay reasonably current and predictable while minimizing risk of thrashing that coordinating large # of Apache projects can cause Page 17 © Hortonworks Inc. 2012
  • 16. Hadoop Now, Next, and Beyond Apache community, including Hortonworks investing to improve Hadoop: • Make Hadoop an open, extensible, and enterprise viable platform • Enable more applications to run on Apache Hadoop “Hadoop.Beyond” Integrate w/ecosystem “Hadoop.Next” (Hadoop 2.x) HDP 2 “Hadoop.Now” Next-gen MapReduce & HDFS (Hadoop 1.0) HDP 1 Most stable Hadoop ever Page 18 © Hortonworks Inc. 2012
  • 17. Hortonworks Support Subscriptions Objective: help organizations to successfully develop and deploy solutions based upon Apache Hadoop • Full-lifecycle technical support available – Developer support for design, development and POCs – Production support for staging and production environments – Up to 24x7 with 1-hour response times • Delivered by the Apache Hadoop experts – Backed by development team that has released every major version of Apache Hadoop since 0.1 • Forward-compatibility – Hortonworks’ leadership role helps ensure bug fixes and patches can be included in future versions of Hadoop projects Page 19 © Hortonworks Inc. 2012
  • 18. Cluster Subscriptions Starter Standard Enterprise Per Cluster Per Cluster Unit 3 month 20 Nodes w/ 250TB of Storage 20 Nodes w/ 250TB of Storage (Compute or Storage Expansion) (Compute or Storage Expansion) Supported Hortonworks Data Platform (HDP) and patches and updates for HDP. Software Software acquired via Hortonworks website and Cluster Subscriptions. Cluster operators can interact with the expert Hortonworks support staff during the proof-of-concept, staging and deployment phases. We Support: Configuration and installation questions, explanation of routine Support maintenance, analysis of performance issues, diagnosis of system or application Coverage issues and any bug fixes or patches that may be necessary. We Don’t Support: Production issues with customer code, end-to-end debugging of customer code, development of customer code, 3rd-party products used during development and deployment. Web, Monday to Friday, Web, Monday to Friday, Access 6am to 6pm PT 6am to 6pm PT Web and Phone, 24 x 7 Incidents Unlimited Unlimited Unlimited Priority 1: 1 Hour Response Business Day Business Day Priority 2: 4 Hours Priority 3: 8 Hours / Biz Day Page 20 © Hortonworks Inc. 2012
  • 19. Developer Subscription Developer Price Per Developer Hortonworks Data Platform (HDP) and patches and updates for HDP. Software acquired via Hortonworks website and Cluster Subscriptions. Supported Software Software acquired via Hortonworks website, Cluster Subscriptions, or Virtual/Cloud Sandbox environments. Developers can interact with the expert Hortonworks support staff to receive guidance on the use of the software and answers for “how-to” questions. We Support: Design advice, performance tuning advice, code snippet review and Support advice, problem diagnosis, bug reports, and other development related questions. Coverage We Don't Support: Production issues with customer code, end-to-end debugging of customer code, development of customer code, 3rd-party products used during development and deployment. Access Web, Monday to Friday, 6am to 6pm PT Incidents Unlimited Response 4 Hours / Business Day Page 21 © Hortonworks Inc. 2012
  • 20. Hortonworks Training Objective: help organizations overcome Hadoop knowledge gaps • Expert role-based training for developers, administrators & data analysts – Heavy emphasis on hands-on labs – Extensive schedule of public training courses available (hortonworks.com/training) • Comprehensive certification programs • Customized, on-site courses available Page 22 © Hortonworks Inc. 2012
  • 21. Hortonworks Architectural Services • Services team dedicated to Hadoop Architecture and Optimization – Extensive cluster experience from smaller <100 clusters to the largest in the world – Recognized technical experts on Hadoop • We work closely with the technical teams to understand the business need and use case – Translate the needs and use cases to technical requirements – Callout other considerations based on our extensive knowledge for growing and expanding clusters • Designed for short-term high-impact knowledge transfer and assist – Complement internal technical team and SI Page 23 © Hortonworks Inc. 2012
  • 22. Thank You! Questions & Answers Page 24 © Hortonworks Inc. 2012

Notes de l'éditeur

  1. Life used to be simple and very transactional in natureEarly 90’s, ERP: transactions count your sales by customer by locationLate 90’s – the age of segmentation and targeted offers. Merge customer operations with marketingNow, life is more complex, connected, and interactional in nature!Digital marketing enables measurement of interactions across channelsSocial networks, mobile commerce, and user-generated content increases the TYPES and VOLUMES of data which is generated by system:system communication and data exhaust from customer behavior like click-streamAnd big data is just beginning – we don’t even list all the sensors, telematics, and other machine-generated data which is predicted to eclipse even that which is generated by social networks
  2. Facts that bolster this vision include: 80% - 90% of the world’s data is unstructured or semi-structured (Forrester, IDC, Gartner all agree) Data volumes have increased exponentially over the past decade and are continuing to do so (IDC, McKinsey reports) Hadoop is uniquely designed to store and process this type of data…at scale…across commodity systems The major server and storage platform vendors are all creating Hadoop-focused strategies
  3. Apache Hadoop LeadershipSanjay RadiaHDFS Core Lead Architect, 4+ years on Hadoop. Major projects include Append v2, Capacity scheduler, Federation and HA.Owen O’MalleyThe Leading committer of code to Hadoop. 5+ years on Hadoop. Original Hadoop architect at Yahoo!. Drove the implementation of Security throughout the project. Arun MurthyOriginal MapReduce Lead. 5+ years on Hadoop. Currently lead architect and Release manager of Apache Hadoop .23.Matt FoleyRelease manager of Apache Hadoop .20.205. Former Director of Engineering for Yahoo! Mail, now running Hortonworks’ Quality and Release efforts.Deveraj DasBuilt the original MapReduce development team at Yahoo!. 5+ years on Hadoop. Now leading up the Apache Ambari (Hadoop Management) project.Alan GatesLead of Pig and HCatalog. 3+ years on Hadoop.
  4. Infrastructure Platform (Servers, Storage, Network, Operating System, Virtualization, Cloud)Systems Management (Installation, Configuration, Administration, Monitoring, Performance, Security Mgmt, Capacity Mgmt, Quality of Service)Data Management Systems (SQL, NoSQL, NewSQL, EDW, Datamarts, MPP DBs, Search, Indexing, MDM, etc.)Data Movement &amp; Integration (ETL, Data Quality, Integration Middleware, Event Processing)Tools &amp; Languages (IDEs, Programming Languages, other tools)Business Intelligence &amp; Analytics (Analytics, Reporting, Visualization, and Dashboards)Applications &amp; Solutions (SaaS offerings, bundled solutions, etc.)
  5. Infrastructure Platform (Servers, Storage, Network, Operating System, Virtualization, Cloud)Systems Management (Installation, Configuration, Administration, Monitoring, Performance, Security Mgmt, Capacity Mgmt, Quality of Service)Data Management Systems (SQL, NoSQL, NewSQL, EDW, Datamarts, MPP DBs, Search, Indexing, MDM, etc.)Data Movement &amp; Integration (ETL, Data Quality, Integration Middleware, Event Processing)Tools &amp; Languages (IDEs, Programming Languages, other tools)Business Intelligence &amp; Analytics (Analytics, Reporting, Visualization, and Dashboards)Applications &amp; Solutions (SaaS offerings, bundled solutions, etc.)
  6. In the graphic above, Apache Hadoop acts as the Big Data Refinery. It’s great at storing, aggregating, and transforming multi-structured data into more useful and valuable formats.Apache Hive is a Hadoop-related component that fits within the Business Intelligence &amp; Analytics category since it is commonly used for querying and analyzing data within Hadoop in a SQL-like manner. Apache Hadoop can also be integrated with other EDW, MPP, and NewSQL components such as Teradata, Aster Data, HP Vertica, IBM Netezza, EMC Greenplum, SAP Hana, Microsoft SQL Server PDW and many others.Apache HBase is a Hadoop-related NoSQL Key/Value store that is commonly used for building highly responsive next-generation applications. Apache Hadoop can also be integrated with other SQL, NoSQL, and NewSQL technologies such as Oracle, MySQL, PostgreSQL, Microsoft SQL Server, IBM DB2, MongoDB, DynamoDB, MarkLogic, Riak, Redis, Neo4J, Terracotta, GemFire, SQLFire, VoltDB and many others.Finally, data movement and integration technologies help ensure data flows seamlessly between the systems in the above diagrams; the lines in the graphic are powered by technologies such as WebHDFS, Apache HCatalog, Apache Sqoop, Talend Open Studio for Big Data, Informatica, Pentaho, SnapLogic, Splunk, Attunity and many others.
  7. At the highest level, I describe three broad areas of data processing and outline how these areas interconnect.The three areas are:1.Business Transactions &amp; Interactions2. Business Intelligence &amp; Analytics3. Big Data RefineryThe graphic illustrates a vision for how these three types of systems can interconnect in ways aimed at deriving maximum value from all forms of data.Enterprise IT has been connecting systems via classic ETL processing, as illustrated in Step 1 above, for many years in order to deliver structured and repeatable analysis. In this step, the business determines the questions to ask and IT collects and structures the data needed to answer those questions.The “Big Data Refinery”, as highlighted in Step 2, is a new system capable of storing, aggregating, and transforming a wide range of multi-structured raw data sources into usable formats that help fuel new insights for the business. The Big Data Refinery provides a cost-effective platform for unlocking the potential value within data and discovering the business questions worth answering with this data. A popular example of big data refining is processing Web logs, clickstreams, social interactions, social feeds, and other user generated data sources into more accurate assessments of customer churn or more effective creation of personalized offers.More interestingly, there are businesses deriving value from processing large video, audio, and image files. Retail stores, for example, are leveraging in-store video feeds to help them better understand how customers navigate the aisles as they find and purchase products. Retailers that provide optimized shopping paths and intelligent product placement within their stores are able to drive more revenue for the business. In this case, while the video files may be big in size, the refined output of the analysis is typically small in size but potentially big in value.The Big Data Refinery platform provides fertile ground for new types of tools and data processing workloads to emerge in support of rich multi-level data refinement solutions.With that as backdrop, Step 3 takes the model further by showing how the Big Data Refinery interacts with the systems powering Business Transactions &amp; Interactions and Business Intelligence &amp; Analytics. Interacting in this way opens up the ability for businesses to get a richer and more informed 360 ̊ view of customers, for example.By directly integrating the Big Data Refinery with existing Business Intelligence &amp; Analytics solutions that contain much of the transactional information for the business, companies can enhance their ability to more accurately understand the customer behaviors that lead to the transactions.Moreover, systems focused on Business Transactions &amp; Interactions can also benefit from connecting with the Big Data Refinery. Complex analytics and calculations of key parameters can be performed in the refinery and flow downstream to fuel runtime models powering business applications with the goal of more accurately targeting customers with the best and most relevant offers, for example.Since the Big Data Refinery is great at retaining large volumes of data for long periods of time, the model is completed with the feedback loops illustrated in Steps 4 and 5. Retaining the past 10 years of historical “Black Friday” retail data, for example, can benefit the business, especially if it’s blended with other data sources such as 10 years of weather data accessed from a third party data provider. The point here is that the opportunities for creating value from multi-structured data sources available inside and outside the enterprise are virtually endless if you have a platform that can do it cost effectively and at scale.
  8. “Node&quot; means a Server or Virtual Machine capable of running the Software. “Server” means a single hardware system capable of running the Software. A hardware partition or blade is considered a separate hardware system.“Virtual Machine&quot; means a software container that can run its own operating system and execute applications like a physical machine.“Cluster” means two or more Nodes that are interconnected for the purposes of executing application programs and sharing data.“Storage” means the total available storage space, also known as raw capacity, within the cluster
  9. I want to be careful with how we present services….they do want people to come onsite for extended engagements