SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
Hadoop In the Enterprise?


      Sih Lee & Peter Krey, Innovation & Shared Services
             Firmwide Engineering & Architecture


             Hadoop World, New York City, October 2nd, 2009




                                                                           2009 JPMorgan Chase & Co.
                                                                               All rights reserved.
                                                              Confidential and proprietary to JPMorgan Chase & Co.
Agenda

                                                                      Page



                                      JPMorgan Chase + Open Source      2


                                      Hadoop In The Enterprise?         3


                                      Active POC Pipeline               6


                                      Hadoop Positioning                7


                                      Cost Comparisons                  8


                                      Hadoop Additions & Must Haves    10
Hadoop In The Enterprise ?




                                      Q&A                              11




                                                                             1
JPMorgan Chase + Open Source




                                  Established Multi-Year Open Source History

                                  Big Supporter of Industry Standards & Open Source Projects

                                  Numerous Production Open Source Implementations
                                   QPID (AMQP) - Top Level Apache Project (http://qpid.apache.org/)
                                   Tyger - Apache + Tomcat + Spring - Fully Integrated App
                                   Server Environment 30+ OS Components
                                   Compute Backbone (CBB) HPC Grid - 1000's of Linux Based Compute
Hadoop In The Enterprise ?




                                   Servers
                                   MuleSoft.org (a.k.a. MuleSource) Enterprise Message Bus
                                   others …




                                                                                                      2
Hadoop In The Enterprise – Economics Driven



                                  Many Big Data Lessons Learned From Web 2.0 Community

                                  Potential For Large Capex and Opex "Dislocation"
                                    Reduced Consumption of Enterprise Premium Resources
                                    Grid Computing Economics Brought To Data Intensive Computing
                                    Stagnant Data Innovation

                                  Enabling & Potentially Disruptive Platform
                                    Many Historical Similarities
                                      Java, Linux, Tomcat, Web / Internet, …
Hadoop In The Enterprise ?




                                      Mini's to Client / Server, Client / Server to Web, Solaris to Linux, …
                                    Key Question: What Can Be Built On Top of and Enabled by Hadoop?




                                                                                                               3
Hadoop In The Enterprise – Choice Driven




                                  Overuse of Relational Database Containers
                                    Institutional “Muscle Memory” … Not Much Else to Choose From
                                    Increasing Large Percentage of Static Data Stored In Proprietary
                                    Transactional DB's
                                    Over-Normalized Schemas … Still Makes Sense With Cheap
                                    Compute & Storage?


                                  Enterprise Storage "Prisoners"
Hadoop In The Enterprise ?




                                    Captive To The Economics & Technology of "A Few" Vendors
                                    Developers Need More Choice
                                    Too Much Proprietary, Single-Source Data Infrastructure
                                    Increasing Need For Minimal / No System + Storage Admins




                                                                                                       4
Hadoop In The Enterprise – Other Drivers




                                  Growing Developer Interest In "No SQL" Data Technologies
                                    Open Source, Distributed, Non-relational Databases
                                    Growing Influence Of Web 2.0 Technologies & Thinking On Enterprise
                                    Hadoop, Cassandra, HBase, Hive, CouchDB, HadoopDB, …, others
                                    memcached For Caching

                                  FSI Industry Drivers
                                    Increased Regulatory Oversight + Reporting =
Hadoop In The Enterprise ?




                                    More Data Needed Over Longer Period Of Time
                                    Growing Need For Less Expensive Data Repository / Store
                                    Increasing Need To Support "One Off" Analysis On Large Data




                                                                                                         5
Active POC Pipeline




                                 Growing Stream of Real Projects To Gauge Hadoop "Goodness of Fit"
                                 Broad Spectrum of Use Cases
                                 Driven By Need To Impact / Dislocate OPEX + CAPEX
                                 Evaluated On Metric Based Performance, Functional, And
                                 Economic Measures
Hadoop In The Enterprise ?




                                                                                                     6
Hadoop Positioning
                                                                                                                Semi-Structured
                                                                                                                   Analysis
                                                                   Higher-Latency

                                                                                                                         • Map/Reduce + HDFS
                                                                                                                • DW7

                                                                                           • DW6
                                                                                              • DW5



                                                                                                      • DW3
                                                              • SQLDB1                                • DW4
                                GB’s                                                                                              TB’s –> PB’s
Hadoop In The Enterprise ?




                                                                     • SQLDB2               • DW2

                                                                     • SQLDB3               • DW1
                                           • InMemory1                          • SQLDB4




                                       Index Based Access –                                                   Index Based Access –
                                         Updates / XActns                                                           Analysis
                                                                         Lower-Latency


                                                                                                                                                 7
Comparative Storage Cost Bar Graph Slide


                                  “Normalized" SAN + NAS $ per gb per month versus HDFS $ per gb per month
Hadoop In The Enterprise ?




                                                                                                        p


                                                                                                        p


                                                                                                        p


                                                                                                        p
                                    N


                                          N


                                                N


                                                       N


                                                              N


                                                                    N


                                                                          AS


                                                                                 AS


                                                                                        AS


                                                                                              AS


                                                                                                      oo


                                                                                                      oo


                                                                                                      oo


                                                                                                      oo
                                  SA


                                        SA


                                              SA


                                                     SA


                                                            SA


                                                                  SA


                                                                         N


                                                                                N


                                                                                       N


                                                                                             N


                                                                                                    ad


                                                                                                    ad


                                                                                                    ad


                                                                                                    ad
                                                                                                   H


                                                                                                   H


                                                                                                   H


                                                                                                   H
                                                                                                             8
Enterprise Data Warehousing Costs


                                  "normalized” bar chart utilizing retail $ per TB

                                                              Data Warehouse S/W -- $K per TB

                                    $250



                                    $200



                                    $150
Hadoop In The Enterprise ?




                                    $100



                                     $50



                                      $0
                                                                                 Products

                                                                                                9
Hadoop Additions & Must Haves




                                  Improved SQL Front-end Tool Interoperability
                                   Better Interop With Skills & Content That Firms Already Have
                                  Improved Security & ACL enforcement … Kerberos integration?
                                  Grow Developer Programming Model Skill Sets
                                  Improve Relational Container Integration & Interop For Data Archival
                                  Management & Monitoring Tools
                                  Improved Developer & Debugging Tools
Hadoop In The Enterprise ?




                                  Reduce Latency Via Integration With Open Source Data Caching
                                   memcached, others
                                  Invitation To FSI or Enterprise Roundtable




                                                                                                         10
Q&A




                                   Sih Lee, Head of Innovation & Shared Services
                                   Firmwide Engineering & Architecture
                                   W# 212-622-3038
                                   sih.x.lee@jpmchase.com


                                   Peter Krey, Consultant, Innovation & Shared Services
                                   Firmwide Engineering & Architecture
                                   W# 212-622-2926
                                   peter.j.krey@jpmchase.com
Hadoop In The Enterprise ?




                                                                                          11

Contenu connexe

Tendances

Hadoop for shanghai dev meetup
Hadoop for shanghai dev meetupHadoop for shanghai dev meetup
Hadoop for shanghai dev meetup
Roby Chen
 
Hadoop-as-a-Service for Lifecycle Management Simplicity
Hadoop-as-a-Service for Lifecycle Management SimplicityHadoop-as-a-Service for Lifecycle Management Simplicity
Hadoop-as-a-Service for Lifecycle Management Simplicity
DataWorks Summit
 

Tendances (19)

Hadoop Twelve Predictions for 2012
Hadoop Twelve Predictions for 2012Hadoop Twelve Predictions for 2012
Hadoop Twelve Predictions for 2012
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
 
Analyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff ScheelAnalyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff Scheel
 
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
 
Hadoop for shanghai dev meetup
Hadoop for shanghai dev meetupHadoop for shanghai dev meetup
Hadoop for shanghai dev meetup
 
Hadoop-as-a-Service for Lifecycle Management Simplicity
Hadoop-as-a-Service for Lifecycle Management SimplicityHadoop-as-a-Service for Lifecycle Management Simplicity
Hadoop-as-a-Service for Lifecycle Management Simplicity
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
 
S18
S18S18
S18
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousing
 
Building Big Data Applications
Building Big Data ApplicationsBuilding Big Data Applications
Building Big Data Applications
 
Ανδρέας Τσαγκάρης, 5th Digital Banking Forum
Ανδρέας Τσαγκάρης, 5th Digital Banking ForumΑνδρέας Τσαγκάρης, 5th Digital Banking Forum
Ανδρέας Τσαγκάρης, 5th Digital Banking Forum
 
VMUGIT UC 2013 - 08a VMware Hadoop
VMUGIT UC 2013 - 08a VMware HadoopVMUGIT UC 2013 - 08a VMware Hadoop
VMUGIT UC 2013 - 08a VMware Hadoop
 
Hadoop in the Cloud
Hadoop in the CloudHadoop in the Cloud
Hadoop in the Cloud
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
 
Driving Business Benefits with Hadoop
Driving Business Benefits with HadoopDriving Business Benefits with Hadoop
Driving Business Benefits with Hadoop
 
Emulex Presents Why I/O is Strategic Global Survey Results
Emulex Presents Why I/O is Strategic Global Survey ResultsEmulex Presents Why I/O is Strategic Global Survey Results
Emulex Presents Why I/O is Strategic Global Survey Results
 
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
 
Impact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerImpact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and career
 

En vedette (7)

Hw09 Monitoring Best Practices
Hw09   Monitoring Best PracticesHw09   Monitoring Best Practices
Hw09 Monitoring Best Practices
 
Hw09 Hadoop Applications At Yahoo!
Hw09   Hadoop Applications At Yahoo!Hw09   Hadoop Applications At Yahoo!
Hw09 Hadoop Applications At Yahoo!
 
Hw09 Building Data Intensive Apps A Closer Look At Trending Topics.Org
Hw09   Building Data Intensive Apps  A Closer Look At Trending Topics.OrgHw09   Building Data Intensive Apps  A Closer Look At Trending Topics.Org
Hw09 Building Data Intensive Apps A Closer Look At Trending Topics.Org
 
Hw09 Large Scale Transaction Analysis
Hw09   Large Scale Transaction AnalysisHw09   Large Scale Transaction Analysis
Hw09 Large Scale Transaction Analysis
 
ZooKeeper Futures
ZooKeeper FuturesZooKeeper Futures
ZooKeeper Futures
 
Hw09 Optimizing Hadoop Deployments
Hw09   Optimizing Hadoop DeploymentsHw09   Optimizing Hadoop Deployments
Hw09 Optimizing Hadoop Deployments
 
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
 

Similaire à Hw09 Data Processing In The Enterprise

Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
Hortonworks
 

Similaire à Hw09 Data Processing In The Enterprise (20)

Hadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondHadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview
 
Demystify Big Data Breakfast Briefing: Herb Cunitz, Hortonworks
Demystify Big Data Breakfast Briefing:  Herb Cunitz, HortonworksDemystify Big Data Breakfast Briefing:  Herb Cunitz, Hortonworks
Demystify Big Data Breakfast Briefing: Herb Cunitz, Hortonworks
 
2012 06 hortonworks paris hug
2012 06 hortonworks paris hug2012 06 hortonworks paris hug
2012 06 hortonworks paris hug
 
Hadoop Business Cases
Hadoop Business CasesHadoop Business Cases
Hadoop Business Cases
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Why Hadoop as a Service?
Why Hadoop as a Service?Why Hadoop as a Service?
Why Hadoop as a Service?
 
Making the Case for Hadoop in a Large Enterprise-British Airways
Making the Case for Hadoop in a Large Enterprise-British AirwaysMaking the Case for Hadoop in a Large Enterprise-British Airways
Making the Case for Hadoop in a Large Enterprise-British Airways
 
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011
 
Apache Hadoop Now Next and Beyond
Apache Hadoop Now Next and BeyondApache Hadoop Now Next and Beyond
Apache Hadoop Now Next and Beyond
 
Oct 2011 CHADNUG Presentation on Hadoop
Oct 2011 CHADNUG Presentation on HadoopOct 2011 CHADNUG Presentation on Hadoop
Oct 2011 CHADNUG Presentation on Hadoop
 
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
 

Plus de Cloudera, Inc.

Plus de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 

Hw09 Data Processing In The Enterprise

  • 1. Hadoop In the Enterprise? Sih Lee & Peter Krey, Innovation & Shared Services Firmwide Engineering & Architecture Hadoop World, New York City, October 2nd, 2009  2009 JPMorgan Chase & Co. All rights reserved. Confidential and proprietary to JPMorgan Chase & Co.
  • 2. Agenda Page JPMorgan Chase + Open Source 2 Hadoop In The Enterprise? 3 Active POC Pipeline 6 Hadoop Positioning 7 Cost Comparisons 8 Hadoop Additions & Must Haves 10 Hadoop In The Enterprise ? Q&A 11 1
  • 3. JPMorgan Chase + Open Source Established Multi-Year Open Source History Big Supporter of Industry Standards & Open Source Projects Numerous Production Open Source Implementations QPID (AMQP) - Top Level Apache Project (http://qpid.apache.org/) Tyger - Apache + Tomcat + Spring - Fully Integrated App Server Environment 30+ OS Components Compute Backbone (CBB) HPC Grid - 1000's of Linux Based Compute Hadoop In The Enterprise ? Servers MuleSoft.org (a.k.a. MuleSource) Enterprise Message Bus others … 2
  • 4. Hadoop In The Enterprise – Economics Driven Many Big Data Lessons Learned From Web 2.0 Community Potential For Large Capex and Opex "Dislocation" Reduced Consumption of Enterprise Premium Resources Grid Computing Economics Brought To Data Intensive Computing Stagnant Data Innovation Enabling & Potentially Disruptive Platform Many Historical Similarities Java, Linux, Tomcat, Web / Internet, … Hadoop In The Enterprise ? Mini's to Client / Server, Client / Server to Web, Solaris to Linux, … Key Question: What Can Be Built On Top of and Enabled by Hadoop? 3
  • 5. Hadoop In The Enterprise – Choice Driven Overuse of Relational Database Containers Institutional “Muscle Memory” … Not Much Else to Choose From Increasing Large Percentage of Static Data Stored In Proprietary Transactional DB's Over-Normalized Schemas … Still Makes Sense With Cheap Compute & Storage? Enterprise Storage "Prisoners" Hadoop In The Enterprise ? Captive To The Economics & Technology of "A Few" Vendors Developers Need More Choice Too Much Proprietary, Single-Source Data Infrastructure Increasing Need For Minimal / No System + Storage Admins 4
  • 6. Hadoop In The Enterprise – Other Drivers Growing Developer Interest In "No SQL" Data Technologies Open Source, Distributed, Non-relational Databases Growing Influence Of Web 2.0 Technologies & Thinking On Enterprise Hadoop, Cassandra, HBase, Hive, CouchDB, HadoopDB, …, others memcached For Caching FSI Industry Drivers Increased Regulatory Oversight + Reporting = Hadoop In The Enterprise ? More Data Needed Over Longer Period Of Time Growing Need For Less Expensive Data Repository / Store Increasing Need To Support "One Off" Analysis On Large Data 5
  • 7. Active POC Pipeline Growing Stream of Real Projects To Gauge Hadoop "Goodness of Fit" Broad Spectrum of Use Cases Driven By Need To Impact / Dislocate OPEX + CAPEX Evaluated On Metric Based Performance, Functional, And Economic Measures Hadoop In The Enterprise ? 6
  • 8. Hadoop Positioning Semi-Structured Analysis Higher-Latency • Map/Reduce + HDFS • DW7 • DW6 • DW5 • DW3 • SQLDB1 • DW4 GB’s TB’s –> PB’s Hadoop In The Enterprise ? • SQLDB2 • DW2 • SQLDB3 • DW1 • InMemory1 • SQLDB4 Index Based Access – Index Based Access – Updates / XActns Analysis Lower-Latency 7
  • 9. Comparative Storage Cost Bar Graph Slide “Normalized" SAN + NAS $ per gb per month versus HDFS $ per gb per month Hadoop In The Enterprise ? p p p p N N N N N N AS AS AS AS oo oo oo oo SA SA SA SA SA SA N N N N ad ad ad ad H H H H 8
  • 10. Enterprise Data Warehousing Costs "normalized” bar chart utilizing retail $ per TB Data Warehouse S/W -- $K per TB $250 $200 $150 Hadoop In The Enterprise ? $100 $50 $0 Products 9
  • 11. Hadoop Additions & Must Haves Improved SQL Front-end Tool Interoperability Better Interop With Skills & Content That Firms Already Have Improved Security & ACL enforcement … Kerberos integration? Grow Developer Programming Model Skill Sets Improve Relational Container Integration & Interop For Data Archival Management & Monitoring Tools Improved Developer & Debugging Tools Hadoop In The Enterprise ? Reduce Latency Via Integration With Open Source Data Caching memcached, others Invitation To FSI or Enterprise Roundtable 10
  • 12. Q&A Sih Lee, Head of Innovation & Shared Services Firmwide Engineering & Architecture W# 212-622-3038 sih.x.lee@jpmchase.com Peter Krey, Consultant, Innovation & Shared Services Firmwide Engineering & Architecture W# 212-622-2926 peter.j.krey@jpmchase.com Hadoop In The Enterprise ? 11