SlideShare a Scribd company logo
1 of 17
WANdisco and Hadoop:
The Future of Big Data
     December 11, 2012
• WANdisco: Wide Area Network Distributed Computing
• Patented technology for active-active replication
• Leader in tools for software engineers (Subversion)
• No venture capital, angel investors or private equity funding
• Listed on the London Stock Exchange on June 1, 2012 in a highly
  successful IPO (LSE:WAND)
• Offices in San Ramon (CA), Boston (MA), Sheffield (UK), Belfast
  (UK), Chengdu (China), Tokyo (Japan)




                                                                    2
WANdisco Technology   Traditional Approach




                                             3
"unlike conventional solutions, the multi-site
computing system architecture does not rely on a central
transaction coordinator that is known to be a single-point-of-
failure."
                                                                  4
“Big Data is the new
       definitive source of
   competitive advantage
 across all industries. For
 those organizations that
understand and embrace
    the new reality of Big
 Data, the possibilities for
new innovation, improved
    agility, and increased
    profitability are nearly
                   endless”
                     - Wikibon




                             5
• Fixing specific problems:
    • Easy to use appliance
    • High availability
    • Disaster recovery / zero time
      to recovery (over a WAN)
• Highly differentiated
    • Nobody else can do
      active-active replication
      over a WAN




                                      6
Dr. Konstantin Shvachko
•   Co-founder of AltorStor, acquired by WANdisco
•   Was part of the team that invented Hadoop at Yahoo in 2006 and went on to
    become the Principal Big Data architect at eBay
•   Credited with the creation and maintenance of the Hadoop Distributed File
    System (HDFS), which is at the very core of both Hadoop and any replication
    solution for Hadoop
Jagane Sundar
•   Co-founder of AltorStor, acquired by WANdisco
•   Was responsible for conceiving, architecting and managing the development
    of AltoScale’s Hadoop As A Service platform before selling it to VertiCloud
•   Visionary behind AltoStor’s Cloud and Big Data Storage Appliance
•   Former Director of Hadoop Engineering at Yahoo! and managed the
    development of Hadoop 0.20.204 with Disk Fail In Place
Complementary IP and skills
•   Ideal fit with our patented active-active replication technology
•   Altostor founders faced problems (scaling, performance and high availability)
    we are planning to solve

                                                                                    7
• 24-by-7 Reliability, Availability, Scalability and Performance
• Planned as well as unplanned outages are extremely expensive
• Steep learning curve and dearth of trained specialists
• Many enterprises forced to rely on public cloud options such as Amazon
    •   Expensive hourly billing models
    •   Vendor lock-in with difficult migration paths
    •   Periodic availability and performance problems
    •   Data security concerns with cloud-based deployments
• Moving from batch model to real-time transaction model
• Our product suite will be designed to meet all of these challenges



                                                                           8
• Plug-and-play pre-packaged software eliminates the need for
  specialized Hadoop skills
• Wizard based deployment, monitoring and management
• Supports migration from Amazon to private in-house clouds
• S3-enabled filesystem unique to WANdisco’s AltoStor appliance
   • Allows searches for any kind of data (images, videos, etc.) based
      on descriptive characteristics
• HBase support for real-time transaction processing




                                                                         9
NameNode    NameNode
       NameNode




     HDFS Data




                       10
• Works over a LAN within a single data
  center
                                          NameNode    NameNode
• Works over a WAN across data centers
  thousands of miles apart

• Supports simultaneous read and write
  access on every server

                                              HDFS Data




                                                                 11
Public Cloud S3 Apps for
            the private cloud – e.g.
            JungleDisk, SmugMug,
             Senduit, Zmanda, etc.                                        Traditional Hadoop M/R
                                                    HBase Apps                     Apps

Step 1
  WANdisco AltoStor
     Appliance
  Hadoop Mgmt Server                   S3 API          HBase API        HDFS API            JobTracker
• Deploy Hadoop(s)
• Manage Hadoop                                              AltoStor Hadoop
• Monitor Hadoop
                                     Step 3



    Enterprise                                     Physical (e.g. rack of Dell servers) or
     Active                                        Virtual Infrastructure (e.g. VMware VI)
    Directory
                   Step 2
                                              for WANdisco AltoStor to use in building Hadoop

                                                                                                         12
• WANdisco AltoStor appliance has the capability to deploy on virtualized
  infrastructure such as VMware
• Advantages:
    • Extra level of reliability
    • Elasticity – live cluster shrinking and expansion
    • Extremely high hardware resource utilization
    • Resource isolation
    • Ease of management – preconfigured VMs




                                                                            13
HDFS Clients



HDFS Cluster
                                       DataNodes


      Active
                   Standby
    NameNode
                  NameNode




                      Shared Storage



                                                   14
Client                                 Client                                     Client




      Active                               Active                                       Active
    NameNode                             NameNode                             …       NameNode
                    Proposal Handler




                                                           Proposal Handler




                                                                                                      Proposal Handler
                                                                                  Dispatch
                                       Dispatch
Dispatch




                                          WANdisco PAXOS


                                                                                                                         15
Questions and
  Answers
Thank You


http://blogs.wandisco.com/autho
r/jagane-sundar

Visit us www.wandisc.com

More Related Content

What's hot

Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)Prashant Gupta
 
Geo-based content processing using hbase
Geo-based content processing using hbaseGeo-based content processing using hbase
Geo-based content processing using hbaseRavi Veeramachaneni
 
Hadoop disaster recovery
Hadoop disaster recoveryHadoop disaster recovery
Hadoop disaster recoverySandeep Singh
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in HadoopBackup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadooplarsgeorge
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Apache Hadoop
Apache HadoopApache Hadoop
Apache HadoopAjit Koti
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopRan Ziv
 
Hadoop 101
Hadoop 101Hadoop 101
Hadoop 101EMC
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Thanh Nguyen
 
Distributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology OverviewDistributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology OverviewKonstantin V. Shvachko
 
Design, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for HadoopDesign, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for Hadoopmcsrivas
 

What's hot (20)

Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Geo-based content processing using hbase
Geo-based content processing using hbaseGeo-based content processing using hbase
Geo-based content processing using hbase
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
HDFS Tiered Storage
HDFS Tiered StorageHDFS Tiered Storage
HDFS Tiered Storage
 
Hadoop disaster recovery
Hadoop disaster recoveryHadoop disaster recovery
Hadoop disaster recovery
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in HadoopBackup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Hadoop
Hadoop Hadoop
Hadoop
 
Hadoop Fundamentals I
Hadoop Fundamentals IHadoop Fundamentals I
Hadoop Fundamentals I
 
Hadoop Overview kdd2011
Hadoop Overview kdd2011Hadoop Overview kdd2011
Hadoop Overview kdd2011
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Apache Hadoop
Apache HadoopApache Hadoop
Apache Hadoop
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop 101
Hadoop 101Hadoop 101
Hadoop 101
 
2. hadoop fundamentals
2. hadoop fundamentals2. hadoop fundamentals
2. hadoop fundamentals
 
Hadoop HDFS
Hadoop HDFSHadoop HDFS
Hadoop HDFS
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
Distributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology OverviewDistributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology Overview
 
Design, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for HadoopDesign, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for Hadoop
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 

Viewers also liked

Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataHortonworks
 
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with HadoopOReillyStrata
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Hortonworks
 
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...Amazon Web Services
 
Business Continuity And Disaster Recovery Notes
Business Continuity And Disaster Recovery NotesBusiness Continuity And Disaster Recovery Notes
Business Continuity And Disaster Recovery NotesAlan McSweeney
 
Disaster Recovery Presentation
Disaster Recovery PresentationDisaster Recovery Presentation
Disaster Recovery PresentationTimSchaefer
 
An Introduction to Disaster Recovery Planning
An Introduction to Disaster Recovery PlanningAn Introduction to Disaster Recovery Planning
An Introduction to Disaster Recovery PlanningNEBizRecovery
 
The A to Z Guide to Business Continuity and Disaster Recovery
The A to Z Guide to Business Continuity and Disaster RecoveryThe A to Z Guide to Business Continuity and Disaster Recovery
The A to Z Guide to Business Continuity and Disaster RecoverySirius
 
Disaster Recovery Plan for IT
Disaster Recovery Plan for ITDisaster Recovery Plan for IT
Disaster Recovery Plan for IThhuihhui
 

Viewers also liked (9)

Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with Hadoop
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
 
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...
 
Business Continuity And Disaster Recovery Notes
Business Continuity And Disaster Recovery NotesBusiness Continuity And Disaster Recovery Notes
Business Continuity And Disaster Recovery Notes
 
Disaster Recovery Presentation
Disaster Recovery PresentationDisaster Recovery Presentation
Disaster Recovery Presentation
 
An Introduction to Disaster Recovery Planning
An Introduction to Disaster Recovery PlanningAn Introduction to Disaster Recovery Planning
An Introduction to Disaster Recovery Planning
 
The A to Z Guide to Business Continuity and Disaster Recovery
The A to Z Guide to Business Continuity and Disaster RecoveryThe A to Z Guide to Business Continuity and Disaster Recovery
The A to Z Guide to Business Continuity and Disaster Recovery
 
Disaster Recovery Plan for IT
Disaster Recovery Plan for ITDisaster Recovery Plan for IT
Disaster Recovery Plan for IT
 

Similar to Hadoop and WANdisco: The Future of Big Data

Why Virtualization is important by Tom Phelan of BlueData
Why Virtualization is important by Tom Phelan of BlueDataWhy Virtualization is important by Tom Phelan of BlueData
Why Virtualization is important by Tom Phelan of BlueDataData Con LA
 
13. The Transition to IPv6 and the Necessity for IP Address Management - Frey...
13. The Transition to IPv6 and the Necessity for IP Address Management - Frey...13. The Transition to IPv6 and the Necessity for IP Address Management - Frey...
13. The Transition to IPv6 and the Necessity for IP Address Management - Frey...Digicomp Academy AG
 
Hadoop Successes and Failures to Drive Deployment Evolution
Hadoop Successes and Failures to Drive Deployment EvolutionHadoop Successes and Failures to Drive Deployment Evolution
Hadoop Successes and Failures to Drive Deployment EvolutionBenoit Perroud
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanJim Kaskade
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - finalHortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Hortonworks
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationCeph Community
 
Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011GlusterFS
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextHortonworks
 
Hadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual MachinesHadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual MachinesDataWorks Summit
 
Liberate Your Files with a Private Cloud Storage Solution powered by Open Source
Liberate Your Files with a Private Cloud Storage Solution powered by Open SourceLiberate Your Files with a Private Cloud Storage Solution powered by Open Source
Liberate Your Files with a Private Cloud Storage Solution powered by Open SourceIsaac Christoffersen
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Hortonworks
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceHortonworks
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Red_Hat_Storage
 
Inside Triton, July 2015
Inside Triton, July 2015Inside Triton, July 2015
Inside Triton, July 2015Casey Bisson
 
Red Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFSRed Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFSGlusterFS
 
App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)outstanding59
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldRichard McDougall
 
App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)outstanding59
 

Similar to Hadoop and WANdisco: The Future of Big Data (20)

Why Virtualization is important by Tom Phelan of BlueData
Why Virtualization is important by Tom Phelan of BlueDataWhy Virtualization is important by Tom Phelan of BlueData
Why Virtualization is important by Tom Phelan of BlueData
 
13. The Transition to IPv6 and the Necessity for IP Address Management - Frey...
13. The Transition to IPv6 and the Necessity for IP Address Management - Frey...13. The Transition to IPv6 and the Necessity for IP Address Management - Frey...
13. The Transition to IPv6 and the Necessity for IP Address Management - Frey...
 
Hadoop Successes and Failures to Drive Deployment Evolution
Hadoop Successes and Failures to Drive Deployment EvolutionHadoop Successes and Failures to Drive Deployment Evolution
Hadoop Successes and Failures to Drive Deployment Evolution
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - final
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph Replication
 
Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
Hadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual MachinesHadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual Machines
 
Liberate Your Files with a Private Cloud Storage Solution powered by Open Source
Liberate Your Files with a Private Cloud Storage Solution powered by Open SourceLiberate Your Files with a Private Cloud Storage Solution powered by Open Source
Liberate Your Files with a Private Cloud Storage Solution powered by Open Source
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
 
Inside Triton, July 2015
Inside Triton, July 2015Inside Triton, July 2015
Inside Triton, July 2015
 
Red Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFSRed Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFS
 
App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworld
 
App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)
 

More from WANdisco Plc

Hadoop scalability
Hadoop scalabilityHadoop scalability
Hadoop scalabilityWANdisco Plc
 
Forrester On Using Subversion to Optimize Globally Distributed Development
Forrester On Using Subversion to Optimize Globally Distributed DevelopmentForrester On Using Subversion to Optimize Globally Distributed Development
Forrester On Using Subversion to Optimize Globally Distributed DevelopmentWANdisco Plc
 
03.13.13 WANDisco SVN Training: Advanced Branching & Merging
03.13.13 WANDisco SVN Training: Advanced Branching & Merging03.13.13 WANDisco SVN Training: Advanced Branching & Merging
03.13.13 WANDisco SVN Training: Advanced Branching & MergingWANdisco Plc
 
02.28.13 WANDisco SVN Training: Getting Info Out of SVN
02.28.13 WANDisco SVN Training: Getting Info Out of SVN02.28.13 WANDisco SVN Training: Getting Info Out of SVN
02.28.13 WANDisco SVN Training: Getting Info Out of SVNWANdisco Plc
 
02.19.13 WANDisco SVN Training: Branching Options for Development
02.19.13 WANDisco SVN Training: Branching Options for Development02.19.13 WANDisco SVN Training: Branching Options for Development
02.19.13 WANDisco SVN Training: Branching Options for DevelopmentWANdisco Plc
 
uberSVN introduction by WANdisco
uberSVN introduction by WANdiscouberSVN introduction by WANdisco
uberSVN introduction by WANdiscoWANdisco Plc
 
WANdisco Subversion Support Services
WANdisco Subversion Support ServicesWANdisco Subversion Support Services
WANdisco Subversion Support ServicesWANdisco Plc
 
Make Subversion Agile
Make Subversion AgileMake Subversion Agile
Make Subversion AgileWANdisco Plc
 
Subversion in 2010 and Beyond
Subversion in 2010 and BeyondSubversion in 2010 and Beyond
Subversion in 2010 and BeyondWANdisco Plc
 
Forrester Research on Optimizing Globally Distributed Software Development Us...
Forrester Research on Optimizing Globally Distributed Software Development Us...Forrester Research on Optimizing Globally Distributed Software Development Us...
Forrester Research on Optimizing Globally Distributed Software Development Us...WANdisco Plc
 
Forrester Research on Globally Distributed Development Using Subversion
Forrester Research on Globally Distributed Development Using SubversionForrester Research on Globally Distributed Development Using Subversion
Forrester Research on Globally Distributed Development Using SubversionWANdisco Plc
 

More from WANdisco Plc (13)

Hadoop scalability
Hadoop scalabilityHadoop scalability
Hadoop scalability
 
Forrester On Using Subversion to Optimize Globally Distributed Development
Forrester On Using Subversion to Optimize Globally Distributed DevelopmentForrester On Using Subversion to Optimize Globally Distributed Development
Forrester On Using Subversion to Optimize Globally Distributed Development
 
03.13.13 WANDisco SVN Training: Advanced Branching & Merging
03.13.13 WANDisco SVN Training: Advanced Branching & Merging03.13.13 WANDisco SVN Training: Advanced Branching & Merging
03.13.13 WANDisco SVN Training: Advanced Branching & Merging
 
02.28.13 WANDisco SVN Training: Getting Info Out of SVN
02.28.13 WANDisco SVN Training: Getting Info Out of SVN02.28.13 WANDisco SVN Training: Getting Info Out of SVN
02.28.13 WANDisco SVN Training: Getting Info Out of SVN
 
02.19.13 WANDisco SVN Training: Branching Options for Development
02.19.13 WANDisco SVN Training: Branching Options for Development02.19.13 WANDisco SVN Training: Branching Options for Development
02.19.13 WANDisco SVN Training: Branching Options for Development
 
uberSVN introduction by WANdisco
uberSVN introduction by WANdiscouberSVN introduction by WANdisco
uberSVN introduction by WANdisco
 
Subversion Zen
Subversion ZenSubversion Zen
Subversion Zen
 
WANdisco Subversion Support Services
WANdisco Subversion Support ServicesWANdisco Subversion Support Services
WANdisco Subversion Support Services
 
Make Subversion Agile
Make Subversion AgileMake Subversion Agile
Make Subversion Agile
 
Why Svn
Why SvnWhy Svn
Why Svn
 
Subversion in 2010 and Beyond
Subversion in 2010 and BeyondSubversion in 2010 and Beyond
Subversion in 2010 and Beyond
 
Forrester Research on Optimizing Globally Distributed Software Development Us...
Forrester Research on Optimizing Globally Distributed Software Development Us...Forrester Research on Optimizing Globally Distributed Software Development Us...
Forrester Research on Optimizing Globally Distributed Software Development Us...
 
Forrester Research on Globally Distributed Development Using Subversion
Forrester Research on Globally Distributed Development Using SubversionForrester Research on Globally Distributed Development Using Subversion
Forrester Research on Globally Distributed Development Using Subversion
 

Recently uploaded

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 

Recently uploaded (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 

Hadoop and WANdisco: The Future of Big Data

  • 1. WANdisco and Hadoop: The Future of Big Data December 11, 2012
  • 2. • WANdisco: Wide Area Network Distributed Computing • Patented technology for active-active replication • Leader in tools for software engineers (Subversion) • No venture capital, angel investors or private equity funding • Listed on the London Stock Exchange on June 1, 2012 in a highly successful IPO (LSE:WAND) • Offices in San Ramon (CA), Boston (MA), Sheffield (UK), Belfast (UK), Chengdu (China), Tokyo (Japan) 2
  • 3. WANdisco Technology Traditional Approach 3
  • 4. "unlike conventional solutions, the multi-site computing system architecture does not rely on a central transaction coordinator that is known to be a single-point-of- failure." 4
  • 5. “Big Data is the new definitive source of competitive advantage across all industries. For those organizations that understand and embrace the new reality of Big Data, the possibilities for new innovation, improved agility, and increased profitability are nearly endless” - Wikibon 5
  • 6. • Fixing specific problems: • Easy to use appliance • High availability • Disaster recovery / zero time to recovery (over a WAN) • Highly differentiated • Nobody else can do active-active replication over a WAN 6
  • 7. Dr. Konstantin Shvachko • Co-founder of AltorStor, acquired by WANdisco • Was part of the team that invented Hadoop at Yahoo in 2006 and went on to become the Principal Big Data architect at eBay • Credited with the creation and maintenance of the Hadoop Distributed File System (HDFS), which is at the very core of both Hadoop and any replication solution for Hadoop Jagane Sundar • Co-founder of AltorStor, acquired by WANdisco • Was responsible for conceiving, architecting and managing the development of AltoScale’s Hadoop As A Service platform before selling it to VertiCloud • Visionary behind AltoStor’s Cloud and Big Data Storage Appliance • Former Director of Hadoop Engineering at Yahoo! and managed the development of Hadoop 0.20.204 with Disk Fail In Place Complementary IP and skills • Ideal fit with our patented active-active replication technology • Altostor founders faced problems (scaling, performance and high availability) we are planning to solve 7
  • 8. • 24-by-7 Reliability, Availability, Scalability and Performance • Planned as well as unplanned outages are extremely expensive • Steep learning curve and dearth of trained specialists • Many enterprises forced to rely on public cloud options such as Amazon • Expensive hourly billing models • Vendor lock-in with difficult migration paths • Periodic availability and performance problems • Data security concerns with cloud-based deployments • Moving from batch model to real-time transaction model • Our product suite will be designed to meet all of these challenges 8
  • 9. • Plug-and-play pre-packaged software eliminates the need for specialized Hadoop skills • Wizard based deployment, monitoring and management • Supports migration from Amazon to private in-house clouds • S3-enabled filesystem unique to WANdisco’s AltoStor appliance • Allows searches for any kind of data (images, videos, etc.) based on descriptive characteristics • HBase support for real-time transaction processing 9
  • 10. NameNode NameNode NameNode HDFS Data 10
  • 11. • Works over a LAN within a single data center NameNode NameNode • Works over a WAN across data centers thousands of miles apart • Supports simultaneous read and write access on every server HDFS Data 11
  • 12. Public Cloud S3 Apps for the private cloud – e.g. JungleDisk, SmugMug, Senduit, Zmanda, etc. Traditional Hadoop M/R HBase Apps Apps Step 1 WANdisco AltoStor Appliance Hadoop Mgmt Server S3 API HBase API HDFS API JobTracker • Deploy Hadoop(s) • Manage Hadoop AltoStor Hadoop • Monitor Hadoop Step 3 Enterprise Physical (e.g. rack of Dell servers) or Active Virtual Infrastructure (e.g. VMware VI) Directory Step 2 for WANdisco AltoStor to use in building Hadoop 12
  • 13. • WANdisco AltoStor appliance has the capability to deploy on virtualized infrastructure such as VMware • Advantages: • Extra level of reliability • Elasticity – live cluster shrinking and expansion • Extremely high hardware resource utilization • Resource isolation • Ease of management – preconfigured VMs 13
  • 14. HDFS Clients HDFS Cluster DataNodes Active Standby NameNode NameNode Shared Storage 14
  • 15. Client Client Client Active Active Active NameNode NameNode … NameNode Proposal Handler Proposal Handler Proposal Handler Dispatch Dispatch Dispatch WANdisco PAXOS 15
  • 16. Questions and Answers

Editor's Notes

  1. SPOFs – Name Node, Hbase, YARN
  2. SPOFs – Name Node, Hbase, YARN