SlideShare une entreprise Scribd logo
1  sur  16
Greenplum Database on HDFS
                                                         (GOH)


                                                         Presenter: Lei Chang

                                                         lei.chang@emc.com




© Copyright 2012 EMC Corporation. All rights reserved.                                1
Outline	
  
   •       Introduc/on	
  
   •       Architecture	
  
   •       Features	
  
   •       Performance	
  study	
  




© Copyright 2012 EMC Corporation. All rights reserved.                 2
EMC	
  Greenplum	
  Unified	
  Analy/cs	
  Pla@orm	
  




© Copyright 2012 EMC Corporation. All rights reserved.                        3
GOH	
  use	
  cases
                                                                           	
  
   •  All	
  customers	
  of	
  Greenplum	
  who	
  want	
  to	
  minimize	
  the	
  amount	
  of	
  
      duplicate	
  storage	
  that	
  they	
  have	
  to	
  buy	
  for	
  analy/cs	
  	
  
             –  managing	
  scale	
  much	
  easier	
  if	
  you	
  focus	
  on	
  the	
  growth	
  of	
  one	
  pool	
  than	
  
                having	
  many	
  fragmented	
  pools.	
  	
  
   •  For	
  customers	
  who	
  want	
  the	
  func/onality	
  of	
  GPDB	
  with	
  the	
  generality	
  and	
  
      storage	
  provided	
  by	
  their	
  HBase	
  store.	
  	
  
   •  Poten/al	
  Ability	
  to	
  plug	
  various	
  storage	
  such	
  as	
  Isilon,	
  Atoms,	
  MapR	
  
      Filesystem,	
  CloudStore,	
  GPFS,	
  Lustre,	
  PVFS	
  and	
  Ceph	
  to	
  GPDB/Hadoop	
  
      soQware	
  stack	
  




© Copyright 2012 EMC Corporation. All rights reserved.                                                                              4
Master host


                                                                                                      GPDB Interconnect



                                                                                                                     Segment
                                           Segment                                                                   (Mirror)
     Segment                                                        Segment                 Segment
                                                                              Segment
                    Segment                              Segment              (Mirror)                Segment                   Segment
                    (Mirror)                             (Mirror)                                     (Mirror)
    Segment host                         Segment host                    Segment host      Segment host             Segment host

                                                                                Meta Ops                                            Read/Write
               Tables in HDFS filespace


                                                                    Namenode
                                                                                                                                B
                                                     Datanode           replication
                                                                                                      Datanode            Datanode



                                Rack1                                                                            Rack2




© Copyright 2012 EMC Corporation. All rights reserved.                                                                                           5
GOH	
  features
                                                                       	
  
   •  A	
  pluggable	
  storage	
  layer.	
  If	
  a	
  new	
  file	
  system	
  can	
  support	
  the	
  
      full	
  seman/c	
  of	
  HDFS	
  interface,	
  then	
  the	
  file	
  system	
  can	
  be	
  
      added	
  as	
  GPDB	
  AO	
  table	
  storage.	
  
   •  ASributed	
  filespace	
  
   •  HDFS	
  filespaces	
  are	
  na/vely	
  supported	
  
   •  Full	
  transac/on	
  support	
  for	
  AO	
  tables	
  on	
  HDFS.	
  	
  
   •  HDFS	
  trunca/on	
  capability	
  to	
  support	
  the	
  transac/on	
  
      capability	
  of	
  GOH.	
  	
  
   •  HDFS	
  na/ve	
  C	
  interface	
  to	
  eliminate	
  the	
  concurrency	
  
      limita/on	
  of	
  current	
  java	
  JNI	
  based	
  client.	
  
   •  All	
  current	
  GPDB	
  func/onality:	
  fault	
  tolerance	
  et	
  al.	
  	
  



© Copyright 2012 EMC Corporation. All rights reserved.                                                      6
Pluggable	
  storage:	
  user	
  interface	
  
   CREATE	
  FUNCTION	
  open_func	
  AS	
  '('	
  obj_file	
  ','	
  link_smybol	
  ')'	
  
   	
  	
  
   CREATE	
  FILESYSTEM	
  filesystemname	
  [OWNER	
  ownername]	
  (	
  
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  connect	
  =	
  connect_func,	
  
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  open	
  =	
  open_func,	
  	
  
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  close	
  =	
  close_func,	
  
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  read	
  =	
  read_func,	
  
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  write	
  =	
  write_func,	
  
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  seek	
  =	
  seek_func,	
  
                                                               	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ...	
  
   )	
  	
  




© Copyright 2012 EMC Corporation. All rights reserved.                                                                                                                     7
ASributed	
  filespaces
                                                                      	
  
   •  The	
  number	
  of	
  replicas	
  for	
  the	
  table	
  in	
  the	
  filespace	
  
   •  Whether	
  mirroring	
  is	
  supported	
  for	
  the	
  tables	
  stored	
  in	
  the	
  
      filespace	
  
   •  Other	
  aSributes…	
  




© Copyright 2012 EMC Corporation. All rights reserved.                                             8
Example	
  SQL
                                                                      	
  
   CREATE	
  FILESPACE	
  goh	
  ON	
  HDFS	
  
   (	
  
   	
  	
  	
  	
  	
  1:	
  'hdfs://name-­‐node/users/changl1/gp-­‐data/gohmaster/gpseg-­‐1',	
  
   	
  	
  	
  	
  	
  2:	
  'hdfs://name-­‐node/users/changl1/gp-­‐data/goh/gpseg0',	
  
   	
  	
  	
  	
  	
  3:	
  'hdfs://name-­‐node/users/changl1/gp-­‐data/goh/gpseg1',	
  
   )	
  WITH	
  (NUMREPLICA	
  =	
  3,	
  MIRRORING	
  =	
  false);	
  	
  




© Copyright 2012 EMC Corporation. All rights reserved.                                               9
Transac/on	
  support
                                                                      	
  
   •  When	
  a	
  load	
  transac/on	
  is	
  aborted,	
  there	
  will	
  be	
  some	
  
      garbage	
  data	
  leQ	
  at	
  the	
  end	
  of	
  file.	
  For	
  HDFS	
  like	
  systems,	
  
      data	
  cannot	
  be	
  truncated	
  or	
  overwriSen.	
  Thus,	
  we	
  need	
  some	
  
      methods	
  to	
  process	
  the	
  par/al	
  data	
  to	
  support	
  transac/on.	
  	
  
             –  Op/on	
  1:	
  Load	
  data	
  into	
  a	
  separate	
  HDFS	
  file.	
  Unlimited	
  number	
  of	
  
                files.	
  
             –  Op/on	
  2:	
  Use	
  metadata	
  to	
  records	
  the	
  boundary	
  of	
  garbage	
  data,	
  and	
  
                implements	
  a	
  kind	
  of	
  vacuum	
  mechanism.	
  
             –  Op/on	
  3:	
  Implement	
  HDFS	
  trunca/on.	
  




© Copyright 2012 EMC Corporation. All rights reserved.                                                                    10
HDFS	
  C	
  client:	
  why	
  
                                                                               	
  
   •  libhdfs	
  (Current	
  HDFS	
  c	
  client)	
  is	
  based	
  on	
  JNI.	
  It	
  is	
  difficult	
  to	
  
      make	
  GOH	
  support	
  a	
  large	
  number	
  of	
  concurrent	
  queries.	
  	
  
   •  Example:	
  
             –  6	
  segments	
  on	
  each	
  segment	
  hosts	
  
             –  50	
  concurrent	
  queries	
  
             –  each	
  query	
  may	
  have	
  12	
  or	
  more	
  QE	
  processes	
  that	
  do	
  scan	
  
             –  there	
  will	
  be	
  about	
  600	
  processes	
  that	
  start	
  600	
  JVMs	
  to	
  access	
  HDFS.	
  	
  
             –  If	
  each	
  JVM	
  uses	
  500MB	
  memory,	
  the	
  JVMs	
  will	
  consume	
  600	
  *	
  500M	
  
                =	
  300G	
  memory.	
  	
  
             –  Thus	
  naïve	
  usage	
  of	
  libhdfs	
  is	
  not	
  suitable	
  for	
  GOH.	
  Currently	
  we	
  have	
  
                three	
  op/ons	
  to	
  solve	
  this	
  problem	
  




© Copyright 2012 EMC Corporation. All rights reserved.                                                                              11
HDFS	
  client:	
  three	
  op/ons
                                                                       	
  
   •  Op/on	
  1:	
  use	
  HDFS	
  FUSE.	
  HDFS	
  FUSE	
  introduces	
  some	
  
      performance	
  overhead.	
  And	
  the	
  scalability	
  is	
  not	
  verified	
  yet.	
  
   •  Op/on	
  3:	
  implement	
  a	
  webhdfs	
  based	
  C	
  client.	
  webhdfs	
  is	
  
      based	
  on	
  HTTP.	
  It	
  also	
  introduces	
  some	
  costs.	
  Performance	
  
      should	
  be	
  benchmarked.	
  Webhdfs	
  based	
  method	
  has	
  several	
  
      benefits,	
  such	
  as	
  ease	
  to	
  implementa/on	
  and	
  low	
  
      maintenance	
  cost.	
  
   •  Op/on	
  2:	
  implement	
  a	
  C	
  RPC	
  interface	
  that	
  directly	
  
      communicates	
  with	
  NameNode	
  and	
  DataNode.	
  Many	
  changes	
  
      when	
  the	
  RPC	
  protocol	
  is	
  changed.	
  
   •  Currently,	
  we	
  implemented	
  op/on	
  2	
  and	
  op/on	
  3.	
  




© Copyright 2012 EMC Corporation. All rights reserved.                                            12
HDFS	
  truncate	
  
   •  API	
  
             –  truncate	
  (DistributedFileSystem)	
  -­‐	
  truncate	
  a	
  file	
  to	
  a	
  specified	
  length	
  
             –  void	
  truncate(Path	
  src,	
  long	
  length)	
  throws	
  IOExcep/on;	
  
   •  Seman/cs	
  
             –  Only	
  single	
  writer/Appender/Truncater	
  is	
  allowed.	
  Users	
  can	
  only	
  call	
  
                truncate	
  on	
  closed	
  files.	
  
             –  HDFS	
  guarantees	
  the	
  atomicity	
  of	
  a	
  truncate	
  opera/on.	
  That	
  is,	
  it	
  
                succeeds	
  or	
  fails.	
  It	
  does	
  not	
  leave	
  the	
  file	
  in	
  an	
  undefined	
  state.	
  
             –  Concurrent	
  readers	
  may	
  read	
  content	
  of	
  a	
  file	
  that	
  will	
  be	
  truncated	
  
                by	
  a	
  concurrent	
  truncate	
  opera/on.	
  But	
  they	
  must	
  be	
  able	
  to	
  read	
  all	
  
                the	
  data	
  that	
  are	
  not	
  affected	
  by	
  the	
  concurrent	
  truncate	
  opera/on.	
  




© Copyright 2012 EMC Corporation. All rights reserved.                                                                         13
HDFS	
  truncate	
  implementa/on	
  (HDFS-­‐3107)
                                                             	
  
   •  Get	
  the	
  lease	
  of	
  the	
  to-­‐be-­‐truncated	
  file	
  (F)	
  
   •  If	
  truncate	
  is	
  at	
  block	
  boundary	
  
             –  Delete	
  the	
  tail	
  blocks	
  as	
  an	
  atomic	
  opera/on.	
  	
  
   •  If	
  truncate	
  is	
  not	
  at	
  block	
  boundary	
  
             –  Copy	
  the	
  last	
  block	
  (B)	
  of	
  the	
  result	
  file	
  (R)	
  to	
  a	
  temporary	
  file	
  (T).	
  
   •  Otherwise,	
  If	
  truncate	
  is	
  not	
  at	
  block	
  boundary	
  
   •  Remove	
  the	
  tail	
  blocks	
  of	
  file	
  F	
  (including	
  B,	
  B+1,	
  …),	
  concat	
  F	
  
      and	
  T,	
  get	
  R.	
  
   •  Release	
  the	
  lease	
  for	
  the	
  file	
  




© Copyright 2012 EMC Corporation. All rights reserved.                                                                                14
Performance	
  study	
  (to	
  be	
  added)
                                                                  	
  




© Copyright 2012 EMC Corporation. All rights reserved.                   15
Thank	
  you!
                                                                     	
  




© Copyright 2012 EMC Corporation. All rights reserved.                      16

Contenu connexe

Tendances

Greenplum: Driving the future of Data Warehousing and Analytics
Greenplum: Driving the future of Data Warehousing and AnalyticsGreenplum: Driving the future of Data Warehousing and Analytics
Greenplum: Driving the future of Data Warehousing and Analytics
eaiti
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
Rush Shah
 
Teradata vs-exadata
Teradata vs-exadataTeradata vs-exadata
Teradata vs-exadata
Louis liu
 
Optimizing MapReduce Job performance
Optimizing MapReduce Job performanceOptimizing MapReduce Job performance
Optimizing MapReduce Job performance
DataWorks Summit
 

Tendances (20)

Greenplum: Driving the future of Data Warehousing and Analytics
Greenplum: Driving the future of Data Warehousing and AnalyticsGreenplum: Driving the future of Data Warehousing and Analytics
Greenplum: Driving the future of Data Warehousing and Analytics
 
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
 
An overview of reference architectures for Postgres
An overview of reference architectures for PostgresAn overview of reference architectures for Postgres
An overview of reference architectures for Postgres
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs Exadata
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
 
The IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceThe IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse Appliance
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
 
Teradata vs-exadata
Teradata vs-exadataTeradata vs-exadata
Teradata vs-exadata
 
IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)
 
Optimizing MapReduce Job performance
Optimizing MapReduce Job performanceOptimizing MapReduce Job performance
Optimizing MapReduce Job performance
 
Netezza vs teradata
Netezza vs teradataNetezza vs teradata
Netezza vs teradata
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...
 
The IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceThe IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse appliance
 
Oracle Database 12c para la comunidad GeneXus - Engineered for clouds
Oracle Database 12c para la comunidad GeneXus - Engineered for cloudsOracle Database 12c para la comunidad GeneXus - Engineered for clouds
Oracle Database 12c para la comunidad GeneXus - Engineered for clouds
 
Understand the Query Plan to Optimize Performance with EXPLAIN and EXPLAIN AN...
Understand the Query Plan to Optimize Performance with EXPLAIN and EXPLAIN AN...Understand the Query Plan to Optimize Performance with EXPLAIN and EXPLAIN AN...
Understand the Query Plan to Optimize Performance with EXPLAIN and EXPLAIN AN...
 
Store data more efficiently and increase I/O performance with lower latency w...
Store data more efficiently and increase I/O performance with lower latency w...Store data more efficiently and increase I/O performance with lower latency w...
Store data more efficiently and increase I/O performance with lower latency w...
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200x
 
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceHadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
 
Apache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in AlibabaApache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in Alibaba
 

Similaire à Greenplum Database on HDFS

Extend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop EcosystemExtend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop Ecosystem
Fei Dong
 
SAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego CloudSAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego Cloud
aidanshribman
 

Similaire à Greenplum Database on HDFS (20)

Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
 
Extend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop EcosystemExtend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop Ecosystem
 
IBM Spark Meetup - RDD & Spark Basics
IBM Spark Meetup - RDD & Spark BasicsIBM Spark Meetup - RDD & Spark Basics
IBM Spark Meetup - RDD & Spark Basics
 
Ria2010 workshop dev mobile
Ria2010 workshop dev mobileRia2010 workshop dev mobile
Ria2010 workshop dev mobile
 
Hadoop Analytics on Isilon Deep Dive
Hadoop Analytics on Isilon Deep DiveHadoop Analytics on Isilon Deep Dive
Hadoop Analytics on Isilon Deep Dive
 
Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012
 
50a volumes
50a volumes50a volumes
50a volumes
 
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep diveApache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
 
Hadoop 101
Hadoop 101Hadoop 101
Hadoop 101
 
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
 
SAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego CloudSAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego Cloud
 
Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012
Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012
Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012
 
Compile ahead of time. It's fine?
Compile ahead of time. It's fine?Compile ahead of time. It's fine?
Compile ahead of time. It's fine?
 
The Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelThe Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux Kernel
 
Slides: Introducing the new ClusterControl 1.2.10 for MySQL, MongoDB and Post...
Slides: Introducing the new ClusterControl 1.2.10 for MySQL, MongoDB and Post...Slides: Introducing the new ClusterControl 1.2.10 for MySQL, MongoDB and Post...
Slides: Introducing the new ClusterControl 1.2.10 for MySQL, MongoDB and Post...
 
Hadoop for carrier
Hadoop for carrierHadoop for carrier
Hadoop for carrier
 
Integrating Lucene into a Transactional XML Database
Integrating Lucene into a Transactional XML DatabaseIntegrating Lucene into a Transactional XML Database
Integrating Lucene into a Transactional XML Database
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
 
Architecture_Masking_Delphix.pptx
Architecture_Masking_Delphix.pptxArchitecture_Masking_Delphix.pptx
Architecture_Masking_Delphix.pptx
 
Zend Products and PHP for IBMi
Zend Products and PHP for IBMi  Zend Products and PHP for IBMi
Zend Products and PHP for IBMi
 

Plus de DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Dernier (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

Greenplum Database on HDFS

  • 1. Greenplum Database on HDFS (GOH) Presenter: Lei Chang lei.chang@emc.com © Copyright 2012 EMC Corporation. All rights reserved. 1
  • 2. Outline   •  Introduc/on   •  Architecture   •  Features   •  Performance  study   © Copyright 2012 EMC Corporation. All rights reserved. 2
  • 3. EMC  Greenplum  Unified  Analy/cs  Pla@orm   © Copyright 2012 EMC Corporation. All rights reserved. 3
  • 4. GOH  use  cases   •  All  customers  of  Greenplum  who  want  to  minimize  the  amount  of   duplicate  storage  that  they  have  to  buy  for  analy/cs     –  managing  scale  much  easier  if  you  focus  on  the  growth  of  one  pool  than   having  many  fragmented  pools.     •  For  customers  who  want  the  func/onality  of  GPDB  with  the  generality  and   storage  provided  by  their  HBase  store.     •  Poten/al  Ability  to  plug  various  storage  such  as  Isilon,  Atoms,  MapR   Filesystem,  CloudStore,  GPFS,  Lustre,  PVFS  and  Ceph  to  GPDB/Hadoop   soQware  stack   © Copyright 2012 EMC Corporation. All rights reserved. 4
  • 5. Master host GPDB Interconnect Segment Segment (Mirror) Segment Segment Segment Segment Segment Segment (Mirror) Segment Segment (Mirror) (Mirror) (Mirror) Segment host Segment host Segment host Segment host Segment host Meta Ops Read/Write Tables in HDFS filespace Namenode B Datanode replication Datanode Datanode Rack1 Rack2 © Copyright 2012 EMC Corporation. All rights reserved. 5
  • 6. GOH  features   •  A  pluggable  storage  layer.  If  a  new  file  system  can  support  the   full  seman/c  of  HDFS  interface,  then  the  file  system  can  be   added  as  GPDB  AO  table  storage.   •  ASributed  filespace   •  HDFS  filespaces  are  na/vely  supported   •  Full  transac/on  support  for  AO  tables  on  HDFS.     •  HDFS  trunca/on  capability  to  support  the  transac/on   capability  of  GOH.     •  HDFS  na/ve  C  interface  to  eliminate  the  concurrency   limita/on  of  current  java  JNI  based  client.   •  All  current  GPDB  func/onality:  fault  tolerance  et  al.     © Copyright 2012 EMC Corporation. All rights reserved. 6
  • 7. Pluggable  storage:  user  interface   CREATE  FUNCTION  open_func  AS  '('  obj_file  ','  link_smybol  ')'       CREATE  FILESYSTEM  filesystemname  [OWNER  ownername]  (                                                                    connect  =  connect_func,                                                                    open  =  open_func,                                                                      close  =  close_func,                                                                    read  =  read_func,                                                                    write  =  write_func,                                                                    seek  =  seek_func,                                      ...   )     © Copyright 2012 EMC Corporation. All rights reserved. 7
  • 8. ASributed  filespaces   •  The  number  of  replicas  for  the  table  in  the  filespace   •  Whether  mirroring  is  supported  for  the  tables  stored  in  the   filespace   •  Other  aSributes…   © Copyright 2012 EMC Corporation. All rights reserved. 8
  • 9. Example  SQL   CREATE  FILESPACE  goh  ON  HDFS   (            1:  'hdfs://name-­‐node/users/changl1/gp-­‐data/gohmaster/gpseg-­‐1',            2:  'hdfs://name-­‐node/users/changl1/gp-­‐data/goh/gpseg0',            3:  'hdfs://name-­‐node/users/changl1/gp-­‐data/goh/gpseg1',   )  WITH  (NUMREPLICA  =  3,  MIRRORING  =  false);     © Copyright 2012 EMC Corporation. All rights reserved. 9
  • 10. Transac/on  support   •  When  a  load  transac/on  is  aborted,  there  will  be  some   garbage  data  leQ  at  the  end  of  file.  For  HDFS  like  systems,   data  cannot  be  truncated  or  overwriSen.  Thus,  we  need  some   methods  to  process  the  par/al  data  to  support  transac/on.     –  Op/on  1:  Load  data  into  a  separate  HDFS  file.  Unlimited  number  of   files.   –  Op/on  2:  Use  metadata  to  records  the  boundary  of  garbage  data,  and   implements  a  kind  of  vacuum  mechanism.   –  Op/on  3:  Implement  HDFS  trunca/on.   © Copyright 2012 EMC Corporation. All rights reserved. 10
  • 11. HDFS  C  client:  why     •  libhdfs  (Current  HDFS  c  client)  is  based  on  JNI.  It  is  difficult  to   make  GOH  support  a  large  number  of  concurrent  queries.     •  Example:   –  6  segments  on  each  segment  hosts   –  50  concurrent  queries   –  each  query  may  have  12  or  more  QE  processes  that  do  scan   –  there  will  be  about  600  processes  that  start  600  JVMs  to  access  HDFS.     –  If  each  JVM  uses  500MB  memory,  the  JVMs  will  consume  600  *  500M   =  300G  memory.     –  Thus  naïve  usage  of  libhdfs  is  not  suitable  for  GOH.  Currently  we  have   three  op/ons  to  solve  this  problem   © Copyright 2012 EMC Corporation. All rights reserved. 11
  • 12. HDFS  client:  three  op/ons   •  Op/on  1:  use  HDFS  FUSE.  HDFS  FUSE  introduces  some   performance  overhead.  And  the  scalability  is  not  verified  yet.   •  Op/on  3:  implement  a  webhdfs  based  C  client.  webhdfs  is   based  on  HTTP.  It  also  introduces  some  costs.  Performance   should  be  benchmarked.  Webhdfs  based  method  has  several   benefits,  such  as  ease  to  implementa/on  and  low   maintenance  cost.   •  Op/on  2:  implement  a  C  RPC  interface  that  directly   communicates  with  NameNode  and  DataNode.  Many  changes   when  the  RPC  protocol  is  changed.   •  Currently,  we  implemented  op/on  2  and  op/on  3.   © Copyright 2012 EMC Corporation. All rights reserved. 12
  • 13. HDFS  truncate   •  API   –  truncate  (DistributedFileSystem)  -­‐  truncate  a  file  to  a  specified  length   –  void  truncate(Path  src,  long  length)  throws  IOExcep/on;   •  Seman/cs   –  Only  single  writer/Appender/Truncater  is  allowed.  Users  can  only  call   truncate  on  closed  files.   –  HDFS  guarantees  the  atomicity  of  a  truncate  opera/on.  That  is,  it   succeeds  or  fails.  It  does  not  leave  the  file  in  an  undefined  state.   –  Concurrent  readers  may  read  content  of  a  file  that  will  be  truncated   by  a  concurrent  truncate  opera/on.  But  they  must  be  able  to  read  all   the  data  that  are  not  affected  by  the  concurrent  truncate  opera/on.   © Copyright 2012 EMC Corporation. All rights reserved. 13
  • 14. HDFS  truncate  implementa/on  (HDFS-­‐3107)   •  Get  the  lease  of  the  to-­‐be-­‐truncated  file  (F)   •  If  truncate  is  at  block  boundary   –  Delete  the  tail  blocks  as  an  atomic  opera/on.     •  If  truncate  is  not  at  block  boundary   –  Copy  the  last  block  (B)  of  the  result  file  (R)  to  a  temporary  file  (T).   •  Otherwise,  If  truncate  is  not  at  block  boundary   •  Remove  the  tail  blocks  of  file  F  (including  B,  B+1,  …),  concat  F   and  T,  get  R.   •  Release  the  lease  for  the  file   © Copyright 2012 EMC Corporation. All rights reserved. 14
  • 15. Performance  study  (to  be  added)   © Copyright 2012 EMC Corporation. All rights reserved. 15
  • 16. Thank  you!   © Copyright 2012 EMC Corporation. All rights reserved. 16