SlideShare une entreprise Scribd logo
1  sur  76
Télécharger pour lire hors ligne
Large Scale ETL with Hadoop
    Headline Goes Here
    Eric Sammer | Principal Solution Architect
    Speaker Name or Subhead Goes Here
    @esammer
    Strata + Hadoop World 2012




1
ETL is like “REST” or “Disaster Recovery”




2
ETL is like “REST” or “Disaster Recovery”
       Everyone defines it differently (and loves to fight
       about it)




2
ETL is like “REST” or “Disaster Recovery”
       Everyone defines it differently (and loves to fight
       about it)
       It’s more of a problem/solution space than a thing




2
ETL is like “REST” or “Disaster Recovery”
       Everyone defines it differently (and loves to fight
       about it)
       It’s more of a problem/solution space than a thing
       Hard to generalize without being lossy in some
       way




2
ETL is like “REST” or “Disaster Recovery”
       Everyone defines it differently (and loves to fight
       about it)
       It’s more of a problem/solution space than a thing
       Hard to generalize without being lossy in some
       way
       Worst, it’s trivial at face value, complicated in
       practice

2
So why is ETL hard?




3
So why is ETL hard?
       It’s not because ƒ(A) → B is hard (anymore)




3
So why is ETL hard?
       It’s not because ƒ(A) → B is hard (anymore)
       Data integration




3
So why is ETL hard?
       It’s not because ƒ(A) → B is hard (anymore)
       Data integration
       Organization and management




3
So why is ETL hard?
       It’s not because ƒ(A) → B is hard (anymore)
       Data integration
       Organization and management
       Process orchestration and scheduling




3
So why is ETL hard?
       It’s not because ƒ(A) → B is hard (anymore)
       Data integration
       Organization and management
       Process orchestration and scheduling
       Accessibility



3
So why is ETL hard?
       It’s not because ƒ(A) → B is hard (anymore)
       Data integration
       Organization and management
       Process orchestration and scheduling
       Accessibility
       How it all fits together


3
Hadoop is two components




4
Hadoop is two components
      HDFS – Massive, redundant data storage




4
Hadoop is two components
      HDFS – Massive, redundant data storage
      MapReduce – Batch-oriented data processing at
      scale




4
The ecosystem brings additional functionality




5
The ecosystem brings additional functionality
      Higher level languages and abstractions on
      MapReduce




5
The ecosystem brings additional functionality
      Higher level languages and abstractions on
      MapReduce
          Hive, Pig, Cascading, ...




5
The ecosystem brings additional functionality
      Higher level languages and abstractions on
      MapReduce
      File, relational, and streaming data integration




6
The ecosystem brings additional functionality
      Higher level languages and abstractions on
      MapReduce
      File, relational, and streaming data integration
          Flume, Sqoop, WebHDFS, ...




6
The ecosystem brings additional functionality
      Higher level languages and abstractions on
      MapReduce
      File, relational, and streaming data integration
      Process orchestration and scheduling




7
The ecosystem brings additional functionality
      Higher level languages and abstractions on
      MapReduce
      File, relational, and streaming data integration
      Process orchestration and scheduling
          Oozie, Azkaban, ...




7
The ecosystem brings additional functionality
      Higher level languages and abstractions on
      MapReduce
      File, relational, and streaming data integration
      Process orchestration and scheduling
      Libraries for parsing and text extraction




8
The ecosystem brings additional functionality
      Higher level languages and abstractions on
      MapReduce
      File, relational, and streaming data integration
      Process orchestration and scheduling
      Libraries for parsing and text extraction
          Tika, ?, ...



8
The ecosystem brings additional functionality
      Higher level languages and abstractions on
      MapReduce
      File, relational, and streaming data integration
      Process orchestration and scheduling
      Libraries for parsing and text extraction
      ...and now low latency query with Impala


9
To truly scale ETL, separate infrastructure from
     processes




10
To truly scale ETL, separate infrastructure from
     processes, and make it a macro-level service




11
To truly scale ETL, separate infrastructure from
     processes, and make it a macro-level service
     (composed of other services).




12
The services of ETL




13
The services of ETL
       Process Repository




13
The services of ETL
       Process Repository
       Metadata Repository




13
The services of ETL
       Process Repository
       Metadata Repository
       Scheduling




13
The services of ETL
       Process Repository
       Metadata Repository
       Scheduling
       Process Orchestration




13
The services of ETL
       Process Repository
       Metadata Repository
       Scheduling
       Process Orchestration
       Integration Adapters or Channels



13
The services of ETL
       Process Repository
       Metadata Repository
       Scheduling
       Process Orchestration
       Integration Adapters or Channels
       Service and Process Instrumentation and
       Collection

13
What do we have today?




14
What do we have today?
       HDFS and MapReduce – The core




14
What do we have today?
       HDFS and MapReduce – The core
       Flume – Streaming event data integration




14
What do we have today?
       HDFS and MapReduce – The core
       Flume – Streaming event data integration
       Sqoop – Batch exchange of relational database
       tables




14
What do we have today?
       HDFS and MapReduce – The core
       Flume – Streaming event data integration
       Sqoop – Batch exchange of relational database
       tables
       Oozie – Process orchestration and basic
       scheduling


14
What do we have today?
       HDFS and MapReduce – The core
       Flume – Streaming event data integration
       Sqoop – Batch exchange of relational database
       tables
       Oozie – Process orchestration and basic
       scheduling
       Impala – Fast analysis of data quality

14
MapReduce is the assembly language of data
     processing




15
MapReduce is the assembly language of data
     processing
        “Simple things are hard, but hard things are
        possible”




15
MapReduce is the assembly language of data
     processing
        “Simple things are hard, but hard things are
        possible”
        Comparatively low level




15
MapReduce is the assembly language of data
     processing
        “Simple things are hard, but hard things are
        possible”
        Comparatively low level
        Java knowledge required




15
MapReduce is the assembly language of data
     processing
        “Simple things are hard, but hard things are
        possible”
        Comparatively low level
        Java knowledge required
        Use higher level tools where possible


15
Data organization in HDFS




16
Data organization in HDFS
        Standard file system tricks to make operations
        atomic




16
Data organization in HDFS
        Standard file system tricks to make operations
        atomic
        Use a well-defined structure that supports tooling




16
Data organization in HDFS – Hierarchy
       /intent
          /category
             /application (optional)
                /dataset
                    /partitions
                       /files

       Examples:
       /data/fraud/txs/2012-01-01/20120101-00.avro
       /data/fraud/txs/2012-01-01/20120101-01.avro
       /group/research/model-17/training-txs/part-00000.avro
       /group/research/model-17/training-txs/part-00001.avro
       /user/esammer/scratch/foo/



17
A view of data integration




18
Event
                      headers:({
                      ((app:((1234,
                      ((type:(321
                      ((ts:(((<epoch>
                      },
                      body:(((<bytes>


        Syslog)
        Events             Flume)Agent

                                                        HDFS
                              Flume)
     Applica7on)            (Channel)1)   /data/ops/syslog/2012P01P01/
       Events


                              Flume)      /data/web/core/2012P01P01/
                            (Channel)2)   /data/web/retail/2012P01P01/
     Clickstream)
        Events                                                             Relational Data
                                          /data/pos/US/NY/17/2012P01P01/
                              Flume)      /data/pos/US/CA/42/2012P01P01/
     Point)of)Sale)         (Channel)3)
        Events
                                                                           Sqoop     Web)App)
                                                                           (Job)1)   Database
                                          /data/wdb/<database>/<table>/




        Streaming Data                    /data/edw/<database>/<table>/    Sqoop
                                                                                      EDW
                                                                           (Job)2)




19
Structure data in tiers




20
Structure data in tiers
        A clear hierarchy of source/derived relationships




20
Structure data in tiers
        A clear hierarchy of source/derived relationships
        One step on the road to proper lineage




20
Structure data in tiers
        A clear hierarchy of source/derived relationships
        One step on the road to proper lineage
        Simple “fault and rebuild” processes




20
Structure data in tiers
        A clear hierarchy of source/derived relationships
        One step on the road to proper lineage
        Simple “fault and rebuild” processes
        Examples




20
Structure data in tiers
        A clear hierarchy of source/derived relationships
        One step on the road to proper lineage
        Simple “fault and rebuild” processes
        Examples
           Tier 0 – Raw data from source systems




20
Structure data in tiers
        A clear hierarchy of source/derived relationships
        One step on the road to proper lineage
        Simple “fault and rebuild” processes
        Examples
           Tier 0 – Raw data from source systems
           Tier 1 – Derived from 0, cleansed, normalized



20
Structure data in tiers
        A clear hierarchy of source/derived relationships
        One step on the road to proper lineage
        Simple “fault and rebuild” processes
        Examples
           Tier 0 – Raw data from source systems
           Tier 1 – Derived from 0, cleansed, normalized
           Tier 2 – Derived from 1, aggregated


20
HDFS%(Tier%0)                                                  HDFS%(Tier%1)

     /data/ops/syslog/2012G01G01/                               /data/repor9ng/sessionsGday/YYYYGMMGDD/

                                           Sessioniza9on

     /data/web/core/2012G01G01/
                                                                /data/repor9ng/eventsGday/YYYYGMMGDD/
     /data/web/retail/2012G01G01/



     /data/pos/US/NY/17/2012G01G01/   Event%Report%Aggrega9on   /data/repor9ng/eventsGhour/YYYYGMMGDD/
     /data/pos/US/CA/42/2012G01G01/



     /data/wdb/<database>/<table>/

                                      Inventory%Reconcilia9on                HDFS%(For%export)


     /data/edw/<database>/<table>/                              /export/edw/inventory/itemGdiff/<ts>/




21
There’s a lot to do




22
There’s a lot to do
       Build libraries or services to reveal higher-level
       interfaces




22
There’s a lot to do
       Build libraries or services to reveal higher-level
       interfaces
       Data management and lifecycle events




22
There’s a lot to do
       Build libraries or services to reveal higher-level
       interfaces
       Data management and lifecycle events
       Instrument jobs and services for performance/
       quality




22
There’s a lot to do
       Build libraries or services to reveal higher-level
       interfaces
       Data management and lifecycle events
       Instrument jobs and services for performance/
       quality
       Metadata, metadata, metadata (metadata)


22
There’s a lot to do
       Build libraries or services to reveal higher-level
       interfaces
       Data management and lifecycle events
       Instrument jobs and services for performance/
       quality
       Metadata, metadata, metadata (metadata)
       Process (job) deployment, service location,

22
To the contributors, potential and current




23
To the contributors, potential and current
        We have work to do




23
To the contributors, potential and current
        We have work to do
        Still way too much scaffolding work




23
To the contributors, potential and current
        We have work to do
        Still way too much scaffolding work




23
I’m out of time (for now)




24
I’m out of time (for now)
        Join me for office hours – 1:40 - 2:20 in
        Rhinelander




24
I’m out of time (for now)
        Join me for office hours – 1:40 - 2:20 in
        Rhinelander
        I’m signing copies of Hadoop Operations tonight




24
25

Contenu connexe

Tendances

Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability | Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability | Edureka!
 
Faster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at YahooFaster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at YahooMithun Radhakrishnan
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop AdministrationEdureka!
 
Apache Drill - Why, What, How
Apache Drill - Why, What, HowApache Drill - Why, What, How
Apache Drill - Why, What, Howmcsrivas
 
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with YarnScale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with YarnDavid Kaiser
 
Meethadoop
MeethadoopMeethadoop
MeethadoopIIIT-H
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Simplilearn
 
Setting up a big data platform at kelkoo
Setting up a big data platform at kelkooSetting up a big data platform at kelkoo
Setting up a big data platform at kelkooFabrice dos Santos
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBaseCloudera, Inc.
 
Hadoop hbase mapreduce
Hadoop hbase mapreduceHadoop hbase mapreduce
Hadoop hbase mapreduceFARUK BERKSÖZ
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsLynn Langit
 

Tendances (20)

Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability | Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
 
Hadoop 1.x vs 2
Hadoop 1.x vs 2Hadoop 1.x vs 2
Hadoop 1.x vs 2
 
Enabling R on Hadoop
Enabling R on HadoopEnabling R on Hadoop
Enabling R on Hadoop
 
Faster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at YahooFaster Faster Faster! Datamarts with Hive at Yahoo
Faster Faster Faster! Datamarts with Hive at Yahoo
 
M7 and Apache Drill, Micheal Hausenblas
M7 and Apache Drill, Micheal HausenblasM7 and Apache Drill, Micheal Hausenblas
M7 and Apache Drill, Micheal Hausenblas
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop Administration
 
Pptx present
Pptx presentPptx present
Pptx present
 
Apache Drill - Why, What, How
Apache Drill - Why, What, HowApache Drill - Why, What, How
Apache Drill - Why, What, How
 
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with YarnScale 12 x   Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
Meethadoop
MeethadoopMeethadoop
Meethadoop
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
 
Hadoop architecture by ajay
Hadoop architecture by ajayHadoop architecture by ajay
Hadoop architecture by ajay
 
Using Apache Drill
Using Apache DrillUsing Apache Drill
Using Apache Drill
 
Apache drill
Apache drillApache drill
Apache drill
 
Setting up a big data platform at kelkoo
Setting up a big data platform at kelkooSetting up a big data platform at kelkoo
Setting up a big data platform at kelkoo
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBase
 
Hadoop hbase mapreduce
Hadoop hbase mapreduceHadoop hbase mapreduce
Hadoop hbase mapreduce
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 

En vedette

Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Daniel Abadi
 
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...Daniel Abadi
 
Application Architectures with Hadoop | Data Day Texas 2015
Application Architectures with Hadoop | Data Day Texas 2015Application Architectures with Hadoop | Data Day Texas 2015
Application Architectures with Hadoop | Data Day Texas 2015Cloudera, Inc.
 
Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems divjeev
 
Big Data vs Data Warehousing
Big Data vs Data WarehousingBig Data vs Data Warehousing
Big Data vs Data WarehousingThomas Kejser
 
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about..."Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...Kai Wähner
 
How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?Thanakrit Lersmethasakul
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsDavid Portnoy
 

En vedette (8)

Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012
 
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
 
Application Architectures with Hadoop | Data Day Texas 2015
Application Architectures with Hadoop | Data Day Texas 2015Application Architectures with Hadoop | Data Day Texas 2015
Application Architectures with Hadoop | Data Day Texas 2015
 
Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems
 
Big Data vs Data Warehousing
Big Data vs Data WarehousingBig Data vs Data Warehousing
Big Data vs Data Warehousing
 
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about..."Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
 
How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse Platforms
 

Similaire à Large Scale ETL with Hadoop

Hadoop: An Industry Perspective
Hadoop: An Industry PerspectiveHadoop: An Industry Perspective
Hadoop: An Industry PerspectiveCloudera, Inc.
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010nzhang
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitDataWorks Summit
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Chris Baglieri
 
SQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle ProfessionalSQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle ProfessionalMichael Rainey
 
May 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLMay 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLAdam Muise
 
Hadoop interview questions
Hadoop interview questionsHadoop interview questions
Hadoop interview questionsbarbie0909
 
Pig - Analyzing data sets
Pig - Analyzing data setsPig - Analyzing data sets
Pig - Analyzing data setsCreditas
 
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)Jeff Magnusson
 
Sf NoSQL MeetUp: Apache Hadoop and HBase
Sf NoSQL MeetUp: Apache Hadoop and HBaseSf NoSQL MeetUp: Apache Hadoop and HBase
Sf NoSQL MeetUp: Apache Hadoop and HBaseCloudera, Inc.
 
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and FacebookHow Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and FacebookAmr Awadallah
 
Hadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderHadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderDmitry Makarchuk
 
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟datastack
 
Big Data - HDInsight and Power BI
Big Data - HDInsight and Power BIBig Data - HDInsight and Power BI
Big Data - HDInsight and Power BIPrasad Prabhu (PP)
 

Similaire à Large Scale ETL with Hadoop (20)

Hadoop: An Industry Perspective
Hadoop: An Industry PerspectiveHadoop: An Industry Perspective
Hadoop: An Industry Perspective
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
 
Big data overview by Edgars
Big data overview by EdgarsBig data overview by Edgars
Big data overview by Edgars
 
SQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle ProfessionalSQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle Professional
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Handling not so big data
Handling not so big dataHandling not so big data
Handling not so big data
 
May 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLMay 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETL
 
Hadoop interview questions
Hadoop interview questionsHadoop interview questions
Hadoop interview questions
 
Pig - Analyzing data sets
Pig - Analyzing data setsPig - Analyzing data sets
Pig - Analyzing data sets
 
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
 
Sf NoSQL MeetUp: Apache Hadoop and HBase
Sf NoSQL MeetUp: Apache Hadoop and HBaseSf NoSQL MeetUp: Apache Hadoop and HBase
Sf NoSQL MeetUp: Apache Hadoop and HBase
 
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and FacebookHow Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
 
Big data
Big dataBig data
Big data
 
Hadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderHadoop and mysql by Chris Schneider
Hadoop and mysql by Chris Schneider
 
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟
 
Hadoop
HadoopHadoop
Hadoop
 
Big Data - HDInsight and Power BI
Big Data - HDInsight and Power BIBig Data - HDInsight and Power BI
Big Data - HDInsight and Power BI
 
The future of Big Data tooling
The future of Big Data toolingThe future of Big Data tooling
The future of Big Data tooling
 

Dernier

Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 

Dernier (20)

Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
How Tech Giants Cut Corners to Harvest Data for A.I.
How Tech Giants Cut Corners to Harvest Data for A.I.How Tech Giants Cut Corners to Harvest Data for A.I.
How Tech Giants Cut Corners to Harvest Data for A.I.
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 

Large Scale ETL with Hadoop

  • 1. Large Scale ETL with Hadoop Headline Goes Here Eric Sammer | Principal Solution Architect Speaker Name or Subhead Goes Here @esammer Strata + Hadoop World 2012 1
  • 2. ETL is like “REST” or “Disaster Recovery” 2
  • 3. ETL is like “REST” or “Disaster Recovery” Everyone defines it differently (and loves to fight about it) 2
  • 4. ETL is like “REST” or “Disaster Recovery” Everyone defines it differently (and loves to fight about it) It’s more of a problem/solution space than a thing 2
  • 5. ETL is like “REST” or “Disaster Recovery” Everyone defines it differently (and loves to fight about it) It’s more of a problem/solution space than a thing Hard to generalize without being lossy in some way 2
  • 6. ETL is like “REST” or “Disaster Recovery” Everyone defines it differently (and loves to fight about it) It’s more of a problem/solution space than a thing Hard to generalize without being lossy in some way Worst, it’s trivial at face value, complicated in practice 2
  • 7. So why is ETL hard? 3
  • 8. So why is ETL hard? It’s not because ƒ(A) → B is hard (anymore) 3
  • 9. So why is ETL hard? It’s not because ƒ(A) → B is hard (anymore) Data integration 3
  • 10. So why is ETL hard? It’s not because ƒ(A) → B is hard (anymore) Data integration Organization and management 3
  • 11. So why is ETL hard? It’s not because ƒ(A) → B is hard (anymore) Data integration Organization and management Process orchestration and scheduling 3
  • 12. So why is ETL hard? It’s not because ƒ(A) → B is hard (anymore) Data integration Organization and management Process orchestration and scheduling Accessibility 3
  • 13. So why is ETL hard? It’s not because ƒ(A) → B is hard (anymore) Data integration Organization and management Process orchestration and scheduling Accessibility How it all fits together 3
  • 14. Hadoop is two components 4
  • 15. Hadoop is two components HDFS – Massive, redundant data storage 4
  • 16. Hadoop is two components HDFS – Massive, redundant data storage MapReduce – Batch-oriented data processing at scale 4
  • 17. The ecosystem brings additional functionality 5
  • 18. The ecosystem brings additional functionality Higher level languages and abstractions on MapReduce 5
  • 19. The ecosystem brings additional functionality Higher level languages and abstractions on MapReduce Hive, Pig, Cascading, ... 5
  • 20. The ecosystem brings additional functionality Higher level languages and abstractions on MapReduce File, relational, and streaming data integration 6
  • 21. The ecosystem brings additional functionality Higher level languages and abstractions on MapReduce File, relational, and streaming data integration Flume, Sqoop, WebHDFS, ... 6
  • 22. The ecosystem brings additional functionality Higher level languages and abstractions on MapReduce File, relational, and streaming data integration Process orchestration and scheduling 7
  • 23. The ecosystem brings additional functionality Higher level languages and abstractions on MapReduce File, relational, and streaming data integration Process orchestration and scheduling Oozie, Azkaban, ... 7
  • 24. The ecosystem brings additional functionality Higher level languages and abstractions on MapReduce File, relational, and streaming data integration Process orchestration and scheduling Libraries for parsing and text extraction 8
  • 25. The ecosystem brings additional functionality Higher level languages and abstractions on MapReduce File, relational, and streaming data integration Process orchestration and scheduling Libraries for parsing and text extraction Tika, ?, ... 8
  • 26. The ecosystem brings additional functionality Higher level languages and abstractions on MapReduce File, relational, and streaming data integration Process orchestration and scheduling Libraries for parsing and text extraction ...and now low latency query with Impala 9
  • 27. To truly scale ETL, separate infrastructure from processes 10
  • 28. To truly scale ETL, separate infrastructure from processes, and make it a macro-level service 11
  • 29. To truly scale ETL, separate infrastructure from processes, and make it a macro-level service (composed of other services). 12
  • 30. The services of ETL 13
  • 31. The services of ETL Process Repository 13
  • 32. The services of ETL Process Repository Metadata Repository 13
  • 33. The services of ETL Process Repository Metadata Repository Scheduling 13
  • 34. The services of ETL Process Repository Metadata Repository Scheduling Process Orchestration 13
  • 35. The services of ETL Process Repository Metadata Repository Scheduling Process Orchestration Integration Adapters or Channels 13
  • 36. The services of ETL Process Repository Metadata Repository Scheduling Process Orchestration Integration Adapters or Channels Service and Process Instrumentation and Collection 13
  • 37. What do we have today? 14
  • 38. What do we have today? HDFS and MapReduce – The core 14
  • 39. What do we have today? HDFS and MapReduce – The core Flume – Streaming event data integration 14
  • 40. What do we have today? HDFS and MapReduce – The core Flume – Streaming event data integration Sqoop – Batch exchange of relational database tables 14
  • 41. What do we have today? HDFS and MapReduce – The core Flume – Streaming event data integration Sqoop – Batch exchange of relational database tables Oozie – Process orchestration and basic scheduling 14
  • 42. What do we have today? HDFS and MapReduce – The core Flume – Streaming event data integration Sqoop – Batch exchange of relational database tables Oozie – Process orchestration and basic scheduling Impala – Fast analysis of data quality 14
  • 43. MapReduce is the assembly language of data processing 15
  • 44. MapReduce is the assembly language of data processing “Simple things are hard, but hard things are possible” 15
  • 45. MapReduce is the assembly language of data processing “Simple things are hard, but hard things are possible” Comparatively low level 15
  • 46. MapReduce is the assembly language of data processing “Simple things are hard, but hard things are possible” Comparatively low level Java knowledge required 15
  • 47. MapReduce is the assembly language of data processing “Simple things are hard, but hard things are possible” Comparatively low level Java knowledge required Use higher level tools where possible 15
  • 49. Data organization in HDFS Standard file system tricks to make operations atomic 16
  • 50. Data organization in HDFS Standard file system tricks to make operations atomic Use a well-defined structure that supports tooling 16
  • 51. Data organization in HDFS – Hierarchy /intent /category /application (optional) /dataset /partitions /files Examples: /data/fraud/txs/2012-01-01/20120101-00.avro /data/fraud/txs/2012-01-01/20120101-01.avro /group/research/model-17/training-txs/part-00000.avro /group/research/model-17/training-txs/part-00001.avro /user/esammer/scratch/foo/ 17
  • 52. A view of data integration 18
  • 53. Event headers:({ ((app:((1234, ((type:(321 ((ts:(((<epoch> }, body:(((<bytes> Syslog) Events Flume)Agent HDFS Flume) Applica7on) (Channel)1) /data/ops/syslog/2012P01P01/ Events Flume) /data/web/core/2012P01P01/ (Channel)2) /data/web/retail/2012P01P01/ Clickstream) Events Relational Data /data/pos/US/NY/17/2012P01P01/ Flume) /data/pos/US/CA/42/2012P01P01/ Point)of)Sale) (Channel)3) Events Sqoop Web)App) (Job)1) Database /data/wdb/<database>/<table>/ Streaming Data /data/edw/<database>/<table>/ Sqoop EDW (Job)2) 19
  • 54. Structure data in tiers 20
  • 55. Structure data in tiers A clear hierarchy of source/derived relationships 20
  • 56. Structure data in tiers A clear hierarchy of source/derived relationships One step on the road to proper lineage 20
  • 57. Structure data in tiers A clear hierarchy of source/derived relationships One step on the road to proper lineage Simple “fault and rebuild” processes 20
  • 58. Structure data in tiers A clear hierarchy of source/derived relationships One step on the road to proper lineage Simple “fault and rebuild” processes Examples 20
  • 59. Structure data in tiers A clear hierarchy of source/derived relationships One step on the road to proper lineage Simple “fault and rebuild” processes Examples Tier 0 – Raw data from source systems 20
  • 60. Structure data in tiers A clear hierarchy of source/derived relationships One step on the road to proper lineage Simple “fault and rebuild” processes Examples Tier 0 – Raw data from source systems Tier 1 – Derived from 0, cleansed, normalized 20
  • 61. Structure data in tiers A clear hierarchy of source/derived relationships One step on the road to proper lineage Simple “fault and rebuild” processes Examples Tier 0 – Raw data from source systems Tier 1 – Derived from 0, cleansed, normalized Tier 2 – Derived from 1, aggregated 20
  • 62. HDFS%(Tier%0) HDFS%(Tier%1) /data/ops/syslog/2012G01G01/ /data/repor9ng/sessionsGday/YYYYGMMGDD/ Sessioniza9on /data/web/core/2012G01G01/ /data/repor9ng/eventsGday/YYYYGMMGDD/ /data/web/retail/2012G01G01/ /data/pos/US/NY/17/2012G01G01/ Event%Report%Aggrega9on /data/repor9ng/eventsGhour/YYYYGMMGDD/ /data/pos/US/CA/42/2012G01G01/ /data/wdb/<database>/<table>/ Inventory%Reconcilia9on HDFS%(For%export) /data/edw/<database>/<table>/ /export/edw/inventory/itemGdiff/<ts>/ 21
  • 63. There’s a lot to do 22
  • 64. There’s a lot to do Build libraries or services to reveal higher-level interfaces 22
  • 65. There’s a lot to do Build libraries or services to reveal higher-level interfaces Data management and lifecycle events 22
  • 66. There’s a lot to do Build libraries or services to reveal higher-level interfaces Data management and lifecycle events Instrument jobs and services for performance/ quality 22
  • 67. There’s a lot to do Build libraries or services to reveal higher-level interfaces Data management and lifecycle events Instrument jobs and services for performance/ quality Metadata, metadata, metadata (metadata) 22
  • 68. There’s a lot to do Build libraries or services to reveal higher-level interfaces Data management and lifecycle events Instrument jobs and services for performance/ quality Metadata, metadata, metadata (metadata) Process (job) deployment, service location, 22
  • 69. To the contributors, potential and current 23
  • 70. To the contributors, potential and current We have work to do 23
  • 71. To the contributors, potential and current We have work to do Still way too much scaffolding work 23
  • 72. To the contributors, potential and current We have work to do Still way too much scaffolding work 23
  • 73. I’m out of time (for now) 24
  • 74. I’m out of time (for now) Join me for office hours – 1:40 - 2:20 in Rhinelander 24
  • 75. I’m out of time (for now) Join me for office hours – 1:40 - 2:20 in Rhinelander I’m signing copies of Hadoop Operations tonight 24
  • 76. 25

Notes de l'éditeur

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n
  71. \n
  72. \n
  73. \n
  74. \n
  75. \n
  76. \n
  77. \n
  78. \n
  79. \n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. \n