SlideShare une entreprise Scribd logo
1  sur  19
fsimage and edits in CDH3 and CDH4
Tatsuo Kawasaki
tatsuo@cloudera.com
objective
HDFS metadata (fsimage and edits) management is different
between CDH3 and CDH4.
This presentation introduces a these difference.

Please let me know if you find any issue.
HDFS metada in CDH3
[root@localhost ~]# ls -ltr /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current/
total 1100
-rw-r--r-- 1 hdfs hdfs 101 Jan 30 00:21 VERSION
-rw-r--r-- 1 hdfs hdfs    8 Jan 30 00:21 fstime
-rw-r--r-- 1 hdfs hdfs 57248 Jan 30 00:21 fsimage
-rw-r--r-- 1 hdfs hdfs 1048580 Jan 31 16:16 edits



after checkpoint
[root@localhost ~]# ls -ltr /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current/
total 84
-rw-r--r-- 1 hdfs hdfs 101 Feb 5 14:37 VERSION
-rw-r--r-- 1 hdfs hdfs 8 Feb 5 14:37 fstime
-rw-r--r-- 1 hdfs hdfs 66760 Feb 5 14:37 fsimage
-rw-r--r-- 1 hdfs hdfs 4 Feb 5 14:37 edits
timeline (CDH3)
 NameNode                              CheckPoint                CheckPoint
              put file                 start                     Done
     t0          t1                           t2                      t3              t4


                                                                                           rename
     fsimage                                                fsimage.ckpt fsimage
                  Update edits
     edits        Update metadata in           edits.new                               edits
                  memory                                                               rename
                                                   create
     fstime                                                                           fstime
                                        get                                transfer   update time

Secondary NameNode


                                                   fsimage fsimage.ckpt
                                                              merge
                                                   edits
Secondary NN Web UI (CDH3)
HDFS metadata in CDH4
After formatting HDFS
-bash-4.1$ ls -l /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/
total 1040
-rw-r--r-- 1 hdfs hdfs 1048576 Feb 5 01:35 edits_inprogress_0000000000000000001
-rw-rw-r-- 1 hdfs hdfs 119 Feb 5 01:33 fsimage_0000000000000000000
-rw-rw-r-- 1 hdfs hdfs 62 Feb 5 01:33 fsimage_0000000000000000000.md5
-rw-r--r-- 1 hdfs hdfs    2 Feb 5 01:35 seen_txid
-rw-rw-r-- 1 hdfs hdfs 202 Feb 5 01:33 VERSION

-bash-4.1$ hexdump -C /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/seen_txid
00000000 31 0a                         |1.|
00000002




                                                                                  Transaction ID is
                                                                                     included in
                                                                                      seen_txid
try to add new file
[training@localhost ~]$ hadoop fs -put /etc/hosts hosts
[training@localhost ~]$
oiv - fsimage viewer
-bash-4.1$ hdfs oiv -i /var/lib/hadoop-
hdfs/cache/hdfs/dfs/name/current/fsimage_000000000000000
0000 -o aaa
-bash-4.1$ cat aaa
drwxr-xr-x - hdfs supergroup          0 1969-12-31 19:00 /




              ‘hosts’ file has not written in
             fsimage before checkpointing
oev – edits viewer
          -bash-4.1$ hdfs oev -i /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_0000000000000000020-0000000000000000027 -o bbb
          cat bbb
          <?xml version="1.0" encoding="UTF-8"?>
          <EDITS>
           <EDITS_VERSION>-40</EDITS_VERSION>
           <RECORD>
            <OPCODE>OP_START_LOG_SEGMENT</OPCODE>
            <DATA>
                                                                                                                    start
              <TXID>20</TXID>
            </DATA>
                                                                                                                Transaction ID
           </RECORD>
           <RECORD>
            <OPCODE>OP_SET_GENSTAMP</OPCODE>
            <DATA>
              <TXID>21</TXID>
              <GENSTAMP>1003</GENSTAMP>
            </DATA>
           </RECORD>                                                                                          Transaction ID
           <RECORD>
            <OPCODE>OP_ADD</OPCODE>
Put         <DATA>
              <TXID>22</TXID>

Transaction   <LENGTH>0</LENGTH>
              <PATH>/user/training/hosts._COPYING_</PATH>
              <REPLICATION>1</REPLICATION>
              <MTIME>1360046220628</MTIME>
              <ATIME>1360046220628</ATIME>
              <BLOCKSIZE>67108864</BLOCKSIZE>
              <CLIENT_NAME>DFSClient_NONMAPREDUCE_1911533003_1</CLIENT_NAME>
              <CLIENT_MACHINE>127.0.0.1</CLIENT_MACHINE>
              <PERMISSION_STATUS>
               <USERNAME>training</USERNAME>
               <GROUPNAME>supergroup</GROUPNAME>
               <MODE>420</MODE>
              </PERMISSION_STATUS>
            </DATA>
           </RECORD>
oev – edits viewer (cont)
      ファイル名edits_0000000000000000020-0000000000000000027

      <RECORD>
        <OPCODE>OP_SET_GENSTAMP</OPCODE>
        <DATA>
         <TXID>23</TXID>
         <GENSTAMP>1004</GENSTAMP>
        </DATA>
       </RECORD>
      <RECORD>
        <OPCODE>OP_UPDATE_BLOCKS</OPCODE>
        <DATA>
         <TXID>24</TXID>
         <PATH>/user/training/hosts._COPYING_</PATH>
         <BLOCK>
          <BLOCK_ID>-3498739165311848505</BLOCK_ID>
          <NUM_BYTES>0</NUM_BYTES>
          <GENSTAMP>1004</GENSTAMP>
         </BLOCK>
        </DATA>
       </RECORD>
       <RECORD>
        <OPCODE>OP_CLOSE</OPCODE>
        <DATA>
         <TXID>25</TXID>
         <LENGTH>0</LENGTH>
         <PATH>/user/training/hosts._COPYING_</PATH>
         <REPLICATION>1</REPLICATION>
         <MTIME>1360046220735</MTIME>
         <ATIME>1360046220628</ATIME>
         <BLOCKSIZE>67108864</BLOCKSIZE>
         <CLIENT_NAME></CLIENT_NAME>
         <CLIENT_MACHINE></CLIENT_MACHINE>
         <BLOCK>
          <BLOCK_ID>-3498739165311848505</BLOCK_ID>
          <NUM_BYTES>83</NUM_BYTES>
          <GENSTAMP>1004</GENSTAMP>
         </BLOCK>
oev – edits viewer (cont)
      ファイル名:edits_0000000000000000020-0000000000000000027

         <PERMISSION_STATUS>
          <USERNAME>training</USERNAME>
          <GROUPNAME>supergroup</GROUPNAME>
          <MODE>420</MODE>
         </PERMISSION_STATUS>
        </DATA>
       </RECORD>
       <RECORD>
        <OPCODE>OP_RENAME_OLD</OPCODE>
        <DATA>
         <TXID>26</TXID>
         <LENGTH>0</LENGTH>
         <SRC>/user/training/hosts._COPYING_</SRC>
         <DST>/user/training/hosts</DST>                         End
         <TIMESTAMP>1360046220738</TIMESTAMP>
        </DATA>                                             Transaction ID
       </RECORD>
       <RECORD>
        <OPCODE>OP_END_LOG_SEGMENT</OPCODE>
        <DATA>
         <TXID>27</TXID>
        </DATA>
       </RECORD>
      </EDITS>
After checkpointing
-bash-4.1$ ls -l /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/
total 1376
-rw-r--r-- 1 hdfs hdfs 1317 Feb 5 01:36 edits_0000000000000000001-0000000000000000019
-rw-r--r-- 1 hdfs hdfs 471 Feb 5 01:37 edits_0000000000000000020-0000000000000000027
-rw-r--r-- 1 hdfs hdfs 30 Feb 5 01:38 edits_0000000000000000028-0000000000000000029
-rw-r--r-- 1 hdfs hdfs 30 Feb 5 01:39 edits_0000000000000000030-0000000000000000031
-rw-r--r-- 1 hdfs hdfs 30 Feb 5 01:40 edits_0000000000000000032-0000000000000000033
-rw-r--r-- 1 hdfs hdfs 30 Feb 5 01:41 edits_0000000000000000034-0000000000000000035
(略)
-rw-r--r-- 1 hdfs hdfs 30 Feb 5 02:53 edits_0000000000000000178-0000000000000000179
-rw-r--r-- 1 hdfs hdfs 30 Feb 5 02:54 edits_0000000000000000180-0000000000000000181
-rw-r--r-- 1 hdfs hdfs 30 Feb 5 02:55 edits_0000000000000000182-0000000000000000183
-rw-r--r-- 1 hdfs hdfs 30 Feb 5 02:56 edits_0000000000000000184-0000000000000000185
-rw-r--r-- 1 hdfs hdfs 30 Feb 5 02:58 edits_0000000000000000186-0000000000000000187
-rw-r--r-- 1 hdfs hdfs 1048576 Feb 5 02:58 edits_inprogress_0000000000000000188
-rw-rw-r-- 1 hdfs hdfs 119 Feb 5 01:33 fsimage_0000000000000000000
-rw-rw-r-- 1 hdfs hdfs 62 Feb 5 01:33 fsimage_0000000000000000000.md5
-rw-r--r-- 1 hdfs hdfs 1211 Feb 5 02:58 fsimage_0000000000000000187
-rw-r--r-- 1 hdfs hdfs 62 Feb 5 02:58 fsimage_0000000000000000187.md5
-rw-r--r-- 1 hdfs hdfs     4 Feb 5 02:58 seen_txid
                                                                                        Transaction ID is
-bash-4.1$ hexdump -C /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/seen_txid           included in
00000000 31 38 38 0a                      |188.|
00000004
                                                                                            seen_txid
oiv - fsimage viewer
-bash-4.1$ hdfs oiv -i /var/lib/hadoop-
hdfs/cache/hdfs/dfs/name/current/fsimage_0000000000000000187 -o aaa
-bash-4.1$ cat aaa
drwxr-xr-x - hdfs supergroup          0 2013-02-05 01:35 /
drwxr-xr-x - hdfs supergroup          0 2013-02-05 01:35 /user
drwxr-xr-x - mapred supergroup           0 2013-02-05 01:35 /var
drwxrwxrwt - hdfs supergroup            0 2013-02-05 01:37 /user/training
-rw-r--r-- 1 training supergroup     83 2013-02-05 01:37 /user/training/hosts
drwxr-xr-x - mapred supergroup           0 2013-02-05 01:35 /var/lib
drwxr-xr-x - mapred supergroup           0 2013-02-05 01:35 /var/lib/hadoop-hdfs
drwxr-xr-x - mapred supergroup           0 2013-02-05 01:35 /var/lib/hadoop-hdfs/cache
drwxr-xr-x - mapred supergroup           0 2013-02-05 01:35 /var/lib/hadoop-
hdfs/cache/mapred
drwxr-xr-x - mapred supergroup           0 2013-02-05 01:35 /var/lib/hadoop-
hdfs/cache/mapred/mapred
drwx------ - mapred supergroup          0 2013-02-05 01:35 /var/lib/hadoop-
                     ‘hosts’file has
hdfs/cache/mapred/mapred/system been
-rw------- 1 mapred supergroup
                    written in HDFS 2013-02-05 01:35 /var/lib/hadoop-
                                       4 after
hdfs/cache/mapred/mapred/system/jobtracker.info
                        checkpoint
oev – edits viewer
    -bash-4.1$ hdfs oev -i /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_inprogress_0000000000000000188 -o
    bbb
    -bash-4.1$ cat bbb
    <?xml version="1.0" encoding="UTF-8"?>
    <EDITS>
     <EDITS_VERSION>-40</EDITS_VERSION>                                      Transaction ID
     <RECORD>
      <OPCODE>OP_START_LOG_SEGMENT</OPCODE>
      <DATA>
       <TXID>188</TXID>
      </DATA>
     </RECORD>
    </EDITS>
timeline (CDH4)-1
 NameNode

     t0      (10 times transaction)   t1          (22 times transaction)   t2



 fsimage_0                                   create new edits_inprogress
                                             using new Transaction ID
 edits_inprogress_1                         edits_inprogress_11                  edits_inprogress_33
                                           edits_1-10                           edits_1-10
                                               finalize and rename
                                                                                 edits_11-32
                                               (transaction 1-10)
                                                                                   finalize and rename
Secondary NameNode                                                                 (transaction 11-22)


                                                trigger a log roll:
                                                1) NN Startup
                                                2) saveNameSpace
                                                3) SecondaryNN CheckPoint
                                                4) storage directroy becomes
                                                    available
                                                5) admin operation
timeline (CDH4)-2
 NameNode              CheckPoint        CheckPoint
                       start             Done
                         t4               t5                   t6



 fsimage_0                                                    fsimage_0
                                      fsimage_ckpt_33
 fsimage_33                                                rename
 edits_inprogress_33                edits_inprogress_34
 edits_1-10                                                   edits_1-10
 edits_11-32                                                 edits_11-32
                                                transfer
                                                               edits_33-
Secondary NameNode
 33

                              get
                                        fsimage_ckpt_33
                                        merge
parameters (CDH4)
fsimage_0                The number of image checkpoint files that will be retained
fsimage_33               dfs.namenode.num.checkpoints.retained
edits_inprogress_34
edits_1-10
edits_11-32              The number of extra transaction which should be retained
edits_33-33              dfs.namenode.num.extra.edits.retained

      interval
         dfs.namenode.checkpoint.period
      transcations
         dfs.namenode.checkpoint.txns
      Secondary NameNode Poll NameNode every seconds
         dfs.namenode.checkpoint.check.period

      *fstime is no longer necessary since it’s all encapsulated in the transaction IDs
Secondary NN web UI (CDH4)
reference
• HDFS paramters
  • http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-
    project-dist/hadoop-hdfs/hdfs-default.xml
• HDFS-1073
  • https://issues.apache.org/jira/secure/attachment/12478323/hdfs
    1073.pdf
• O’Reilly Hadoop: The definitive Guide, 3rd edition

Contenu connexe

Tendances

101 3.3 perform basic file management
101 3.3 perform basic file management101 3.3 perform basic file management
101 3.3 perform basic file managementAcácio Oliveira
 
101 3.3 perform basic file management
101 3.3 perform basic file management101 3.3 perform basic file management
101 3.3 perform basic file managementAcácio Oliveira
 
Most frequently used unix commands for database administrator
Most frequently used unix commands for database administratorMost frequently used unix commands for database administrator
Most frequently used unix commands for database administratorDinesh jaisankar
 
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...NETWAYS
 
Linux Bash Shell Cheat Sheet for Beginners
Linux Bash Shell Cheat Sheet for BeginnersLinux Bash Shell Cheat Sheet for Beginners
Linux Bash Shell Cheat Sheet for BeginnersDavide Ciambelli
 
Unix commands in etl testing
Unix commands in etl testingUnix commands in etl testing
Unix commands in etl testingGaruda Trainings
 
2345014 unix-linux-bsd-cheat-sheets-i
2345014 unix-linux-bsd-cheat-sheets-i2345014 unix-linux-bsd-cheat-sheets-i
2345014 unix-linux-bsd-cheat-sheets-iLogesh Kumar Anandhan
 
101 4.1 create partitions and filesystems
101 4.1 create partitions and filesystems101 4.1 create partitions and filesystems
101 4.1 create partitions and filesystemsAcácio Oliveira
 
Hadoop Installation and basic configuration
Hadoop Installation and basic configurationHadoop Installation and basic configuration
Hadoop Installation and basic configurationGerrit van Vuuren
 
Fun with processes - lightning talk
Fun with processes - lightning talkFun with processes - lightning talk
Fun with processes - lightning talkPaweł Dawczak
 
Lpi lição 01 exam 102 objectives
Lpi lição 01  exam 102 objectivesLpi lição 01  exam 102 objectives
Lpi lição 01 exam 102 objectivesAcácio Oliveira
 
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedVmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedAdrian Huang
 
From Drives to URLs
From Drives to URLsFrom Drives to URLs
From Drives to URLsadil raja
 
Exadata - BULK DATA LOAD Testing on Database Machine
Exadata - BULK DATA LOAD Testing on Database Machine Exadata - BULK DATA LOAD Testing on Database Machine
Exadata - BULK DATA LOAD Testing on Database Machine Monowar Mukul
 
Linux command line cheatsheet
Linux command line cheatsheetLinux command line cheatsheet
Linux command line cheatsheetWe Ihaveapc
 
Postgresql 12 streaming replication hol
Postgresql 12 streaming replication holPostgresql 12 streaming replication hol
Postgresql 12 streaming replication holVijay Kumar N
 

Tendances (20)

101 3.3 perform basic file management
101 3.3 perform basic file management101 3.3 perform basic file management
101 3.3 perform basic file management
 
101 3.3 perform basic file management
101 3.3 perform basic file management101 3.3 perform basic file management
101 3.3 perform basic file management
 
Most frequently used unix commands for database administrator
Most frequently used unix commands for database administratorMost frequently used unix commands for database administrator
Most frequently used unix commands for database administrator
 
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
 
Rhel1
Rhel1Rhel1
Rhel1
 
Linux Bash Shell Cheat Sheet for Beginners
Linux Bash Shell Cheat Sheet for BeginnersLinux Bash Shell Cheat Sheet for Beginners
Linux Bash Shell Cheat Sheet for Beginners
 
Build Your OS Part1
Build Your OS Part1Build Your OS Part1
Build Your OS Part1
 
basic-unix.pdf
basic-unix.pdfbasic-unix.pdf
basic-unix.pdf
 
Unix commands in etl testing
Unix commands in etl testingUnix commands in etl testing
Unix commands in etl testing
 
2345014 unix-linux-bsd-cheat-sheets-i
2345014 unix-linux-bsd-cheat-sheets-i2345014 unix-linux-bsd-cheat-sheets-i
2345014 unix-linux-bsd-cheat-sheets-i
 
101 4.1 create partitions and filesystems
101 4.1 create partitions and filesystems101 4.1 create partitions and filesystems
101 4.1 create partitions and filesystems
 
Hadoop Installation and basic configuration
Hadoop Installation and basic configurationHadoop Installation and basic configuration
Hadoop Installation and basic configuration
 
Fun with processes - lightning talk
Fun with processes - lightning talkFun with processes - lightning talk
Fun with processes - lightning talk
 
Log
LogLog
Log
 
Lpi lição 01 exam 102 objectives
Lpi lição 01  exam 102 objectivesLpi lição 01  exam 102 objectives
Lpi lição 01 exam 102 objectives
 
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedVmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
 
From Drives to URLs
From Drives to URLsFrom Drives to URLs
From Drives to URLs
 
Exadata - BULK DATA LOAD Testing on Database Machine
Exadata - BULK DATA LOAD Testing on Database Machine Exadata - BULK DATA LOAD Testing on Database Machine
Exadata - BULK DATA LOAD Testing on Database Machine
 
Linux command line cheatsheet
Linux command line cheatsheetLinux command line cheatsheet
Linux command line cheatsheet
 
Postgresql 12 streaming replication hol
Postgresql 12 streaming replication holPostgresql 12 streaming replication hol
Postgresql 12 streaming replication hol
 

Similaire à HDFS metadata (fsimage and edits) difference CDH3 and CDH4

Logical volume manager xfs
Logical volume manager xfsLogical volume manager xfs
Logical volume manager xfsSarwar Javaid
 
Ugif 09 2013 new environment and dynamic setting in ids 12.10
Ugif 09 2013   new environment and dynamic setting in ids 12.10Ugif 09 2013   new environment and dynamic setting in ids 12.10
Ugif 09 2013 new environment and dynamic setting in ids 12.10UGIF
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learnedtcurdt
 
First there was the command line
First there was the command lineFirst there was the command line
First there was the command lineAdrian Cardenas
 
Linux Common Command
Linux Common CommandLinux Common Command
Linux Common CommandJeff Yang
 
Linea de comandos bioface zem800
Linea de comandos bioface zem800Linea de comandos bioface zem800
Linea de comandos bioface zem800thomaswarnerherrera
 
Using Puppet to Create a Dynamic Network - PuppetConf 2013
Using Puppet to Create a Dynamic Network - PuppetConf 2013Using Puppet to Create a Dynamic Network - PuppetConf 2013
Using Puppet to Create a Dynamic Network - PuppetConf 2013Puppet
 
List command linux fidora
List command linux fidoraList command linux fidora
List command linux fidoraJinyuan Loh
 
Keynote 1 - Engineering Software Analytics Studies
Keynote 1 - Engineering Software Analytics StudiesKeynote 1 - Engineering Software Analytics Studies
Keynote 1 - Engineering Software Analytics StudiesESEM 2014
 
bcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challengesbcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challengesIO Visor Project
 
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation CenterDUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation CenterAndrey Kudryavtsev
 
Setup oracle golden gate 11g replication
Setup oracle golden gate 11g replicationSetup oracle golden gate 11g replication
Setup oracle golden gate 11g replicationKanwar Batra
 
MySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaMySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaYoungHeon (Roy) Kim
 
Miscelaneous Debris
Miscelaneous DebrisMiscelaneous Debris
Miscelaneous Debrisfrewmbot
 
Learning the command line
Learning the command lineLearning the command line
Learning the command lineAdrian Cardenas
 
Devops for beginners
Devops for beginnersDevops for beginners
Devops for beginnersVivek Parihar
 
4.3 control mounting and unmounting of filesystems v2
4.3 control mounting and unmounting of filesystems v24.3 control mounting and unmounting of filesystems v2
4.3 control mounting and unmounting of filesystems v2Acácio Oliveira
 
Hadoop World 2011: Leveraging Hadoop for Legacy Systems - Mathias Herberts, C...
Hadoop World 2011: Leveraging Hadoop for Legacy Systems - Mathias Herberts, C...Hadoop World 2011: Leveraging Hadoop for Legacy Systems - Mathias Herberts, C...
Hadoop World 2011: Leveraging Hadoop for Legacy Systems - Mathias Herberts, C...Cloudera, Inc.
 

Similaire à HDFS metadata (fsimage and edits) difference CDH3 and CDH4 (20)

Logical volume manager xfs
Logical volume manager xfsLogical volume manager xfs
Logical volume manager xfs
 
Ugif 09 2013 new environment and dynamic setting in ids 12.10
Ugif 09 2013   new environment and dynamic setting in ids 12.10Ugif 09 2013   new environment and dynamic setting in ids 12.10
Ugif 09 2013 new environment and dynamic setting in ids 12.10
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
 
First there was the command line
First there was the command lineFirst there was the command line
First there was the command line
 
Linux Common Command
Linux Common CommandLinux Common Command
Linux Common Command
 
Linea de comandos bioface zem800
Linea de comandos bioface zem800Linea de comandos bioface zem800
Linea de comandos bioface zem800
 
Ex200
Ex200Ex200
Ex200
 
Using Puppet to Create a Dynamic Network - PuppetConf 2013
Using Puppet to Create a Dynamic Network - PuppetConf 2013Using Puppet to Create a Dynamic Network - PuppetConf 2013
Using Puppet to Create a Dynamic Network - PuppetConf 2013
 
List command linux fidora
List command linux fidoraList command linux fidora
List command linux fidora
 
Keynote 1 - Engineering Software Analytics Studies
Keynote 1 - Engineering Software Analytics StudiesKeynote 1 - Engineering Software Analytics Studies
Keynote 1 - Engineering Software Analytics Studies
 
bcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challengesbcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challenges
 
BPF Tools 2017
BPF Tools 2017BPF Tools 2017
BPF Tools 2017
 
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation CenterDUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
 
Setup oracle golden gate 11g replication
Setup oracle golden gate 11g replicationSetup oracle golden gate 11g replication
Setup oracle golden gate 11g replication
 
MySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & GrafanaMySQL Monitoring using Prometheus & Grafana
MySQL Monitoring using Prometheus & Grafana
 
Miscelaneous Debris
Miscelaneous DebrisMiscelaneous Debris
Miscelaneous Debris
 
Learning the command line
Learning the command lineLearning the command line
Learning the command line
 
Devops for beginners
Devops for beginnersDevops for beginners
Devops for beginners
 
4.3 control mounting and unmounting of filesystems v2
4.3 control mounting and unmounting of filesystems v24.3 control mounting and unmounting of filesystems v2
4.3 control mounting and unmounting of filesystems v2
 
Hadoop World 2011: Leveraging Hadoop for Legacy Systems - Mathias Herberts, C...
Hadoop World 2011: Leveraging Hadoop for Legacy Systems - Mathias Herberts, C...Hadoop World 2011: Leveraging Hadoop for Legacy Systems - Mathias Herberts, C...
Hadoop World 2011: Leveraging Hadoop for Legacy Systems - Mathias Herberts, C...
 

Dernier

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Dernier (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

HDFS metadata (fsimage and edits) difference CDH3 and CDH4

  • 1. fsimage and edits in CDH3 and CDH4 Tatsuo Kawasaki tatsuo@cloudera.com
  • 2. objective HDFS metadata (fsimage and edits) management is different between CDH3 and CDH4. This presentation introduces a these difference. Please let me know if you find any issue.
  • 3. HDFS metada in CDH3 [root@localhost ~]# ls -ltr /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current/ total 1100 -rw-r--r-- 1 hdfs hdfs 101 Jan 30 00:21 VERSION -rw-r--r-- 1 hdfs hdfs 8 Jan 30 00:21 fstime -rw-r--r-- 1 hdfs hdfs 57248 Jan 30 00:21 fsimage -rw-r--r-- 1 hdfs hdfs 1048580 Jan 31 16:16 edits after checkpoint [root@localhost ~]# ls -ltr /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current/ total 84 -rw-r--r-- 1 hdfs hdfs 101 Feb 5 14:37 VERSION -rw-r--r-- 1 hdfs hdfs 8 Feb 5 14:37 fstime -rw-r--r-- 1 hdfs hdfs 66760 Feb 5 14:37 fsimage -rw-r--r-- 1 hdfs hdfs 4 Feb 5 14:37 edits
  • 4. timeline (CDH3) NameNode CheckPoint CheckPoint put file start Done t0 t1 t2 t3 t4 rename fsimage fsimage.ckpt fsimage Update edits edits Update metadata in edits.new edits memory rename create fstime fstime get transfer update time Secondary NameNode fsimage fsimage.ckpt merge edits
  • 5. Secondary NN Web UI (CDH3)
  • 6. HDFS metadata in CDH4 After formatting HDFS -bash-4.1$ ls -l /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/ total 1040 -rw-r--r-- 1 hdfs hdfs 1048576 Feb 5 01:35 edits_inprogress_0000000000000000001 -rw-rw-r-- 1 hdfs hdfs 119 Feb 5 01:33 fsimage_0000000000000000000 -rw-rw-r-- 1 hdfs hdfs 62 Feb 5 01:33 fsimage_0000000000000000000.md5 -rw-r--r-- 1 hdfs hdfs 2 Feb 5 01:35 seen_txid -rw-rw-r-- 1 hdfs hdfs 202 Feb 5 01:33 VERSION -bash-4.1$ hexdump -C /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/seen_txid 00000000 31 0a |1.| 00000002 Transaction ID is included in seen_txid
  • 7. try to add new file [training@localhost ~]$ hadoop fs -put /etc/hosts hosts [training@localhost ~]$
  • 8. oiv - fsimage viewer -bash-4.1$ hdfs oiv -i /var/lib/hadoop- hdfs/cache/hdfs/dfs/name/current/fsimage_000000000000000 0000 -o aaa -bash-4.1$ cat aaa drwxr-xr-x - hdfs supergroup 0 1969-12-31 19:00 / ‘hosts’ file has not written in fsimage before checkpointing
  • 9. oev – edits viewer -bash-4.1$ hdfs oev -i /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_0000000000000000020-0000000000000000027 -o bbb cat bbb <?xml version="1.0" encoding="UTF-8"?> <EDITS> <EDITS_VERSION>-40</EDITS_VERSION> <RECORD> <OPCODE>OP_START_LOG_SEGMENT</OPCODE> <DATA> start <TXID>20</TXID> </DATA> Transaction ID </RECORD> <RECORD> <OPCODE>OP_SET_GENSTAMP</OPCODE> <DATA> <TXID>21</TXID> <GENSTAMP>1003</GENSTAMP> </DATA> </RECORD> Transaction ID <RECORD> <OPCODE>OP_ADD</OPCODE> Put <DATA> <TXID>22</TXID> Transaction <LENGTH>0</LENGTH> <PATH>/user/training/hosts._COPYING_</PATH> <REPLICATION>1</REPLICATION> <MTIME>1360046220628</MTIME> <ATIME>1360046220628</ATIME> <BLOCKSIZE>67108864</BLOCKSIZE> <CLIENT_NAME>DFSClient_NONMAPREDUCE_1911533003_1</CLIENT_NAME> <CLIENT_MACHINE>127.0.0.1</CLIENT_MACHINE> <PERMISSION_STATUS> <USERNAME>training</USERNAME> <GROUPNAME>supergroup</GROUPNAME> <MODE>420</MODE> </PERMISSION_STATUS> </DATA> </RECORD>
  • 10. oev – edits viewer (cont) ファイル名edits_0000000000000000020-0000000000000000027 <RECORD> <OPCODE>OP_SET_GENSTAMP</OPCODE> <DATA> <TXID>23</TXID> <GENSTAMP>1004</GENSTAMP> </DATA> </RECORD> <RECORD> <OPCODE>OP_UPDATE_BLOCKS</OPCODE> <DATA> <TXID>24</TXID> <PATH>/user/training/hosts._COPYING_</PATH> <BLOCK> <BLOCK_ID>-3498739165311848505</BLOCK_ID> <NUM_BYTES>0</NUM_BYTES> <GENSTAMP>1004</GENSTAMP> </BLOCK> </DATA> </RECORD> <RECORD> <OPCODE>OP_CLOSE</OPCODE> <DATA> <TXID>25</TXID> <LENGTH>0</LENGTH> <PATH>/user/training/hosts._COPYING_</PATH> <REPLICATION>1</REPLICATION> <MTIME>1360046220735</MTIME> <ATIME>1360046220628</ATIME> <BLOCKSIZE>67108864</BLOCKSIZE> <CLIENT_NAME></CLIENT_NAME> <CLIENT_MACHINE></CLIENT_MACHINE> <BLOCK> <BLOCK_ID>-3498739165311848505</BLOCK_ID> <NUM_BYTES>83</NUM_BYTES> <GENSTAMP>1004</GENSTAMP> </BLOCK>
  • 11. oev – edits viewer (cont) ファイル名:edits_0000000000000000020-0000000000000000027 <PERMISSION_STATUS> <USERNAME>training</USERNAME> <GROUPNAME>supergroup</GROUPNAME> <MODE>420</MODE> </PERMISSION_STATUS> </DATA> </RECORD> <RECORD> <OPCODE>OP_RENAME_OLD</OPCODE> <DATA> <TXID>26</TXID> <LENGTH>0</LENGTH> <SRC>/user/training/hosts._COPYING_</SRC> <DST>/user/training/hosts</DST> End <TIMESTAMP>1360046220738</TIMESTAMP> </DATA> Transaction ID </RECORD> <RECORD> <OPCODE>OP_END_LOG_SEGMENT</OPCODE> <DATA> <TXID>27</TXID> </DATA> </RECORD> </EDITS>
  • 12. After checkpointing -bash-4.1$ ls -l /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/ total 1376 -rw-r--r-- 1 hdfs hdfs 1317 Feb 5 01:36 edits_0000000000000000001-0000000000000000019 -rw-r--r-- 1 hdfs hdfs 471 Feb 5 01:37 edits_0000000000000000020-0000000000000000027 -rw-r--r-- 1 hdfs hdfs 30 Feb 5 01:38 edits_0000000000000000028-0000000000000000029 -rw-r--r-- 1 hdfs hdfs 30 Feb 5 01:39 edits_0000000000000000030-0000000000000000031 -rw-r--r-- 1 hdfs hdfs 30 Feb 5 01:40 edits_0000000000000000032-0000000000000000033 -rw-r--r-- 1 hdfs hdfs 30 Feb 5 01:41 edits_0000000000000000034-0000000000000000035 (略) -rw-r--r-- 1 hdfs hdfs 30 Feb 5 02:53 edits_0000000000000000178-0000000000000000179 -rw-r--r-- 1 hdfs hdfs 30 Feb 5 02:54 edits_0000000000000000180-0000000000000000181 -rw-r--r-- 1 hdfs hdfs 30 Feb 5 02:55 edits_0000000000000000182-0000000000000000183 -rw-r--r-- 1 hdfs hdfs 30 Feb 5 02:56 edits_0000000000000000184-0000000000000000185 -rw-r--r-- 1 hdfs hdfs 30 Feb 5 02:58 edits_0000000000000000186-0000000000000000187 -rw-r--r-- 1 hdfs hdfs 1048576 Feb 5 02:58 edits_inprogress_0000000000000000188 -rw-rw-r-- 1 hdfs hdfs 119 Feb 5 01:33 fsimage_0000000000000000000 -rw-rw-r-- 1 hdfs hdfs 62 Feb 5 01:33 fsimage_0000000000000000000.md5 -rw-r--r-- 1 hdfs hdfs 1211 Feb 5 02:58 fsimage_0000000000000000187 -rw-r--r-- 1 hdfs hdfs 62 Feb 5 02:58 fsimage_0000000000000000187.md5 -rw-r--r-- 1 hdfs hdfs 4 Feb 5 02:58 seen_txid Transaction ID is -bash-4.1$ hexdump -C /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/seen_txid included in 00000000 31 38 38 0a |188.| 00000004 seen_txid
  • 13. oiv - fsimage viewer -bash-4.1$ hdfs oiv -i /var/lib/hadoop- hdfs/cache/hdfs/dfs/name/current/fsimage_0000000000000000187 -o aaa -bash-4.1$ cat aaa drwxr-xr-x - hdfs supergroup 0 2013-02-05 01:35 / drwxr-xr-x - hdfs supergroup 0 2013-02-05 01:35 /user drwxr-xr-x - mapred supergroup 0 2013-02-05 01:35 /var drwxrwxrwt - hdfs supergroup 0 2013-02-05 01:37 /user/training -rw-r--r-- 1 training supergroup 83 2013-02-05 01:37 /user/training/hosts drwxr-xr-x - mapred supergroup 0 2013-02-05 01:35 /var/lib drwxr-xr-x - mapred supergroup 0 2013-02-05 01:35 /var/lib/hadoop-hdfs drwxr-xr-x - mapred supergroup 0 2013-02-05 01:35 /var/lib/hadoop-hdfs/cache drwxr-xr-x - mapred supergroup 0 2013-02-05 01:35 /var/lib/hadoop- hdfs/cache/mapred drwxr-xr-x - mapred supergroup 0 2013-02-05 01:35 /var/lib/hadoop- hdfs/cache/mapred/mapred drwx------ - mapred supergroup 0 2013-02-05 01:35 /var/lib/hadoop- ‘hosts’file has hdfs/cache/mapred/mapred/system been -rw------- 1 mapred supergroup written in HDFS 2013-02-05 01:35 /var/lib/hadoop- 4 after hdfs/cache/mapred/mapred/system/jobtracker.info checkpoint
  • 14. oev – edits viewer -bash-4.1$ hdfs oev -i /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_inprogress_0000000000000000188 -o bbb -bash-4.1$ cat bbb <?xml version="1.0" encoding="UTF-8"?> <EDITS> <EDITS_VERSION>-40</EDITS_VERSION> Transaction ID <RECORD> <OPCODE>OP_START_LOG_SEGMENT</OPCODE> <DATA> <TXID>188</TXID> </DATA> </RECORD> </EDITS>
  • 15. timeline (CDH4)-1 NameNode t0 (10 times transaction) t1 (22 times transaction) t2 fsimage_0 create new edits_inprogress using new Transaction ID edits_inprogress_1 edits_inprogress_11 edits_inprogress_33 edits_1-10 edits_1-10 finalize and rename edits_11-32 (transaction 1-10) finalize and rename Secondary NameNode (transaction 11-22) trigger a log roll: 1) NN Startup 2) saveNameSpace 3) SecondaryNN CheckPoint 4) storage directroy becomes available 5) admin operation
  • 16. timeline (CDH4)-2 NameNode CheckPoint CheckPoint start Done t4 t5 t6 fsimage_0 fsimage_0 fsimage_ckpt_33 fsimage_33 rename edits_inprogress_33 edits_inprogress_34 edits_1-10 edits_1-10 edits_11-32 edits_11-32 transfer edits_33- Secondary NameNode 33 get fsimage_ckpt_33 merge
  • 17. parameters (CDH4) fsimage_0 The number of image checkpoint files that will be retained fsimage_33 dfs.namenode.num.checkpoints.retained edits_inprogress_34 edits_1-10 edits_11-32 The number of extra transaction which should be retained edits_33-33 dfs.namenode.num.extra.edits.retained interval dfs.namenode.checkpoint.period transcations dfs.namenode.checkpoint.txns Secondary NameNode Poll NameNode every seconds dfs.namenode.checkpoint.check.period *fstime is no longer necessary since it’s all encapsulated in the transaction IDs
  • 18. Secondary NN web UI (CDH4)
  • 19. reference • HDFS paramters • http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop- project-dist/hadoop-hdfs/hdfs-default.xml • HDFS-1073 • https://issues.apache.org/jira/secure/attachment/12478323/hdfs 1073.pdf • O’Reilly Hadoop: The definitive Guide, 3rd edition