How the new operation of Hadoop Distributed FIle System (HDFS) -- Append works. The internals of the processing. The new states that are more than the write operation.
16. http://dataera.wordpress.com
http://linkedin.com/in/yuechen2
State
Name Node (NN) block, 4 types of states:
complete
under_construction
under_recovery
committed
Data Node (DN) replica, 5 types of states:
Finalized
RBW (ReplicaBeingWritten, in write’s pipeline, visible to read)
RUR (ReplicaUnderRecovery, lease is expired)
RWR (ReplicaWaitingToBeRecovered, if one DN is down, all RBW becomes RWR)
Temporary (being transmitted between DN’s)
18. http://dataera.wordpress.com
http://linkedin.com/in/yuechen2
Overall Procedure
From the perspective of Client, append operation firstly calls append of DistributedFileSystem, this operation would return a stream object FSDataOutputStream out. If Client needs to append data to this file, it could calls out.write to write, and calls out.close to close.
19. http://dataera.wordpress.com
http://linkedin.com/in/yuechen2
write/append
1)Normal close
DFSOutputStream.close()->FSNamesystem.completeFile()- >commitOrCompleteLastBlock()
State of file in NN (Name Node) is INode, not INodeUnderConstruction.
2)Abnormal close
The state is INodeUnderConstruction. The lease (write lock) on the file is not released.
Lease recovery
Block recovery
20. http://dataera.wordpress.com
http://linkedin.com/in/yuechen2
Lease Recovery
When file is not normally closed, the last block’s 3 replicas may be in different states (size and generation stamp (version of the block)).
The recovery procedure includes checking if the previous lease holder renews the lease, and if the lease exceeds the softLimit (exceeds the time limit); if so, calls internalReleaseLease().
21. http://dataera.wordpress.com
http://linkedin.com/in/yuechen2
Block Recovery
Sent with DN’s heartbeat to NN.
Find the best state of all replicas, and recover the remaining to this state.
State Ranking: Finalized > RBW > RWR > RUR > Temporary
When finishing recovery, continues executing (append, write, etc.)