SlideShare une entreprise Scribd logo
1  sur  68
Télécharger pour lire hors ligne
Network Coding for Distributed
Storage Systems*
Presented by
Jayant Apte
ASPITRG
7/9/13 & 7/11/13
*Dimakis, A.G.; Godfrey, P.B.; Wu, Y.; Wainwright, M.J.; Ramchandran, K. "Network Coding for
Distributed Storage Systems", Information Theory, IEEE Transactions on, On page(s):
4539 – 4551 Volume: 56, Issue: 9, Sept. 2010
Outline
●
Part 1
– Single Source Multi-cast Linear Network Coding
●
Part 2
– The repair problem
– Reduction of repair problem to single source multicast network
– Family of single source multi-cast networks arising from the reduction
– A lower bound on min-cuts(i.e. An upper bound on max-flow and hence
coding capacity of network)
– Minimization of storage bandwidth subject to this lower bound
Some background on single source
multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking,
IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
Some background on single source
multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking,
IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
Max-Flow-Min-Cut Theorem
Max-Flow-Min-Cut Theorem
Max-Flow-Min-Cut Theorem
Some background on single source
multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking,
IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
Basic Network Model
Basic Network Model
Local coding coefficients
Global coding coefficients
Matrix formulation
The transfer matrix
Proof of Theorem 2
Proof of Theorem 3
Some background on single source
multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking,
IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
Extension to multicast
Part 2- Outline
● Introduction
● The repair problem
● Reduction of repair problem to single source multicast network
● Family of single source multi-cast networks arising from the
reduction
● A lower bound on min-cuts(i.e. An upper bound on max-flow
and hence coding capacity of network)
● Minimization of storage bandwidth subject to this lower bound
Distributed storage
● We are living in an internet age
● Demand for large scale data storage has increased
significantly
● Social networks, file and video sharing require
seamless storage, access and security for massive
amounts of data
● Storage mediums(viz. hard-drives) are individually
unreliable
● Hence we introduce redundancy via the use of
erasure codes to improve reliability
A storage code((4,2) MDS)
Kwefgws
Jwehfwg
SjfJHFJ
jhfefog
Sikytrd
sdjhvkjd
A1
A2
B1
B2
A1
A2
B1
B2
A1
+B1
A2
+B2
A2
+B1
A1
+ A2
+B2
Fragment 1
Fragment 2
Disk 1
Disk 2
Disk 3
Disk 4
A storage code((4,2) MDS)
Kwefgws
Jwehfwg
SjfJHFJ
jhfefog
Sikytrd
sdjhvkjd
A1
A2
B1
B2
A1
A2
B1
B2
A1
+B1
A2
+B2
A2
+B1
A1
+ A2
+B2
Fragment 1
Fragment 2
Disk 1
Disk 2
Disk 3
Disk 4
Part 2- Outline
● Introduction
● The repair problem
● Reduction of repair problem to single source multicast network
● Family of single source multi-cast networks arising from the
reduction
● A lower bound on min-cuts(i.e. An upper bound on max-flow
and hence coding capacity of network)
● Minimization of storage bandwidth subject to this lower bound
Problem Definition
● Storage nodes are distributed and connected in a network
● Together they represent some storage code(MDS or
approximate MDS like LDPC)
● The issue of repairing a node arises when a storage node of the
system fails
● The still functioning nodes are called active nodes
● A newcomer node called repair node must connect to a subset
of active nodes, obtain information from them and reconstruct
the storage code i.e, repair the code
● The objective is to minimize amount of information transferred
in this process
Notation
The repair problem
x1
x2
x3
x4
y1
y2
x5
Example: A (4,2) MDS code
( = repair bandwidth per node )
The repair problem
● Data object (2Mb) is divided into two fragments:
y1
,y2
(1 Mb each)
● 4 encoded fragments generated: x1
,x2
,x3
,x4
(1 Mb
each)
● x4
fails, x5
, the newcomer needs to communicate
with existing nodes and create a new encoded
packet
● Any two out of x1
,x2
,x3
,x5
must suffice to recover
original data object
The repair problem
● What(and how much) should x1
,x2
,x3
communicate to
x5
such that are minimized?
x1
x2
x3
x4
y1
y2
x5
Example 1: A (4,2) MDS code
Variants of the repair problem
● Exact Repair: Failed blocks are exactly regenerated
i.e. newcomer node must reconstruct exact replica of
encoded block in the failed node
● Functional Repair: Newly generated data block
need not be exact replica of encoded block on the
failed node
● Exact repair of the systematic part: Only repair the
systematic part exactly so there is always a un-
coded copy of original file available
Variants of the repair problem
● Exact Repair: Failed blocks are exactly regenerated
i.e. newcomer node must reconstruct exact replica of
encoded block in the failed node
● Functional Repair: Newly generated data block
need not be exact replica of encoded block on the
failed node
● Exact repair of the systematic part: Only repair the
systematic part exactly so there is always a un-
coded copy of original file available
Functional repair example
(Using RLNC)
a1
b1
a2
b2
a1
+b1
+a2
+b2
a1
+2b1
+a2
+2b2
a1
+2b1
+3a2
+b2
3a1
+2b1
+2a2
+3b2
a1
b1
a2
b2
p1=a1
+2b1
p2=2a2
+b2
p1=4a1
+5b1
+4a2
+5b2
5a1
+7b1
+8a2
+7b2
6a1
+9b1
+6a2
+6b2
1
2
2
1
3
1
1
1
1
1
2
2
File fragments
Encoded data blocks
Encoded repair packets
Repair node
(Each box is 0.5Mb)
Functional repair example
(Using RLNC)
a1
b1
a2
b2
a1
+b1
+a2
+b2
a1
+2b1
+a2
+2b2
a1
+2b1
+3a2
+b2
3a1
+2b1
+2a2
+3b2
a1
b1
a2
b2
p1=a1
+2b1
p2=2a2
+b2
p1=4a1
+5b1
+4a2
+5b2
5a1
+7b1
+8a2
+7b2
6a1
+9b1
+6a2
+6b2
1
2
2
1
3
1
1
1
1
1
2
2
File fragments
Encoded data blocks
Encoded repair packets
Repair node
(Each box is 0.5Mb)
Flow across this
Cut is repair b/w
An attempt at solution
x1
x2
x3
x4
y1
y2
x5
Example 1: A (4,2) MDS code
An attempt at solution
x1
x2
x3
x4
y1
y2
x5
Example 1: A (4,2) MDS code
x5
Recovers original data
object and creates a new
independent linear combination
Can we do better than this?
Can we do better than this?
YES!
Part 2- Outline
● Introduction
● The repair problem
● Reduction of repair problem to single source
multicast network
● Family of single source multi-cast networks arising
from the reduction
● A lower bound on min-cuts(i.e. An upper bound on
max-flow and hence coding capacity of network)
● Minimization of storage bandwidth subject to this
lower bound
Reduction to information flow graph
Example
x1
in
x2
in
x3
in
x4
in
x5
in
x1
out
x2
out
x3
out
x4
out
S
x5
out
DC
Information flow graph corresponding
to Example 1: A (4,2) MDS code
Node 4 has failed
Dynamic nature of information flow
graph due to given failure pattern
x1
in
x2
in
x3
in
x4
in
x5
in
x1
out
x2
out
x3
out
x4
out
S
x5
out
DC
Information flow graph corresponding
to Example 1: A (4,2) MDS code
Node 4 has failed
Family of information flow graphs
x1
in
x2
in
x3
in
x4
in
x5
in
x1
out
x2
out
x3
out
x4
out
S
x5
out
DC
Information flow graph corresponding
to Example 1: A (4,2) MDS code
Node 3 also failed say a few minutes later
x6
in
x6
out
Lemma 1
Outline
● The repair problem
● Reduction of repair problem to single source
multicast network
● Family of single source multi-cast networks arising
from the reduction
● A lower bound on min-cuts(i.e. An upper bound on
max-flow and hence coding capacity of network)
● Minimization of storage bandwidth subject to this
lower bound
Information flow graph
S
Information flow graph
S
Information flow graph
S
Information flow graph
S
Information flow graph
S
Information flow graph
S
Proof
WLOG
Outline
● The repair problem
● Reduction of repair problem to single source
multicast network
● Family of single source multi-cast networks arising
from the reduction
● A lower bound on min-cuts(i.e. An upper bound on
max-flow and hence coding capacity of network)
● Minimization of storage bandwidth subject to this
lower bound
Minimize subject to the lower
bound
Nature of constraint
LHS of constraint as function of
LHS of constraint as function of
Solution to the optimization
Simplification of solution
Simplification of solution
Solution
Minimum repair bandwidth
Storage-Bandwidth Tradeoff
Relationship between and [1]
References
● [1]Alexandros G. Dimakis, P. Brighten Godfrey, Yunnan Wu, Martin J. Wainwright,
and Kannan Ramchandran. 2010. Network coding for distributed storage systems.
IEEE Trans. Inf. Theor. 56, 9 (September 2010), 4539-4551.
● [2]Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking,
IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
● [3]Tracey Ho and Desmond Lun. 2008. Network Coding: An Introduction.
Cambridge University Press, New York, NY, USA.
● [4]Dimakis, A.G.; Ramchandran, K.; Wu, Y.; Changho Suh, "A Survey on Network
Codes for Distributed Storage," Proceedings of the IEEE , vol.99, no.3, pp.476,489,
March 2011

Contenu connexe

Tendances

Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
zaidinvisible
 
Question bank cn2
Question bank cn2Question bank cn2
Question bank cn2
sangusajjan
 
129966862758614726[1]
129966862758614726[1]129966862758614726[1]
129966862758614726[1]
威華 王
 
Iaetsd low power flip flops for vlsi applications
Iaetsd low power flip flops for vlsi applicationsIaetsd low power flip flops for vlsi applications
Iaetsd low power flip flops for vlsi applications
Iaetsd Iaetsd
 

Tendances (19)

Ecc cipher processor based on knapsack algorithm
Ecc cipher processor based on knapsack algorithmEcc cipher processor based on knapsack algorithm
Ecc cipher processor based on knapsack algorithm
 
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
 
Elgamal signature for content distribution with network coding
Elgamal signature for content distribution with network codingElgamal signature for content distribution with network coding
Elgamal signature for content distribution with network coding
 
An Efficient FPGA Implementation of the Advanced Encryption Standard Algorithm
An Efficient FPGA Implementation of the Advanced Encryption Standard AlgorithmAn Efficient FPGA Implementation of the Advanced Encryption Standard Algorithm
An Efficient FPGA Implementation of the Advanced Encryption Standard Algorithm
 
Reduced Complexity Maximum Likelihood Decoding Algorithm for LDPC Code Correc...
Reduced Complexity Maximum Likelihood Decoding Algorithm for LDPC Code Correc...Reduced Complexity Maximum Likelihood Decoding Algorithm for LDPC Code Correc...
Reduced Complexity Maximum Likelihood Decoding Algorithm for LDPC Code Correc...
 
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
 
Design of Reversible Sequential Circuit Using Reversible Logic Synthesis
Design of Reversible Sequential Circuit Using Reversible Logic SynthesisDesign of Reversible Sequential Circuit Using Reversible Logic Synthesis
Design of Reversible Sequential Circuit Using Reversible Logic Synthesis
 
Design and Implementation of an Embedded System for Software Defined Radio
Design and Implementation of an Embedded System for Software Defined RadioDesign and Implementation of an Embedded System for Software Defined Radio
Design and Implementation of an Embedded System for Software Defined Radio
 
Reliability Improvement in Logic Circuit Stochastic Computation
Reliability Improvement in Logic Circuit Stochastic ComputationReliability Improvement in Logic Circuit Stochastic Computation
Reliability Improvement in Logic Circuit Stochastic Computation
 
Computer network
Computer networkComputer network
Computer network
 
Hardware implementation of (63, 51) bch encoder and decoder for wban using lf...
Hardware implementation of (63, 51) bch encoder and decoder for wban using lf...Hardware implementation of (63, 51) bch encoder and decoder for wban using lf...
Hardware implementation of (63, 51) bch encoder and decoder for wban using lf...
 
Cryptoghraphy
CryptoghraphyCryptoghraphy
Cryptoghraphy
 
Question bank cn2
Question bank cn2Question bank cn2
Question bank cn2
 
Research Paper
Research PaperResearch Paper
Research Paper
 
Fpga based low power and high performance address generator for wimax deinter...
Fpga based low power and high performance address generator for wimax deinter...Fpga based low power and high performance address generator for wimax deinter...
Fpga based low power and high performance address generator for wimax deinter...
 
Fpga based low power and high performance address
Fpga based low power and high performance addressFpga based low power and high performance address
Fpga based low power and high performance address
 
129966862758614726[1]
129966862758614726[1]129966862758614726[1]
129966862758614726[1]
 
Study and Performance Analysis of MOS Technology and Nanocomputing QCA
Study and Performance Analysis of MOS Technology and Nanocomputing QCAStudy and Performance Analysis of MOS Technology and Nanocomputing QCA
Study and Performance Analysis of MOS Technology and Nanocomputing QCA
 
Iaetsd low power flip flops for vlsi applications
Iaetsd low power flip flops for vlsi applicationsIaetsd low power flip flops for vlsi applications
Iaetsd low power flip flops for vlsi applications
 

Similaire à Network Coding for Distributed Storage Systems(Group Meeting Talk)

1. Networking Fundamentals.pptx
1. Networking Fundamentals.pptx1. Networking Fundamentals.pptx
1. Networking Fundamentals.pptx
Miguel Prado
 
Oow2007 performance
Oow2007 performanceOow2007 performance
Oow2007 performance
Ricky Zhu
 
Tendencias de Uso y Diseño de Redes de Interconexión en Computadores Paralel...
Tendencias de Uso y Diseño de Redes de Interconexión  en Computadores Paralel...Tendencias de Uso y Diseño de Redes de Interconexión  en Computadores Paralel...
Tendencias de Uso y Diseño de Redes de Interconexión en Computadores Paralel...
Facultad de Informática UCM
 

Similaire à Network Coding for Distributed Storage Systems(Group Meeting Talk) (20)

Heterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D PackagingHeterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D Packaging
 
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemAccelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
 
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective
 
Ag32224229
Ag32224229Ag32224229
Ag32224229
 
Data storage csc
Data storage cscData storage csc
Data storage csc
 
Semiconductor overview
Semiconductor overviewSemiconductor overview
Semiconductor overview
 
Energy Saving ARM Server Cluster Born for Distributed Storage & Computing
Energy Saving ARM Server Cluster Born for Distributed Storage & ComputingEnergy Saving ARM Server Cluster Born for Distributed Storage & Computing
Energy Saving ARM Server Cluster Born for Distributed Storage & Computing
 
1. Networking Fundamentals.pptx
1. Networking Fundamentals.pptx1. Networking Fundamentals.pptx
1. Networking Fundamentals.pptx
 
Multi core k means
Multi core k meansMulti core k means
Multi core k means
 
Oow2007 performance
Oow2007 performanceOow2007 performance
Oow2007 performance
 
FEC & File Multicast
FEC & File MulticastFEC & File Multicast
FEC & File Multicast
 
End nodes in the Multigigabit era
End nodes in the Multigigabit eraEnd nodes in the Multigigabit era
End nodes in the Multigigabit era
 
PASTE: Network Stacks Must Integrate with NVMM Abstractions
PASTE: Network Stacks Must Integrate with NVMM AbstractionsPASTE: Network Stacks Must Integrate with NVMM Abstractions
PASTE: Network Stacks Must Integrate with NVMM Abstractions
 
Tendencias de Uso y Diseño de Redes de Interconexión en Computadores Paralel...
Tendencias de Uso y Diseño de Redes de Interconexión  en Computadores Paralel...Tendencias de Uso y Diseño de Redes de Interconexión  en Computadores Paralel...
Tendencias de Uso y Diseño de Redes de Interconexión en Computadores Paralel...
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
Js2517181724
Js2517181724Js2517181724
Js2517181724
 
Js2517181724
Js2517181724Js2517181724
Js2517181724
 
Distribute Storage System May-2014
Distribute Storage System May-2014Distribute Storage System May-2014
Distribute Storage System May-2014
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
 
How Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterHow Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver Cluster
 

Plus de Jayant Apte, PhD

Plus de Jayant Apte, PhD (9)

McKay's Algorithm for Isomorph-free Exhaustive Generation
McKay's Algorithm for Isomorph-free Exhaustive GenerationMcKay's Algorithm for Isomorph-free Exhaustive Generation
McKay's Algorithm for Isomorph-free Exhaustive Generation
 
ISIT 2014, Hawaii presentation
ISIT 2014, Hawaii presentationISIT 2014, Hawaii presentation
ISIT 2014, Hawaii presentation
 
Adjacency Decomposition Method: Breaking up problems
Adjacency Decomposition Method: Breaking up problemsAdjacency Decomposition Method: Breaking up problems
Adjacency Decomposition Method: Breaking up problems
 
Entropic Inequalities and marginal problems (Fritz and Chavez)
Entropic Inequalities and marginal problems (Fritz and Chavez) Entropic Inequalities and marginal problems (Fritz and Chavez)
Entropic Inequalities and marginal problems (Fritz and Chavez)
 
Exact Repair problems with multiple sources: CISS 2014
Exact Repair problems with multiple sources: CISS 2014Exact Repair problems with multiple sources: CISS 2014
Exact Repair problems with multiple sources: CISS 2014
 
Latex Workshop Tutorial
Latex Workshop TutorialLatex Workshop Tutorial
Latex Workshop Tutorial
 
Candidacy Exam Talk
Candidacy Exam TalkCandidacy Exam Talk
Candidacy Exam Talk
 
Jayant lrs
Jayant lrsJayant lrs
Jayant lrs
 
Jayant chm
Jayant chmJayant chm
Jayant chm
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 

Network Coding for Distributed Storage Systems(Group Meeting Talk)

  • 1. Network Coding for Distributed Storage Systems* Presented by Jayant Apte ASPITRG 7/9/13 & 7/11/13 *Dimakis, A.G.; Godfrey, P.B.; Wu, Y.; Wainwright, M.J.; Ramchandran, K. "Network Coding for Distributed Storage Systems", Information Theory, IEEE Transactions on, On page(s): 4539 – 4551 Volume: 56, Issue: 9, Sept. 2010
  • 2. Outline ● Part 1 – Single Source Multi-cast Linear Network Coding ● Part 2 – The repair problem – Reduction of repair problem to single source multicast network – Family of single source multi-cast networks arising from the reduction – A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) – Minimization of storage bandwidth subject to this lower bound
  • 3. Some background on single source multi-cast network coding *Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
  • 4. Some background on single source multi-cast network coding *Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
  • 8. Some background on single source multi-cast network coding *Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
  • 16.
  • 17.
  • 19. Some background on single source multi-cast network coding *Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
  • 21. Part 2- Outline ● Introduction ● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the reduction ● A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) ● Minimization of storage bandwidth subject to this lower bound
  • 22. Distributed storage ● We are living in an internet age ● Demand for large scale data storage has increased significantly ● Social networks, file and video sharing require seamless storage, access and security for massive amounts of data ● Storage mediums(viz. hard-drives) are individually unreliable ● Hence we introduce redundancy via the use of erasure codes to improve reliability
  • 23. A storage code((4,2) MDS) Kwefgws Jwehfwg SjfJHFJ jhfefog Sikytrd sdjhvkjd A1 A2 B1 B2 A1 A2 B1 B2 A1 +B1 A2 +B2 A2 +B1 A1 + A2 +B2 Fragment 1 Fragment 2 Disk 1 Disk 2 Disk 3 Disk 4
  • 24. A storage code((4,2) MDS) Kwefgws Jwehfwg SjfJHFJ jhfefog Sikytrd sdjhvkjd A1 A2 B1 B2 A1 A2 B1 B2 A1 +B1 A2 +B2 A2 +B1 A1 + A2 +B2 Fragment 1 Fragment 2 Disk 1 Disk 2 Disk 3 Disk 4
  • 25. Part 2- Outline ● Introduction ● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the reduction ● A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) ● Minimization of storage bandwidth subject to this lower bound
  • 26. Problem Definition ● Storage nodes are distributed and connected in a network ● Together they represent some storage code(MDS or approximate MDS like LDPC) ● The issue of repairing a node arises when a storage node of the system fails ● The still functioning nodes are called active nodes ● A newcomer node called repair node must connect to a subset of active nodes, obtain information from them and reconstruct the storage code i.e, repair the code ● The objective is to minimize amount of information transferred in this process
  • 28. The repair problem x1 x2 x3 x4 y1 y2 x5 Example: A (4,2) MDS code ( = repair bandwidth per node )
  • 29. The repair problem ● Data object (2Mb) is divided into two fragments: y1 ,y2 (1 Mb each) ● 4 encoded fragments generated: x1 ,x2 ,x3 ,x4 (1 Mb each) ● x4 fails, x5 , the newcomer needs to communicate with existing nodes and create a new encoded packet ● Any two out of x1 ,x2 ,x3 ,x5 must suffice to recover original data object
  • 30. The repair problem ● What(and how much) should x1 ,x2 ,x3 communicate to x5 such that are minimized? x1 x2 x3 x4 y1 y2 x5 Example 1: A (4,2) MDS code
  • 31. Variants of the repair problem ● Exact Repair: Failed blocks are exactly regenerated i.e. newcomer node must reconstruct exact replica of encoded block in the failed node ● Functional Repair: Newly generated data block need not be exact replica of encoded block on the failed node ● Exact repair of the systematic part: Only repair the systematic part exactly so there is always a un- coded copy of original file available
  • 32. Variants of the repair problem ● Exact Repair: Failed blocks are exactly regenerated i.e. newcomer node must reconstruct exact replica of encoded block in the failed node ● Functional Repair: Newly generated data block need not be exact replica of encoded block on the failed node ● Exact repair of the systematic part: Only repair the systematic part exactly so there is always a un- coded copy of original file available
  • 33. Functional repair example (Using RLNC) a1 b1 a2 b2 a1 +b1 +a2 +b2 a1 +2b1 +a2 +2b2 a1 +2b1 +3a2 +b2 3a1 +2b1 +2a2 +3b2 a1 b1 a2 b2 p1=a1 +2b1 p2=2a2 +b2 p1=4a1 +5b1 +4a2 +5b2 5a1 +7b1 +8a2 +7b2 6a1 +9b1 +6a2 +6b2 1 2 2 1 3 1 1 1 1 1 2 2 File fragments Encoded data blocks Encoded repair packets Repair node (Each box is 0.5Mb)
  • 34. Functional repair example (Using RLNC) a1 b1 a2 b2 a1 +b1 +a2 +b2 a1 +2b1 +a2 +2b2 a1 +2b1 +3a2 +b2 3a1 +2b1 +2a2 +3b2 a1 b1 a2 b2 p1=a1 +2b1 p2=2a2 +b2 p1=4a1 +5b1 +4a2 +5b2 5a1 +7b1 +8a2 +7b2 6a1 +9b1 +6a2 +6b2 1 2 2 1 3 1 1 1 1 1 2 2 File fragments Encoded data blocks Encoded repair packets Repair node (Each box is 0.5Mb) Flow across this Cut is repair b/w
  • 35. An attempt at solution x1 x2 x3 x4 y1 y2 x5 Example 1: A (4,2) MDS code
  • 36. An attempt at solution x1 x2 x3 x4 y1 y2 x5 Example 1: A (4,2) MDS code x5 Recovers original data object and creates a new independent linear combination
  • 37. Can we do better than this?
  • 38. Can we do better than this? YES!
  • 39. Part 2- Outline ● Introduction ● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the reduction ● A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) ● Minimization of storage bandwidth subject to this lower bound
  • 42. Dynamic nature of information flow graph due to given failure pattern x1 in x2 in x3 in x4 in x5 in x1 out x2 out x3 out x4 out S x5 out DC Information flow graph corresponding to Example 1: A (4,2) MDS code Node 4 has failed
  • 43. Family of information flow graphs x1 in x2 in x3 in x4 in x5 in x1 out x2 out x3 out x4 out S x5 out DC Information flow graph corresponding to Example 1: A (4,2) MDS code Node 3 also failed say a few minutes later x6 in x6 out
  • 45. Outline ● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the reduction ● A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) ● Minimization of storage bandwidth subject to this lower bound
  • 46.
  • 53. Proof
  • 54.
  • 55.
  • 56. WLOG
  • 57. Outline ● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the reduction ● A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) ● Minimization of storage bandwidth subject to this lower bound
  • 58. Minimize subject to the lower bound
  • 60. LHS of constraint as function of
  • 61. LHS of constraint as function of
  • 62. Solution to the optimization
  • 68. References ● [1]Alexandros G. Dimakis, P. Brighten Godfrey, Yunnan Wu, Martin J. Wainwright, and Kannan Ramchandran. 2010. Network coding for distributed storage systems. IEEE Trans. Inf. Theor. 56, 9 (September 2010), 4539-4551. ● [2]Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003 ● [3]Tracey Ho and Desmond Lun. 2008. Network Coding: An Introduction. Cambridge University Press, New York, NY, USA. ● [4]Dimakis, A.G.; Ramchandran, K.; Wu, Y.; Changho Suh, "A Survey on Network Codes for Distributed Storage," Proceedings of the IEEE , vol.99, no.3, pp.476,489, March 2011