This module has been created to answer all the questions on how IPFS can be used for dynamic real-time applications. In this module, you will learn:
- how to reason about dynamic data on IPFS,
- IPNS, the simplest construction for naming in IPFS,
- how PubSub can offer subsecond speeds for interactive applications,
- how CRDTs are a fundamental building block for distributed applications,
- what is available in the ecosystem.
2. a
CONTENT ADDRESSING CONTENT DISCOVERY
& ROUTING
CONTENT EXCHANGE
● Chunking
● Linking Chunks
in Merkle DAGs
● From Data to
Data Structures with IPLD
● Anatomy of the IPFS CID
● Routing & Provider
Records
● DHT-based Routing
● Gossip-based Routing
● Bitswap
● GraphSync
MUTABLE NAMES &
MESSAGE DELIVERY
● Dynamic Data
● IPNS
● PubSub
● CRDTs
IPFS Components
3. Agenda
➔ Motivation
➔ Messaging Layer
◆ IPFS/libp2p PubSub
◆ PubSub Routers (Floodsub & Gossipsub)
➔ Mutable Pointers
◆ Self-Certified Names (IPNS)
◆ IPNS over DHT, PubSub & DNS
➔ Type Systems for Distributed Applications
◆ Conflict Free Replicated Data Types (CRDT)
➔ Access Controls for Distributed Applications
◆ Cryptographic ACL / Capabilities System
➔ Examples of tools and libraries available in the ecosystem
7. ➔ IPFS gives us immutable identifiers (CIDs), however, Rich Web Apps need dynamic
content that changes rapidly
➔ Ability to create, edit and update existing data, from text to images, videos and so on.
➔ Propagating updates to other participants, created a shared view into the latest state
of a dataset
➔ Grant and revoke read and write permissions to data
➔ Support multiple interaction patterns, from one-to-many (e.g. Blog), many-to-many (e.g.
Social Networks) and many-to-one (e.g. RSS/feed subscriptions)
What is Mutable Content about?
12. Benefits
➔ Layer for distributing updates
➔ Message Oriented Comms -> Simple to program
➔ Support for different topologies -> Different types
of interaction patterns supported
➔ Loose Coupling -> Separation of Concerns
➔ Scalable -> Adapt as your network grows
A Messaging Layer supported by
P2P PubSub is key for supporting
different interaction patterns
Challenges
➔ Permission-less network -> can’t control who
join/leaves
➔ Network topology is bottom-up (e.g. brokerless)
➔ Network Churn
➔ Optimizations for Latency/Bandwidth/Delivery
Guarantees are about tradeoffs, not one-size fits all
13. ➔ Topic based interface
➔ Design Goals
◆ Reliability: All messages get delivered to all peers
subscribed to the topic.
◆ Speed: Messages are delivered quickly.
◆ Efficiency: The network is not flooded with excess
copies of messages.
◆ Resilience: Peers can join and leave the network
without disrupting it. There is no central point of
failure.
◆ Scale: Topics can have enormous numbers of
subscribers and handle a large throughput of
messages.
◆ Simplicity: The system is simple to understand and
implement. Each peer only needs to remember a
small amount of state.
IPFS/libp2p PubSub
➔ Parameterizable
➔ Two main implementations
◆ FloodSub
◆ Gossipsub
14. ➔ Design
◆ Simplest possible design
◆ Ambient Peer Discovery (e.g. IPFS Main Network’s DHT, MulticastDNS and
other systems)
◆ Routing is achieved by Flooding
➔ Pros
◆ Robust delivery guarantees, even in the presence of high
network churn
◆ Minimum latency delivery
➔ Cons
◆ Huge bandwidth overhead
(duplicate messages received)
◆ Unbounded degree flooding
(wasterful)
FloodSub
Time
16. ➔ Bleeding edge P2P PubSub
➔ Design
◆ Self Stabilizing algorithm
◆ 2 Networks: Metadata (control plane) & Message (data plane)
◆ Nodes establish local meshes with reciprocal agreements on subscriptions
◆ Degree bounded by default to 6
◆ Nodes send Gossip with
● GRAFT & PRUNE to establish peering agreements
● Gossip about what messages have been delivered and seen
◆ Eager push/Lazy pull model
◆ Scoring function to reward benevolent peers and penalize bad behaving peers
◆ Support for protocol extensions
➔ Pros
◆ Resilient to Churn
◆ Resilient to Sybil, Eclipse & Spam Attacks
◆ Minimizes Bandwidth usage
➔ Cons
◆ For small networks, it doesn’t achieve the same minimum latency guarantees as
Floodsub
Gossipsub with
hardening extensions
17. ➔ Simulation
◆ 100 peers
◆ 5 msg/s
◆ Run for 2s
◆ Time expansion 10x
for visualisation
Gossipsub with
hardening extensions
18. ➔ Evaluation Performed with over 10 000 peers
➔ Demonstrated successfully how Gossipsub is resilient to a variety of attacks
➔ The implementation of Gossipsub and the code used for the evaluation is all Open Source
➔ Paper Publish with evaluation and Architecture
Evaluation Report & Paper
19. ➔ Documents all known challenges & existing solutions
➔ Invitation for collaboration in solving the challenges at new lengths
Open Problem Statement
22. Goals
➔ No central central coordinator to distribute
updates
➔ Ability to certify that the updates originate
from the author
Decentralized updates are a
requirement for several use cases
Challenges
➔ Different expectations of delivery depending on
the application
➔ Scalability (# of nodes) and frequency
(# of updates)
23. ➔ Self-Certified Naming
➔ Design
◆ Verifiability:
◆ Versatility: A name can point to one or more
immutable addresses
◆ Transport independent: Not be tied to a specific
method of distribution.
➔ Names (aka Pointers) are referenced by the
PublicKey of the signer vs. the Content of the
name
IPNS, the
InterPlanetary NameSystem
➔ Several implementations:
◆ IPNS over DHT
◆ IPNS over PubSub
◆ IPNS over DNS
◆ IPNS over ETH Domains
◆ IPNS over Namecoin (one of the first
implementations)
◆ IPNS over QRCodes? NFC? BLE Beacons?
Multiple possibilities
24. ➔ Each name is a Public/Private Key Pair
➔ The Private Key is used to signed statements (aka records)
containing the state of the name
➔ The Public Key is used by clients for
◆ Discovering the new records published
◆ Certify the record update was indeed issued by the author
➔ Every time the owner of the name wants to issue an update, it
simply creates a new record, signs it and broadcasts it over
one of the selected transports (e.g DHT, PubSub, DNS, etc)
➔ Human Readable naming is enabled via a registrar. Examples:
◆ DNS some-domain.com -> TXT Record /ipns/Qm_CID of
Public Key.. -> /ipfs/Qm_CID of content
◆ UnstoppableDomains some-domain.crypto -> /ipns/Qm_CID
of Public Key.. -> /ipfs/Qm_CID of content
IPNS Architecture
25. ➔ Documents all known challenges & existing solutions
➔ Invitation for collaboration in solving the challenges at new lengths
Open Problem Statements
27. ➔ “A distributed system is one in which the failure of a computer you
didn't even know existed can render your own computer unusable.”
Leslie Lamport
➔ Synchronization errors happen
◆ Simultaneous Writers
◆ Propagation Delay
➔ Distributed Consensus is not a trivial problem
➔ There must be a better way! 💡
Mutable Pointers are not enough
for Distributed Applications
28. Type Systems for
Distributed Apps
Motivation
Messaging Layer
Mutable Pointers
Type Systems for
Distributed Apps
Access Controls for
Distributed Apps
Mutable Content
Tools & Libraries
29. Goals
➔ No need for central coordinator and avoid
central points of failure and/or chokepoints
(e.g. case for Operational Transforms)
➔ Support multiple times of collaboration
Convergent state and/or
Consensus over Distributed Data
Challenges
➔ Avoid resource intensive and slow constructions
(e.g. PBFT, Nakamoto Consensus, etc)
➔ Support Web scale
➔ Fast sync, Snapshotting and Garbage Collection
30. ➔ Fairly new and large field of distributed
systems research
➔ Design
◆ Merge function is defined a priority and
distributed to all participants
◆ Independent of the order of the events received,
all participants will know how to merge the
events and converge on the same state
➔ CRDTs are not a one-size fits all data
structure, instead they are many data
structures that fit specific use cases:
◆ G-Set (Grow-only Set)
◆ 2P-Set (Two-Phase Set)
◆ LWW-Element-Set (Last-Write-Wins-Element-Set)
◆ OR-Set (Observed-Remove Set)
◆ Sequence/Ordered Set
◆ Many more..
Conflict-Free Replicated Data
Types (CRDTs)
➔ Pros
◆ Order of reception is not important, as long as the
events arrive
◆ Does not require any live interactivity with all
participants
◆ Can be used to build all sorts of applications, from
collaborative doc editors, chats and much more
➔ Cons
◆ Can generate a lot of non-trivially garbage collectable
objects for long running programs
32. Access Controls for
Distributed Apps
Motivation
Messaging Layer
Mutable Pointers
Type Systems for
Distributed Apps
Access Controls for
Distributed Apps
Mutable Content
Tools & Libraries
33. Goals
➔ Grant and Remove access to data
➔ No need for central coordinator and avoid
central points of failure and/or chokepoints
(e.g. case for Operational Transforms)
➔ Support multiple times of collaboration
Access Controls
Challenges
➔ Avoid resource intensive and slow constructions
(e.g. PBFT, Nakamoto Consensus, etc) to agree
on the ACL
➔ Support Web scale
➔ Fast sync, Snapshotting and Garbage Collection
34. ➔ Private collaborations over data in a Public Network
➔ Design
◆ Symmetric Keys give Read Access
● Only the ones with the key can decrypt and therefore
read
◆ Asymmetric Keys give Write Access
● Similar to how IPNS gives ownership of who is trusted to
updated the Record
➔ Pros
◆ Fine grain control of Read & Write Access
◆ Authorship is verifiable
◆ Does not require a centralized party for coordination
➔ Cons
◆ Require nodes to have a way to synchronize on the latest
state of ACL to ensure that only nodes allowed receive the
latest updates
Cryptographic ACLs
aka Capabilities Systems
37. ➔ Open Source Collaborative Text Editor
➔ IPNS to get the latest release the App
➔ IPFS to load the contents
➔ Public Key Pair & Symmetric Key at the URL
➔ CRDT to create the shared state
➔ PubSub to distribute messages
➔ Features
◆ Real-time collaboration
◆ Conflict-Free
PeerPad as an example of the
vertical integration