1. Symbiotic Coupling of P2P and Cloud Systems:
The Wikipedia Case
Lars Bremer, University of Paderborn, Germany
Kalman Graffi, University of Düsseldorf, Germany
2. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 2
Know this Banners?
Know these banners?
Know these banners?
3. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 3
Background on Wikipedia
Wikipedia
– Collaborative Internet Encylopaedia
Numbers on English Wikipedia
– Alexa rank: 6
– Article Count: 3.8 million
– Edits: 3.4 million per month
– Page Views: 11.3 million per hour
Figures show the popularity of articles
– Top: All articles
– Bottom: Top 250 articles
Problem: Costs through high traffic
0 0 .5 1 1 .5 2 2 .5
x 1 0
6
1 0
1 0
0
1 0
2
1 0
4
1 0
6
R a n k
PageViews
Wikipedia: Page View D
Distribution, All Ranks
0 0 .5 1 1 .5 2 2 .5
x 1 0
6
1 0
1 0
0
1 0
2
1 0
4
1 0
6
R a n k
PageViews
0 5 0 1 0 0 1 5 0 2 0 0 2 5 0
1 0
2
1 0
3
1 0
4
1 0
5
1 0
6
R a n k
PageViews
Wikipedia: Page View Distribution
4. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 4
Motivation and Outline of our Work
Goal: Efficiency increase
– Cloud-like performance
• Maintain high data availability
• Quick article delivery
– Low operational costs
• Users should help in sharing articles
• Donations of network resources
Approach
– Combine peer-to-peer (p2p) and centralized (cloud) architecture
– Cloud is used as backup and main hoster
• Much less traffic and costs
– Users participate in p2p overlay
• Lookup articles first there
• Provide downloaded articles to other peers
5. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 5
Outline
Motivation / Use Case
Background on Structured P2P Overlays
Symbiotic Coupling of P2P and Cloud Systems
Evaluation
Conclusions
6. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 6
Background on Structured P2P Overlays
Nodes and objects use same ID
space
Each object is managed by a
node ( responsible)
Assignment based on IDs
Nodes maintain a topology /
routing structure to support:
Lookup:
getResponsibleNode(ID)
After that: e.g. data tranfer
H(„my data“)
= 3107
2207
2906
3485
201116221008
709
611
?
H(„my data“)
= 3107
2207
2906
3485
201116221008
709
611
?
Lookup
Data transfer
Model Overview
7. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 7
Cloud Computing vs. P2P Technology
Cloud and P2P
– Access to a distributed pool of
resources:
• Storage, bandwidth, computational
power
Cloud computing
– Resource providers: companies
– Controlled environment
• No (/minimal) churn
• Homogenous devices
– Selective centralized structures
– Mainly paid by usage
P2P systems
– Resource providers: user devices
– Uncontrolled environment
• Churn
• Heterogeneous devices
• Uncertainty / unpredictability
• Distributed access points
8. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 8
Symbiotic Coupling of P2P and Cloud Systems
Goal: High performance at low costs
– Performance: data availability, low delays
– Costs: traffic at cloud operator (linked to monetary expenses)
Our approach
– Main service (here Wikipedia) remains as main data pool
– Nodes install an (p2p) addon p2p overlay
• Allows to share content of specific services
– Nodes visiting Wikipedia
• Join the p2p overlay and remain online for a while
– Articles are served and provided in p2p overlay
– If not available / initially: download from cloud
9. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 9
Model Overview
a r t ic le n o d e
A 1 N 1
A 1 N 2
A 1 N 4
A 2 N 2
D O C U M E N T T A B L E
R E F E R E N C E T A B L E
A 1
A 1
N 1
N 2
N 3
N 4
A 2
D O C U M E N T T A B L E
A 1
D O C U M E N T T A B L E
A 1
A 2
• Key Idea: References
• All downloaded Articles are published
• Cloud used as Fallback
Overview on the Architecture
Document space
– Article ID is hashed article name
– Responsible node maintains
list of articles providers
– Article providers
• Downloaded once the article
• Registered at resp. node
We use Chord
– Any other DHT also fine
– Needs to support
Key-based Routing
10. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 10
Operation: Initial Lookup for an Article
11. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 11
Operation: Further Lookup for an Article
12. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 12
Other Operations
Update
– Editing done on the Cloud
– Active vs. passive updates
• Active: Cloud actively informs node holding references
• Passive: Responsible peers periodically check for updates
– Frequency based on object popularity
– Old references are discarded
• They point to outdated content
– New reference table is built-up
Leave
– Leaving node informs all nodes holding references to it
– Can also be detected, but introduces delay
13. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 13
Evaluation
Main questions
– What is the efficiency gain?
– How much traffic is saved?
Approach
– Evaluation through simulation
Layer setup
– User mode: downscaled Wikipedia workload
– Application: document storage
– Overlay: Chord
– Network model:
• Global Network Positioning delay model
• OECD bandwidth model
14. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 14
► PeerfactSim.KOM (see www.peerfact.org)
Type
– Event-based simulator
– Written in Java
– Simulations up to 100K peers possible
– Focus on simulation of p2p systems on various layers
• User
• Application
• Services: monitoring, replication …
• Overlays
• Network models
Invitation to join the community
– Several universities use and extend the simulator actively
– Used and heavily extended in the project
15. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 15
Layered View
Layered Architecture
– Easy exchange of components
– Testing of new applications
– Testing of new mechanisms
Main idea
– Layers have several implementations
– Enables testing of individual layer
mechanisms
• on its own
• in combination with other layers
See www.peerfact.org
Application
Overlay
User
SimulationEngine
Network
Service
Transport
Application
Overlay
User
SimulationEngine
Network
Service
Transport
16. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 16
Simulation Setup / Workload Model
17. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 17
Simulation Results
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
0 0.2 0.4 0.6 0.8 1.0
ReferenceLookupTime(Sec)
All Queries
MST 30min, Reference
MST 60min, Reference
MST 90min, Reference
MST 120min, Reference
MST 30min, Article
MST 60min, Article
MST 90min, Article
MST 120min, Article
(a) Reference Lookup and Total Download Time
18. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 18
Simulation Results
1.0
d Time (b) Traffic Load in the Cloud
0
10
20
30
40
50
60
70
80
0
Articles/min
(c) Tra
19. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 19
Simulation Results
0
10
20
30
40
50
60
70
80
0 20 40 60 80 100 120 140 160 180
Articles/min
Simulation Time (min)
Downloaded from Peer
Downloaded from Cloud
(c) Traffic Load Savings for Session Time 120 min
20. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 20
Conclusions and Future Work
Symbiotic p2p/cloud approach lowers operational costs
– Users take load and share content
– Traffic load on server was reduced to 27.6% in this experiment
– Websites with many users can benefit from p2p support
– WebP2P – browser-based p2p via peerjs, nodejs, etc. is coming
– User devices are powerful, load can be handled „for free“
Future Work
– Investigate WebP2P approach
• Browser plugin to create p2p overlay
– Create p2p framework for social networks
• Use capacity of user devices to host a social network
• See http://www.p2pframework.com
– Further extend PeerfactSim.KOM – the p2p system simulator
• See http://www.peerfact.org
21. Lars Bremer, Kalman Graffi: Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case 21
Thank You for Your Attention
Jun.-Prof. Dr.-Ing. Kalman Graffi
Technology of Social Networks Group
Institute of Computer Science
Heinrich-Heine-Universität Düsseldorf
eMail: graffi@cs.uni-duesseldorf.de
Web: www.p2pframework.com
??