3. Wikimedia
Overview
• Wikimedia Foundation
– American non-profit organization, Jimmy Wales
– Several online collaborative wiki projects
⇒ Wikipedia, etc.
– Develop and maintain open content (CC BY-SA)
2
4. Wikimedia
Overview
• History
– 2001, Wikipedia
⇒ Perl, Text, UseModeWiki
– 2002~3, Wikipedia Upgrade
⇒ PHP, MySQL, MediaWiki
– 2003, Wikimedia Foundation
– After that, various projects
– 2011, 1 billion people, 50 million articles
3
6. Wikimedia
Features
• Operating the world's fifth
largest web property
• Giving Wikimedia's
volunteers the best possible
tools to do their work
• Developing recruiting
resources for new
volunteers
• Staging outreach and
community events world-
wide
5
7. Wikimedia
Features
• Partnering with cultural
institutions
• Working with the
educational sector
• Providing access to
Wikipedia everywhere
• Informing our decision-
making with facts and data
6
10. Wikimedia
Technical Figures
• 25,000 ~ 60,000 HTTP requests per sec
• 3.5 Gbit per sec of data traffic
• 3 Data centers : Tampa, Amsterdam, Seoul
9
13. Wikimedia
CDN (Content Distribution Network)
• 3 clusters on 3 different continents
– Primary cluster in Tampa, Florida
– Secondary caching-only clusters in Amsterdam, the Netherlands and Seoul, South
Korea
• Geographic load balancing (GLB)
– hand out DNS answers based on the estimated location of the querying DNS resolvers
• Squid caching
– Split into two groups : Text and Media
– 75 Squid servers
⇒ up to 40 GB disk, 8 GB memory
– Hit rates : 85% for Text, 98% for Media
12
14. Wikimedia
MediaWiki
• MediaWiki
– Free web-based wiki software platform
– All Wikimedia projects run on a MediaWiki platform
– Open source software (GPL v2)
– Release Ver. 1.18.2
• Characteristics in wiki project
– Scales well with multiple CPUs
⇒ Quad-core servers
– One centrally managed
⇒ Hardware shared with external storage
– Simple implementation with LAMP
– Memcached tasks
– Additional extensions
13
16. Wikimedia
MediaWiki
• Persistent Data
– Metadata in core databases
– Actual text in external storages
⇒ All revisions text, Compressed
– Uploaded files in image servers
• Database
– Separate database per wiki
– One master database, many replicated slaves
• Core Database Scaling
– Separating read and write operations
⇒ Read on slaves, write on master
– Separating expensive and cheap operations
– Separating big, popular and small wikis
15
21. Wikimedia
References
• Sunil H. A. North, Deborah M. (2010), Investigating Pedagogical Value of
Wiki Technology
• Mark Bergsma. (2007). Wikimedia Architecture
• http://www.mediawiki.org/wiki/MediaWiki
• http://en.wikipedia.org/wiki/Wikimedia
20