In this session I'd share the design, architecture and implementation of some of the most common elements of any social platform - Open API, profiles, searches, lists and activity streams. These "pillers" of a social platform bear most of the weight behind a jazzy UI, and scaling them has its own challenges. I will also talk about how we built the Social Platform at IGN from ground up, including not-so-unique challenges like integration with legacy systems.
8. People are tired of creating accounts on every site Need to support existing login method if the platform caters to an existing audience Existing auth may not work well with Open API initiatives Open API and Oauth 2 legged: Service to Service 3 legged: User to App to Service
9. Identify the bottlenecks Measure everything Use CDNs for all static content Front end optimization via async loading Database optimization via indexes, sharding Caching Scaling the sorts Scaling up vs. Scaling out CAP theorem Relational vs. NOSQL storage Read vs. Write heaviness
10. Query vs. Propagation Queries are read heavy Propagation is write heavy Deletion is a pain with propagation Activity Aggregation Aggregation on actor vs. object Normalized vs. Denormalized storage Comments Decorating the activities on each request
11. Integration with legacy touchpoints Opening up the API More channels like Mobile More independent applications Rate limiting and access control Don’t forget existing data Data outlives code
12. Flexibility in the code to adapt changing requirements quickly and seamlessly Good design DRY SOCs Flexibility in the infrastructure to adapt changing traffic and behavior Virtualization Heavy replication Flexibility in the team to respond to changes Process
13. Automated Testing wherever possible Developer Focus on test coverage (80+%) Continuous Integration and Deployment Cucumber + Hudson Cross browser testing (yes, including IE)
14.
15. Java services Tomcat with Shindig 1.1, 4 nodes REST/JSON Ruby Rails Admin App for moderation and points/levels Migration Scripts Twitter bot for routing #myign tweets to the platform Misc. scripts to invalidate memache keys and test service endpoints
16. Memcached Extremely trivial to set up and maintain Almost never dies Massive scale out Careful with Cache hotspots Concurrent writes On the fly scale-out Key/Value size limits
17. MySQL Proven, cheap to develop and operate Maslow’s hammer Easy scale out Hard to store (and retrieve) network graphs Write scaling with single master Not the best choice for activitystreams Schema changes lock the table(s)
18. Awesome write scaling Great for activity propagation model In place updates Using $push and $set Excellent for storing social relationships as documents Very easy to cluster We are running replica pairs, plan to move to replica sets Schema-less No need to run alter scripts on 18M-row table
19. Queryable Rich Query language ($in, $size, $exists, $slice) MapReduce for heavy data crunching Supports Indexing You can even index collections inside a document Storage ~4x storage compared to relational data Emerging technology Index defragmentation $or and indexing (to be supported in 1.7) Load balancing support in the driver (coming soon)
20. RabbitMQ for messaging Ease of clustering Written in Erlang for high performance and availability Used for Propagation of activities Sending out email alerts Indexing data in Solr
21. Person GET @self, @friends, @followers, @all, PUT/POST @self, @friends Activities GET @global, @self, @friends, POST @self MediaItems GET @self, @all and POST @self AppData For applications to store/retrieve data as key-value pairs GET/POST @self Status GET @friends, @self, @followers , POST @self
22. Must have for any Java/Ruby webapp Monitoring and troubleshooting Save a ton of $ and time by efficient root cause analysis tools Agents for Ruby and Java IGN Engineers helped write PHP and Memcached agents
23. Social Applications and community Check the pulse of the community UserVoice (http://ign.uservoice.com) Less is more Distinguish yourself and focus on your niche Be Agile - Release early, release often Do not shock your audience Announce the changes/features on a blog Eat your own dog food http://people.ign.com/ign-labs
24. Released July 2010 as beta Daily API requests ~25M Daily page views ~30K Daily Uniques ~12K 6ms response times Expected traffic 8-10x with more integration and mobile platform
25. Manish Pandit Engineering Manager, Social Platform at IGN Email: pandit.manish-at-gmail.com Twitter: @lobster1234 LinkedIn: http://www.linkedin.com/in/mpandit Blog: http://contrarianwisdom.blogspot.com MyIGN: http://people.ign.com/mpanditign