Follow along w/ the video: http://www.youtube.com/watch?v=2SQ0O_oPpe4
How do data infrastructure, insights and products change when your user base grows by orders of magnitude? When should you move your user-facing data product off your laptop? (hint: now!) Does your data offer insights about the world at large, or is it just mirroring your early adopters? In this talk, I will share some of the data scaling lessons we've learned at LinkedIn, recount war stories (and close calls!) and document the evolution of the data scientist.
5. Possible : High risk, rapid innovation Chasing the long tail (by hand) Not possible: Long tail recommendations Network effects Insights into the world at large
14. Possible : Insights into the world at large Network effects Infrastructure innovation Not possible: Long tail recommendations Segmented insights and products
15.
16.
17. Data infrastructure team! ~1900 machines Kafka real time data streams Reporting there’s a (mobile) app for that! … and servers, and dedicated teams Infrastructure – evolved.
25. Possible : Sliced-and-diced insights and products Network effects Economies of scale Fast A/B tests Not possible: Casual, hour-long outages Testing in production on 100% of users