- 1. Recommendations @ LinkedIn<br />1<br />
- 2. Think Platform<br />Leverage Hadoop<br />2<br />
- 3. The world’s largest professional networkOver 50% of members are now international<br />135M+<br />75%<br />*<br />Fortune 100 Companies use LinkedIn to hire<br />**<br />>2M<br />Company Pages<br />**<br />~2/sec<br />New Members joining<br />*as of Nov 4, 2011**as of June 30, 2011<br />3<br />
- 4. 4<br />Recommendations Opportunity<br />
- 5. 5<br />
- 6. 6<br />
- 7. 7<br />
- 8. 8<br />
- 9. 9<br />
- 10. 10<br />
- 11. The Recommendations Opportunity<br />Pandora Search for People<br />Groups browse maps<br />Events You<br />May Be<br />Interested In<br />11<br />
- 12. 50%<br />12<br />
- 13. 13<br />Positions<br />Education<br />Summary<br />Experience<br />Skills<br />
- 14. Are all titles the same?<br /><ul><li>Software Engineer
- 15. Technical Yahoo
- 16. Member Technical Staff
- 17. Software Development Engineer
- 18. SDE</li></li></ul><li>Are all companies the same?<br />‘IBM’ has 8000+ variations<br /><ul><li>ibm – ireland
- 19. ibm research
- 20. T J Watson Labs
- 21. International Bus. Machines</li></li></ul><li>Recommendation Trade-offsThe need for a common platform<br />Real Time<br /> Time Independent<br />16<br />
- 22. Recommendation Trade-offsThe need for a common platform<br />Content Analysis<br />Collaborative<br />17<br />
- 23. Recommendation Trade-offsThe need for a common platform<br />Precision <br />Recall<br />18<br />
- 24. Specialty -> Specialty<br /> Skills-> Skills<br />Seniority<br />Skills<br />Title<br />Specialty<br />Education<br />Experience<br />Location<br />Industry<br /> Title -> Title<br />Matching<br />0.58<br />Seniority -> Seniority<br />Related Titles<br />Related Companies<br />Related Industries<br />0.94<br />Binary<br />Exact match<br />Exact match in bucket<br />Summary -> Summary<br />0.26<br />Title -> Related Title<br />0.18<br />Education -> Education<br />Soft Match<br /> v1 = tf * idf<br />CosΘ =v1*v2<br />|v1|*|v2|<br />0.98<br />.<br />.<br />.<br />0.16<br />Seniority<br />Skills<br />Title<br />Specialty<br />Education<br />Experience<br />Location<br />Industry<br />0.40<br />Related Titles<br />Related Companies<br />Related Industries<br />
- 25. Importance <br />weight vector<br />(Skills-> Skills)<br />Feedback<br />0.70<br />Normalization, <br />Scoring <br />& Ranking<br />Filtering<br />Location<br />Company<br />Industry<br />Similarity <br />score vector<br />(Skills-> Skills)<br />0.94<br />
- 26. Technologies<br />
- 27. 22<br />Hadoop Case Studies<br /><ul><li>Scaling
- 28. Blending Recommendation Algorithms
- 29. Grandfathering
- 30. Model Selection
- 31. A/B Testing
- 32. Tracking and Reporting</li></li></ul><li>23<br />Scaling<br />Billions of Recommendations<br />Latency > 1 sec<br />Minhashing<br />Latency < 1 sec<br />Recall = Low<br />Latency < 1 sec<br />Recall = High<br />23<br />
- 33. 24<br />Hadoop Case Studies<br /><ul><li>Scaling ✔
- 34. Blending Recommendation Algorithms
- 35. Grandfathering
- 36. Model Selection
- 37. A/B Testing
- 38. Tracking and Reporting</li></li></ul><li>Blending Recommendation Algorithms<br />Co-View <br />Impact Latency ~ Minutes <br />Complexity = High<br />Co-View <br />Impact Latency ~ Hours<br /> Complexity = Low<br />25<br />
- 39. 26<br />Hadoop Case Studies<br /><ul><li>Scaling ✔
- 40. Blending Recommendation Algorithms ✔
- 41. Grandfathering
- 42. Model Selection
- 43. A/B Testing
- 44. Tracking and Reporting</li></li></ul><li>27<br />Grandfathering<br />Adding and Changing Features<br />Next Profile Edit<br />No Time Guarantees<br />Minimal Disruption<br />Parallel Feature<br />Extraction Pipeline<br />Time ~ Week<br />Significant Systems Work<br />Time ~ Hour<br />Minimal Disruption<br />Grandfather When Ready<br />
- 45. 28<br />Hadoop Case Studies<br /><ul><li>Scaling ✔
- 46. Blending Recommendation Algorithms ✔
- 47. Grandfathering ✔
- 48. Model Selection
- 49. A/B Testing
- 50. Tracking and Reporting</li></li></ul><li>29<br />Model Selection<br />Decision Trees<br /><ul><li>Features
- 51. Models
- 52. Parameters</li></ul>SVM<br />SVM<br />Logistic<br />Regression<br />`<br />Content,<br />Collaborative<br /> L1+L2<br />Regularization<br />29<br />29<br />
- 53. 30<br />Hadoop Case Studies<br /><ul><li>Scaling ✔
- 54. Blending Recommendation Algorithms ✔
- 55. Grandfathering ✔
- 56. Model Selection ✔
- 57. A/B Testing
- 58. Tracking and Reporting</li></li></ul><li>31<br />A/B Testing<br />Is Option A Better Than Option B? Let’s Test<br />New <br />Model<br />`<br />A<br />10%<br />Traffic<br />Old<br />Model<br />90%<br />B<br />Send 10% of members who have more than 100 connections AND <br />who have logged in the past one week, AND who are based in Europe<br />31<br />31<br />
- 59. 32<br />Hadoop Case Studies<br /><ul><li>Scaling ✔
- 60. Blending Recommendation Algorithms ✔
- 61. Grandfathering ✔
- 62. Model Selection ✔
- 63. A/B Testing ✔
- 64. Tracking and Reporting</li></li></ul><li>33<br />Tracking and Reporting<br />K-way joins across billions of rows<br />Up to the minute reporting<br />Nearsightedness<br />K-way join complexity<br />Lacks up to the<br /> minute reporting<br />Simple k-way joins<br />
- 65. 34<br />Think Platform<br />Leverage Hadoop<br />
- 66. 35<br />Come work with us at LinkedIn<br />You<br />Applied Research<br />Engineer<br />LinkedIn<br />35<br />

