Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Prochain SlideShare
Chargement dans…5
×

Data Day Texas - Recommendations

Preetha Appan is the technical lead of the recommendations team at Indeed. Her past contributions to Indeed's job and resume search engines include keyword tokenization improvements, query expansion features, and major infrastructure and performance improvements. She enjoys working on challenging problems in machine learning and information retrieval.

Livres associés

Gratuit avec un essai de 30 jours de Scribd

Tout voir
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Soyez le premier à commenter

Data Day Texas - Recommendations

1. 1. I
2. 2. Job recommendations are significantly different Rapid inventory growth - Millions of new jobs discovered every day
3. 3. Job recommendations are significantly different Rapid inventory growth - Millions of new jobs discovered every day ~ 1.5 million new users visit indeed every day
4. 4. Job recommendations are significantly different Rapid inventory growth - Millions of new jobs discovered every day ~ 1.5 million new users visit indeed every day Average lifespan of a job is ~30 days
5. 5. Job recommendations are significantly different Rapid inventory growth - Millions of new jobs discovered every day ~ 1.5 million new users visit indeed every day Average lifespan of a job is ~30 days One job posting usually meant to hire one individual
6. 6. Compute similarity For ui In {Users} For uj In {Users} SIMi,j = compute_similarity(ui, uj )
7. 7. → →
8. 8.
9. 9. ∩ ∪
10. 10. Items[Ui ] = {x1 , x2 , ..xn } H minhashH (Ui )= min{ x∈Itemsi | H(x) }
11. 11. Similarity(U1, U2) = 1, if minhash(U1) == minhash(U2) Similarity(U1, U2) = 0, otherwise This is an unbiased estimator
12. 12. Similarity(U1, U2) = 1, if minhash(U1) == minhash(U2) Similarity(U1, U2) = 0,
13. 13. Hk Hk Prob(minhashH (Ui ) == minhashH (Uj )) = J(Ui , Uj )
14. 14. user → {job1, job2, job3,..}
15. 15. H = {H1 , H2 , ..H20 } for user in Users for hash in H minhash[hash] = min{x∈Itemsi | hash(x)}
16. 16. For ui In {Users} For uj In {Users} SIMi,j = compute_similarity(ui, uj )
17. 17. user1 → (111, 123, 134, 148, ..129) user2 → (101, 123, 139, 148, ..135) user3 → (191, 103, 126, 108, ..119) user4 → (191, 103, 126, 108, ..129) ...
18. 18. user → {cluster} cluster → {users}
19. 19. 123 → (user1, user2) 148 → (user1, user2) 129 → (user1, user4) 191 → (user3, user4) ...
20. 20.
21. 21. → user1 → {job1, job2} user2 → {job2, job3, job5} 123 → {user1, user2}
22. 22. → user1 → {job1, job2} user2 → {job2, job3, job5} 123 → {job1, job2, job3, job5}
23. 23. 1. user → {cluster}
24. 24. user → {cluster} user1 → {111, 123, ..}
25. 25. 111 → {job5, job2, job9} 123 → {job1, job2, job3, job5} {job2, job5, job9, job1, job3} → →
26. 26. {job2, job5, job9, job1, job3}
27. 27. 1.
28. 28. → → {101, 121}
29. 29. → → {101, 121} {“Software Engineer”, “Java Developer”, “Python Developer”}
30. 30. → → {101, 121} {“Software Engineer”, “Java Developer”, “Python Developer”} minhash({“Software Engineer”, “Java Developer”, “Python Developer”}) → {99, 135}
31. 31. → → {101, 121} {“Software Engineer”, “Java Developer”, “Python Developer”} minhash({“Software Engineer”, “Java Developer”, “Python Developer”}) → {99, 135} → {99, 121}
32. 32. → minhash({“Software Engineer”, “Java Developer”, “Python Developer”}) → {99,121} 99 → add {“Software Engineer”, “Java Developer”, “Python Developer”} 121 → add {“Software Engineer”, “Java Developer”, “Python Developer”}
33. 33. {“Software Engineer”, “Java Developer”, “Python Developer”} {99, 121} 99 → {“Software Engineer”, “Java Developer”, “Python Developer”}
34. 34. → {99, 131} {“Software Engineer”, “Java Developer”, “Python Developer”}
35. 35. → → → → → →
36. 36. ● ●
37. 37. 1. http://go.indeed.com/docservice
38. 38.
39. 39.
40. 40. → → →
41. 41. ● ● ●
42. 42. Engineering blog & talks http://indeed.tech Open Source http://opensource.indeedeng.io Careers http://indeed.jobs Twitter @IndeedEng