Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Query Understanding at LinkedIn [Talk at Facebook]

1 294 vues

Publié le

  • Soyez le premier à commenter

Query Understanding at LinkedIn [Talk at Facebook]

  1. 1. Query Understanding and Search Assistance @ LinkedIn Abhi Lad (Engineering Lead, Search Quality)
  2. 2. Outline ● Search at LinkedIn ● Goal of search ● Search assistance / Guided search ● Query understanding & rewriting
  3. 3. Search at LinkedIn
  4. 4. Search at LinkedIn Universal search box
  5. 5. Search at LinkedIn Navigational People search
  6. 6. Search at LinkedIn Exploratory People search FACETS
  7. 7. Search at LinkedIn Exploratory People search
  8. 8. Search at LinkedIn Job Search
  9. 9. Search at LinkedIn Federated Search JOBS PEOPLE PEOPLE
  10. 10. Goal of Search
  11. 11. Help users find who or what they are looking for with minimal effort Goal of search
  12. 12. Help users find who or what they are looking for with minimal effort 1. Help users frame “good” queries 2. Understand the user’s underlying intent / information need 3. Rewrite the query to ensure good result set 4. Rank the results based on the user and the query 5. Provide good result attribution: snippets, highlighting 6. Propose next actions to refine results Goal of search
  13. 13. Search Assistance
  14. 14. ● Query Assistance: [Pre-retrieval] Help users frame their queries easily ○ Autocomplete, Search suggestions in typeahead, Spellcheck, ... ● Guided Search: [Post-retrieval] Guide users through their search process ○ Facet suggestions, Related searches, ... Search Assistance (Especially useful for exploratory queries)
  15. 15. Autocomplete & Search Suggestions Query autocomplete Search suggestions
  16. 16. Autocomplete & Search Suggestions Query autocomplete => Entity detection => Search suggestions
  17. 17. Autocomplete & Search Suggestions Query autocomplete => Entity detection => Search suggestions Autocomplete system: ● Based on query logs ● Index and retrieve using Lucene FST ● Can complete last part of the query (even if entire query was previously unseen) (Do not index people names)
  18. 18. Autocomplete & Search Suggestions Autocomplete Use query logs to index unigrams (tokens), bigrams, and entities (companies, titles, skills, locations) ● Compute co-occurrence statistics ● Build FST for efficient “prefix => entity” retrieval Query: [senior digital product manager sa|n francisco] Score based on entity co-occurrence using last entity in the query (product manager): ● P(san francisco | product manager) ● P(san diego | product manager) ● P(sandisk | product manager) Fall back to bigram co-occurrence: ● P(francisco | san) x P(san | manager)
  19. 19. Autocomplete & Search Suggestions Autocomplete ● Personalization ○ [ma] ■ machinist ■ manager ■ machine learning? ● Implicit spelling correction ○ [macine lear] => machine learning ● Use similar entities to complete previously unseen queries ○ [software engineer] ⇔ [software developer] ○ Complete [hadoop software de|veloper] based on [hadoop software engineer]
  20. 20. Autocomplete & Search Suggestions Search Suggestions ● Personalization ○ [hadoop] ■ “People with hadoop skills” ■ “Jobs requiring hadoop skills” ● Suggestions with multiple entities ○ [hadoop engineer san francisco] ■ “Hadoop engineer jobs in San Francisco]
  21. 21. Spellcheck ● Fix obvious typos ● Help users spell names
  22. 22. Spellcheck People names Companies Titles Past queries
  23. 23. Spellcheck PROBLEM: User profiles as well as query logs contain many spelling errors (Frequency alone is not helpful due to the long-tail distribution of entities)
  24. 24. Spellcheck PROBLEM: User profiles as well as query logs contain many spelling errors SOLUTION: Use query chains and click data to infer correct spelling
  25. 25. Spellcheck ● Better error model ○ Improved metaphone (version 3) ○ Platform aware: Keyboard edit distance on mobile ● Machine-learned model ● Support for partial queries ○ Spellcheck-as-you-type for “Instant” search
  26. 26. Facet Suggestions
  27. 27. Facet Suggestions
  28. 28. Facet Suggestions ● Query awareness ○ For TITLE queries, suggest seniority facet ○ Don’t suggest facets for name queries ○ Don’t suggest redundant/conflicting facets (location facet when query has location) ● User awareness ○ User profile: Users often restrict search results to their own location, industry, seniority ○ User behavior: Recruiters often restrict to particular industry, location ● Document set awareness ○ Ensure minimum number of results ○ Bias towards higher-quality results (people, jobs, …)
  29. 29. Query Understanding and Rewriting
  30. 30. Query Understanding
  31. 31. Query Tagging (Recognized entities: Names, titles, companies, schools, locations, skills)
  32. 32. Query Tagger Sequential model trained on the following data: ● Emission probabilities (dictionary) ○ Profiles – Names, Titles, Schools, Locations ○ Standardized data – Companies, Skills ● Transition probabilities ○ Query logs ○ Tags for query tokens inferred based on result clicks
  33. 33. Query Tagger Prediction: 1. Segmentation: Maximum likelihood using unigram/bigram counts [data scientist] [linkedin] [mountain view] 2. Sequence labeling: Viterbi decoding [TITLE] [COMPANY] [LOCATION] 3. Entity linking: Dictionary [TITLE ID=435] [COMPANY ID=1337] [LOCATION ID=us:ca:mountain_view]
  34. 34. Query Tagging ● Query tags used for ranking model selection ○ Name query => NAME MODEL ○ Title query, Skill query => TITLE MODEL ○ ... ● More precise matching with documents [software engineer google new york] is rewritten to [TITLE:(software engineer) COMPANY:(google) GEO:(new york)] Using query tags:
  35. 35. Entity-based filtering BEFORE AFTER escape hatch
  36. 36. Query Expansion Name synonyms Job Title synonyms
  37. 37. Query Expansion ● Titles ○ Query reformulations ■ [programmer] => [software engineer] => CLICK ■ [lawyer] => [attorney] => CLICK ■ [attorney] => [legal counsel] => CLICK ● Names ○ Query Reformulations ○ Dictionaries ■ bob == robert ■ beth == elizabeth ■ ...
  38. 38. Name spelling variants Name Clustering
  39. 39. Name spelling variants Two-step clustering: 1. Coarse clustering – metaphone 2. Finer clustering – edit distance, hand-written rules… Each name is assigned to a cluster NC_SRIRAM = {sriram, sreeram, sriraam, shriram, …} NC_SRIRAM Name Clustering
  40. 40. Summary ● Search assistance and guided search are critical for ensuring search success ○ Good query => good results ● High degree of structure in queries and documents (profiles, jobs, …) ○ Query understanding and Document understanding are crucial ○ “Things not Strings” => entity-based retrieval ● Query understanding and rewriting play an important role in result set quality ○ A good initial set of documents simplifies the ranker’s job ○ Good result set => accurate facet counts ○ Allows for sorting options other than relevance (recency, number of connections, …)
  41. 41. Thank You!

×