08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Upick
1.
2. Agenda
Introduction to Named Entities (NE) concepts.
Extracting relationships among NEs and
problems associated with them.
Our approach: Human in the loop
Proposed design and results
4. Named Entity: Definition
It is an atomic element in a body of text.
Types: person, organization, location etc.
Different named entities when linked together,
form a relation.
5. Named Entity: An example
Sachin Tendulkarwas born in Bombay.
NE of type ‘Person’ NE of type ‘Location’
6. Relationship: Structure
Subject – Relation - Object
NE of any type NE of any type
Verb, Adjective, Adverb
9. Extracting relationships among
NEs: Importance
They signify a fact related to a named entity.
Useful in Question answering system.
Useful in improving the accuracy of
search results.
10. Extracting relationships among
NEs: Standard process
1. Identify named entities within a
sentence.
2. Find the verb or adjective that
connects the identified named entities.
3. Connect them together to form relation.
11. Extracting relationships among
NEs: Difficulty
Co-references,
Use of
abbreviations, acronyms and
ambiguous words at several places.
Complex structure of the sentences.
12. Extracting relationships among
NEs: Difficulty Example
“Tom called his father last night. They talked
for an hour. He said he would be home the next
day."
What is ‘He' referring to? Tom orhis father?
13. Extracting relationships among
NEs: Required process
1. Identify part-of-speech constructs:
noun, verb, adjective etc.
2. Determine Co-references,
Acronyms and abbreviations.
3. Connect them together to form a
relationship.
14. Extracting relationships among
NEs: Automated Approaches
Natural Language Processing: Part-of-Speech
Tagger, CRF.
Machine Learning: Hidden Markov Model,
Singular Vector Model.
Statistical Methods: Maximum Entropy
Method.
Other methods:Vocabularybased systems,
context based clustering.
15. Issues with the automated
extraction techniques
Dependency Scalability
on external vocabulary Domain-dependent.
sources, like Wikipedia,
Corpus-dependent.
WordNet, MindNet etc.
Relation specific.
Maintenance, update of
vocabulary sources is
manual, costly and
require expertise.
Limited size produce
context based noise.
18. Crowdsourcing: In terms of NE
relationship extraction
Advantages Disadvantages
Humans can easily extract Not like computers.
and find different facts from
a text body. Find the task boring and
cumbersome.
They can also verify the
accuracy of the obtained Need incentives to
relationships from participate.
automated techniques.
20. 1. Monetary incentives
Features Disadvantages
Can scale massively. Not the only source of
motivation.
Harness majority vote.
Work is not credited so may
Example: encourage cheating.
Requires filtering and
Amazon Mechanical Turk
monitoring.
Needs labor law.
21. 2. Social incentives
Features Disadvantages
Distribute among social Do not scale beyond the
peers (crowd). closed network.
Harness trust. Specific to the participating
crowd.
Examples:
Requires filtering and
Flickr, Facebook, Quora. monitoring.
22. 3. Fun incentives
Features Disadvantages
Games are seductive. Need someone to play with.
Bring collaboration, Improper game play may
curiosity, challenges, encourage cheating.
competition, fun.
More cognitive work leads
Task is generally hidden.
to less fun.
Example:
Requires filtering and
ESP, GWAP monitoring.
23. Existing crowdsourcing ideas
Monetary incentive Fun incentive
Amazon Mechanical Turk Games With a purpose:
Verbosity, Categorilla,
collecting named entities, Phrase Detectives
finding relational hierarchy,
phrase detection etc. For collecting common
sense facts, producing
Still no solution for entities for templates.
verification!
Still a higher cognitive task!
27. uPick working
Step 1: Extract NEs and relations using
POS Tagger (automated technique).
Step 2:Present the extracted relations to a
crowd in the form a game (challenge).
Step 3: Filter the relations by collecting the
majority votes.
29. uPick scoring
For the first
player, compare the output with
the expert judgments.
For subsequent players, check the
majority vote (> 50%).
30. uPick benefits
Effectiveness Generalization
Min. cognitive effort Language and corpus
because of click-based independent.
interaction.
Can be extended to solve
No dependency on external other similar NLP problems.
resource, therefore,
scalable.
31. Supervised laboratory
study.
Participants: 12 (4 male
User study and 8 females).
Two sessions of one
of hour: training and game
play.
uPick Document setfour
onAshok Maurya,
Sachin Tendulkar,
Shahrukh Khan, and
Sonia Gandhi.
32. D1 D2 D3 D4
Total number of presented 37 39 40 33
relations
Correctly identified valid 19 18 19 15
relations
Incorrectly identified valid 5 6 4 1
relations as invalid
Correctly identified invalid 12 12 16 15
relations
Incorrectly identified 1 3 1 2
invalid relations as valid
Accuracy 84% 77% 87% 91%
(Correctly identified
relations / total relations)
Accuracy using automated techniques 65% 61% 57% 49%
only (Valid relations / total relations)
RESULTS: Accuracy of uPick scheme after considering
majority votes of the participants
33. Conclusion
Participants did not find the game design
engaging.
uPick proved helpful in remembering various
facts related to a text body.
34. Future Work
Leader board.
More engaging game play design. For example,
physics based puzzles and object finding games.
Extension to question answering system based
on individual document.
Notes de l'éditeur
In traditional computation a task is outsourced to one or more computers for solving…But in human computation, the task is routed to the crowd.Specific problems easy for humans but not for systems. E.g., objects in a room.