Contenu connexe Similaire à 인간의 경험 공유를 위한 태스크 및 컨텍스트 추출 및 표현 Similaire à 인간의 경험 공유를 위한 태스크 및 컨텍스트 추출 및 표현 (20) 인간의 경험 공유를 위한 태스크 및 컨텍스트 추출 및 표현1. KAIST Education for the World, Research for the Future
인간의 경험 공유를 위한
태스크 및 컨텍스트 추출 및 표현
2012. 11. 29
류지희
웹사이언스공학 전공
정보검색 및 자연어처리 연구실
2. Why Human Experience Sharing?
Necessity of Experiential Problem Solving Knowledge
2 © 2012 IR&NLP Lab. All rights reserved.
3. A. Change a Flat Tire When
You Are a Woman Alone
1. Loosen lug nuts on tire.
2. Install spare tire.
user
User Context Info
B. Change a Tire like a Real
Woman
[On U.S. highway]
1. Call AAA.
[1 year driving experience]
2. Be placed on “hold”.
[Heading to New York]
[Female]
3 © 2012 IR&NLP Lab. All rights reserved.
4. Experience Mining
Building a Relational Knowledge about Experiences
Experiential Knowledge Distillation
Web
Context-anchored
Experiential
Experiential
Sentences &
Knowledge
Context
Automatic extraction
Aggregation & abstraction
Event People Place Time Event People Place Time
Play Soccer Yongho, … Expo Park 2011-08-10 (Type) (Type) (Type) (Type)
Play Baseball Chulsoo, … Gapchun Park 2009-09-02 (Sport) (student) (Park) (Summer)
… …
4 © 2012 IR&NLP Lab. All rights reserved.
5. From What?
Various types of open contents on the Web!
How-to Blog Microblog
articles posts posts
Human Task Event Context Place Semantics
mining mining mining
Human
Experiential KB
5 © 2012 IR&NLP Lab. All rights reserved.
7. Human Task Model
Topic
hasTopic
Goal
hasAction
hasNextAction
Action
hasObject hasTime hasLocation
Object Time Location
7 © 2012 IR&NLP Lab. All rights reserved.
8. Human Task Extraction
Goal
Title How to Make Omelet Soup Make Omelet Soup
Action Sequence
Step 1 Place the water or canned chicken broth
in a large saucepan. (place, water) (place, broth)
Boil the sweet yellow onion for several (boil, onion)
minutes.
(add, broth)
Step 2 Add the powdered chicken broth along
(boil, soup)
with the canned mushrooms.
(add, onion)
Boil the soup for a few more minutes,
and then add the chopped green onion. (drop, egg)
Step 3 Drop the eggs into the simmering broth Ingredients
a few minutes before you're ready to water broth
serve the omelet soup. soup onion egg
8 © 2012 IR&NLP Lab. All rights reserved.
9. Hybrid Extraction Method
Eat fruit every day.
Sentences Turn off the car. (eat, fruit)
(turn off, car)
Retrieve and apply Yes Extract
Matched?
a rule verb and ingredients
No Yes
Select the best
Syntactic Patterns Prob. > threshold
label sequence
CRFs Model
9 © 2012 IR&NLP Lab. All rights reserved.
10. Next Challenging Issues
A large fraction of sentences (more than 40%) in how-to instructions are
not imperative sentences.
Difficulties arising from variations in writing Case Percentage
Scoping ambiguity Scoping
E.g. Clear or glitter nail polish should go on the nails. 13.9%
Ambiguity
Anaphora Anaphora 13.1%
E.g. Make it fun and unique Condition 11.9%
Condition Ellipsis 1.9%
E.g. If your computers are only a few years old Implicit
1.3%
Ellipsis meaning
E.g. So why don't you? Grammatical
1.3%
Implicit meaning mistake
E.g. Studying improves grades. (Study hard!) Etc. 56.6%
Grammatical mistake Case Percentage in
all the clauses in
E.g. IM a friend! (Make friend relationship in a instance messenger) 30 sample documents
10 © 2012 IR&NLP Lab. All rights reserved.
11. Feature Sets
Feature Type Feature Name Feature Values
Clause Type main, subordinate
Person 1st person, 2nd person, 3rd person
Syntactic Auxiliary Verb will, shall, can, may, must, able to, …
Features Voice active, passive, n/a
Tense past, present, future
Polarity negated, non-negated
Feature Type Feature Name Examples
Obligation • You have to ask about the car.
Permission • You can search for the world weather.
Modality
Features
Explanation • The cost for delivery is already included.
Supposition • You will have access to the weather.
11 © 2012 IR&NLP Lab. All rights reserved.
12. Result: Actionable Clause Detection
Task Used Feature Sets F1(NB) F1(DT) F1(SVM)
Syntactic Features
Actionable 0.933 0.942 0.948
(micro only)
Clause
Detection + Modality Features
0.862 0.963 0.966
(micro ¯o)
NB : Naï Bayes
ve DT : Decision Tree SVM : Support Vector Machines
12 © 2012 IR&NLP Lab. All rights reserved.
13. Bridge to Semantic Web
AcTN knowledge representation YAGO knowledge representation
13 © 2012 IR&NLP Lab. All rights reserved.
14. Changing Data Representation
Current Form Ultimate Target Form
Refined tabular Well-designed
data records ontology entries
[plain text] [well-formed RDF]
14 © 2012 IR&NLP Lab. All rights reserved.
16. What is an Event?
Events are defined as situations that happen
Punctual (example 1-2) or last for a period of time (example 3-4)
States in which something holds true (example 5)
Examples
Ferdinand Magellan, a Portuguese explorer, first reached the islands in search
(1)
of spices.
A fresh flow of lava, gas and debris erupted there Saturday. (2)
11,024 people were evacuated to 18 disaster relief centers. (3)
“We’re expecting a major eruption,” he said in a telephone interview early to
(4)
day.
Israel has been scrambling to buy more masks abroad, after a shortage of sev
(5)
eral hundred thousand gas masks.
16 © 2012 IR&NLP Lab. All rights reserved.
17. Event Expressions
Event may be expressed in the following forms
Type Example
Verb A fresh flow of lava, gas and debris erupted there Saturday.
Israel will ask the United States to delay a military strike ag
Noun ainst Iraq until the Jewish state is fully prepared for a possib
le Iraqi attack.
A Philippine volcano, dormant for six centuries, began expl
Adjective
oding with searing gases, thick ash and deadly debris.
“There is no reason why we would not be prepared,” Mord
Predicative clause
echai told the Yediot Ahronot daily.
Prepositional phrase All 75 people on board the Aeroflot Airbus died.
17 © 2012 IR&NLP Lab. All rights reserved.
18. Feature Sets
Basic Features
Named entity (NE) tags and an indication of whether the target
noun is prenominal or not.
Lexical Semantic Features (LS)
The set of target nouns’ lemmas and their WordNet hypernyms
Dependency-based Features (DF)
Nouns become events if they occur with a certain surrounding
context, namely, syntactic dependencies
Dependency-based Features sometimes need to be combined
with Lexical Semantic Features
18 © 2012 IR&NLP Lab. All rights reserved.
19. Comparing with Previous Work
An improvement of about 0.22 (precision) and 0.09 (recall)
over the state-of-the-art, respectively.
Llorens et al. (2010) Proposed Method
0.727
Precision
0.95
0.483
Recall
0.577
0.584
F1
0.718
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00
19 © 2012 IR&NLP Lab. All rights reserved.
21. Place Semantics
AS GPS-enabled mobile devices have come into wide use,
Location based services catch popularity
But it is hard to provide appropriate context-aware
services to users when the system only use user’s location,
i.e. GPS(latitude, longitude)
Contrary to location, Place is space where people impart a
meaning
If we know the meaning of the place, Place Semantics, we
can serve much better suitable services to users
21 © 2012 IR&NLP Lab. All rights reserved.
22. Motivation
Scenario
Recently, Lena moved to Korea from USA. She doesn’t know Korean culture and
geography at all because she didn’t leave outside USA before.
Is there similar places with Brooklyn Bowl that
I often visited in order to relieve stress?
How about Olympic Bowling Alley?
No. Thanks! It’s NOT the place I wanted.
Brooklyn Bowl is a bowling alley in New York City. People enjoy bowling, have a
party, drink beer and hold a music event in Brooklyn Bowl.
22 © 2012 IR&NLP Lab. All rights reserved.
23. Place Semantics Mining
People leave texts about “why they visit, what they do” when
they check-in at Place on Foursquare
We can know the perception of places from those texts
We apply LDA to extract Place Semantics
A document is composed of texts written in a place.
“text”
Place
23 © 2012 IR&NLP Lab. All rights reserved.
24. Similarity between Two Places
Is there similar places with Brooklyn Bowl that
I often visited in order to relieve stress?
How about XL Night Club?
Brooklyn Bowl XL Night Club
Have a party & Drink beer
41%
32%
Enjoy a music show
After work 27%
5% 3%
Eat food 26%
7%
Watch sports game
11% 18% 30%
Others
24 © 2012 IR&NLP Lab. All rights reserved.
26. Application of Our Results
Semantic Annotation
Adds diversity and richness to text processing
26 © 2012 IR&NLP Lab. All rights reserved.
28. KAIST Education for the World, Research for the Future
Jihee Ryu (jiheeryu@kaist.ac.kr)
http://jihee.kr
Yoonjae Jeong (hybris@kaist.ac.kr)
Eunyoung Kim (ey_kim@kaist.ac.kr)
Sung-Hyon Myaeng (myaeng@kaist.ac.kr)
http://ir.kaist.ac.kr/member/professor/
IR&NLP Lab
http://ir.kaist.ac.kr
29. Reference
1) Jung, Y., Ryu, J., Kim, K., Myaeng, S.H.: Automatic Construction of a Large-Scale
Situation Ontology by Mining How-to Instructions from the Web. Web Semantics:
Science, Services and Agents on the World Wide Web (2010)
2) Ryu, J., Jung, Y., Kim, K., Myaeng, S.H.: Automatic Extraction of Human Activity
Knowledge from Method-Describing Web Articles. 1st Workshop on Automated
Knowledge Base Construction (2010)
3) Park, K.C., Jeong, Y., Myaeng, S.H.: Detecting Experiences from Weblogs. 48th
Annual Meeting of the Association for Computational Linguistics (2010)
4) Ryu, J., Jung, Y., Myaeng, S.H.: Actionable Clause Detection from Non-imperative
Sentences in How-to Instructions: A Step for Actionable Information Extraction.
15th International Conference on Text, Speech and Dialogue (2012)
5) Jeong, Y., Myaeng, S.H.: Using Syntactic Dependencies and WordNet Classes for
Noun Event Recognition. Workhop on Detection, Representation, and Exploitation
of Events in the Semantic Web in conjunction with the 11th International Semantic
Web Conference 2012 (2012)
6) Carter, E., Donald, J.: Space and place: theories of identity and location. Lawrence
& Wishart Ltd. (1993)
29 © 2012 IR&NLP Lab. All rights reserved.
30. Data Collection: How-to Articles
General How-to Articles
1,850,725 articles from eHow & 109,781 articles from wikiHow
eHow Category Group # doc wikiHow Category Group # doc
Computers & Software, Internet 323,289 Computers, Electronics 18,265
Home Building & Design & Safety 307,277 Family Life, Home, Pets, Relationships 18,220
Culture, Holidays, Hobbies, Weddings 238,143 Hobbies, Holidays, Travel 14,514
Business, Investment, Personal Finance 153,458 Health, Sports 14,161
Arts, Entertainment, Music 149,426 Youth 9,161
Family, Parenting, Pets, Plants 135,909 Personal Care, Style 7,031
Cars, Car Repair 108,386 Education, Communications 6,775
Healthcare, Fitness, Sports 103,758 Finance, Business, Work 6,729
Education, Careers, Employment 103,717 Food, Entertaining 6,099
Electronics 101,403 Arts, Entertainment 5,151
Food, Recipes 63,553 Cars, Vehicles 2,316
Fashion, Beauty 62,406 Philosophy, Religion 1,359
Total (As from December 2011) 1,850,725 Total (As from December 2011) 109,781
30 © 2012 IR&NLP Lab. All rights reserved.