Lab seminar201006042. • Damon
Horowitz,
Sepandar
D.
Kamvar
• The
Anatomy
of
a
Large-‐Scale
Social
Search
Engine
• WWW
2010
• Aardvark
QA
• web
4. •
Aardvark
• :
Google
•
•
•
•
•
•
•
•
“Do
you
have
any
good
babysiLer
recommendaMons
in
Palo
Alto
for
my
6-‐year-‐old
twins?
I’m
looking
for
somebody
that
won’t
let
them
watch
TV.”
5. • Crawler
and
Indexer
–
• Query
Analyzer
–
• Ranking
FuncMon
–
• UI
– UI
7. s(ui ,u j ,q) = p(ui | u j ) • p(ui | q)
= p(ui | u j )∑ p(ui | t) p(t | q)
t∈T
• p(ui|uj):
quality
score
• p(ui|q):
relevance
score
•
u: q: t:
8. P(ui|t)
•
p(t | ui ) p(ui )
p(ui | t) =
•
p(t)
•
s(t | ui ) = p(t | ui ) + γ ∑u∈U p(t | u)
• facebook
• blog
∑ p(t | u ) = 1
i
• /twiLer
t∈T
€
€
•
•
•
10. P(t|q) :
• Non
QuesMon
Classifier
–
• Inappropriate
QuesMon
Classifier
–
• Trivial
QuesMon
Classifier
–
• LocaMon
SensiMve
Classifier
–
11. P(t|q) :
•
– Keyword
Match
Topic
Mapper
•
– Taxonomy
Topic
Mapper
• SVM 3000
– Salient
Term
Topic
Mapper
• d-‐idf
– User
Tag
Topic
Mapper
•
12. •
– Topic
ExperMse:
p(ui|q)
– Connectedness:
p(ui|uj)
– Availability:
•
–
16. •
– Google PC
• Mobile
Google Aardvark
– Google Aardvark
17. •
•
Aardvark
18.6
words
98.1%
2.2
2.9
words
57
63%
21. • 97.7% 3
• 174,605
• 1,199,323
22. • Google
– 200 Aardvark
– Aardvark google
5
– 10
Aardvark
5
71.5%
3.93
±
1.23
Google
2
70.5%
3.07
±
1.46
24. • “ ” Aardvark
• Aardvark
• Aardvark
• “ ”
•