An overview of issues and early work on combining human computation and scalable computing to tackle big data analytics problems. Includes a survey of relevant projects underway at the UC Berkeley AMPLab.
9. Useful Taxonomies
• Doan, Halevy, Ramakrishnan; (Crowdsourcing)
CACM 4/11
– nature of collaboration (implicit vs. explicit)
– architecture (standalone vs. piggybacked)
– must recruit users/workers? (yes or no)
– What do users/workers do?
• Bederson & Quinn; (Human Computation) CHI ‟11
– Motivation (Pay, Altruism, Enjoyment, Reputation)
– Quality Control (many mechanisms)
– Aggregation (how are results combined?)
– Human Skill (Visual recognition, language, …)
– …
10. Types of Tasks
Task Granularity Examples
Complex Tasks • Build a website
• Develop a software system
• Overthrow a government?
Simple Projects • Design a logo and visual identity
• Write a term paper
Macro Tasks • Write a restaurant review
• Test a new website feature
• Identify a galaxy
Micro Tasks • Label an image
• Verify an address
• Simple entity resolution
Inspired by the report: “Paid Crowdsourcing”,
Smartsheet.com, 9/15/2009
20. Not Exactly Crowdsourcing, but…
“The hope is that, in not too many years, human brains
and computing machines will be coupled together very
tightly, and that the resulting partnership will think as no
human brain has ever thought and process data in a way
not approached by the information-handling machines
we know today.”
21. AMP: Integrating Diverse Resources
Algorithms:
Machine Learning and
Analytics
People:
Machines:
CrowdSourcing &
Cloud Computing
Human Computation
21
22. The Berkeley AMPLab
• Goal: Data analytics stack integrating A, M & P
• BDAS: Released as BSD/Apache Open Source
• 6 year duration: 2011-2017
• 8 CS Faculty
• Directors: Franklin(DB), Jordan (ML), Stoica (Sys)
• Industrial Support & Collaboration:
• NSF Expedition and Darpa XData
22
23. People in AMP
• Long term Goal: Make people
an integrated part of the system!
• Leverage human activity
Machines +
• Leverage human intelligence
Algorithms
• Current AMP People Projects
– Carat: Collaborative Energy
Questions
activity
Answers
Debugging
data,
– CrowdDB: “The World‟s Dumbest
Database System”
– CrowdER: Hybrid computation for
Entity Resolution
– CrowdQ: Hybrid Unstructured
Query Answering
23
24. Carat: Leveraging Human Activity
~500,000
downloads
to date
A. J. Oliner, et al. Collaborative
Energy Debugging for Mobile
Devices. Workshop on Hot
Topics in System Dependability
(HotDep), 2012.
24
25. Carat: How it works
Collaborative Detection of Energy Bugs
25
26. Leveraging Human Intelligence
First Attempt: CrowdSQL Results
CrowdDB Parser
Turker Relationship
MetaData
Manager
UI Form
Optimizer
Creation Editor
See also:
Executor UI Template Manager
Qurk – MIT
Statistics
HIT Manager
Deco – Stanford
Files Access Methods
Disk 1
Disk 2
CrowdDB: Answering Queries with Crowdsourcing, SIGMOD 2011
26
Query Processing with the VLDB Crowd, VLDB 2011
27. DB-hard Queries
Company_Name Address Market Cap
Google Googleplex, Mtn. View CA $210Bn
Intl. Business Machines Armonk, NY $200Bn
Microsoft Redmond, WA $250Bn
SELECT Market_Cap
From Companies
Where Company_Name = “IBM”
Number of Rows: 0
Problem:
Entity Resolution
27
28. DB-hard Queries
Company_Name Address Market Cap
Google Googleplex, Mtn. View CA $210Bn
Intl. Business Machines Armonk, NY $200Bn
Microsoft Redmond, WA $250Bn
SELECT Market_Cap
From Companies
Where Company_Name = “Apple”
Number of Rows: 0
Problem:
Closed-World Assumption
28
29. DB-hard Queries
SELECT Image
From Pictures
Where Image contains
“Good Looking Dog”
Number of Rows: 0
Problem:
Subjective Comparision
29
30. Leveraging Human Intelligence
First Attempt: CrowdSQL Results
CrowdDB Parser
Turker Relationship
MetaData
Manager
UI Form
Where to use the crowd: Optimizer
Creation Editor
• Cleaning and Executor UI Template Manager
Statistics
Disambiguation
• Find missing data Files Access Methods HIT Manager
• Make subjective
comparisons Disk 1
Disk 2
CrowdDB: Answering Queries with Crowdsourcing, SIGMOD 2011
30
Query Processing with the VLDB Crowd, VLDB 2011
33. CrowdSQL
DDL Extensions:
Crowdsourced columns Crowdsourced tables
CREATE TABLE company ( CREATE CROWD TABLE department (
name STRING PRIMARY KEY, university STRING,
hq_address CROWD STRING); department STRING,
phone_no STRING)
PRIMARY KEY (university, department);
DML Extensions:
CrowdEqual: CROWDORDER operators (currently UDFs):
SELECT * SELECT p FROM picture
FROM companies WHERE subject =
WHERE Name ~= “Big Blue” "Golden Gate Bridge"
ORDER BY CROWDORDER(p, "Which
pic shows better %subject");
33
34. CrowdDB Query: Picture ordering
Which picture visualizes better
Query: "Golden Gate Bridge"
SELECT p FROM picture
WHERE subject = "Golden Gate Bridge"
ORDER BY CROWDORDER(p, "Which pic shows
better %subject");
Data-Size: 30 subject areas, with 8 pictures each
Batching: 4 orderings per HIT
Replication: 3 Assignments per HIT
Price: 1 cent per HIT
Submit
34 (turker-votes, turker-ranking, expert-ranking)
35. User Interface vs. Quality
Please fill out the missing Please fill out the missing
professor data department data
N ame Carey Department CS
Department CS Name
Please fill out the missing
name MTJoin professor data
MTJoin Phone
E-Mail (Dep) Name Carey
(Professor)
p.dep = d.name Submit
p.name = "carey" Submit MTProbe E-Mail
(Professor, Dep) Department
name=Carey
Please fill out the missing Please fill out the missing Department
MTProbe professor data Phone
department data
(Professor) Carey
MTProbe(Dep) Department Name
name=Carey Submit
Name E-Mail
Phone Department
Submit Submit
(Department first) (Professor first) (De-normalized Probe)
≈10% Error-Rate ≈10% Error-Rate ≈80% Error-Rate
35
38. What Does This Query Mean?
SELECT COUNT(*) FROM IceCreamFlavors
Trushkowsky et al. Croudsourcing Enumeration Queries, ICDE 2013 (to appear)
38
39. Estimating Completeness
SELECT COUNT(*) FROM US States
US States using Mechanical Turk
Species Estimation techniques perform well on average
•Uniform under-predicts slightly, coeff of var. = 0.5
•Decent estimate after 100 HITs
States: unique items
Average US States
50
40
avg # unique answers
30
20
10
0
0 50 100 150 200 250 300
39 # responses
# Answers (HITs)
40. Estimating Completeness
SELECT COUNT(*) FROM IceCreamFlavors
• Ice Cream Ice Cream Flavors
Flavors
– Estimators don‟t
converge
– Very highly
skewed (CV =
5.8)
– Detect that # HITs
insufficient Few, short lists of ice cream flavors
(e.g. “alumni swirl, apple cobbler
(beginning of crunch, arboretum breeze,…” from Penn
State Creamery
40
curve)
41. pay-as-you-go
• “I don’t believe it is usually possible to estimate the
number of species... but only an appropriate lower bound
for that number. This is because there is nearly always a
good chance that there are a very large number of
extremely rare species” –
Good, 1953
• So instead, can ask: “What‟s the benefit of
m additional HITs?”
Ice Cream after 1500 HITs
m Actual Shen Spline
10 1 1.79 1.62
50 7 8.91 8.22
200 39 35.4 32.9
41
43. Hybrid Entity-Resolution
Threshold = 0.2
#Pairs = 8,315
#HITs = 508
Cost= $38.1
Time = 4.5h
Time(QT) = 20h
J. Wang et al. CrowdER: Crowdsourcing Entity Resolution, PVLDB 2012
43/17
44. CrowdQ – Query Generation
• Help find answers to unstructured queries
– Approach: Generate a structured query via templates
• Machines do parsing and ontology lookup
• People do the rest: verification, entity extraction, etc.
Demartini et al. CroudQ: Crowdsourced Query Understanding, CIDR 2013 (to appear)
44
46. Generic Architecture
Middleware is the software that
resides between applications
application and the underlying architecture.
The goal of middleware is to
facilitate the development of
applications by providing higher-
level abstractions for better
Hybrid Platform programmability, performance, s
calability, security, and a variety
of essential features.
Middleware 2012 web page
47. The Challenge
Incentives
Latency & Prediction
Failure Modes
Some issues: Work Conditions
Interface
Task Structuring
Task Routing
47 …
48. Can you incentivize workers?
http://waxy.org/2008/11/the_faces_of_
48
mechanical_turk/
50. Can you trust the crowd?
On Wikipedia ”any user can
change any entry, and if
enough users agree with
them, it becomes true."
“The Elephant population in Africa has
tripled over the past six months.”[1]
Wikiality: Reality as decided on by majority rule.[2]
[1] http://en.wikipedia.org/wiki/Cultural_impact_of_The_Colbert_Report
[2] http://www.urbandictionary.com/define.php?term=wikiality
51. Answer Quality Approaches
• Some General Techniques
– Approval Rate / Demographic Restrictions
– Qualification Test
– Gold Sets/Honey Pots
– Redundancy and Voting
– Statistical Measures and Bias Reduction
– Verification/Review
• Query Specific Techniques
• Worker Relationship Management
51
52. Can you organize the crowd?
Independent agreement to identify patches
Soylent, a prototype...
Randomize order of suggestions
52
[Bernstein et al: Soylent: A Word Processor with a Crowd Inside. UIST, 2010]
54. Can you build a low-latency crowd?
from: M S Bernstein, J Brandt, R C Miller, D R Karger, “Crowds in Two
Seconds: Enabling Realtime Crowdsourced Applications”, UIST 2011.
54
56. For More Information
Crowdsourcing Tutorials:
• P. Ipeirotis, Managing Crowdsourced Human Computation,
WWW „11, March 2011.
• O. Alonso, M. Lease, Crowdsourcing for Information Retrieval:
Principles, Methods, and Applications, SIGIR July 2011.
• A. Doan, M. Franklin, D. Kossmann, T. Kraska,
Crowdsourcing Applications and Platforms: A Data
Management Perspective, VLDB 2011.
AMPLab: amplab.cs.berkeley.edu
• Papers
• Project Descriptions and Pages
• News updates and Blogs
56
Notes de l'éditeur
Fix ME!!!!
For the database administrator it is the correct answer, but for the CEO it is not really understandable
Equal is not a good fit
210 HITsIt took 68 minutes to complete the whole experiment.
-
Lead off with saying heavily skewed distribution will be difficult to estimate, only lower bound(say quote)Instead, reason about cost vs. benefit tradeoffWhen you ask a slightly different question, you can still make progress!