SlideShare une entreprise Scribd logo
1  sur  18
Télécharger pour lire hors ligne
Scheduling Human Intelligence
Tasks in Multi-Tenant
Crowd-Powered Systems
Djellel Eddine Difallah, University of Fribourg, CH
Gianluca Demartini, University of Sheffield, UK
Philippe Cudré-Mauroux, University of Fribourg, CH
Introduction
• Crowdsourcing relies on a large pool of humans to perform
complex tasks (paid workers, volunteers, players etc)
• A Crowdsourcing platform (e.g., CrowdFlower, Amazon
MTurk) allows requesters to tap into a pool of paid workers
in a shared resources fashion
• Requesters would publish batches of similar tasks to be
completed in exchange of a monetary reward
• Workers can arrive and leave at any point in time and can
selectively focus on an arbitrary subset of the tasks only
2
Introduction
Observations
• Few workers perform many tasks, followed by a
long tail of workers performing fewer tasks [Ipeirotis
2010; Franklin et al. 2011]
• Large jobs are fast at the beginning, then they lose
their momentum toward the end [Difallah et al. 2014]
• We suspect that this leads to batches being treated
unequally. (Batch Size, Freshness, Requester,
Price) [Difallah et al. 2015]
3
0.00
0.25
0.50
0.75
1.00
Jan 01 Jan 15 Feb 01 Feb 15 Mar 01 Mar 15 Apr 01
Time (Day)
Count(Normalized)
(a) Batch distribution per Size.
0.00
0.25
0.50
0.75
1.00
Jan 01 Jan 15 Feb 01 Feb 15 Mar 01 Mar 15 Apr 01
Time (Day)
Throughput(Normalized)
(b) Cumulative Throughput per Batch Size.
Introduction
Data Analysis
• Most of the Batches
present on AMT have
10 HITs or less
• The overall platform
throughput is
dominated by larger
batches
Tiny[0,10]
Small[10,100]
Medium[100,1000]
Large[1000,Inf]
4
Motivation
The case of Multi-Tenant Crowd-powered Systems (CPS)
• Definition: A CPS serves multiple customers/users (e.g., a
Crowd DBMS)
• The system posts a batch of tasks on the crowdsourcing
platform per user query
• The CPS is in constant competition to attract workers
• With itself — multiple tenants
• With other requesters
• Job starvation is problematic in business applications
5
Contributions
• We design a novel crowdsourcing system
architecture that allows job scheduling for a CPS
on top of a traditional crowdsourcing platform
• We devise a scheduling algorithm that embodies
a set of general design requirements
• We empirically evaluate our setup on Amazon
MTurk, with real crowd and a set of scheduling
algorithms
6
HIT-Bundle
Definition
• Scheduling requires that
we have control over the
serving process of tasks
• A HIT-Bundle is a batch
that contains
heterogeneous tasks
• All tasks that are generated
by the CPS are published
through the HIT-Bundle HIT-Bundle
Batch 1
Batch 2
Batch 3
Batch 4
7
HIT-Bundle
Micro Experiment
• Comparison of
batch execution
time using different
grouping strategies
• Distinct batches
• Combined in a
HIT-Bundle
0
25
50
75
100
0 1000 2000 3000 4000
Time (seconds)
#HITsRemaining
B6 − Bundle
B7 − Bundle
B6
B7
8
Proposed CPS
Architecture
Crowdsourcing
Decision Engine
HIT-Bundle Manager
Multi-Tenant
Crowd-Powered System
Crowdsourcing
Platform
Progress
Monitor API
HIT Scheduler
Human
Workers
c1 a1b3..
Queue
Crowdsourcing
App
HIT Collection and Reward
HIT
Results
Aggregator
HIT
Manager
Scheduler
External
HIT
Page
Batch A $$
Batch B $$$
Batch C $
..
Batch Catalog
HIT-Bundle
Creation/Update
Batch Merging
StatusMETA
System
Crowdsourced
queries
Batch Input
Merger
Resource
Tracker
config_file
9
Scheduling for the Crowd
Design Guidelines
• (R1) Runtime Scalability: Adopt a runtime scheduler that a)
dynamically adapts to the current availability of the crowd, and b)
scales to make real-time scheduling decisions as the work
demand grows higher
• (R2) Fairness: The scheduler must provide a steady progress to
large requests without blocking or starving, the smaller requests
• (R3) Priority: The scheduler must be sensitive to clients who
have higher priority (e.g., those who pay more)
• (R4) Human Aware: Unlike machines, people performances are
impacted by many factors including context switching, training
effects, boringness, task difficulty and interestingness
10
(Weighted) Fair Scheduler
• Fair Scheduling FS (R1) (R2):
• Keep track of how many tasks per
batch are currently assigned
running_tasks
• Assign task with min running_tasks
• The Weighted Fair Sharing WFS variant
(R3):
• Compute a weight, based on priority
(e.g., price)
• weight(Bj) = p(Bj)/sum(p(B))
• Assign task with

min running_tasks/weight
• Pros. ensures that all the batches receive
proportional number of workers available
• Cons. We don’t satisfy (R4) Human
Awareness
HIT-Bundle
7 tasks running
1. get_task()
FS: return( )
WFS: return( )
2.
p=0.1$ w= 0.5
p=0.05$ w= 0.25
p=0.05$ w= 0.25
11
Worker Context Switch
Micro Experiment
• We run a HIT Bundle with
heterogenous tasks
• Compute average execution
time for each HIT
• RR: Round Robin, task type
changes every time
• SEQ10 / SEQ25: Task types
are alternated every 10,
respectively 25 tasks
• The mean task execution time
is significantly lower for SEQ25
●
●
●
●
●
●
●
●
●
** (p−value=0.023)** (p−value=0.023)
20
40
60
RR SEQ10 SEQ25
Experiment Type
ExecutiontimeperHIT(Seconds)
RR SEQ10 SEQ25
12
Worker Conscious Fair
Scheduling WCFS
• Goal: Reduce the context switch introduced by having
the worker continuously switch tasks types
• We modify Fair Sharing with Delayed Scheduling [Zaharia
et al. 2010]
• A task will give up its priority K times until a worker
who just completed a similar task is available again
• Pros. we satisfy all our design requirements. A worker
receives longer sequences of similar tasks
• Cons. Need to set K
13
Experiments
Controlled Setup
• On Amazon Mechanical Turk (no simulations)
• HIT-Bundle with 5 different task types
• We artificially ensure that we have num_workers
>10 before starting an experiment
• We compare against basic schedulers First In First
Out (FIFO), Round Robin (RR), Shortest Job First
(SJF)
14
Controlled Experiments
Latency
All experiment are run in parallel
FIFO order [B1, B2, B3, B4, B5]
SJF order [B4, B3, B5, B2, B1] based on
previous evidence
• FIFO finishes jobs one after the other
• Wile SJF finishes the shortest jobs
first
• FS and RR offer a balanced
workforce
0
500
1000
1500
2000
B1 B2 B3 B4 B5
Batch
Time(Seconds)
FIFO FS RR SJF
(a) Batch Latency
0
500
1000
1500
2000
FIFO FS RR SJF
Scheduling Scheme
Time(Seconds)
(b) Overall Experiment Latency
15
0
300
600
900
B1 B2 B3 B4 B5
Batch
Time(seconds)
B2:$0.02
B2:$0.05
(a)Vary The Price
0
250
500
750
1000
B1 B2 B3 B4 B5
Batch
Time(seconds)
10 workers
20 workers
(b) Vary The Workforce
Experiments
Varying the Control Factors
Weighted Fair Scheduler is used
• (a) Effect of increasing B2’s
priority (Price) on batch
execution time
• B2 executes faster
• (b) Effect of varying the number
of crowd workers involved in the
completion of the HIT batches
• The load is rebalanced (albeit,
with different proportions) but
all batches had a speed
increase
16
Experiments in the Wild
Execution Trace
0
10
20
30
0
10
20
30
0
10
20
30
FSIndividualBatchesWCFS
12:20 12:30 12:40 12:50
Time
#ActiveWorkers
Conclusions
• Batch starvation in crowdsourcing is problematic for requesters
• We introduce a new scheduling layer that shares a pool of crowd
workers among multiple tenants of a crowd-powered system
• We perform evaluations in a real setup with real workers
• We show that an HIT-Bundle increases the overall throughput
• Our technique (Worker Conscious Fair Sharing), inspired from
large scale data processing frameworks, minimises context switch
• Toward Service Level Agreement aware scheduling for
crowdsourcing platforms.
Code: https://github.com/XI-lab/HIT-Scheduler

Contenu connexe

Similaire à Crowd scheduling www2016

Scheduling and sequencing
Scheduling and sequencingScheduling and sequencing
Scheduling and sequencing
Akanksha Gupta
 
Lecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptxLecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptx
ssuser0d0f881
 
dataprocess using different technology.ppt
dataprocess using different technology.pptdataprocess using different technology.ppt
dataprocess using different technology.ppt
ssuserf6eb9b
 

Similaire à Crowd scheduling www2016 (20)

02 performance
02 performance02 performance
02 performance
 
Operating Systems Process Scheduling Algorithms
Operating Systems   Process Scheduling AlgorithmsOperating Systems   Process Scheduling Algorithms
Operating Systems Process Scheduling Algorithms
 
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
 
Scheduling
SchedulingScheduling
Scheduling
 
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning WorkloadsHeterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
 
Scheduling and sequencing
Scheduling and sequencingScheduling and sequencing
Scheduling and sequencing
 
Lecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptxLecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptx
 
VTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERS
VTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERSVTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERS
VTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERS
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6
 
Section05 scheduling
Section05 schedulingSection05 scheduling
Section05 scheduling
 
Performance Testing Java Applications
Performance Testing Java ApplicationsPerformance Testing Java Applications
Performance Testing Java Applications
 
Job Queues Overview
Job Queues OverviewJob Queues Overview
Job Queues Overview
 
Operating System Lab Manual
Operating System Lab ManualOperating System Lab Manual
Operating System Lab Manual
 
dataprocess using different technology.ppt
dataprocess using different technology.pptdataprocess using different technology.ppt
dataprocess using different technology.ppt
 
Product layout in Food Industry and Line Balancing
Product layout in Food Industry and Line BalancingProduct layout in Food Industry and Line Balancing
Product layout in Food Industry and Line Balancing
 
K017446974
K017446974K017446974
K017446974
 
Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...
Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...
Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...
 
Operations Research_18ME735_module 5 sequencing notes.pdf
Operations Research_18ME735_module 5 sequencing notes.pdfOperations Research_18ME735_module 5 sequencing notes.pdf
Operations Research_18ME735_module 5 sequencing notes.pdf
 
Operations Management : Line Balancing
Operations Management : Line BalancingOperations Management : Line Balancing
Operations Management : Line Balancing
 
First Come First Serve
First Come First ServeFirst Come First Serve
First Come First Serve
 

Plus de eXascale Infolab

HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
eXascale Infolab
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
eXascale Infolab
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
eXascale Infolab
 

Plus de eXascale Infolab (20)

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex Graphs
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory map
 
Cikm 2018
Cikm 2018Cikm 2018
Cikm 2018
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
 
SSSW 2015 Sense Making
SSSW 2015 Sense MakingSSSW 2015 Sense Making
SSSW 2015 Sense Making
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
 
OLTP-Bench
OLTP-BenchOLTP-Bench
OLTP-Bench
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big Data
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
 
Hasler2014
Hasler2014Hasler2014
Hasler2014
 

Dernier

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Dernier (20)

Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 

Crowd scheduling www2016

  • 1. Scheduling Human Intelligence Tasks in Multi-Tenant Crowd-Powered Systems Djellel Eddine Difallah, University of Fribourg, CH Gianluca Demartini, University of Sheffield, UK Philippe Cudré-Mauroux, University of Fribourg, CH
  • 2. Introduction • Crowdsourcing relies on a large pool of humans to perform complex tasks (paid workers, volunteers, players etc) • A Crowdsourcing platform (e.g., CrowdFlower, Amazon MTurk) allows requesters to tap into a pool of paid workers in a shared resources fashion • Requesters would publish batches of similar tasks to be completed in exchange of a monetary reward • Workers can arrive and leave at any point in time and can selectively focus on an arbitrary subset of the tasks only 2
  • 3. Introduction Observations • Few workers perform many tasks, followed by a long tail of workers performing fewer tasks [Ipeirotis 2010; Franklin et al. 2011] • Large jobs are fast at the beginning, then they lose their momentum toward the end [Difallah et al. 2014] • We suspect that this leads to batches being treated unequally. (Batch Size, Freshness, Requester, Price) [Difallah et al. 2015] 3
  • 4. 0.00 0.25 0.50 0.75 1.00 Jan 01 Jan 15 Feb 01 Feb 15 Mar 01 Mar 15 Apr 01 Time (Day) Count(Normalized) (a) Batch distribution per Size. 0.00 0.25 0.50 0.75 1.00 Jan 01 Jan 15 Feb 01 Feb 15 Mar 01 Mar 15 Apr 01 Time (Day) Throughput(Normalized) (b) Cumulative Throughput per Batch Size. Introduction Data Analysis • Most of the Batches present on AMT have 10 HITs or less • The overall platform throughput is dominated by larger batches Tiny[0,10] Small[10,100] Medium[100,1000] Large[1000,Inf] 4
  • 5. Motivation The case of Multi-Tenant Crowd-powered Systems (CPS) • Definition: A CPS serves multiple customers/users (e.g., a Crowd DBMS) • The system posts a batch of tasks on the crowdsourcing platform per user query • The CPS is in constant competition to attract workers • With itself — multiple tenants • With other requesters • Job starvation is problematic in business applications 5
  • 6. Contributions • We design a novel crowdsourcing system architecture that allows job scheduling for a CPS on top of a traditional crowdsourcing platform • We devise a scheduling algorithm that embodies a set of general design requirements • We empirically evaluate our setup on Amazon MTurk, with real crowd and a set of scheduling algorithms 6
  • 7. HIT-Bundle Definition • Scheduling requires that we have control over the serving process of tasks • A HIT-Bundle is a batch that contains heterogeneous tasks • All tasks that are generated by the CPS are published through the HIT-Bundle HIT-Bundle Batch 1 Batch 2 Batch 3 Batch 4 7
  • 8. HIT-Bundle Micro Experiment • Comparison of batch execution time using different grouping strategies • Distinct batches • Combined in a HIT-Bundle 0 25 50 75 100 0 1000 2000 3000 4000 Time (seconds) #HITsRemaining B6 − Bundle B7 − Bundle B6 B7 8
  • 9. Proposed CPS Architecture Crowdsourcing Decision Engine HIT-Bundle Manager Multi-Tenant Crowd-Powered System Crowdsourcing Platform Progress Monitor API HIT Scheduler Human Workers c1 a1b3.. Queue Crowdsourcing App HIT Collection and Reward HIT Results Aggregator HIT Manager Scheduler External HIT Page Batch A $$ Batch B $$$ Batch C $ .. Batch Catalog HIT-Bundle Creation/Update Batch Merging StatusMETA System Crowdsourced queries Batch Input Merger Resource Tracker config_file 9
  • 10. Scheduling for the Crowd Design Guidelines • (R1) Runtime Scalability: Adopt a runtime scheduler that a) dynamically adapts to the current availability of the crowd, and b) scales to make real-time scheduling decisions as the work demand grows higher • (R2) Fairness: The scheduler must provide a steady progress to large requests without blocking or starving, the smaller requests • (R3) Priority: The scheduler must be sensitive to clients who have higher priority (e.g., those who pay more) • (R4) Human Aware: Unlike machines, people performances are impacted by many factors including context switching, training effects, boringness, task difficulty and interestingness 10
  • 11. (Weighted) Fair Scheduler • Fair Scheduling FS (R1) (R2): • Keep track of how many tasks per batch are currently assigned running_tasks • Assign task with min running_tasks • The Weighted Fair Sharing WFS variant (R3): • Compute a weight, based on priority (e.g., price) • weight(Bj) = p(Bj)/sum(p(B)) • Assign task with
 min running_tasks/weight • Pros. ensures that all the batches receive proportional number of workers available • Cons. We don’t satisfy (R4) Human Awareness HIT-Bundle 7 tasks running 1. get_task() FS: return( ) WFS: return( ) 2. p=0.1$ w= 0.5 p=0.05$ w= 0.25 p=0.05$ w= 0.25 11
  • 12. Worker Context Switch Micro Experiment • We run a HIT Bundle with heterogenous tasks • Compute average execution time for each HIT • RR: Round Robin, task type changes every time • SEQ10 / SEQ25: Task types are alternated every 10, respectively 25 tasks • The mean task execution time is significantly lower for SEQ25 ● ● ● ● ● ● ● ● ● ** (p−value=0.023)** (p−value=0.023) 20 40 60 RR SEQ10 SEQ25 Experiment Type ExecutiontimeperHIT(Seconds) RR SEQ10 SEQ25 12
  • 13. Worker Conscious Fair Scheduling WCFS • Goal: Reduce the context switch introduced by having the worker continuously switch tasks types • We modify Fair Sharing with Delayed Scheduling [Zaharia et al. 2010] • A task will give up its priority K times until a worker who just completed a similar task is available again • Pros. we satisfy all our design requirements. A worker receives longer sequences of similar tasks • Cons. Need to set K 13
  • 14. Experiments Controlled Setup • On Amazon Mechanical Turk (no simulations) • HIT-Bundle with 5 different task types • We artificially ensure that we have num_workers >10 before starting an experiment • We compare against basic schedulers First In First Out (FIFO), Round Robin (RR), Shortest Job First (SJF) 14
  • 15. Controlled Experiments Latency All experiment are run in parallel FIFO order [B1, B2, B3, B4, B5] SJF order [B4, B3, B5, B2, B1] based on previous evidence • FIFO finishes jobs one after the other • Wile SJF finishes the shortest jobs first • FS and RR offer a balanced workforce 0 500 1000 1500 2000 B1 B2 B3 B4 B5 Batch Time(Seconds) FIFO FS RR SJF (a) Batch Latency 0 500 1000 1500 2000 FIFO FS RR SJF Scheduling Scheme Time(Seconds) (b) Overall Experiment Latency 15
  • 16. 0 300 600 900 B1 B2 B3 B4 B5 Batch Time(seconds) B2:$0.02 B2:$0.05 (a)Vary The Price 0 250 500 750 1000 B1 B2 B3 B4 B5 Batch Time(seconds) 10 workers 20 workers (b) Vary The Workforce Experiments Varying the Control Factors Weighted Fair Scheduler is used • (a) Effect of increasing B2’s priority (Price) on batch execution time • B2 executes faster • (b) Effect of varying the number of crowd workers involved in the completion of the HIT batches • The load is rebalanced (albeit, with different proportions) but all batches had a speed increase 16
  • 17. Experiments in the Wild Execution Trace 0 10 20 30 0 10 20 30 0 10 20 30 FSIndividualBatchesWCFS 12:20 12:30 12:40 12:50 Time #ActiveWorkers
  • 18. Conclusions • Batch starvation in crowdsourcing is problematic for requesters • We introduce a new scheduling layer that shares a pool of crowd workers among multiple tenants of a crowd-powered system • We perform evaluations in a real setup with real workers • We show that an HIT-Bundle increases the overall throughput • Our technique (Worker Conscious Fair Sharing), inspired from large scale data processing frameworks, minimises context switch • Toward Service Level Agreement aware scheduling for crowdsourcing platforms. Code: https://github.com/XI-lab/HIT-Scheduler