SlideShare a Scribd company logo
1 of 29
Download to read offline
Crowdsourcing satellite imagery:
          study of iterative vs. parallel models
                           Nicolas Maisonneuve, Bastien Chopard




                                                          Twitter: nmaisonneuve




Friday, September 21, 12                                                          1
Damage assessment after a humanitarian crisis




Friday, September 21, 12                                           2
Port-au-prince: 300K buildings assessed
                           in 3 months for 8 UNOSAT experts




Friday, September 21, 12                                             3
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?




Friday, September 21, 12                                             4
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           5
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           6
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           7
Tested Collaborative Models (1/2)
                                  iterative model




                       e.g. wikipedia, open street map, assembly lines
Friday, September 21, 12                                                 8
Tested Collaborative Models (2/2)
                                   parallel model




                                                 aggregation




             e.g. voting systems in society, distributed computing
Friday, September 21, 12                                             9
Tested Collaborative Models (2/2)
                                   parallel model




            old version (17th to mid 20th century): when computers were human/women
            (Mathematical Table project - (1938 -1948)
Friday, September 21, 12                                                              10
Qualitative comparison
                                    Iterative                    Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces




Friday, September 21, 12                                                               11
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration




Friday, September 21, 12                                                               12
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration


               quality                                    redundancy + diversity of
                              sequential improvement
            mechanism                                     opinions




Friday, September 21, 12                                                               13
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration


               quality                                    redundancy + diversity of
                              sequential improvement
            mechanism                                     opinions

                                                          useless redundancy for
                              path dependency effect +
               side effect                                obvious decisions + pb of
                              sensitivity to vandalism
                                                          aggregation




Friday, September 21, 12                                                               14
Controlled Experiment: web platform




                           Interface/instruction for the Parallel model

Friday, September 21, 12                                                  15
on 3 maps with different topologies
                    (annotated by 1 UNITAR expert)




Friday, September 21, 12                                16
Participants used for the experiments:
              Mechanical Turk as simulator




Friday, September 21, 12                          17
Data Quality Metrics

                 Quality of the collective output
                 • type I errors = p(wrong annotation)
                 • type II errors = p(missing a building)
                 • Consistency

                 Analogy with the information retrieval field:
                 • Precision = p(an annotation is a building)
                 • Recall = p(a building is annotated)
                 • F-measure = score mixing recall + precision
                 • (metrics adjusted with tolerance distance)



Friday, September 21, 12                                         18
Methodology for parallel model
                     Step 1 - collecting independent contribution:
                     N for (map1, map2, map3) = (121,120,113)




Friday, September 21, 12                                             19
Methodology for parallel model
                       Step 2 - for each map,
       generating the set of groups of m=[1 to N] participants


  m=1


  m=2



m=3


Friday, September 21, 12                                         20
Methodology for parallel model
         Step 3 - for each group: aggregating + computing quality

 groups
of m = 2

                           Spatial Clustering of points + quorum




                 Compute Data Quality with Gold Standard

                             Precision          Recall             F-measure

Friday, September 21, 12                                                       21
The more = the better?
                              (parallel model)
      avg. F-measure




    yes but until some points..
    • (Adding more people wont change the consensus panel)
    • Limitation of Linus’ law (compared to iterative model e.g.
    openstreetmap)
    • Wisdom != skill: we can’t replace training by more people
Friday, September 21, 12                                           22
Methodology for Iterative model




                           sample of an iterative process for map3




Friday, September 21, 12                                             23
Methodology for Iterative model




 n instances
 of about m
  iterations

      Collected data for map1, map2, map3 = 13, 21,25
              instances of about 10 iterations
Friday, September 21, 12                                24
Methodology for Iterative model
            Step 2- for each iteration, we compute the precision,
                     recall, f-measure of all the instances




                           Precision   Recall       F-measure

Friday, September 21, 12                                            25
Intrepretation of results / Comparison
               on data quality

                           Parallel                               Iterative

   Accuracy -
   wrong                   consensual results (*)                 error propagation
   annotations
                                                                  accumulation of
   Accuracy -
                           useless redundancy on                  knowledge driving
   missing
                           obvious buildings                      attention on
   buildings
                                                                  uncovered area
   Consistency             redundancy                             naive last = best
  (*) but parallel < iterative in difficult cases (map 2) (lack of consensus)

Friday, September 21, 12                                                              26
Side-objective: Measuring how the crowd spatially agrees
          Method: taking randomly 2 participants and measure their
        spatial inter-agreement (e.g. ratio of points matching) and repeat
                              the process N time




Friday, September 21, 12                                                     27
Side-objective: Measuring how the crowd spatially agrees
          Method: taking randomly 2 participants and measure their
        spatial inter-agreement (e.g. ratio of points matching) and repeat
                              the process N time




                           way to measure the intrinsic difficulty of a task
                                  (map 1 = easy , map 2 = quite hard)
Friday, September 21, 12                                                      28
future tracks
                     Impact of the organization beyond data
                     quality
                     • Energy / Footprint to collectively solve a problem,
                     • Participation sustainability,
                     • On Individual behavior (skill Learning & Enjoyment)
                     Skill complementarity:
                     Is the best group of 3 people the best 3 people at the
                     individual level? data says no!
                     Other symbolic organisations / mechanism:
                     • human cellular automata (cell = 1 person, resubmit a task at
                     time t, because influenced by peers results generated at time
                     t-1)
                     • Integration of Game design / Gamification
Friday, September 21, 12                                                              29

More Related Content

Viewers also liked

Mapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street ViewMapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street View
Nicolas Maisonneuve
 
Team activity analysis / visualization
Team activity analysis / visualizationTeam activity analysis / visualization
Team activity analysis / visualization
Nicolas Maisonneuve
 

Viewers also liked (10)

Observer service
Observer service Observer service
Observer service
 
The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.
 
a dynamic web feed system
a dynamic web feed systema dynamic web feed system
a dynamic web feed system
 
Social Attention analysis
Social Attention analysisSocial Attention analysis
Social Attention analysis
 
Mapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street ViewMapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street View
 
Team activity analysis / visualization
Team activity analysis / visualizationTeam activity analysis / visualization
Team activity analysis / visualization
 
NoiseTube project
NoiseTube projectNoiseTube project
NoiseTube project
 
NoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phonesNoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phones
 
Orientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignmentOrientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignment
 
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeMatching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Crowdsourcing satellite imagery (Talk at Giscience2012)

  • 1. Crowdsourcing satellite imagery: study of iterative vs. parallel models Nicolas Maisonneuve, Bastien Chopard Twitter: nmaisonneuve Friday, September 21, 12 1
  • 2. Damage assessment after a humanitarian crisis Friday, September 21, 12 2
  • 3. Port-au-prince: 300K buildings assessed in 3 months for 8 UNOSAT experts Friday, September 21, 12 3
  • 4. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Friday, September 21, 12 4
  • 5. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 5
  • 6. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 6
  • 7. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 7
  • 8. Tested Collaborative Models (1/2) iterative model e.g. wikipedia, open street map, assembly lines Friday, September 21, 12 8
  • 9. Tested Collaborative Models (2/2) parallel model aggregation e.g. voting systems in society, distributed computing Friday, September 21, 12 9
  • 10. Tested Collaborative Models (2/2) parallel model old version (17th to mid 20th century): when computers were human/women (Mathematical Table project - (1938 -1948) Friday, September 21, 12 10
  • 11. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces Friday, September 21, 12 11
  • 12. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration Friday, September 21, 12 12
  • 13. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration quality redundancy + diversity of sequential improvement mechanism opinions Friday, September 21, 12 13
  • 14. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration quality redundancy + diversity of sequential improvement mechanism opinions useless redundancy for path dependency effect + side effect obvious decisions + pb of sensitivity to vandalism aggregation Friday, September 21, 12 14
  • 15. Controlled Experiment: web platform Interface/instruction for the Parallel model Friday, September 21, 12 15
  • 16. on 3 maps with different topologies (annotated by 1 UNITAR expert) Friday, September 21, 12 16
  • 17. Participants used for the experiments: Mechanical Turk as simulator Friday, September 21, 12 17
  • 18. Data Quality Metrics Quality of the collective output • type I errors = p(wrong annotation) • type II errors = p(missing a building) • Consistency Analogy with the information retrieval field: • Precision = p(an annotation is a building) • Recall = p(a building is annotated) • F-measure = score mixing recall + precision • (metrics adjusted with tolerance distance) Friday, September 21, 12 18
  • 19. Methodology for parallel model Step 1 - collecting independent contribution: N for (map1, map2, map3) = (121,120,113) Friday, September 21, 12 19
  • 20. Methodology for parallel model Step 2 - for each map, generating the set of groups of m=[1 to N] participants m=1 m=2 m=3 Friday, September 21, 12 20
  • 21. Methodology for parallel model Step 3 - for each group: aggregating + computing quality groups of m = 2 Spatial Clustering of points + quorum Compute Data Quality with Gold Standard Precision Recall F-measure Friday, September 21, 12 21
  • 22. The more = the better? (parallel model) avg. F-measure yes but until some points.. • (Adding more people wont change the consensus panel) • Limitation of Linus’ law (compared to iterative model e.g. openstreetmap) • Wisdom != skill: we can’t replace training by more people Friday, September 21, 12 22
  • 23. Methodology for Iterative model sample of an iterative process for map3 Friday, September 21, 12 23
  • 24. Methodology for Iterative model n instances of about m iterations Collected data for map1, map2, map3 = 13, 21,25 instances of about 10 iterations Friday, September 21, 12 24
  • 25. Methodology for Iterative model Step 2- for each iteration, we compute the precision, recall, f-measure of all the instances Precision Recall F-measure Friday, September 21, 12 25
  • 26. Intrepretation of results / Comparison on data quality Parallel Iterative Accuracy - wrong consensual results (*) error propagation annotations accumulation of Accuracy - useless redundancy on knowledge driving missing obvious buildings attention on buildings uncovered area Consistency redundancy naive last = best (*) but parallel < iterative in difficult cases (map 2) (lack of consensus) Friday, September 21, 12 26
  • 27. Side-objective: Measuring how the crowd spatially agrees Method: taking randomly 2 participants and measure their spatial inter-agreement (e.g. ratio of points matching) and repeat the process N time Friday, September 21, 12 27
  • 28. Side-objective: Measuring how the crowd spatially agrees Method: taking randomly 2 participants and measure their spatial inter-agreement (e.g. ratio of points matching) and repeat the process N time way to measure the intrinsic difficulty of a task (map 1 = easy , map 2 = quite hard) Friday, September 21, 12 28
  • 29. future tracks Impact of the organization beyond data quality • Energy / Footprint to collectively solve a problem, • Participation sustainability, • On Individual behavior (skill Learning & Enjoyment) Skill complementarity: Is the best group of 3 people the best 3 people at the individual level? data says no! Other symbolic organisations / mechanism: • human cellular automata (cell = 1 person, resubmit a task at time t, because influenced by peers results generated at time t-1) • Integration of Game design / Gamification Friday, September 21, 12 29