Presentation by Dr. Mike Dobson of TeleMapics LLC on Crowdsourcing and Map Compilation. This invited presentation was delivered at the New York Geospatial Summit on June 16, 2011 in Skaneateles, New York.
33. Structural Problems With Crowdsourcing Issues for Crowdsoucring Sub-Issues Weaknesses 1. How many people will contribute? b. How many collaborators are capable of contributing valid data? c. Not enough credible data gathers to validate the data.
34. Structural Problems With Crowdsourcing Issues for Crowdsoucring Sub-Issues Weaknesses 2. Where are contributors located? b. How does this distribution match the desired map coverage? c. Not enough distribution/coverage to meet the need for comprehensive, current data. No reliable way to redirect their spatial focus
35. Structural Problems With Crowdsourcing Issues for Crowdsoucring Sub-Issues Weaknesses 3. How long does it take to get quality results over large areas of coverage? b. Number and distribution of contributors not manageable. c. Unknown, but lack of progress will become a significant problem for contributors. Even when you get the data, the lack of standards and effective quality control is a significant problem.
36. Comparison Meridian 2 OSM 03/08 OSM 03/09 OSM 10/09 Total Length (m) 261,678,924 169,608,142, 267,950,190 324,127,812 Total Length With Attributes 205,621,010 54,581,704 95,332,646 123,632,213 Number of âcomplete cells 112,385 30,702 57,622 73,092 Number of âcompleteâ cells with attributes 92,467 6,670 13,595 19,719 Haklay and Ellul 2010
Yep, this was my room at Finger Lakes Lodging â just kidding !
So pay attention if you want to see those sheets on your bed.
Outline of presentation
One more thing.
Way back in the last century---
But even today, we continue to screw up our map data. A bridge to Cozumel? Great idea but today it does not exist.
Wow, look at the beautiful Alpsee across from Crazy Ludwigâs Castle - Oh, itâs not shown on the map.
The roads I am driving on must be imaginary ones, as they are not shown on the TomTom PND.
Gee â new ways to screw up maps!
This is an example of some recent field work in determining locations. The maps are from the Starbucks corporate website. We assumed that Starbucks would know the addresses of their shops and they did. In addition, their addresses provided the nearest cross street. The mapping (and geocoding). is by Microsoft, while I added additional information . The squat green placard is where Starbucks and Microsoft say the shops are located and was on the original graphic. I added two other pins to the same display. The red pin shows the actual location of the shop (I recorded a GPS coordinate at the door of each shop). The green pin shows the location of the cross street provided in the listing. In the limited sample of 8 shops, 6 were incorrectly located and the average errors was .6 miles. (see our blog for more information). If Starbucks canât help you find its own shop, what can we find using third-parties who have less information to help determine where businesses are located.
How about that, Google has located the coastal town of Imperial Beach, CA in the inland town on Lemon Grove, CA
Poor Nokia and OVI, they canât even find Providence, Rhode Island.
Navteq shows traffic flowing on a portion of I-195 that was closed months earlier and no traffic on the new route if I-195. How is it possible
Look, having fixed the problem once, Navteq, at a later date, even showed route signs on the closed highway and traffic still flowing at normal speeds.
Of course, due to the use of out-of-date imagery, it looks like traffic is flowing on the roads. Unfortunately, the Rhode Island DOT has been deconstruction the road and, at the date this screen shot was captured the overpasses no longer linked the road segments.
How many East Kikbride Expressways are there in the UK. Actually, there is only one.
And this is it- multi-lane, median, looks like an expressway should
Hmmmm. This is another East Kilbride Expressway reputed to be in the Orkney Islands. Just doesnât look quite like an expressway, does it.
Spending money will not solve this problem â and there is little money to spend. So, why not try crowdsourcing?
Contrast social search and local-social search
Definition
What it is and isnât
Well, this isnât quite true either - You donât get enough eyes and you specifically may not get enough local eyes. So letâs see why this is incorrect. And WHY YOU NEED LOCAL EYES.
.â Suroweicki p 278 2004,2005
Suroweicki 2004,2005 , various
Data quality considerations
Well, has it worked?
Source Muki Haklay 2009- UK data - It appears that approximately 50% of the data in the OSM United Kingdom database were contributed by less than 30 of the contributors, raising questions on whether OSM is a good example of the notions behind crowdsourcing. (Graph from Haklay 2009)
The OSM experience at other locations (source Budhathoki 2010). Hmm, a lot of one-time contributors and a few workhorses thea contribute over 100,000 nodes. Are these workhorses contributing local knowledge, or merely digitizing satellite images?
Budhathoki et al. Hmmm. Looks like this is male dominated activity, with participants below 40, who have a lot of education and little GIS experience.
Here are some issues related to reliance on crowdsources data for map compilation.
From Haklay and Ellul (2010). OSM has lots of cells digitized but, as of the dates shown, has made less progress in attributing these data.
Ouch
We need to mix crowdsourcing with other compilation techniques,
In my opinion, this is the âgold standardâ
Ingredients
Active and Passive Input
Active input from TomTom users through MapShare
Passive input
TomTom as of 2010 had collected approximately three trillion GPS points using Passive Community Input. Tele Atlas claims that the miles of GPS traces it receives in the United States in one day are equivalent to the total miles of roads in the United States and the data collected in Europe represent four times the total kilometers of road in Europe.
Relative accuracy using these techniques - C = standard map compilation D = Driving roads/streets A = Active UGC P= Passive UGC Using the Hybrid model (CDPA) will generate more robust and relevant databases for navigation, advertisings (better pois and locations) than other alternatives.
How Google likely approaches the hybrid problem. Satellite and other aerial imagery drive the process.
But in Googleâs revision process, User Generated Content (crowdsourcing) drives the process.
How Googleâs tools work together in an attempt at increasing data accuracy
Notice that Googleâs Tower of Power benefits from business input. Google is an advertising agency and its maps benefit its Adwords program and benefit from it.
Mapping systems need external format and there is nothing like a customer to shape that interaction into critical inputs for improvement. Systems like OSM that do not have direct customers may not be as successful as its commercial competitors who can benefit from their distribution channel and customer base.
Googleâs approach is potentially market leading, but it is not quite âprime timeâ.
When you use the Google Map Maker tools, they can sometimes help you learn a new language. The gift that keeps on giving!
Future roles for UGC in map updating will likely be limited, but very important nontheless.
Both classes of benefits need to be integrated to produce the best maps
We need to find a way to make this model work.
Here are other tasks on which we need to focus.
We spoke to this concept earlier, but here is it is in graphics. If the data and processes do not line up (oh my gosh, it looks like a syzygy), the search will not provide authoritative results. More importantly, you need to mine all of these areas to discover the information that can help make you maps and their function better than they are today.
The various tools used to comb data from social networks are ones we should be using in map compilation.
Yep, we need road-side imaging in addition to crowdsourcing.
This is from a $1.99 app for an iPhone called Theodolite Pro. The photo can be automatically sent to Google Maps for geocoding and display. In the image, you can see the address, its coordinates, its elevation and several other pieces of useful information. We need to explore the capability of these devices for augmenting field data collection â possibly by our customers or intended audience.
Yes, we canâŚâŚooops
If I told you that there was a fellow with a wrecking ball up the hill a mile away tearing down a library and that at 4:32 PM the wrecking ball would snap its chain, roll down the hill and end up in your trunk. You would say with some disdain âYeah, right.â Well, take a look at the next image.
So much for the odds. You can use crowdsourcing to your advantage when building spatial databases.