Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Open Data
Not Just Good. Better
Open Data is Good!
http://www.flickr.com/photos/stolidsoul/433129708/sizes/o/in/photostream/
But we’re not the ones
we need to convince
http://okfestival.org/open-government-data-camp/
Most people don’t
care about ‘open’
http://www.flickr.com/photos/erlin1/9312646298/sizes/l/in/photostream/
Even though open
data is better
(than closed/proprietary)
Even though open
data is better
(than closed/proprietary)
• Better for innovation
Even though open
data is better
(than closed/proprietary)
• Better for innovation
• Better for competition
Even though open
data is better
(than closed/proprietary)
• Better for innovation
• Better for competition
• Better for ef...
Even though open
data is better
(than closed/proprietary)
• Better for innovation
• Better for competition
• Better for ef...
But open has a secret
weapon
http://www.flickr.com/photos/x-ray_delta_one/8493335701/sizes/l/in/photostream/
It’s better quality too
http://www.flickr.com/photos/infusionsoft/4484373179/sizes/l/in/photostream/
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often dupl...
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often dupl...
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often dupl...
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often dupl...
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often dupl...
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often dupl...
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often dupl...
A concrete example:
corporate networks
Hugely important
(and valuable)
• The dataset we need to understand
the corporate world
• Who we (or the government) is re...
But proprietary datasets
on this are problematic
• Expensive, so relatively few users
• Huge gaps in data
• Uses proprieta...
But proprietary datasets
on this are problematic
• Expensive, so relatively few users
• Huge gaps in data
• Uses proprieta...
The open data
alternative
The open data
alternative
Enabled by a grant from the
Alfred P Sloan Foundation
Data from disparate
public sources
finding
new
insights
no such
company
...and
errorstoo
no such
company
What a modern financial
company looks like (highly simplified
& truncated views)
What a modern financial
company looks like (highly simplified
& truncated views)
What a modern financial
company looks like (highly simplified
& truncated views)
What a modern financial
company looks like (highly simplified
& truncated views)
private
unlimited
company
Crowd-sourcing?
Ninja-sourcing!
http://www.flickr.com/photos/danielygo/5531024732/sizes/l/in/photostream/
The company that wants to know
your network... every friend...
every interaction
http://www.flickr.com/photos/jeffmcneill/5...
Facebook, Inc
This is what we got from
their SEC filings as text
Facebook, Inc
(and turned into data)
This is what we got from
their SEC filings as text
Facebook, Inc
Pinnacle Sweden AB
Vitesse LLC
Facebook Operations LLC
Facebook Ireland Limited
Edge Network Services Limite...
Facebook Ireland Limited
Edge Network Services Limited
Pinnacle Sweden AB
Vitesse LLC
Facebook Operations LLC
Andale Acqui...
Facebook Ireland Limited
Edge Network Services Limited
Then we started
investigating
Facebook, Inc
Facebook, Inc
Facebook Ireland Limited Edge Network Services Limited
Facebook, Inc
Facebook Ireland Limited Edge Network Services Limited
Facebook Cayman
Holdings Unlimited
IV
Facebook Cayman...
Want to help?
jobs@opencorporates.com
investigators@opencorporates.com
Open Corporate Data: not just good, better
Open Corporate Data: not just good, better
Open Corporate Data: not just good, better
Prochain SlideShare
Chargement dans…5
×

Open Corporate Data: not just good, better

5 460 vues

Publié le

Presentation given by Chris Taggart, CEO and Co-Founder of OpenCorporates at Open Knowledge Festival, Geneva, September 2013

Discussing benefits and quality of open corporate hierarchy (network) data

  • Soyez le premier à commenter

Open Corporate Data: not just good, better

  1. 1. Open Data Not Just Good. Better
  2. 2. Open Data is Good! http://www.flickr.com/photos/stolidsoul/433129708/sizes/o/in/photostream/
  3. 3. But we’re not the ones we need to convince http://okfestival.org/open-government-data-camp/
  4. 4. Most people don’t care about ‘open’ http://www.flickr.com/photos/erlin1/9312646298/sizes/l/in/photostream/
  5. 5. Even though open data is better (than closed/proprietary)
  6. 6. Even though open data is better (than closed/proprietary) • Better for innovation
  7. 7. Even though open data is better (than closed/proprietary) • Better for innovation • Better for competition
  8. 8. Even though open data is better (than closed/proprietary) • Better for innovation • Better for competition • Better for efficiency
  9. 9. Even though open data is better (than closed/proprietary) • Better for innovation • Better for competition • Better for efficiency • Better for sharing (esp cross- organisation or cross-border)
  10. 10. But open has a secret weapon http://www.flickr.com/photos/x-ray_delta_one/8493335701/sizes/l/in/photostream/
  11. 11. It’s better quality too http://www.flickr.com/photos/infusionsoft/4484373179/sizes/l/in/photostream/
  12. 12. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  13. 13. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  14. 14. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  15. 15. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  16. 16. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  17. 17. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  18. 18. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  19. 19. A concrete example: corporate networks
  20. 20. Hugely important (and valuable) • The dataset we need to understand the corporate world • Who we (or the government) is really doing business with • Political influence/donations/lobbying • Tax/resource extraction • Corporate Governance • Credit risk
  21. 21. But proprietary datasets on this are problematic • Expensive, so relatively few users • Huge gaps in data • Uses proprietary IDs (so not clear what it’s refers to) • Restrictive licences • Opaque – no info re calculations, provenance or confidence
  22. 22. But proprietary datasets on this are problematic • Expensive, so relatively few users • Huge gaps in data • Uses proprietary IDs (so not clear what it’s refers to) • Restrictive licences • Opaque – no info re calculations, provenance or confidence Result: low-quality data
  23. 23. The open data alternative
  24. 24. The open data alternative Enabled by a grant from the Alfred P Sloan Foundation
  25. 25. Data from disparate public sources
  26. 26. finding new insights
  27. 27. no such company
  28. 28. ...and errorstoo no such company
  29. 29. What a modern financial company looks like (highly simplified & truncated views)
  30. 30. What a modern financial company looks like (highly simplified & truncated views)
  31. 31. What a modern financial company looks like (highly simplified & truncated views)
  32. 32. What a modern financial company looks like (highly simplified & truncated views) private unlimited company
  33. 33. Crowd-sourcing?
  34. 34. Ninja-sourcing! http://www.flickr.com/photos/danielygo/5531024732/sizes/l/in/photostream/
  35. 35. The company that wants to know your network... every friend... every interaction http://www.flickr.com/photos/jeffmcneill/5260815552/sizes/l/ why bother?
  36. 36. Facebook, Inc This is what we got from their SEC filings as text
  37. 37. Facebook, Inc (and turned into data) This is what we got from their SEC filings as text
  38. 38. Facebook, Inc Pinnacle Sweden AB Vitesse LLC Facebook Operations LLC Facebook Ireland Limited Edge Network Services Limited Andale Acquisition Corp (and turned into data) This is what we got from their SEC filings as text
  39. 39. Facebook Ireland Limited Edge Network Services Limited Pinnacle Sweden AB Vitesse LLC Facebook Operations LLC Andale Acquisition Corp Then we started investigating Facebook, Inc
  40. 40. Facebook Ireland Limited Edge Network Services Limited Then we started investigating Facebook, Inc
  41. 41. Facebook, Inc Facebook Ireland Limited Edge Network Services Limited
  42. 42. Facebook, Inc Facebook Ireland Limited Edge Network Services Limited Facebook Cayman Holdings Unlimited IV Facebook Cayman Holdings Unlimited II Facebook Cayman Holdings Unlimited lll Facebook Ireland Holdings Randomus Investments Limited Facebook International Holdings II Ltd Facebook International Holdings I Ltd Facebook Cayman Holdings Unlimited I
  43. 43. Want to help? jobs@opencorporates.com investigators@opencorporates.com

×