Short paper presented at Linked Science workshop, ISWC 2015.
http://linkedscience.org/events/lisc2015/
http://linkedscience.org/wp-content/uploads/2015/04/paper5.pdf
Using the Web as a Data Source: Challenges for Linked Science
1. Using the Web as a Data
Source: Challenges for
Linked Science
Carsten Keßler Hunter College, City University of New York
http://carsten.io @carstenkessler
15. Does LISC work here?
• Semantic annotation, linking and archiving is problematic:
• large datasets
• in constant flux
• legal restrictions
• terms of service may change at any time
• legislation required to permit archiving
16. Can we fix sampling bias?
• Probably not, but…
• need to be aware of it
• if possible, test against other datasets to check for validity of
conclusions
• can we somehow encode this, maybe as provenance?
17. Should we enforce
data publication?
• Tradeoff between:
• Learning about interesting
new research going on at
those companies, without
any ability to verify or
reproduce results
• Implementing and
enforcing strict rules for
conference and journal
outlets that would force
authors to provide the
data used in their analysis
18. Thank you.
Carsten Keßler Hunter College, City University of New York
http://carsten.io @carstenkessler