How is Data Made? From Dataset Literacy to Data Infrastructure Literacy
1. How is Data Made?
From Dataset Literacy to
Data Infrastructure Literacy
30th June, Web Science 2015, University of Oxford
Jonathan Gray | jonathangray.org | @jwyg
12. Jonathan Gray (2012) “What Data Can and Cannot Do”. The Guardian. Available at:
http://www.theguardian.com/news/datablog/2012/may/31/data-journalism-focused-critical
13. Jonathan Gray (2012) “What Data Can and Cannot Do”. The Guardian. Available at:
http://www.theguardian.com/news/datablog/2012/may/31/data-journalism-focused-critical
16. – Franklin v. State of Georgia, 69 Ga. 36; 1882 Ga
“We cannot conceive of a more impartial and truthful
witness than the sun, as its light stamps and seals the
similitude of the wound on the photograph put before
the jury; it would be more accurate than the memory of
witnesses, and as the object of all evidence is to show
truth, why should not this dumb witness show it?”
18. Critical literacy to read images:
!
• How is the camera set up to take shots?
• What is captured and how?
• What is not captured?
• How does equipment mediate the image?
• Selection, framing, arrangement, post-
production?
19. Instead of the camera, the elaborate sprawl
of public information systems.
25. Datasets are generated by a mixture of social and
technical processes, including e.g.:
!
• Laws and policies
• Administrative protocols
• Registration procedures
• Instruments and equipment
• Software systems
• Financial audits
• Feedback systems
• Management systems
• Metadata from digital services
• Standards bodies/standardisation procedures
26. Data literacy is not just about
knowing how to use data analysis software
or understanding statistics..
27. But also understanding methods, rationales,
assumptions, definitions, technologies, institutions,
through which datasets were generated.
34. Gray. J. & Davies, T. (2015) “Fighting Phantom Firms in the UK: From Opening Up Datasets to
Reshaping Data Infrastructures?”. Available at SSRN: http://ssrn.com/abstract=2610937
35. In case of campaigning around company ownership,
the disclosure of existing datasets was not enough.
36. Civil society organisations had to undertake a more
creative, sustained and holistic engagement with
shaping and influencing the development of data
infrastructures as socio-technical systems.
37. This included research and advocacy around:
!
• Costs, functionalities and user interfaces of
software systems that would run the register;
• Changes to primary and secondary legislation;
• Additional administrative requirements and their
impacts on different actors inside and outside the
public sector.
38. Campaigners had to look beyond the question
of what information is released, towards the
question of what information is collected and
generated by the public sector in the first place,
how this is information is generated through
data infrastructures.
42. Bruno, I. and Didier, E. and Vitale, T. (2014) “Statactivism: Forms of Action between Disclosure
and Affirmation”. Available at SSRN: http://ssrn.com/abstract=2466882
43. Not just blanket critique or withdrawal of
quantification and “metrification”.