This presentation was provided by Scott Ziegler of Louisiana State University during the NISO Virtual Conference, Open Data Projects, held on Wednesday, June 13, 2018.
Ziegler Open Data in Special Collections Libraries
1. Open Data
in Special Collections Libraries;
or, How Can We Be Better Than Data Brokers?
Scott Ziegler
Louisiana State University Libraries
--
NISO Virtual Conference: Open Data Projects
June 13, 2018
2. Open Data
Open data is data that can be freely used, re-used and
redistributed by anyone - subject only, at most, to the
requirement to attribute and sharealike.
-Open Data Handbook
(http://opendatahandbook.org/guide/en/what-is-open-data/)
@ScottLZiegler 2
3. Examples of Open Data
Civic
● Philadelphia Open Data (https://www.opendataphilly.org/ )
● Baton Rouge Open Data (https://data.brla.gov/)
Weather
● National Weather Service (https://www.weather.gov/)
● Louisiana Office of State Climatology (http://www.losc.lsu.edu/)
3@ScottLZiegler
4. Open Historic Data
As a subset of open data, open historic data is free for anyone to use for any
purpose and is created from historic material.
Specifically, I’ll be focusing on data created from historic material held within
special collections libraries.
4@ScottLZiegler
8. From Data to Product
Data Product
(This is the part we supply)
(Usually, lots of work needs to go here)
8@ScottLZiegler
9. We Don’t Open Everything
Cultural Sensitivity
Libraries and Archives have
material that represent groups
in ways that are racist, sexist,
etc.
Privacy
Personally identifiable
information about living
individuals
9@ScottLZiegler
11. Meanwhile, Out in the World
Algorithms of Oppression
Safiya Noble
Automating Inequality
Virginia Eubanks
Weapons of Math Destruction
Cathy O’Neil
11@ScottLZiegler
12. Meanwhile, Out in the World
Equifax Breach
(Reported)
Cambridge Analytica
(Reported)
September 2017 March 2018
Mark Zuckerberg
testifies before
Congress
April 2018
European Union
implements new
data collection
regulations
May 2018 June 2018
Facebook gives
data to telecom
firms
(Reported)
12@ScottLZiegler
13. Data Brokers
Collect information about individuals from a wide variety of sources
Package data to create a profile of a person
Sell this package to advertisers, credit agents, government entities
13@ScottLZiegler
14. So, Are We Better
Than Data Brokers?
14@ScottLZiegler
15. Intentions and Subjects
Our intentions are better
● Research not personal profit
Our subject is different
● Individuals we deal with are often historical
15@ScottLZiegler
16. Intentions and Subjects
Intent:
Intent is not particularly important.
Outcomes and results are important.
- Safiya Nobel, Algorithms of Oppression
Harm should be understood in wider terms than just individuals
16@ScottLZiegler
18. Taking Advantage of the Help Already Out There
Benefit from the expertise of others
● Bring the writings of humanities/social science to the development team
18@ScottLZiegler
19. Standardize the Practice of Asking For Help
Representation Officers
● Person in charge of investigating who is being represented in a digital project
● Research possible partners from that group/community
Tie this closely to the role of outreach/promotion.
● We want to act as though the people being described will be looking closely
at the description
19@ScottLZiegler
20. Standardize a Path for Feedback and Adjustments
Clarify why we did what we did
● “During the planning phase of this project, we worked with the following scholars and community
groups”
And how anyone can suggest we do it otherwise
● Perhaps a form and/or contact email address
Explain what the process looks like for considering changes
● Though we might not be able to accommodate every request for modification, these are the steps
we will take after we receive your comments
20@ScottLZiegler
21. This Is Going to Be Lots of Work
● It’s work to read books
● It’s work to apply these ideas to our day jobs
● It’s work to listen to criticism of our projects
● It’s work to try to get people to participate
And Also:
● It’s work to help us
● It’s work to explain things to us in a way that we’ll understand
21@ScottLZiegler
22. Thank You
Thanks for listening
Please reach out if you want to talk more
@ScottLZiegler
sziegler1@lsu.edu
22
Notes de l'éditeur
Talking about
(1) open data in special collections: what this is, why we do it
(2) what it means to work with data in the current social context, in which a shocking amount of data about is gathered, packaged and sold
After talking about one specific example of using open historic data to open new types of interaction with archival material, I’ll use the case of data brokers, people and organizations that collect and sell data, as a means for thinking about what we shouldn’t be doing with open historic data.
I rely heavily on the work and thoughts of many people. I’ll argue that doing so is the only way to ensure that we’re better than data brokers.
While my team and I were busy playing with all this data, a lot was happening out in the world.
Beyond the legal constraints (HIPPA, COPPA, etc), and traditional archival concerns of privacy, we’re also concerned about how our data is
While my team and I were busy playing with all this data, a lot was happening out in the world.
Significant scholarship on the misuses of data was released.
And to compliment the scholarship
“Oh shit, this affect us”
Also in April, CNBC reported that data analytics firm CubeYou has app suspended from Facebook for sharing data with advertises, suggesting that what Cambridge Analytica was doing is much more common that we knew.
On Tuesday of this week, as I was putting this slide together, Mark Zuckerberg was testifying to congress about the use of Facebook data.
It’s in the context that I want to weigh what I do against the creepy business of data brokers, how gather personal information repackage and sell it to anyone for any purpose.
Cambridge Analytica doesn’t consider itself a data broker, I should mention.
There are plenty of shades of creepiness, to be sure.
I’m using the data broker example as a flash point against which to compare my own work.
I use it in general terms to mean people who benefit from data of others, by the representation of others.
And what bothers me is that practice of representing other people and benefiting from this representation
Here’s one way of thinking about it:
Basically, we’re not creating/sharing this data to make money, or to make anyone’s life harder
And the people that we’re describing are usually deceased, and probably cannot be harmed by our work.
Dr Nobel writes, “Many people say to me, ‘But tech companies don’t mean to be racist; that’s not their intent.’ Intent is not particularly important. Outcomes and results are important.”
I’ve never thought of myself as someone how doesn’t take the expertise of others seriously, but I need to face the fact that I haven’t always acted like I do.
There have been many calls to combine technologists with social scientists and humanists. In my library, I work very closely with the development team, many of whom have a background in the humanities.
Beyond bringing the writings of others to us, standardize the way that we ask outside scholars, community members for assistance
Build things with the assumption that they’ll be seen by the people being described in them
I’m not sure how well this work for everything, but a place to start is to be explicit with any digital project we put out that we understand that we’re describing people who are not here to speak for themselves.
Make the process explicit: “We worked with the following scholars for advice and guidance”
When we are doing our best, there is no reason to be opaque or mysterious
Following Cathy O’Neil’s suggestions from her book Weapons of Math Destruction, I’d like to work on projects in which the description of people updated as needed, the decisions are transparent and the conclusion and assumptions are open to scrutiny.
And this is just from our point of view.
It’s always work to write the books and to continuously point out to us that we inadvertently harm people
It’s work to help us
It’s work to explain things to us in a way that will will understand
Without a way to pay these scholars for their expertise, I’m not sure how many will be willing to help. Identifying individuals based on their interests is the first step I have in mind
However, before anything else a commitment is necessary from us: a commitment to read what we can and learn what we can; to bring the literature into the shop and struggle with the adoption of it in our everyday work.