SlideShare une entreprise Scribd logo
1  sur  57
Figures of the Many
Quantitative Concepts for Qualitative Thinking
Bernhard Rieder
Universiteit van Amsterdam
Mediastudies Department
Context
Terms like "big data", "computational social science", "digital humanities",
"digital methods", etc. are receiving a lot of attention.
They point to a set of practices for knowledge production: data analysis,
visualization, modeling, etc.
Instead of a totalizing search for a "logic" of data analysis, we could
inquire into the vocabulary of analytical gestures that constitute the
practice of data analysis.
A twofold approach to methods:
☉ Engagement, development, application => digital methods
☉ Conceptual, historical, and political analysis and critique => software studies
This presentation
How do we talk about data? How do we analyze them? What is our frame
of thought? How do we go further in terms of imagination, expressivity?
☉ 1 / Confronting "the many"
☉ 2 / Two kinds of mathematics
☉ Objects and their properties => Statistics
☉ Objects and their relations => Graph theory
Engage the theory of knowledge (epistemology) mobilized in data analysis,
but through the actual techniques and not generalizing concepts.
What styles of reasoning?
Hacking (1991) building the concept of "style of reasoning" on A. C.
Crombie’s (1994) "styles of scientific thinking":
☉ postulation and deduction
☉ experiment and empirical research
☉ reasoning by analogy
☉ ordering by comparison and taxonomy
☉ statistical analysis of regularities and probabilities
☉ genetic development
What kind of reasoning are we mobilizing in data analysis?
Is the history of styles of reasoning simply intellectual progress, or
adaptation to a changing world, or co-constitutive of that world?
What is our world like?
"It is hard to believe that we still have to absorb the same types of
actors, the same number of entities, the same profiles of beings, and
the same modes of existence into the same types of collectives as
Comte, Durkheim, Weber, or Parson [sic], especially after science and
technology have massively multiplied the participants to be cooked in
the melting pot." (Latour 2005, 260)
The proliferation of actors and facilitation of transversal connectivity have
lead to large and complex forms of socio-technical grouping and
structuring.
Forms of organization take the shape of (multi-sided) markets based
around technological platforms that facilitate transactions.
Social media use simple but flexible grammars of connectivity
(combination of point to point and list forms), exchange, and aggregation
that accommodate various practices and levels of scale.
The diversity of practices, contents, geographies, topologies, intensities,
motivations, etc. makes it hard to generalize and theorize dynamics of use.
1 / The many
Platforms like Twitter
boost opportunities for
connectivity between
various types of actors.
At the same time, they
produce detailed data
traces that are highly
centralized and searchable.
Quality / quantity
"One of my favorite fantasies is a dialogue between Mills and Lazarsfeld in which the former
reads to the latter the first sentence of The Sociological Imagination: 'Nowadays men often
feel that their private lives are a series of traps.' Lazarsfeld immediately replies: 'How many
men, which men, how long have they felt this way, which aspects of their private lives
bother them, do their public lives bother them, when do they feel free rather than trapped,
what kinds of traps do they experience, etc., etc., etc.' If Mills succumbed, the two of them
would have to apply to the National Institute of Mental Health for a million-dollar grant to
check out and elaborate that first sentence. They would need a staff of hundreds, and when
finished they would have written Americans View Their Mental Health rather than The
Sociological Imagination, provided that they finished at all, and provided that either of them
cared enough at the end to bother writing anything." (Maurice Stein, cit. in Gitlin 1978)
Theory vs. empiricism, macro vs. micro, qualitative vs. quantitative, inductive vs.
deductive, associative vs. formalistic, etc.
The promise of data analysis tools, applied to exhaustive (and cheap) data, is to
bridge the gap, to allow zooming, "quali-quanti" (Latour 2010).
“facts and statistics collected together for reference or analysis. See also datum.
- Computing: the quantities, characters, or symbols on which operations are performed by a
computer, being stored and transmitted in the form of electrical signals and recorded on
magnetic, optical, or mechanical recording media.
- Philosophy: things known or assumed as facts, making the basis of reasoning or
calculation.” (Oxford American Dictionary)
Define: data
Reasoning (OAD): "think rationally", "use one's mind", "calculate", "make sense
of", "come to the conclusion", "judge", "persuade", etc.
Reasoning as "giving reasons" – what counts as a good reason? What counts as a
good argument? As a proof? What is "good" knowledge?
Reasoning as a series of techniques, e.g. science, engineering, etc.
Why does the astronaut step into the space shuttle?
A short history of reasoning the "more"
Commercial Capitalism (13th +)
calculating for trade, arithmetic, sharing risk and profit in long-distance commerce
Rise of the Nation State (17th +)
"art of the state", mercantilism, scientific revolution
Industrialization (19th +)
urbanization, scientific management, large bureaucracies
☉ Fibonacci, "Liber Abaci", Fibonacci, Calculating with Arab numerals (Pisa, 1202)
☉ Unknown, "Arte dell'Abbaco", Practical arithmetic (Venice, 1478)
☉ Pacioli, "Summa de arithmetica, geometria, proportioni et proportionalità", Double entry
bookkeeping (Venice, 1494)
☉ William Petty & John Graunt, Political Arithmetick (17th century)
☉ Hermann Conring & Gottfried Achenwall, Statistik (17th & 18th century)
☉ Adolphe Quetelet, Statistical regularities and the "average man" (19th century)
☉ Francis Galton & Karl Pearson, Public health and eugenics (late 19th century)
Liber Abaci, Fibonacci, 1202
Calculation for accounting,
money-changing, insurance,
lending, measurement, etc.
"Having proved that there die about 3,506 persons at Paris unnecessarily, to the
damage of France, we come next to compute the value of the said damage, and
of the remedy thereof, as follows, viz., the value of the said 3,506 at 60 livres
sterling per head, being about the value of Algier slaves (which is less than the
intrinsic value of people at Paris), the whole loss of the subjects of France in that
hospital seems to be 60 times 3,506 livres sterling per annum, viz., 210,360
livres sterling, equivalent to about 2,524,320 French livres." (Petty 1655)
The Assurance of Lifes,
Charles Babbage, 1826
First life tables were
assembled in the 17th
century by John Graunt.
Babbage builds a machine
to produce tables faster.
Essai sur la statistique de
la population française,
Adolphe d'Angeville, 1836
population census, tax
register, house numbers, etc.
modern statistics, large
bureaucracies, quantitative
social sciences, etc.
Over the last centuries, scientific thinking has become the dominant way
of producing knowledge and making decisions in most societies.
Scientific thinking implies various styles of reasoning, different ways of
"giving reasons", different analytical gestures, etc.
Styles are intrinsically connected to our "lifeworld" (Husserl 1936).
Two diagnoses:
☉ Our lifeworld is changing in significant ways => "the many"
☉ We need new ways of making sense of it => data analysis
What is the style of data analysis? Its epistemology? One or many?
What are its techniques, its analytical gestures?
Some conclusions for part 1
2 / Two kinds of mathematics
Can there be data analysis without math? No.
Does this imply epistemological commitments? Yes.
But there are choice, e.g. between:
☉ Confirmatory data analysis => deductive
☉ Exploratory data analysis (Tukey 1962) => inductive
There is a fast growing variety of analytical gestures focusing on large
numbers of formalized and classed objects.
2 / Two kinds of mathematics
Statistics
Observed: objects and properties
Inferred: relations
Data representation: the table
Visual representation: quantity charts
Grouping: class (similar properties)
Graph-theory
Observed: objects and relations
Inferred: structure
Data representation: the matrix
Visual representation: network diagrams
Grouping: clique (dense relations)
Facebook Page "ElShaheeed", June 2010 – June 2011, (Poell / Rieder, forthcoming)
7K posts, 700K users, 3.6M comments, 10M likes (tool: netvizz), work in progress!
New media platforms funnel practices into reduced and largely formal
"grammars of action" (Agre 1989); data is therefore very clean, very
complete, and very detailed.
Can be imported with great ease into standard packages that come with
many analytical gestures built in R, Excel, SPSS, Rapidminer, etc.).
Tools are easy, concepts are hard.
Statistics
Facebook Page "ElShaheeed", June 2010 – June 2011
comment timescatter
Facebook Page "ElShaheeed", June 2010 – June 2011
comment timescatter, log10 y scale
Facebook Page "ElShaheeed", June 2010 – June 2011:
comment timescatter, log10 y scale, likes on
Facebook Page "ElShaheeed", June 2010 – June 2011
comment timeline, per day
Facebook Page "ElShaheeed", June 2010 – June 2011
comment timeline, per month
Facebook Page "ElShaheeed", June 2010 – June 2011
page posts by type, per month
Facebook Page "ElShaheeed", June 2010 – June 2011
comparison timeline: comments, posts, comments per post
Facebook Page "ElShaheeed", June 2010 – June 2011
histogram of comment lengths in characters
Facebook Page "ElShaheeed", June 2010 – June 2011
histogram of like count
Calculating relationships between variables
Quetelet 1827, Galton 1885, Pearson 1901
"Erosion of determinism" (Hacking 1991)
Facebook Page "ElShaheeed", June 2010 – June 2011
scatterplot comments / likes, with standard error
Facebook Page "ElShaheeed", June 2010 – June 2011:
scatterplot comments / likes, per post type
2 / Two kinds of mathematics
Statistics
Observed: objects and properties
Inferred: relations
Data representation: the table
Visual representation: quantity charts
Grouping: class (similar properties)
Graph-theory
Observed: objects and relations
Inferred: structure
Data representation: the matrix
Visual representation: network diagrams
Grouping: clique (dense relations)
3 / The mathematics of structure
Graph theory has a long prehistory; social network analysis starts in the
1930s with Jacob Moreno's work.
Graph theory is "a mathematical model for any system involving a binary
relation" (Harary 1969); it makes relational structure calculable.
Three different force-based layouts of my FB profile
OpenOrd, ForceAtlas, Fruchterman-Reingold
Non force-based layouts
Circle diagram, parallel bubble lines, arc diagram
Network statistics
betweenness centrality
degree
Relational elements of graphs can
be represented as tables (nodes
have properties) and analyzed
through statistics.
Network statistics bridge the gap
between individual units and the
structural forms they are
embedded in.
This is currently an extremely
prolific field of research.
Twitter 1% sample, 24 hours: 4.3M tweets, 3.4M
users, 2M accounts mentioned, 227K unique hashtags
Helpful: baseline sampling
Twitter's API proposes a random 1% statuses/sample endpoint that does
not require privileged access.
Provides datasets for researching certain types of questions and allows to
"contextualize" (baseline) other collections.
We (Gerlitz / Rieder 2013) explored 24 hours of the 1% sample and
captured 4,376,230 tweets, sent from 3,370,796 accounts, at an average
rate of 50.65 tweets per second, leading to about 1.3GB of uncompressed
and unindexed MySQL tables.
A baseline provides reference points
Beware of averages in non-normal distributions! But 1% sample is
sufficiently large to allow representative exploration of subsamples.
We can qualify structures and individual elements in terms with the help
of statistics and graph theory.
Twitter 1% sample, co-hashtag analysis
227,029 unique hashtags, 1627 displayed (freq >= 50)
Size: frequency
Color: modularity
Size: frequency
Color: user diversity
Twitter 1% sample, co-hashtag analysis
227,029 unique hashtags, 1627 displayed (freq >= 50)
Size: frequency
Color: degree
Twitter 1% sample, co-hashtag analysis
227,029 unique hashtags, 1627 displayed (freq >= 50)
Nine measures of centrality (Freeman 1979)
Label PR α=0.85 PR α=0.7 PR α=0.55 PR α=0.4 In-Degree Out-Degree Degree
n34 0.0944 0.0743 0.0584 0.0460 4 1 5
n1 0.0867 0.0617 0.0450 0.0345 1 2 3
n17 0.0668 0.0521 0.0423 0.0355 2 1 3
n39 0.0663 0.0541 0.0453 0.0388 5 1 6
n22 0.0619 0.0506 0.0441 0.0393 5 1 6
n27 0.0591 0.0451 0.0371 0.0318 1 0 1
n38 0.0522 0.0561 0.0542 0.0486 6 0 6
n11 0.0492 0.0372 0.0306 0.0274 3 1 4
Twitter 1% sample
Co-hashtag analysis
Degree vs.
wordFrequency
Degree vs. userDiversity
Twitter 1% sample
Co-hashtag analysis
Facebook Page "ElShaheeed"
700K nodes, 11M connections
Color: type
Facebook Page "ElShaheeed"
700K nodes, 11M connections
Color: outdegree
Conclusions
There is a lot of excitement about data analysis, but our understanding of
styles and analytical gestures is still very poor.
We need interrogation and critiques of methodology that are developed
from engagement and historical/conceptual investigation.
We need analytical gestures that are more closely tied to concepts from
the humanities and social sciences; exploration rather than confirmation.
Visualization and simpler tools are very interesting but require technical
and conceptual literacy to deliver more than illustrations.
This is probably not a fad.
"Incite, induce, deviate, make easy or difficult, enlarge or limit, render more or
less probable… These are the categories or power." (Deleuze 1986, 77)
Thank You
rieder@uva.nl
https://www.digitalmethods.net
http://thepoliticsofsystems.net
"Far better an approximate answer to the right
question, which is often vague, than an exact answer to
the wrong question, which can always be made precise.
Data analysis must progress by approximate answers, at
best, since its knowledge of what the problem really is will
at best be approximate." (Tukey 1962)

Contenu connexe

Tendances

Platforms and Analytical Gestures
Platforms and Analytical GesturesPlatforms and Analytical Gestures
Platforms and Analytical GesturesBernhard Rieder
 
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleTweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleBernhard Rieder
 
Interactive visualization and exploration of network data with Gephi
Interactive visualization and exploration of network data with GephiInteractive visualization and exploration of network data with Gephi
Interactive visualization and exploration of network data with GephiDigital Methods Initiative
 
How is Data Made? From Dataset Literacy to Data Infrastructure Literacy
How is Data Made? From Dataset Literacy to Data Infrastructure LiteracyHow is Data Made? From Dataset Literacy to Data Infrastructure Literacy
How is Data Made? From Dataset Literacy to Data Infrastructure LiteracyJonathan Gray
 
Big data sources and methods for social and economic analyses
Big data sources and methods for social and economic analysesBig data sources and methods for social and economic analyses
Big data sources and methods for social and economic analysesAmerico Arizaca Avalos
 
Frontiers open techai 20180908 v3
Frontiers open techai 20180908 v3Frontiers open techai 20180908 v3
Frontiers open techai 20180908 v3ISSIP
 
What Data Can Do: A Typology of Mechanisms . Angèle Christin
What Data Can Do: A Typology of Mechanisms . Angèle Christin What Data Can Do: A Typology of Mechanisms . Angèle Christin
What Data Can Do: A Typology of Mechanisms . Angèle Christin eraser Juan José Calderón
 
Giovanni Maria Sacco
Giovanni Maria SaccoGiovanni Maria Sacco
Giovanni Maria Saccoguest66dc5f
 
Data Center Computing for Data Science: an evolution of machines, middleware,...
Data Center Computing for Data Science: an evolution of machines, middleware,...Data Center Computing for Data Science: an evolution of machines, middleware,...
Data Center Computing for Data Science: an evolution of machines, middleware,...Paco Nathan
 
PatternLanguageOfData
PatternLanguageOfDataPatternLanguageOfData
PatternLanguageOfDatakimErwin
 
The story of Data Stories
The story of Data StoriesThe story of Data Stories
The story of Data StoriesElena Simperl
 
Pie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on TwitterPie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on TwitterElena Simperl
 
Information Visualization: Analyzing and Presenting Data
Information Visualization: Analyzing and Presenting DataInformation Visualization: Analyzing and Presenting Data
Information Visualization: Analyzing and Presenting DataAndrew Vande Moere
 
Perceptions of Syrian refugees and data experts on relocation algorithm
Perceptions of Syrian refugees and data experts on relocation algorithmPerceptions of Syrian refugees and data experts on relocation algorithm
Perceptions of Syrian refugees and data experts on relocation algorithmDataLab - Taltech
 
The human face of AI: how collective and augmented intelligence can help sol...
The human face of AI:  how collective and augmented intelligence can help sol...The human face of AI:  how collective and augmented intelligence can help sol...
The human face of AI: how collective and augmented intelligence can help sol...Elena Simperl
 

Tendances (20)

Platforms and Analytical Gestures
Platforms and Analytical GesturesPlatforms and Analytical Gestures
Platforms and Analytical Gestures
 
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleTweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
 
Interactive visualization and exploration of network data with Gephi
Interactive visualization and exploration of network data with GephiInteractive visualization and exploration of network data with Gephi
Interactive visualization and exploration of network data with Gephi
 
How is Data Made? From Dataset Literacy to Data Infrastructure Literacy
How is Data Made? From Dataset Literacy to Data Infrastructure LiteracyHow is Data Made? From Dataset Literacy to Data Infrastructure Literacy
How is Data Made? From Dataset Literacy to Data Infrastructure Literacy
 
Statistics in Journalism Sheffield 2014
Statistics in Journalism Sheffield 2014Statistics in Journalism Sheffield 2014
Statistics in Journalism Sheffield 2014
 
Big data sources and methods for social and economic analyses
Big data sources and methods for social and economic analysesBig data sources and methods for social and economic analyses
Big data sources and methods for social and economic analyses
 
Frontiers open techai 20180908 v3
Frontiers open techai 20180908 v3Frontiers open techai 20180908 v3
Frontiers open techai 20180908 v3
 
Today's Data Grow Tomorrow's Citizens
Today's Data Grow Tomorrow's CitizensToday's Data Grow Tomorrow's Citizens
Today's Data Grow Tomorrow's Citizens
 
What Data Can Do: A Typology of Mechanisms . Angèle Christin
What Data Can Do: A Typology of Mechanisms . Angèle Christin What Data Can Do: A Typology of Mechanisms . Angèle Christin
What Data Can Do: A Typology of Mechanisms . Angèle Christin
 
GI Management Transformation: from geometry to databased relationships
GI Management Transformation: from geometry to databased relationshipsGI Management Transformation: from geometry to databased relationships
GI Management Transformation: from geometry to databased relationships
 
Giovanni Maria Sacco
Giovanni Maria SaccoGiovanni Maria Sacco
Giovanni Maria Sacco
 
Data Power
Data PowerData Power
Data Power
 
Data Center Computing for Data Science: an evolution of machines, middleware,...
Data Center Computing for Data Science: an evolution of machines, middleware,...Data Center Computing for Data Science: an evolution of machines, middleware,...
Data Center Computing for Data Science: an evolution of machines, middleware,...
 
PatternLanguageOfData
PatternLanguageOfDataPatternLanguageOfData
PatternLanguageOfData
 
The story of Data Stories
The story of Data StoriesThe story of Data Stories
The story of Data Stories
 
Pie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on TwitterPie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on Twitter
 
Information Visualization: Analyzing and Presenting Data
Information Visualization: Analyzing and Presenting DataInformation Visualization: Analyzing and Presenting Data
Information Visualization: Analyzing and Presenting Data
 
Perceptions of Syrian refugees and data experts on relocation algorithm
Perceptions of Syrian refugees and data experts on relocation algorithmPerceptions of Syrian refugees and data experts on relocation algorithm
Perceptions of Syrian refugees and data experts on relocation algorithm
 
The human face of AI: how collective and augmented intelligence can help sol...
The human face of AI:  how collective and augmented intelligence can help sol...The human face of AI:  how collective and augmented intelligence can help sol...
The human face of AI: how collective and augmented intelligence can help sol...
 
Data stories
Data storiesData stories
Data stories
 

Similaire à Figures of the Many - Quantitative Concepts for Qualitative Thinking

Artificial intelligence in the field of economics.pdf
Artificial intelligence in the field of economics.pdfArtificial intelligence in the field of economics.pdf
Artificial intelligence in the field of economics.pdfgtsachtsiris
 
Reflection on Humanism, Citizenship, and the Digital Society (from Theory to ...
Reflection on Humanism, Citizenship, and the Digital Society (from Theory to ...Reflection on Humanism, Citizenship, and the Digital Society (from Theory to ...
Reflection on Humanism, Citizenship, and the Digital Society (from Theory to ...AJHSSR Journal
 
Notational systems and cognitive evolution
Notational systems and cognitive evolutionNotational systems and cognitive evolution
Notational systems and cognitive evolutionJeff Long
 
Argumentation in Artificial Intelligence.pdf
Argumentation in Artificial Intelligence.pdfArgumentation in Artificial Intelligence.pdf
Argumentation in Artificial Intelligence.pdfSabrina Baloi
 
How and why study big cultural data
How and why study big cultural dataHow and why study big cultural data
How and why study big cultural dataLev Manovich
 
What is the major power linking statistics & data mining
What is the major power linking statistics & data miningWhat is the major power linking statistics & data mining
What is the major power linking statistics & data miningIJDKP
 
What is the Major Power Linking Statistics & Data Mining? November 2013
What is the Major Power Linking Statistics & Data Mining? November 2013 What is the Major Power Linking Statistics & Data Mining? November 2013
What is the Major Power Linking Statistics & Data Mining? November 2013 Soaad Abd El-Badie
 
How to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceabilityHow to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceabilityINRIA - ENS Lyon
 
Bex lecture 5 - digitisation and the museum
Bex   lecture 5 - digitisation and the museumBex   lecture 5 - digitisation and the museum
Bex lecture 5 - digitisation and the museumBex Lewis
 
Digital, Humanities, Latour and Networks. By Moses A. Boudourides
Digital, Humanities, Latour and Networks. By Moses A. BoudouridesDigital, Humanities, Latour and Networks. By Moses A. Boudourides
Digital, Humanities, Latour and Networks. By Moses A. BoudouridesMoses Boudourides
 
Platform Capitalism and the New Value Economy in the Academy
Platform Capitalism and the New Value Economy in the Academy Platform Capitalism and the New Value Economy in the Academy
Platform Capitalism and the New Value Economy in the Academy Mark Carrigan
 
Human-machine Inter-agencies
Human-machine Inter-agenciesHuman-machine Inter-agencies
Human-machine Inter-agenciesmo-seph
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...The Higher Education Academy
 
Data Science definition
Data Science definitionData Science definition
Data Science definitionCarloLauro1
 
Let's talk about Data Science
Let's talk about Data ScienceLet's talk about Data Science
Let's talk about Data ScienceCarlo Lauro
 

Similaire à Figures of the Many - Quantitative Concepts for Qualitative Thinking (20)

Artificial intelligence in the field of economics.pdf
Artificial intelligence in the field of economics.pdfArtificial intelligence in the field of economics.pdf
Artificial intelligence in the field of economics.pdf
 
Reflection on Humanism, Citizenship, and the Digital Society (from Theory to ...
Reflection on Humanism, Citizenship, and the Digital Society (from Theory to ...Reflection on Humanism, Citizenship, and the Digital Society (from Theory to ...
Reflection on Humanism, Citizenship, and the Digital Society (from Theory to ...
 
Notational systems and cognitive evolution
Notational systems and cognitive evolutionNotational systems and cognitive evolution
Notational systems and cognitive evolution
 
Argumentation in Artificial Intelligence.pdf
Argumentation in Artificial Intelligence.pdfArgumentation in Artificial Intelligence.pdf
Argumentation in Artificial Intelligence.pdf
 
How and why study big cultural data
How and why study big cultural dataHow and why study big cultural data
How and why study big cultural data
 
What is the major power linking statistics & data mining
What is the major power linking statistics & data miningWhat is the major power linking statistics & data mining
What is the major power linking statistics & data mining
 
Peter Acs IT-Gatineau
Peter Acs IT-GatineauPeter Acs IT-Gatineau
Peter Acs IT-Gatineau
 
What is the Major Power Linking Statistics & Data Mining? November 2013
What is the Major Power Linking Statistics & Data Mining? November 2013 What is the Major Power Linking Statistics & Data Mining? November 2013
What is the Major Power Linking Statistics & Data Mining? November 2013
 
PdvgFlorence20110319rev1eng
PdvgFlorence20110319rev1engPdvgFlorence20110319rev1eng
PdvgFlorence20110319rev1eng
 
How to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceabilityHow to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceability
 
Bex lecture 5 - digitisation and the museum
Bex   lecture 5 - digitisation and the museumBex   lecture 5 - digitisation and the museum
Bex lecture 5 - digitisation and the museum
 
Digital, Humanities, Latour and Networks. By Moses A. Boudourides
Digital, Humanities, Latour and Networks. By Moses A. BoudouridesDigital, Humanities, Latour and Networks. By Moses A. Boudourides
Digital, Humanities, Latour and Networks. By Moses A. Boudourides
 
Platform Capitalism and the New Value Economy in the Academy
Platform Capitalism and the New Value Economy in the Academy Platform Capitalism and the New Value Economy in the Academy
Platform Capitalism and the New Value Economy in the Academy
 
Human-machine Inter-agencies
Human-machine Inter-agenciesHuman-machine Inter-agencies
Human-machine Inter-agencies
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...
 
Caps2015alvarezuned
Caps2015alvarezunedCaps2015alvarezuned
Caps2015alvarezuned
 
Open Research
Open ResearchOpen Research
Open Research
 
ENP_Dutch_Infoday_PHuijnen
ENP_Dutch_Infoday_PHuijnen ENP_Dutch_Infoday_PHuijnen
ENP_Dutch_Infoday_PHuijnen
 
Data Science definition
Data Science definitionData Science definition
Data Science definition
 
Let's talk about Data Science
Let's talk about Data ScienceLet's talk about Data Science
Let's talk about Data Science
 

Dernier

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 

Dernier (20)

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 

Figures of the Many - Quantitative Concepts for Qualitative Thinking

  • 1. Figures of the Many Quantitative Concepts for Qualitative Thinking Bernhard Rieder Universiteit van Amsterdam Mediastudies Department
  • 2. Context Terms like "big data", "computational social science", "digital humanities", "digital methods", etc. are receiving a lot of attention. They point to a set of practices for knowledge production: data analysis, visualization, modeling, etc. Instead of a totalizing search for a "logic" of data analysis, we could inquire into the vocabulary of analytical gestures that constitute the practice of data analysis. A twofold approach to methods: ☉ Engagement, development, application => digital methods ☉ Conceptual, historical, and political analysis and critique => software studies
  • 3. This presentation How do we talk about data? How do we analyze them? What is our frame of thought? How do we go further in terms of imagination, expressivity? ☉ 1 / Confronting "the many" ☉ 2 / Two kinds of mathematics ☉ Objects and their properties => Statistics ☉ Objects and their relations => Graph theory Engage the theory of knowledge (epistemology) mobilized in data analysis, but through the actual techniques and not generalizing concepts.
  • 4. What styles of reasoning? Hacking (1991) building the concept of "style of reasoning" on A. C. Crombie’s (1994) "styles of scientific thinking": ☉ postulation and deduction ☉ experiment and empirical research ☉ reasoning by analogy ☉ ordering by comparison and taxonomy ☉ statistical analysis of regularities and probabilities ☉ genetic development What kind of reasoning are we mobilizing in data analysis? Is the history of styles of reasoning simply intellectual progress, or adaptation to a changing world, or co-constitutive of that world? What is our world like?
  • 5.
  • 6.
  • 7. "It is hard to believe that we still have to absorb the same types of actors, the same number of entities, the same profiles of beings, and the same modes of existence into the same types of collectives as Comte, Durkheim, Weber, or Parson [sic], especially after science and technology have massively multiplied the participants to be cooked in the melting pot." (Latour 2005, 260)
  • 8. The proliferation of actors and facilitation of transversal connectivity have lead to large and complex forms of socio-technical grouping and structuring. Forms of organization take the shape of (multi-sided) markets based around technological platforms that facilitate transactions. Social media use simple but flexible grammars of connectivity (combination of point to point and list forms), exchange, and aggregation that accommodate various practices and levels of scale. The diversity of practices, contents, geographies, topologies, intensities, motivations, etc. makes it hard to generalize and theorize dynamics of use. 1 / The many
  • 9. Platforms like Twitter boost opportunities for connectivity between various types of actors.
  • 10. At the same time, they produce detailed data traces that are highly centralized and searchable.
  • 11. Quality / quantity "One of my favorite fantasies is a dialogue between Mills and Lazarsfeld in which the former reads to the latter the first sentence of The Sociological Imagination: 'Nowadays men often feel that their private lives are a series of traps.' Lazarsfeld immediately replies: 'How many men, which men, how long have they felt this way, which aspects of their private lives bother them, do their public lives bother them, when do they feel free rather than trapped, what kinds of traps do they experience, etc., etc., etc.' If Mills succumbed, the two of them would have to apply to the National Institute of Mental Health for a million-dollar grant to check out and elaborate that first sentence. They would need a staff of hundreds, and when finished they would have written Americans View Their Mental Health rather than The Sociological Imagination, provided that they finished at all, and provided that either of them cared enough at the end to bother writing anything." (Maurice Stein, cit. in Gitlin 1978) Theory vs. empiricism, macro vs. micro, qualitative vs. quantitative, inductive vs. deductive, associative vs. formalistic, etc. The promise of data analysis tools, applied to exhaustive (and cheap) data, is to bridge the gap, to allow zooming, "quali-quanti" (Latour 2010).
  • 12. “facts and statistics collected together for reference or analysis. See also datum. - Computing: the quantities, characters, or symbols on which operations are performed by a computer, being stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media. - Philosophy: things known or assumed as facts, making the basis of reasoning or calculation.” (Oxford American Dictionary) Define: data Reasoning (OAD): "think rationally", "use one's mind", "calculate", "make sense of", "come to the conclusion", "judge", "persuade", etc. Reasoning as "giving reasons" – what counts as a good reason? What counts as a good argument? As a proof? What is "good" knowledge? Reasoning as a series of techniques, e.g. science, engineering, etc.
  • 13. Why does the astronaut step into the space shuttle?
  • 14. A short history of reasoning the "more" Commercial Capitalism (13th +) calculating for trade, arithmetic, sharing risk and profit in long-distance commerce Rise of the Nation State (17th +) "art of the state", mercantilism, scientific revolution Industrialization (19th +) urbanization, scientific management, large bureaucracies ☉ Fibonacci, "Liber Abaci", Fibonacci, Calculating with Arab numerals (Pisa, 1202) ☉ Unknown, "Arte dell'Abbaco", Practical arithmetic (Venice, 1478) ☉ Pacioli, "Summa de arithmetica, geometria, proportioni et proportionalità", Double entry bookkeeping (Venice, 1494) ☉ William Petty & John Graunt, Political Arithmetick (17th century) ☉ Hermann Conring & Gottfried Achenwall, Statistik (17th & 18th century) ☉ Adolphe Quetelet, Statistical regularities and the "average man" (19th century) ☉ Francis Galton & Karl Pearson, Public health and eugenics (late 19th century)
  • 15. Liber Abaci, Fibonacci, 1202 Calculation for accounting, money-changing, insurance, lending, measurement, etc.
  • 16. "Having proved that there die about 3,506 persons at Paris unnecessarily, to the damage of France, we come next to compute the value of the said damage, and of the remedy thereof, as follows, viz., the value of the said 3,506 at 60 livres sterling per head, being about the value of Algier slaves (which is less than the intrinsic value of people at Paris), the whole loss of the subjects of France in that hospital seems to be 60 times 3,506 livres sterling per annum, viz., 210,360 livres sterling, equivalent to about 2,524,320 French livres." (Petty 1655)
  • 17. The Assurance of Lifes, Charles Babbage, 1826 First life tables were assembled in the 17th century by John Graunt. Babbage builds a machine to produce tables faster.
  • 18. Essai sur la statistique de la population française, Adolphe d'Angeville, 1836 population census, tax register, house numbers, etc. modern statistics, large bureaucracies, quantitative social sciences, etc.
  • 19. Over the last centuries, scientific thinking has become the dominant way of producing knowledge and making decisions in most societies. Scientific thinking implies various styles of reasoning, different ways of "giving reasons", different analytical gestures, etc. Styles are intrinsically connected to our "lifeworld" (Husserl 1936). Two diagnoses: ☉ Our lifeworld is changing in significant ways => "the many" ☉ We need new ways of making sense of it => data analysis What is the style of data analysis? Its epistemology? One or many? What are its techniques, its analytical gestures? Some conclusions for part 1
  • 20. 2 / Two kinds of mathematics Can there be data analysis without math? No. Does this imply epistemological commitments? Yes. But there are choice, e.g. between: ☉ Confirmatory data analysis => deductive ☉ Exploratory data analysis (Tukey 1962) => inductive There is a fast growing variety of analytical gestures focusing on large numbers of formalized and classed objects.
  • 21. 2 / Two kinds of mathematics Statistics Observed: objects and properties Inferred: relations Data representation: the table Visual representation: quantity charts Grouping: class (similar properties) Graph-theory Observed: objects and relations Inferred: structure Data representation: the matrix Visual representation: network diagrams Grouping: clique (dense relations)
  • 22. Facebook Page "ElShaheeed", June 2010 – June 2011, (Poell / Rieder, forthcoming) 7K posts, 700K users, 3.6M comments, 10M likes (tool: netvizz), work in progress!
  • 23. New media platforms funnel practices into reduced and largely formal "grammars of action" (Agre 1989); data is therefore very clean, very complete, and very detailed. Can be imported with great ease into standard packages that come with many analytical gestures built in R, Excel, SPSS, Rapidminer, etc.). Tools are easy, concepts are hard. Statistics
  • 24. Facebook Page "ElShaheeed", June 2010 – June 2011 comment timescatter
  • 25. Facebook Page "ElShaheeed", June 2010 – June 2011 comment timescatter, log10 y scale
  • 26. Facebook Page "ElShaheeed", June 2010 – June 2011: comment timescatter, log10 y scale, likes on
  • 27. Facebook Page "ElShaheeed", June 2010 – June 2011 comment timeline, per day
  • 28. Facebook Page "ElShaheeed", June 2010 – June 2011 comment timeline, per month
  • 29. Facebook Page "ElShaheeed", June 2010 – June 2011 page posts by type, per month
  • 30. Facebook Page "ElShaheeed", June 2010 – June 2011 comparison timeline: comments, posts, comments per post
  • 31. Facebook Page "ElShaheeed", June 2010 – June 2011 histogram of comment lengths in characters
  • 32. Facebook Page "ElShaheeed", June 2010 – June 2011 histogram of like count
  • 33. Calculating relationships between variables Quetelet 1827, Galton 1885, Pearson 1901 "Erosion of determinism" (Hacking 1991)
  • 34. Facebook Page "ElShaheeed", June 2010 – June 2011 scatterplot comments / likes, with standard error
  • 35. Facebook Page "ElShaheeed", June 2010 – June 2011: scatterplot comments / likes, per post type
  • 36. 2 / Two kinds of mathematics Statistics Observed: objects and properties Inferred: relations Data representation: the table Visual representation: quantity charts Grouping: class (similar properties) Graph-theory Observed: objects and relations Inferred: structure Data representation: the matrix Visual representation: network diagrams Grouping: clique (dense relations)
  • 37. 3 / The mathematics of structure Graph theory has a long prehistory; social network analysis starts in the 1930s with Jacob Moreno's work. Graph theory is "a mathematical model for any system involving a binary relation" (Harary 1969); it makes relational structure calculable.
  • 38. Three different force-based layouts of my FB profile OpenOrd, ForceAtlas, Fruchterman-Reingold
  • 39. Non force-based layouts Circle diagram, parallel bubble lines, arc diagram
  • 40. Network statistics betweenness centrality degree Relational elements of graphs can be represented as tables (nodes have properties) and analyzed through statistics. Network statistics bridge the gap between individual units and the structural forms they are embedded in. This is currently an extremely prolific field of research.
  • 41. Twitter 1% sample, 24 hours: 4.3M tweets, 3.4M users, 2M accounts mentioned, 227K unique hashtags
  • 42. Helpful: baseline sampling Twitter's API proposes a random 1% statuses/sample endpoint that does not require privileged access. Provides datasets for researching certain types of questions and allows to "contextualize" (baseline) other collections. We (Gerlitz / Rieder 2013) explored 24 hours of the 1% sample and captured 4,376,230 tweets, sent from 3,370,796 accounts, at an average rate of 50.65 tweets per second, leading to about 1.3GB of uncompressed and unindexed MySQL tables.
  • 43. A baseline provides reference points Beware of averages in non-normal distributions! But 1% sample is sufficiently large to allow representative exploration of subsamples. We can qualify structures and individual elements in terms with the help of statistics and graph theory.
  • 44.
  • 45. Twitter 1% sample, co-hashtag analysis 227,029 unique hashtags, 1627 displayed (freq >= 50) Size: frequency Color: modularity
  • 46. Size: frequency Color: user diversity Twitter 1% sample, co-hashtag analysis 227,029 unique hashtags, 1627 displayed (freq >= 50)
  • 47. Size: frequency Color: degree Twitter 1% sample, co-hashtag analysis 227,029 unique hashtags, 1627 displayed (freq >= 50)
  • 48. Nine measures of centrality (Freeman 1979)
  • 49. Label PR α=0.85 PR α=0.7 PR α=0.55 PR α=0.4 In-Degree Out-Degree Degree n34 0.0944 0.0743 0.0584 0.0460 4 1 5 n1 0.0867 0.0617 0.0450 0.0345 1 2 3 n17 0.0668 0.0521 0.0423 0.0355 2 1 3 n39 0.0663 0.0541 0.0453 0.0388 5 1 6 n22 0.0619 0.0506 0.0441 0.0393 5 1 6 n27 0.0591 0.0451 0.0371 0.0318 1 0 1 n38 0.0522 0.0561 0.0542 0.0486 6 0 6 n11 0.0492 0.0372 0.0306 0.0274 3 1 4
  • 50. Twitter 1% sample Co-hashtag analysis Degree vs. wordFrequency
  • 51. Degree vs. userDiversity Twitter 1% sample Co-hashtag analysis
  • 52. Facebook Page "ElShaheeed" 700K nodes, 11M connections Color: type
  • 53.
  • 54. Facebook Page "ElShaheeed" 700K nodes, 11M connections Color: outdegree
  • 55. Conclusions There is a lot of excitement about data analysis, but our understanding of styles and analytical gestures is still very poor. We need interrogation and critiques of methodology that are developed from engagement and historical/conceptual investigation. We need analytical gestures that are more closely tied to concepts from the humanities and social sciences; exploration rather than confirmation. Visualization and simpler tools are very interesting but require technical and conceptual literacy to deliver more than illustrations. This is probably not a fad.
  • 56. "Incite, induce, deviate, make easy or difficult, enlarge or limit, render more or less probable… These are the categories or power." (Deleuze 1986, 77)
  • 57. Thank You rieder@uva.nl https://www.digitalmethods.net http://thepoliticsofsystems.net "Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise. Data analysis must progress by approximate answers, at best, since its knowledge of what the problem really is will at best be approximate." (Tukey 1962)

Notes de l'éditeur

  1. An almostclassic kind of reasoning about "more".Image: http://www.prweb.com/releases/information/digital/prweb509640.htm
  2. Every one of use posses a large number of objects, many of them computers.
  3. People do a lot of different things on Twitter, Facebook, etc. – and just because you and your immediate vicinity seem to have coherent practices, this does not mean others have.
  4. Anatomy of a tweet. https://twitter.com/ICIJorg/status/321585235491962880https://api.twitter.com/1/statuses/show/321585235491962880.json
  5. Very large scale systems on the one side, but highly concentrated data repositories on the other.The promise of data analysis is, of course, to use that data to make sense of all the complexity.
  6. C. Wright Mills vs. Paul LazarsfeldMany people argue that we no longer need that grant, we already have the data.
  7. Reasoning then guides practice. Description => decision-making.
  8. "Why does the Astronaut step into the Space Shuttle?", does not seem like a sensible idea. What reasons are given that we do not think about astronauts as suicidal?
  9. Cost-benefit analysis! How to price a life? (today: expected future earnings)
  10. http://www.youtube.com/watch?v=zFl6p4D59AAhttp://www.videohippy.com/video/11216/Little-Britain-Computer-says-NoExample: opening a bank account at ABN-Amro (credit rating)http://www.creditchecker.nl/Questions: ShouldI give that person money? How much? At what interest rate?
  11. Questions: I am the government, what should I do? Where should I invest? How does the economy work?Adolphed'Angeville:Essaisur la Statistique de la Population Française, 1836 - Full document: http://www.europeana.eu/portal/record/03486/DE44EEC02EA9F56E94AD9D3BD077AB298A92514E.html
  12. Making decisions: in particular on the interpersonal level!
  13. Allows for all kinds of folding, combinations, etc. – Math is not homogeneous, but sprawling!Different forms of reasoning, different modes of aggregation.These are already analytical frameworks, different ways of formalizing.
  14. http://www.facebook.com/ElShaheeed (Created by WaelGhonim, considered to be a central place for the sparking of the Egyptian Revolution)http://apps.facebook.com/netvizz/ (tool used for extraction)
  15. Simply plotting events is an analytical gesture. (=> pattern)
  16. Changing scales, analytical gesture, "tame" large numbers and heighten visibility
  17. Adding variables => allow for comparisons
  18. Count per interval (here: day).
  19. Different visuals, change counting interval, very different effect
  20. But if we look at the number of posts published on the page, this is a very different picture! So we want to compare!
  21. Find outliers and interesting moments not only in terms of values, but relationships between values.
  22. Looking at "central tendencies" in data. When does it make sense? Here it does, because there is no powerlaw.
  23. Whatdo the averages characterize here? Not much – there is no "typical" post.
  24. In statistics, regression analysis is a statistical technique for estimating the relationships among variables. (correlation)A probability relationship: height and weight is correlated: if you are very tall, there is a good chance that you also weigh more; a statistical not a deterministic relationshhipErosion of determinism in the 19th centuryTitle : Recherchessur la population, les naissances, les décès, les prisons, les dépôts de mendicité, etc., dans le royaume des Pays-Bas , par M. A. Quételet,… 1827http://gallica.bnf.fr/ark:/12148/bpt6k81568v.r=.langEN
  25. Positive correlation, but it's not 1:1
  26. And now to graph theory.
  27. Forsythe and Katz, 1946 – "adjacency matrix", Moreno, 1934
  28. Visualization is, again, one type of analysis.Which properties of the network are "made salient" by an algorithm?http://thepoliticsofsystems.net/2010/10/one-network-and-four-algorithms/Models behind: spring simulation, simulated annealing (http://wiki.cns.iu.edu/pages/viewpage.action?pageId=1704113)
  29. So, what can we do?Logistics are important, because they determine who can do what kind of research, requirements for groups, etc.1% easy to handle for modern hardware; but for how long?
  30. A platform that hosts many different practices, from interpersonal communication to mass-media like oulets like Lady Gaga's account, which has 36M followers.But means or medians are still reference points!
  31. We can of course produce descriptive statistics!Baselining allows us to make "drawing the line" more informed. Does not evacuate bias – there is no "view from nowhere" – but maybe more conscious.
  32. Extend word lists (what am I missing?), account for refraction.
  33. Compare
  34. Larger roles of hashtags, not all are issue markers!
  35. All in all, this process resulted in the specification of nine centrality measures based on three conceptual foundations. Three are based on the degrees of points and are indexes of communication activity. Three are based on the betweenness of points and are indexes of potential for control of communication. And three are based on closeness and are indexes either of independence or efficiency.(Freeman 1979)What concepts are they based on?
  36. Network metrics are highly dependent on individual variables.
  37. There is no need to analyze and visualize a graph as a network.Characterize hashtags in relation to a whole. (their role beyond my sample), better understand our fishing pole and the weight it carries.Tbt: throwback thursday
  38. How do we interpret this: understand the platform, understand the context of the phenomenon, understand the algorithm, etc.
  39. How do we interpret somethinglike this?
  40. Quantitative forms allow us to fill this with "content".