This document discusses social media, linked data, and the context question. It summarizes challenges with using social media data, including issues with biases, sampling, and representativeness. It discusses the need to understand data origins to assess veracity and quality. Provenance models are proposed to document processes and integrate different data sources. Open questions are raised about defining provenance for social media and leveraging it to influence analysis while considering privacy implications.
1. Social
Media,
Linked
Data
&
The
Context
Ques9on
Pete
Edwards
p.edwards@abdn.ac.uk
Compu'ng
Science
/
dot.rural
Digital
Economy
Hub
University
of
Aberdeen
2. dot.rural
Research
Hub
•
•
Exploring
the
contribu1on
digital
technologies
can
make
to
enhancing
key
services
;
genera1ng
business
opportuni1es
;
boos1ng
quality
of
life
;
promo1ng
the
economic,
social
and
environmental
sustainability
of
rural
areas
across
the
UK
and
beyond.
Researchers
from
a
range
of
disciplines:
• computer
science,
communica'ons
engineering,
human
geography,
sociology,
environmental
science,
business,
medicine,
transport
studies
Accessibility
& Mobilities
Enterprise
& Culture
CS / Eng
Natural
Resource
Conservation
Healthcare
3. Agenda
•
•
•
•
•
•
Social
Media
–
PiFalls
The
Context
Challenge
Provenance
Digital
Social
Research
Experience
Quality
Open
Ques'ons
4. Social
Media
–
PiFalls
#1
• Integra'ng
different
informa'on
sources
to
support
public
transport
informa'on
– Government
open
data
(na'onal
and
local),
operator
data,
vehicle
data
(when
available),
disrup'on
reports
…
• “Can
we
use
TwiUer
data?”
– OXen
used
as
a
means
to
report
disrup'on/
service
issues.
5. UK
Snow
–
January
2013
“Snow
causing
#chaos
in
Cheltenham!
Reports
of
up
to
1.5
inches
in
worst
affected
areas...
#uksnowdisaster!”
“Avoid
the
Penn
road.
It
is
awful!!!
#uksnow
#wolves
#wolvessnow”
How
to
assess
veracity
of
such
reports?
“No
trains
between
Portsmouth
Harbour
&
Southampton
Central
un9l
further
no9ce
#uksnow
#SnowSouthern”
–
SouthernRailUK
6. Social
Media
–
PiFalls
#2
• Big
Data:
PiDalls,
Methods
and
Concepts
for
an
Emergent
Field,
Zeynep
Tufekci
2013
Ø Use
of
social
media
analysis
by
social
scien'sts
and
policy
makers
is
challenged
by
inadequate
aUen'on
to
methodological
and
conceptual
issues.
• Issues:
1. LiUle/no
aUen'on
to
the
implicit
and
explicit
structural
biases
of
the
plaForm(s)
most
frequently
used
to
generate
datasets.
2. Lack
of
clarity
with
regard
to
sampling,
universe
and
representa'veness
(who
are
the
‘crowd’?).
3. Most
analyses
come
from
a
single
plaForm.
7. The
Context
Challenge
• Integra1on
of
social
media
data
with
other
datasets
needs
to
address
issues
around
data
quality.
• Social
media
analyses
need
to
be
more
explicit
in
discussing
the
biases,
implicit
assump1ons
and
steps
in
analy1cal
methods.
• How
do
we
understand
the
origins
of
data,
to
help
assess
veracity
and
u'lity?
• How
do
we
document
the
processes
involved
in
analysis?
• How
do
we
integrate
different
sources
together?
8. Provenance
• Lineage,
history,
audit
trail…
• Who,
What,
Where,
Why,
When,
Which,
&
(W)How
(Goble,
2002).
• W3C
Provenance
Working
Group
– “Informa'on
about
en''es,
ac'vi'es,
and
agents
involved
in
producing
a
piece
of
data
or
thing,
which
can
be
used
form
assessments
about
its
quality,
reliability,
or
trustworthiness”
9. W3C
PROV
Model
W3C
PROV
hUp://www.w3.org/TR/prov-‐overview/
hUp://www.w3.org/TR/2013/REC-‐prov-‐o-‐20130430/
10. W3C
PROV
Model
• En'ty
–
“a
physical,
digital,
or
other
kind
of
thing
with
some
fixed
aspects”
• Ac'vity
–
“something
that
occurs
over
a
period
of
'me
and
acts
upon
or
with
en''es”
• Agent
–
“something
that
bears
some
form
of
responsibility
for
an
ac'vity
taking
place,
the
existence
of
an
en'ty,
or
another
agent’s
ac'vity”
Annota'on:
“Angry”
wasGeneratedBy
Sen'ment
classifica'on
wasAssociatedWith
Classifer
Service
11. Linked
Data
• Using
the
Web
to
connect
related
data
that
wasn’t
previously
linked
• Designed
for
humans
and
machines.
• Links
between
a
thing
and
its
descrip'on
– RDF
(Resource
Descrip'on
Framework)
– Pete
-‐>
works
For
-‐>
University
of
Aberdeen
• Encourages
reuse,
reduce
redundancy.
13. Quality
• Reasoning about quality seen as critical as
more and more services/things publish data.
• Quality – a measures of ‘fitness for use’.
• Quality metrics should examine the context
around data (including provenance).
– Outputs are quality scores categorised into quality
dimensions (e.g. accuracy, relevance, …)
16. Ques'ons
• How
should
provenance
aUributes
and
characteris'cs
be
defined
for
social
media?
• Can
provenance
be
iden'fied
and
leveraged
to
help
influence
analysis?
• In
addi'on
to
veracity,
what
other
connec'ons
can
be
made
between
provenance
and
elements
of
social
media?
• What
are
the
implica'ons
for
privacy?
17. TwiUer
Ethnography
16-Jan
Aberdeen
Operator
FirstAberdeen
Morning Aberdeen, our control room team are
1.01
reporting full service out there today, with only minor
delays at the moment
16-Jan
Aberdeen
Public
mikewareham
@FirstAberdeen just realised your driver gave me a
single ticket this morning when I asked for, and paid
for, a return! Ridiculous!
2.01
16-Jan
Aberdeen
Operator
FirstAberdeen
@mikewareham Hi Mike, sorry about that - can you
email us the details to
customer.services@firstgroup.com and we'll
investigate. Thanks
2.02
16-Jan
Aberdeen
Public
dalgarnoamanda
@FirstAberdeen thanks to the no3 bus driver that's
waited for me three days in a row while I run for the
bus. Much appreciated!
3.01
16-Jan
Aberdeen
Operator
FirstAberdeen
@dalgarnoamanda You're welcome - do you have a
3.02
bus number so that we can pass on your thanks to the
driver?
16-Jan
Aberdeen
Public
dalgarnoamanda
@FirstAberdeen its the no3 that is on rosemount at the 3.03
moment going towards town
16-Jan
Aberdeen
Operator
FirstAberdeen
@dalgarnoamanda That bus isn't tracking at the
3.04
moment, but the bus number's on your ticket - should
be 5 digits starting with a 6
16-Jan
16-Jan
Aberdeen
Aberdeen
Public
Operator
dalgarnoamanda
FirstAberdeen
@FirstAberdeen hold on its 62191
@dalgarnoamanda That's the one - thanks for that,
we'll pass on your compliments
3.05
3.06
17-Jan
Aberdeen
Operator
FirstAberdeen
It's Friday Aberdeen! Yay! 2 minor RTC's at Nigg and
Mounthooly are causing some delays to services 18
and 11,20 & 23 at the moment
4.01
17-Jan
Aberdeen
Operator
FirstAberdeen
But apart from that, it's a quietish day on Aberdeen's
roads with full service and only minor delays - let's
hope it stays that way!
5.01
16-Jan
Aberdeen
Public
AndrewWatt7
@FirstAberdeen is there a bus that go's from union
square to pittordrie football ground?
6.01
17-Jan
Aberdeen
Operator
FirstAberdeen
@AndrewWatt7 Hi Andrew, Ser 13 will take you to the 6.02
back of the ground, or a 1, 2 or X40 will drop you on
King St and you can walk through
17-Jan
Aberdeen
Public
AndrewWatt7
@FirstAberdeen okay thanks
16-Jan
Aberdeen
Public
FraserMacaulay
@FirstAberdeen where did you get your no.17 driver? 7.01
The moon? Absolute ****
17-Jan
Aberdeen
Operator
FirstAberdeen
@FraserMacaulay Hi Fraser, can you email us the
7.02
details to customer.services@firstgroup.com and we'll
have a word with the driver. Thanks
6.03
• Contextual
inquiry
• Digital
diary
study
• Content
analysis
18. Thanks
…
• Team
members:
– Chris
Baillie,
David
Corsar,
Milan
Markovic,
Edoardo
Pignon,
Paul
Gault
• Collaborators:
– Caitlin
CoUrill,
John
Nelson,
Jillian
Anable
• Partners:
• Funders: