Web Observatories, e-Research and the Importance of Collaboration. WST 2014 Webinar series, 20th March 2014
See Web Science Trust http://webscience.org/
1. Web Observatories, e-Research
and the Importance of
Collaboration
David De Roure
e-Research Centre, University of Oxford
ESRC Strategic Adviser for Data Resources
@dder
2. Overview
1. Big Data for research (UK perspective)
2. Social Media Data is distinctive
3. Several shifts in how scholarship is conducted
4. And hence the context for Web Observatories
3. Edwards, P. N., et al. (2013) Knowledge Infrastructures: Intellectual Frameworks
and Research Challenges. Ann Arbor: Deep Blue.
12. The Big Picture
More people
Moremachines
Big Data
Big Compute
Conventional
Computation
“Big Social”
Social Networks
e-infrastructure
online
R&D
Big Data
Production
& Analytics
deeply
about
society
13. RCUK and Big Data
▶ „Big data is a term for a collection of datasets
so large and complex that it is beyond the
ability of typical database software tools to
capture, store, manage, and analyse them.
„Big‟ is not defined as being larger than a
certain number of „bytes‟ because as
technology advances over time, the size of
datasets that qualify as big data will also
increase‟ (RCUK)
14.
15. Research benefits of new data
▶ Undertaking research on pressing policy-related issues
without the need for new data collection
• Food consumption, social background and obesity
• Energy consumption, housing type and climatic
conditions
• Rural location, private/public transport alternatives and
incomes
• School attainment, higher education participation,
subject choices, student debt and later incomes
▶ New data such as social media enable us to ask big
questions, about big populations, and in real time – this is
transformative
18. Research questions
– Social and political
movements
– Political participation and
trust
– Individual,
group/community and
national identities
– Personal, local, national
and global security
(including crime, law
enforcement and defence)
– Rural development and
„Urban Transformations‟
– Crisis prevention,
preparedness, response,
management and recovery
– Education
– Health and wellbeing
(including ageing)
– Environment and
sustainability
– Economic growth and
financial markets (including
employment and the labour
market)
28. Real life is and must be full of all kinds of social
constraint – the very processes from which
society arises. Computers can help if we use
them to create abstract social machines on the
Web: processes in which the people do the
creative work and the machine does the
administration... The stage is set for an
evolutionary growth of new social engines. The
ability to create new forms of social process
would be given to the world at large, and
development would be rapid.Berners-Lee, Weaving the Web, 1999 (pp. 172–175)
The Order of Social
Machines
29. SOCIAM: The Theory and Practice of Social Machines is funded by the UK Engineering and Physical Sciences Research Council
(EPSRC) under grant number EPJ017728/1 and comprises the Universities of Southampton, Oxford and Edinburgh. See sociam.org
33. Big data elephant versus sense-making
network?
The challenge is to foster the co-constituted socio-technical
system on the right i.e. a computationally-enabled sense-
making network of expertise, data, models, visualisations
and narratives
Iain Buchan
36. The Observatory Context
▶ New forms of data enable us answer old
questions in new ways and to address entirely
new questions
– Especially about (new) social processes
▶ There are multiple shifts occurring:
– Academia and business
– Volumes and velocity of data
– Realtime analytics
– Computational infrastructure
– Dataflows vs datasets (and curation infrastructure)
– Correlation vs causation
– Increasing automation and ethical implications
– Machine-to-Machine in Internet of Things
39. WOW2014 Web Observatory Workshop at WWW2014
Keynote Professor Dame Wendy Hall The Web Observatory: A Web Science Perspective
Huanbo Luan and Tat-Seng Chua, The Design of a Live Social Observatory System
Matthew Weber, Observing the Web by Understanding the Past: Archival Internet Research
Mizuki Oka, Yasuhiro Hashimoto and Takashi Ikegami, Fluctuation and Burst Response in Social
Media
Gareth Beeston, Manuel Leon, Caroline Halcrow, Xianni Xiao, Lu Liu, Jinchuan Wang, Jinho Jay
Kim and Kunwoo Park,Humour Reactions in Crisis: A Proximal analysis of Chinese posts on Sina
Weibo in Reaction to the Salt Panic of March 2011
Robert Simpson, Kevin Page and David De Roure, Zooniverse: Observing the World‟s Largest
Citizen Science Platform
Paul Booth,Visualising Data in Web Observatories: A Proposal for Visual Analytics Development &
Evaluation
Marie Joan Kristine Gloria, John S. Erickson, Joanne S. Luciano, Deborah McGuinness and
Dominic Difranzo, Legal and Ethical Considerations: Step 1b in Building a Health Web Observatory
Ian Brown, Wendy Hall and Lisa Harris, Towards a Taxonomy for Web Observatories
Posters:
Reuben Binns, Observation without Surveillance: Web Observatories and Privacy
Besnik Fetahu, Stefan Dietze, and Wolfgang Nejdl, What's all the Data about? - Creating
Structured Profiles of Linked Data on the Web
Caroline Halcrow, Jinchuan Wang, Xianni Xiao, Lu Liu, Scaling and geo-locating commonly used
humour tags in Weibo
Shuangjie Li, Zhigang Wang and Juanzi Li, Observation on Heterogeneous Online Wikis of
Different Languages
Panel: Web Observatory interoperability and standards moderator David De Roure
Panellists: Wendy Hall (Web Science Trust), Jim Hendler (RPI), Thanassis Tiropanis (University ofwow.oerc.ox.ac.uk
EPSRC: Under ‘Big Data’ we are considering both very large and also complex data, including dynamic and heterogenous data from all the various sources including sensors, social media, industry etc.
ESRC was allocated 64m and much of this is being used to set up the ESRC Big Data Network. The ESRC’s Big Data Network will support the development of a network of innovative investments which will strengthen the UK’s competitive advantage in Big Data for the social sciences. The core aim of this network is to facilitate access to different types of data and thereby stimulate innovative research and develop new methods to undertake that research. Although you should note that diagram it is only illustrative in terms of how the UKDS and ADS will work across – that is still under discussion; and only illustrative in the number of Business and Local Government Data Research.This network has been divided into three phases. In Phase 1 of the Big Data Network the ESRC has invested in the development of the Administrative Data Research Network (ADRN) which will provide access to de-identified administrative data collected by government departments for research use – focus of this meeting and all your grants.A few words about Phase 2 and 3 before we pass to Vanessa to talk about the ADRN some more. Phase 2is currently bring commissioned and will deal primarily with business data and/ or local government data. Phase 3, further details of which will be released in the last autumn / winter and will focus primarily on third sector data and social media data. It is expected that there will be opportunities for interaction across all elements of the ESRC Big Data Network and that they will all work together around the wider objectives of facilitating access to different forms of data and of ensuring maximum impact is generated from the use of that data for the mutual benefit of data owners and researchers, and through the research facilitated by the Network, benefit society and the economy more generally.
Thanks to Simon Hettrick for additional input to this slide.