Seminar at the Polimi, Lecco site. About Open Data and relation with Linked Data, Open Government Data, Big Data. Open Data for Prosumers and for "Men in The Middle" (the ones that build information systems that solve issues of (open) data publication). The first part of the seminar is dedicated to some open data examples, to the definition of Open Data, and to some Open Data publication examples. The second part odf the seminar is dedicated to the issues of opening data and to my personal experience in opening data for the Autonomous PRovince of Trento and for the Joint Research Centre of the European Commission.
1. Lorenzino Vaccari 22/06/2016
About Open Data
(Please, interrupt me and make questions!)
1
lorenzino.vaccari@gmail.com
lorenzino.vaccari@jrc.ec.europa.eu
Lorenzino Vaccari
Seminar at POLIMI@Lecco
3. Lorenzino Vaccari 22/06/2016
Part 1: for data prosumers
• What are Open Data useful for?
• What is Open Data?
• Why are Open Data useful?
• How is Open Data related to Open
Government Data and Big Data?
• The Open Data movement
3
5. Lorenzino Vaccari 22/06/2016
Open Source & Open Data together to tackle with
humanitarian projects and economic development
5
HOT: Humanitarian OpenStreetMap Team
http://hot.openstreetmap.org
11. Lorenzino Vaccari 22/06/2016
“is data that can be freely used, reused and
redistributed by anyone – subject only, at most, to
the requirement to attribute and sharealike.” *
*(Source: )
http://opendatahandbook.org/guide/en/what-is-open-data/
11
12. Lorenzino Vaccari 22/06/2016
• use
• reuse
• redistribution
• commercial reuse
• derivative works
BUT, may require:
• attribution
• share alike
J. Gray (OKF): http://www.slideshare.net/jwyg/open-government-data-what-why-how
12
“open” =
13. Lorenzino Vaccari 22/06/2016
“Open” data
13
• Open License
• Free
• Open Access, e.g.:
• No registration
• No co-authorship
• Direct access (no services)
• ….
https://unsplash.com/@ryanmoreno
14. Lorenzino Vaccari 22/06/2016
Open License
14
• A license should be compatible with other
open licenses.
• A license is open if its terms satisfy the
following conditions...
https://unsplash.com/@rzunikoff
15. Lorenzino Vaccari 22/06/2016
Open license:
Required Permissions
15
The license must irrevocably permit (or allow) the following:
Use: The license must allow free use of the licensed work.
Redistribution: The license must allow redistribution of the licensed work, including sale,
whether on its own or as part of a collection made from works from different sources.
Modification: The license must allow the creation of derivatives of the licensed work and allow
the distribution of such derivatives under the same terms of the original licensed work.
Separation: The license must allow any part of the work to be freely used, distributed, or
modified separately from any other part of the work or from any collection of works in which it
was originally distributed. All parties who receive any distribution of any part of a work within
the terms of the original licenseshould have the same rights as those that are granted in
conjunction with the original work.
Compilation: The license must allow the licensed work to be distributed along with other
distinct works without placing restrictions on these other works.
Non-discrimination: The license must not discriminate against any person or group.
Propagation: The rights attached to the work must apply to all to whom it is redistributed
without the need to agree to any additional legal terms.
Application to Any Purpose: The license must allow use, redistribution, modification, and
compilation for any purpose. The license must not restrict anyone from making use of the work
in a specific field of endeavor.
No Charge: The license must not impose any fee arrangement, royalty, or other compensation
or monetary remuneration as part of its conditions.
http://opendefinition.org/od/2.1/en/
16. Lorenzino Vaccari 22/06/2016
Open license:
Acceptable Conditions
16
The license must not limit, make uncertain, or otherwise diminish the permissions
required in Section 2.1 except by the following allowable conditions:
Attribution: The license may require distributions of the work to include attribution of
contributors, rights holders, sponsors, and creators as long as any such prescriptions are not
onerous.
Integrity: The license may require that modified versions of a licensed work carry a different
name or version number from the original work or otherwise indicate what changes have been
made.
Share-alike: The license may require distributions of the work to remain under the same license
or a similar license.
Notice: The license may require retention of copyright notices and identification of the license.
Source: The license may require that anyone distributing the work provide recipients with access
to the preferred form for making modifications.
Technical Restriction Prohibition: The license may require that distributions of the work remain
free of any technical measures that would restrict the exercise of otherwise allowed rights.
Non-aggression: The license may require modifiers to grant the public additional permissions
(for example, patent licenses) as required for exercise of the rights allowed by the license. The
license may also condition permissions on not aggressing against licensees with respect to
exercising any allowed right (again, for example, patent litigation).
http://opendefinition.org/od/2.1/en/
17. Lorenzino Vaccari 22/06/2016
Open “Data”
17
Best practices:
• Primary source
• Timely
• Open format
• Updated and complete
• Machine readable
• ...
21. Lorenzino Vaccari 22/06/201621
“Splendid! The data is accessible on the
Web in a structured way (that is, machine-
readable), however, the data is still locked-
up in a document. To get the data out of the
document you depend on proprietary
software.”
make it available as structured data
(e.g., Excel instead of image scan of
a table)
22. Lorenzino Vaccari 22/06/201622
DateTime,MC
2016-01-01 00:00:00.000,58.808331
2016-01-01 00:10:00.000,59.374001
2016-01-01 00:20:00.000,58.720833
2016-01-01 00:30:00.000,57.98
2016-01-01 00:40:00.000,57.606003
2016-01-01 00:50:00.000,56.762001
2016-01-01 01:00:00.000,55.659184
2016-01-01 01:10:00.000,54.94286
2016-01-01 01:20:00.000,54.263268
2016-01-01 01:30:00.000,52.922451
2016-01-01 01:40:00.000,53.167347
2016-01-01 01:50:00.000,54.807999
2016-01-01 02:00:00.000,57.063263
2016-01-01 02:10:00.000,58.257141
2016-01-01 02:20:00.000,58.035999
2016-01-01 02:30:00.000,57.861225
2016-01-01 02:40:00.000,57.07143
2016-01-01 02:50:00.000,56.338776
2016-01-01 03:00:00.000,55.452
….
http://data.jrc.ec.europa.eu/dataset/jrc-abcis-ap-pm10mc-2016
“Excellent! The data is not only
available via the Web but now
everyone can use the data easily. On
the other hand, it’s still data on the
Web and not data in the Web.”
make it available in a non-
proprietary open format (e.g.,
CSV as well as of Excel)
24. Lorenzino Vaccari 22/06/201624
“Brilliant!
Now it’s data,in
the Web linked
to other data.
Both the
consumer and
the publisher
benefit from the
network effect.”
https://data.europa.eu/euodp/en/data/dataset/jrc-names
link your data to
other data to provide
context
27. Lorenzino Vaccari 22/06/2016
Open Data Benefits
● The Open data are the knowledge base to:
● Improve the economic grow and the
entrepreneurship based on the development of
digital services reusing Public Sector Information
● Answer to social needs through the publication of
innovative services and applications
● Aims at reducing the cost of the public
administrative activities within Public – Private
Partnerships (PPP)
● Improve the transparency of the activities of the
public institutions and the participation of the
citizens to these activities
27
28. Lorenzino Vaccari 22/06/201628
Economic Growth
“Today, the cumulative value
of products and services
derived from open access
to weather data is
estimated at $15 billion.”
http://www.accuweather.com/
http://www.socrata.com/blog/economic-impact-open-data/
32. Lorenzino Vaccari 22/06/2016
Open Government Data
32
“The three principles of
transparency, participation,
and collaboration form the
cornerstone of an open
government”
Barack Obama, 8/12/2009
https://www.whitehouse.gov/sites/default/files/omb/assets/memoranda_2010/m10-06.pdf
34. Lorenzino Vaccari 22/06/2016
Big Data & Open Data
Variety
Volume Velocity
• Structured
• Unstructured
• Semi-structured
• …
• Terabytes
• Records
• Transactions
• Tables, Files
• Batch
• Real Time
• Streams
• Near-time
3V’s
34
Open Data is often one of
the sources for Big Data
35. Lorenzino Vaccari 22/06/2016
State of the art: the Open Data
movement
What is happening around us ? Some examples...
● Globally
● Europe
● Italy
● Locally
35
36. Lorenzino Vaccari 22/06/2016
Open Data Charter - G8 (12/07/2013)
The principles are:
● Open Data by Default
● Quality and Quantity
● Useable by All
● Releasing Data for
Improved
Governance
● Releasing Data for
Innovation
http://opensource.com/government/13/7/open-data-charter-g8
https://www.gov.uk/government/publications/open-data-charter/g8-open-data-charter-and-technical-annex
36
38. Lorenzino Vaccari 22/06/2016
The GEOSS portal
38
The GEOSS CORE data
principles
● Full and Open Exchange of
Data, recognizing Relevant
International Instruments
and National Policies
● Data and Products at
Minimum Time delay
● Free of Charge or minimal
Cost for Research and
Education
http://www.geoportal.org/web/guest/geo_home
39. Lorenzino Vaccari 22/06/2016
OpenStreetMap: OD & Crowdsourcing
39
OpenStreetMap is a free map of
the world, created by someone
like you
“OpenStreetMap project creates and
provides geographical data, such as road
maps, freely available to anyone.
Behind the establishment and growth of
the project have been restrictions on use
or availability of map information
across much of the world and the advent
of inexpensive portable satellite
navigation devices”
https://www.openstreetmap.org
41. Lorenzino Vaccari 22/06/2016
OGD in Europe - Pan European
http://www.europeandataportal.eu/en/
41
Connecting Europe Facility
launches second call
(16/05/2016)
The Connecting Europe
Facility (CEF) in Telecom is an
EU programme to facilitate
cross-border interaction
between public
administrations, businesses and
citizens, through the
deployment of Digital Service
Infrastructures. One of its aims
is to support projects which
contribute to the European
ecosystem of the deployed
interoperable and
interconnected digital services.
…
485,473 datasets found
46. Lorenzino Vaccari 22/06/2016
Open Data @Lecco?
46
● Search if Lecco has an official Open Data
web site:
○ Which datasets (domains)?
○ Which formats?
■ How many stars?
○ Which licenses?
■ Is it clear the type of license for each dataset?
● Are there any other web sites in Lecco?
○ Are there any Universities which share
Open Data?
47. Lorenzino Vaccari 22/06/2016
Your Open Data (data provider)
47
● Do you think you could be an Open Data
provider? E.g. with the datasets of your
thesis?
● Would you like to share them openly?
● If not, why?
48. Lorenzino Vaccari 22/06/2016
Your Open Data (data consumer)
48
● Which data are you working on?
○ Where do you get them from?
● Which data would you like to find on
Internet?
○ Are the dataset you download fine with
you? If not, why?
50. Lorenzino Vaccari 22/06/2016
Part 2: for men in the middle
● Open Data Issues
● Two experiences:
○ Autonomous Province of Trento
■ The story started with GeoData…
■ Now “Open Data in Trentino”: http://dati.trentino.it
■ Community building
○ European Commission: Joint Research Centre
■ http://data.jrc.ec.europa.eu
● Want to learn more?
50
53. Lorenzino Vaccari, Juan Pane 22/06/201653
Organizational Barriers
● Not ready
● Lack of resources (IT, Human)
● Don’t want to be ready
http://montcomediation.org/images/MCMC_MyWayYourWay.jpg
54. Lorenzino Vaccari, Juan Pane 22/06/201654
Legal barriers
● Open the Data
○ All the data that was produced
using public money has to be
made publicly available (with
exceptions)
● vs Privacy
○ You cannot open data that
could allow correlation of
private personal data
http://s177.photobucket.com/user/sealth2828/media/gavel.jpg.html
55. Lorenzino Vaccari, Juan Pane 22/06/201655
● Data is not contextualized
● Opening data is a complex task, opening
cleaned data is even more complex.
● Unclear licenses
Adoption barriers
http://www.thepadrino.com/2011/01/defendius-labyrinth-security-lock.html
57. Lorenzino Vaccari, Juan Pane 22/06/201657
● Privileged access to data
● Transparency is bad for fraudulent business
Context Barriers
http://img.gawkerassets.com/img/182n8vzdlg1iojpg/original.jpg
58. Lorenzino Vaccari, Juan Pane 22/06/201658
● Zuiderwijk et al 2010
● Listed 118 socio-technical impediments for
opening data in the literature such as:
○ Findability
○ Usability
○ Understandablity
○ Quality
○ Linking
○ Comparability and compatibility
○ Metadata
○ ….
Barriers
59. Lorenzino Vaccari, Juan Pane 22/06/201659
Congratulation for the presentation! I am
curious about the data you used! Are
these datasets freely available? Would
you like to publish them as Open Data
in the catalog we are creating at the JRC
level? Here there is a draft version: http:
//data.jrc.ec.europa.eu/ .
Cheers,
Lorenzino
-------------------
Hi Lorenzino,sorry but I am not allowed to
publish my dataset.
Cheers,
Xyz
Meanwhile at the JRC… The Data
are MINE !
63. Lorenzino Vaccari 22/06/2016
Autonomous Province of Trento
● The story started with GeoData…
● Now “Open Data in Trentino”: http://dati.
trentino.it
● Community building
63
66. Lorenzino Vaccari 22/06/2016
The “Open Data in Trentino” project
66
• The “Open Data in Trentino” project is a 3 years initiative
finalized to develop an open data infrastructure to
enhance Service Innovation for Trentino following the
PAT strategy for services innovation enabled by ICT. The
project will be developed within a partnership between
Trento RISE and the Autonomous Province of Trento
(PAT) according to the innovation PAT model
• Goals
• Improved quality of life for citizens
• Open Data and local businesses
• Transparency
• Improved efficiency and productivity
69. Catalogue
08/10/2013Juan Pane, Lorenzino Vaccari69
The Open Knowledge Foundation (OKF) is a
non-profit organisation founded in 2004 and
dedicated to promoting open data and open
content in all their forms – including government
data, publicly funded research and public domain
cultural content.
http://okfn.org
74. Lorenzino Vaccari, Juan Pane 22/06/2016
Create Community
74
http://media.gettyimages.com/photos/members-of-the-colla-vella-de-valls-climb-up-as-they-construct-a-picture-id153610809
82. Lorenzino Vaccari 22/06/2016
JRC and Open Access
82
● for scientific publications/data within Horizon 2020 and by
other relevant initiatives (e.g. Research Data Alliance)
● overall trend for public move to open data (G8 charter,
INSPIRE..)
As continuation of JRC's efforts to make available
and transparent to the public the scientific
knowledge produced, in 2014 the JRC will roll out
its Open Access strategy for its publications
JRC Management Plan 2014
Commission Decision on the
reuse of Commission
documents (2011/833/EU)
Open Access in EC and beyond
, Anders Friis-Christensen, Andrea Perego
83. Lorenzino Vaccari 22/06/2016
JRC Open Data project
83
JRC Data Policy
- Open Data principles
- Data acquisition
principles
- Data management
principles
- Implementation
principles
JRC Data Catalogue
Containing JRC datasets
related to, e.g., Soil,
Water, Air quality, Marine,
Biodiversity, etc.
http://data.jrc.ec.europa.eu
.
EU Open Data
portal
A single access point to a
growing range of data
from the institutions and
other bodies of the EU
https://open-data.europa.eu
Commission
Decision on the
reuse of
Commission
documents
(2011/833/EU)
84. Lorenzino Vaccari, Anders Friis-Christensen 22/06/201684 Lorenzino Vaccari, Anders Friis-Christensen 22/06/2016
86. Lorenzino Vaccari, Anders Friis-Christensen 22/06/201686
Project Scope
JRC Data Policy
Data Policy
Implementati
on Guidelines
Software
components
(e.g. data
dissemination)
Data
Open Data
Applies to
95. Lorenzino Vaccari 22/06/2016
Open Data event in Lecco (8/6/2016)
95
http://www.comune.lecco.it/index.php/archivio-news/23-news-dal-comune/2437-convegno-open-data-e-sharing-economy
96. Lorenzino Vaccari 22/06/2016
Open Data & Smart Cities (EU ODP)
96
Analytical Report 4: Open Data in Cities
http://www.europeandataportal.eu/sites/default/files/edp_analytical_report_n4_-_open_data_in_cities_v1.0_final.pdf
97. Lorenzino Vaccari 22/06/2016
A question for you: is it difficult to
use a data catalogue? Why?
97
From the user point of view (what I found):
● I do not known about it
● I cannot found what I need
○ “Spaghetti” catalogues
■ many records
■ not clear what is inside (no clear classification)
■ too few datasets
● I do not receive updates
○ On datasets I am interested in
● Even if I found it
○ I cannot access it
■ Broken links, access barriers (registrations,…)
○ Is the dataset the last version?
98. Lorenzino Vaccari 22/06/2016
Questions?
98
Thanks For Your Attention!!!!
Acknowledgments:
● Anders Friis-Christensen
● Maurizio Napolitano
● Juan Pane
● Andrea Perego
Lorenzino Vaccari
lorenzino.vaccari@gmail.com
lorenzino.vaccari@jrc.ec.europa.eu