1. 10th ITS European Congress, Helsinki, Finland 16–19 June 2014 Paper Number
Crossing borders with open data
Kasvi, Jyrki*
& Salo, Jari
TIEKE Finnish Information Society Development Centre, Finland
Salomonkatu 17A, 00100 Helsinki, Finland
+358 50 4309360, jyrki.kasvi@tieke.fi
Abstract Mobility apps for cross border traffic require access to different public data
repositories in all the countries the apps are to be used in. Opening this data is the most
effective way of making it available to apps developers and facilitating service innovation.
Nevertheless, opening public data does not suffice alone. Both the international and
commercial nature of mobility apps set further requirements for public authorities opening
their data. These requirements include international data harmonization, open data
accessibility, integrity and safety. Finally, data roaming costs may be prohibitive for users of
international mobility apps, and may require regulation.
Keywords: Open data, cross border mobility, smart transport.
Introduction
Cross border traffic and transportation call for mobile services and applications that operate
fluently and without interruptions while people and cargo move from one country to another.
In order to develop these mobility apps or services, application developers and service
providers require a wide range of public and private information from various authorities in
several countries; About road weather, traffic conditions and disruptions, public transport
timetables, border station queues, road side services and cultural events just to mention a few.
While the so called PSI directive on the re-use of public sector information (Directive
2003/98/EC) has eased access to public data within European Union, an apps developer may
have to negotiate with authorities in 28 countries in order to cover the whole EU market.
What is more, when passengers and cargo move from Union area to other countries, fluent
access to data becomes even more crucial.
2. Crossing borders with open data
2
The most effective solution is to open the data required by mobile traffic and transportation
services in all countries, particularly within the European common market consisting or the
European Union and the EFTA countries. One can even say that circumambient open data is a
precondition to a working mobility service market. While authorities of some EU members
like United Kingdom and Denmark have already extensively opened their public data
repositories, many others are only just beginning. [1]
In addition, reciprocal access to corresponding public data in EU neighbours like Russia and
Turkey should be negotiated on Union level. Otherwise, the administrative burden becomes
too much for app and service developers, especially small and medium sized technology
companies with limited legal resources.
Open data for mobility
While the PSI directive encourages public authorities to make as much of their data available
for general use as possible, data required for mobility services should have priority. In
addition to spurring economic growth [2], these services advance free movement of people,
products and services, which are among the key principles of the European Union.
As the end users of mobility apps range from hauliers to bus passengers, the apps require
access to a wide variety of public data. For example, a lorry on an international route may
involve routing apps for the driver, fleet management apps for the lorry company, logistics
apps for the haulier and queue management apps for customs authorities, all with their
particular needs for data and communication.
Different mobility users require different services requiring different data sets.
User groups Applications and services Data requirements
Citizens Car drivers Navigation, smart routing,
weather service, smart
parking, roadside service
information, driving habit
feedback
Geographic, weather, road
condition, road works and
maintenance, disruptions,
car parks, traffic statistics,
public and private services
Passengers Timetables, fares, ticket sales,
lodging, service information
Timetables, reservation,
disruptions
Private
companies
Transport companies
and hauliers
Fleet management, fare
collection, logistics, cargo
tracking, customs
Tax and customs records
Road/rail/port operators Traffic flow, tracking
Public
sector
Public national, EU and
local authorities
Emergency services, traffic
management, road
maintenance, border station
queue management
Traffic flow, incident
detection
3. Crossing borders with open data
3
While there is a variety of guidelines on how to open public data, good practices about its
utilization are less common. Some approaches stress that even open data is quite useless
without ideas on how to utilize it from the start, while others encourage opening of all
available data sets in order to see which data is most useful for developers and users. While
both approaches have merit, the latter one, however, leaves more space for open innovation.
In the traditional, closed model, a service ‘innovation’ has to be defined before access to data
and data reparations can be negotiated between the authority and the apps developer.
Correspondingly, the open model allows experimentation and development of new, even
surprising ideas with low economic risks. What is more, authorities are often unaware of what
kids of data people want and need most. As a result, opening of all the available data is often
recommended. [3]
Open data for app development
While open data presents authorities with definite technical and legal requirements, these
requirements are not necessarily enough to effect emergence of new commercial apps or
services. While there have been an abundance of new ideas and innovative prototypes based
on open data, the development of actual commercial mobility apps has been somewhat
disappointing. In order to overcome this challenge, the requirements of commercial
developers and their clients should also be taken into account when opening data meant for
mobility apps.
Common requirements for open data
The commonly used description by opendefinition.org defines open data as “A piece of data or
content is open if anyone is free to use, reuse, and redistribute it – subject only, at most, to the
requirement to attribute and/or share-alike.” What does this mean in practice?
Firstly, there should be no technical constrains for access to data. The data repositories opened
have open application programming interfaces, use standard machine readable open data
formats and provide sufficient data transfer capacity.
Furthermore, there should be no legal constrains for access to data. Data is made available
with a license that allows data to be exploited and distributed without restrictions by anybody
for any purpose. For example Creative Commons licenses are among the most referenced
open licensing schemes. Open licenses are based on the idea that the material is free to use,
provided that the terms of use are respected.
4. Crossing borders with open data
4
Finally, there should be no financial constrains for data use. The public authority opening the
data does not charge users of the data any access or usage fees.
What is more, the location, technical definition and terms of use of the data opened have to be
made easily available. Different kinds of data catalogues have been used to list and describe
opened data sets with metadata, but while local developers know where to find the national
open data catalogue, their usefulness for cross border service development is often negligible.
For example, while Finnish authorities have been diligently opening their data repositories for
few years now, the Finnish national open data catalogue1
is still available only in Finnish. Not
even a Swedish catalogue is available in spite of Swedish being the other official language of
Finland. As a result, the catalogue is practically unusable for foreign companies and developers.
In order to facilitate international usage of open data, e.g. for cross border mobility apps, each
country should have a public authority responsible for maintaining local open data catalogues
and making national open data accessible not only for local companies but also for developers
from other countries.
Restrictions to openness
Absolutely open public data will continue to be rare. The reasons behind different degrees of
openness vary, and willingness of public agencies to provide datasets changes not just country
by country, but also agency by agency.
For example, data sources often include information that can be considered to belong under
privacy policy and therefore anonymization is needed before publication. This is often the
case in traffic related data, as data on the movement of vehicles may be used to track the
movement of their passengers. For example, congestion fee data may have to be anonymized
before opening. [4] If the need for anonymization has not been taken into account when the
data base has been created, the cost can be considerable.
In some cases, information in the data sets could be considered to be a risk for national or
system security. This might appear more often in conjunction with maps of critical
infrastructure, such as gas pipelines or food logistics systems. Even parts of the transportation
system infrastructure may be considered sensitive.
Sometimes, the public data repositories contain information that can be considered to be
someone’s intellectual property or business asset or maybe form a part of that. For example,
1
http://www.suomi.fi/suomifi/tyohuone/yhteiset_palvelut/avoin_data/
5. Crossing borders with open data
5
hauling companies may treat movements of their cargo as trade secrets. In some cases, the
‘ownership’ of public information may be unclear and opening it may constitute a legal risk.
It is an old saying that information is power. There could be mental barriers to open data to
everyone and for new innovation and usage. These barriers are often high among agencies and
administration used to having clear borderlines of responsibilities and field of operation.
Although an authority may be required to gather data for its own operations, it almost never
arrives in a form that can be automatically made open. Sometimes, the data needs to be
anonymized or aggregated to alleviate privacy concerns, for a cost. In other cases, it needs to
be re-formatted and placed in open data sources in specified formats in order to make the data
useful to outsiders. New, more efficient connections to Internet may also be required in order
to allow downloads of large datasets without disturbing other Internet services provided by
the authority.
Quality requirements for open data
Further requirements for open data targeted for commercial mobility apps involve different
aspects of the quality of the data opened. Data should be made available with a quality
statement regarding its quality, as this will allow potential users to determine, whether the
data is suitable for their purposes as is, or does data quality require additional measures. In
practice, the various aspects of data quality address two issues: data accessibility and data
integrity.
The first quality issue is reliable data accessibility, that is, commercial apps may require the
data to be available 24/7 with 99.98% reliability, but the public authority providing the data
may be open only during office hours and 99.5% reliability. As the authority cannot be required
to improve data accessibility beyond its own operational needs, the service developer has to
take care of improving accessibility on its own cost.
What is more, authorities with open data may make changes to their data repositories, making
old data access routines obsolete. Authorities cannot be required to freeze their systems when
they open data. As a result, services based on open data may cease to operate, if apps providers
do not continuously maintain their data access routines.
The second data quality issue is data integrity. Commercial apps users may require guarantees
of the flawlessness of the data used by the apps. As the apps may have considerable financial
impacts, the clients may also want to be able to trace the data back to its source in order to
ensure its authenticity and timeliness. For example, real estate prices are definitely influenced
6. Crossing borders with open data
6
by apps using police statistics of the neighbourhood [5]. In order to ensure data integrity, the
service provider using open data has to have safe and sound information security. The
responsibility of the integrity of the service and potential damages has to be defined.
As there is no completely error-free data, an error reporting and handling procedure is needed.
When app users encounter inaccuracies in open data, they should be able to report them to the
public authority maintaining the data repository.
Commercial apps data quality requirements call for open data aggregators that collect open
data from public authorities and ensure its availability and integrity for commercial purposes.
Are these aggregators to be private service providers or financed by public sector, is yet to be
seen.
Care should be taken that these aggregators do not close access to original open data for
individual developers and data tinkerers, as they are a valuable source of innovation for
commercial apps developers, too. Comprehensive national and European data catalogues are
still required alongside aggregators.
Commercial open data ecosystem requires aggregators responsible for data quality.
Standards, interfaces and licences
Opening public data for mobility services is not enough. In order to mitigate mobility service
innovation, countries should harmonize traffic related data for example between European
Union countries and national administrative branches. Apps developers should not be required
to create different interfaces in order to access data in each and every EU country. Union level
coordination may be required in order to define and harmonize the standards and interfaces
used.
7. Crossing borders with open data
7
A positive relationship between open data and standards can be identified, as the value of data
increases; the more widely it can be shared and utilized. In general, it is clear that standard
based data can be utilized, aggregated and refined better than data in a proprietary format.
Hence, the authorities collecting and providing open data are encouraged to implement
standard data formats.
Usually, the selection of a standard to be utilized is quite straight forward. However, when
different agencies are collecting, using and opening same or almost corresponding data for
different purposes, a data harmonization exercise is very highly recommended. The
harmonization and standardization of data structures and data exchange services are
fundamental challenges for the information society as a whole as well as for Intelligent
Transport Systems.
The coordination and harmonization of traffic management measures between road operators
is an essential part of maximizing the capacities of their road networks in order to reduce the
effects of congestion and improving safety. For traffic management related data, there are
several widely recognized data formats available, JSON and DATEX II being the most
notable.
The original DATEX standard was developed for the purpose of exchanging road traffic
information among road operators and between road operators and service providers in order
to improve the traffic conditions and to inform the drivers on the road and also for their
pre-trip planning.
The second generation DATEX II specification is aimed also for other actors in the traffic and
travel information sector. DATEX II has become the reference for all applications requiring
access to dynamic traffic and travel related information, ranging from road works to public
events having on impact on traffic. It has been one of DATEX II’s main achievements to
establish a logical model for this domain that is widely supported by users all over Europe.
DATEX II is already a prerequisite on some EU programs in order to achieve conformity with
other systems developed.
JSON (Java Script Object Notation) is a lightweight data interchange format which is based
on a subset of the JavaScript programming language. It is easy to read and write for humans
and also easy for machines to parse and generate. The JSON format is defined in the
ECMA-404 –standard. It is a text format that is completely language independent but uses
conventions that are familiar to most programmers.
8. Crossing borders with open data
8
Roaming across Europe
Mobility apps require reasonable priced reliable mobile Internet access across Europe.
Fortunately the worst excesses of data roaming prices have already been addressed by Union
regulation, but the roaming costs can still be considerable.
When passengers or cargo cross Union borders, cost of required mobile Internet access may
become prohibitive. Service providers may have to negotiate access contracts with national
mobile operators.
Recommendations
Recommendations for authorities opening data for mobile apps development:
1. Give priority to data required for mobility services when opening data.
2. Use standard data formats and APIs in data repositories in order to make it easier to open
them.
3. Provide sufficient broadband capacity for data downloads.
4. Inform developers of the data sets opened e.g. through open data catalogues.
5. Ensure data integrity and improve data accessibility.
6. Note the need for anonymization of data when developing new data repositories.
Recommendations for companies and developers utilizing open data in mobility apps
1. Ascertain availability and integrity of open data used, for example through aggregators.
2. Ensure timeliness of downloaded open data used for service creation.
3. Ensure data security of your service.
Recommendations for mobility apps end users
1. Report errors encountered in open data.
2. Participate in crowdsourced open data generation.
Recommendations for legislators
1. Mobile data roaming costs should be kept as low as profitably possible.
2. Mobility related public data should be harmonized and standardized both internationally
and between national administrative branches.
3. A public authority should be made responsible for national data catalogue maintenance.
9. Crossing borders with open data
9
References
1. Open Knowledge Foundation (2013). Government data still not open enough - New
survey on eve of London summit. In: Open Data Index [online]. [cit. Apr 10th
2014].
2. Koski, H. (2011). Does Marginal Cost Pricing of Public Sector Information Spur Firm
Growth? ETLA Discussion Papers 1260.
3. Halonen, A. (2012). Being Open About Data: Analysis Of The UK Open Data Policies And
Applicability Of Open Data, London: The Finnish Institute in London.
4. Blumberg, A.J. & Chase, R. (2006) Congestion Pricing That Preserves Driver Privacy,
Intelligent Transportation Systems Conference, 2006. ITSC'06. IEEE.
5. Daily Mail Reporter (2010) Asbo App for iPhone tells you how anti-social your area is,
Daily Mail 19 February 2010.