4. Geospatial Big Data
Raster Vector Sensors Mobile
It’s easier than ever to collect geospatial data,
but how can we exploit these geospatial big data?
5. Example: Property data
One of the most valuable datasets managed by
governments worldwide
Extensively used in various domains by private and
public organizations
6. Challenges in working with property data
• Difficult to access
• Cross-sectors
• Data is highly heterogeneous and possibly large
• Data quality
• Time-consuming integration
• Lack of innovation
• …
7. How can we innovate (and make money)
with property-related (Open) Data?
8. proDataMarket project goals
• To make property data more accessible,
more usable and easier to understand
• To make it easier for:
• Property data providers to publish and
distribute their data
• Data consumers to find and access
property data needed for their businesses
2.5 Years
(2015-2017)
€4.5M
20+
Datasets
10. Example business case #1
Objective evaluation of the real estate properties
Business Intelligence companies
(e.g. Cerved)
Automation and cost-reduction in
property valuations, new services
Public administration Fact-driven social policy
Real estate agencies Speed up evaluation of properties, more
objective estimation of properties
Property buyers/sellers Eliminate intermediaries
11. Example business case #1 (cont’)
Objective evaluation of the real estate properties
in Italy, by
Istat Census
Snapshot of Italy, socio-
demographic data about: house
(its characteristics), people of the
family (personal data, education,
profession, work / study place)
people that live in house (guests)
OpenStreetMap
Point of interest of the city
about transport, downtown,
environment
Cadastral report
Property details (surface,
cadastral category, quality
status, age, ownership details)
~ 10M buildings
The evaluation of real estate
property
An up-to-date, objective evaluation
of the real estate properties in
territories in Italy
=
Market price €
++
12. =+
SYNTETIC INDEX ISTAT = -0.23 - (0.12 * UNEMPLOYED) + (0.2 * HISTORIC_BUILDINGS) + (0.58 * GRADUATES_ON_RESIDENTS) + (0.6 * STUDENTS_ON_RESIDENTS)
SYNTETIC INDEX POI = -0.5 + (0.15 * closest_metro_station) + (0.14 * closest_railway_station) + (0.24 * n_bus_stops_within_800m) +
(0.6 * n_small_green_areas_pois_within_800m) + (0.02 * n_pedestrian_paths_within_1000m) - (0.05 * closest_airport)
Example business case #1 (cont’)
Objective evaluation of the real estate properties
in Italy, by
13. Example business case #1 (cont’)
Objective evaluation of the real estate properties
in Italy, by
Sample technical challenges
Semantic data heterogeneity
How to translate a point of interest
into an OSM query?
How to retrieve data from the
whole Italy?
Structural data heterogeneity
How to compute indicators on
different data structures?
Messy data
How to exclude from computation
duplicated annotations of the same
real-world entities?
14. • Stakeholders:
• Public administration (e.g. FEGA
in Spain)
• Farmers and land owners
• Intermediaries (e.g. service
providers)
• Problems:
• Unfair grant assignment and
expenditure on audits
• Incorrect grant assignments
• Features defined subjectively
Example business case #2
Common Agriculture Policy (CAP) funds assignments
in Spain, by
15. Cadaster Information
Parcels and their features:
surfaces, limits, slope….
EFAs & LEs
Ecological focused areas and
Landscape elements accurately
defined using LIDAR
Satellite
Kind of crops, Health
status, Set aside zones,
Nitrogen fixing crops, CO2
fixing crops…
Accurately defined
CAP parameters objectively
defined, Automated process to
create new datasets related to
CAP Funds, Less errors, Less
audits and field visits…
=
CAP Funds
++
Fund assignment rules examples
• Crop Diversification
• Kind, density and surface of Ecological Focus Areas
• Conditionality
Example business case #2 (cont’)
Common Agriculture Policy (CAP) funds assignments
in Spain, by
16. 4) There are patterns:
Groups, lines, isolated
trees, etc.
5) Trees in line, hedges
Non-aligned groups,
copses
6) A viewer
2) Classified points by
their height
1) Raw datasets,
just points
3) Points are grouped: Yellow
(soil), Green (trees), Orange
(bushes)
Example business case #2 (cont’)
Common Agriculture Policy (CAP) funds assignments
in Spain, by
17. Example business case #3 (cont’)
Augment Reality (AR) for Property-related Data
in Norway, by
AR for buildings AR for underground infrastructure
What’s the impact of a new
building on its surroundings?
Where are the underground pipes?
18.
19.
20. • A hard copy of 314 pages and as a PDF
file
• 6 Person-Months
• Data collection with spreadsheets
• Quality assurance through e-mails and
phone correspondence
Pains: Time consuming, Poor data quality,
Static report without live updating
• Live service
• Efficient sharing of data
• Simplified integration with
external datasets
• Live updating
• Reliable access
• …
• Risk and vulnerability analysis, e.g.
buildings affected by flooding
• Analysis of leasing prices
Report Reporting Service 3rd party services
Example business case #4
Reporting state-owned real estate properties
in Norway, by
22. DataGraft: Data Transformation and
Knowledge Graph Publication Process
• Interactive design of transformations
• Repeatable transformations
• Reuse/share transformations (user-
based access)
• Cloud-based deployment of
transformations
• Self-serviced process
• Data and Transformation as-a-Service
22
Transform
Generate
RDF
Ontology X
Ontology X
Ontology X
Ontology
mapping
RDF Graph
Raw Data Prepared Data
Map
Map
Semantic graph
database
23. Geospatial Data is BIGthing
Innovation with property-related
data in proDataMarket