Semantic search within Earth Observation products databases based on automatic tagging of image content

SEMANTIC SEARCH WITHIN EARTH OBSERVATION PRODUCTS DATABASES
BASED ON AUTOMATIC TAGGING OF IMAGE CONTENT
Jérôme Gasperi
2014 Conference on Big Data from Space
Frascati - Italy - November 12th, 2014

Big Data ?
The data deluge
The search paradigm
iTag
An EO tagging library
resto
An EO product search engine
What’s next ?
Conclusion and perspectives

The data deluge
Brett Ryder - http://www.economist.com/node/15579717

Earth Observation products search paradigm is to use
the acquisition parameters stored in the metadata

Sven Sachsalber | http://www.palaisdetokyo.com/fr/events/sven-sachsalber

iTag
Automatic tagging of Earth Observation products

Orthorectified image Characterized image
This is urban
This is water
This is forest
What we got What we need

iTag provides semantic enhancement of Earth
Observation data

It uses metadata footprint to enrich metadata
from exogenous data
i.e. no image processing !

Out of the box tagging sources
Continents,
Countries,
Regions,
States,
Cities,
Land cover,
Rivers,
Population count

# Polygon around Moscow
$moscow = ‘POLYGON((37.1351 55.9655,38.1006 55.9640,38.0525
55.4969,37.0926 55.5171,37.1351 55.9655))’;
# Initialize iTag
$iTag = new iTag();
# Tag polygon for land cover
$result = $iTag->tag($moscow, array(
‘landcover’ => true
));

Tag footprint around Moscow
http://goo.gl/6AkU4y

resto
Toward an Earth Observation products search engine

Search, visualize and download
Earth Observation data

Gazetteer Query Analyzer
Administration
REST Webservices
Abstract Database Access
Layer
PostgreSQL
Driver
iTag 2.0
resto 2.0
Search
Visualize
Download
Users
POST
DELETE
Admin
Data

Abstract Database Access Layer
PostgreSQL Driver
database
resto
schema
_collection1
schema
_collection2
…etc…
schema
resto
schema
usersmanagement
PostGIS
hstore
Table inheritance

Rresto
Search Ingest
GET POST

Gazetteer Query Analyzer
Administration
REST Webservices
Abstract Database Access
Layer
PostgreSQL
Driver
iTag 2.0
resto 2.0
Search
Visualize
Download
Users
POST
DELETE
Data

During ingestion process, resources are automatically
tagged thanks to iTag library

Search images over Russia
Bounding box !!

resto provides semantic search capabilities
It uses a Query Analyzer to translate natural language query into
a set of EO OpenSearch parameters

<with> "keyword"
<without> "keyword"
"quantity" <lesser> (than) "numeric" "unit"
"quantity" <greater> (than) "numeric" "unit"
"quantity" <equal> (to) "numeric" "unit"
<lesser> (than) "numeric" "unit" (of) "quantity"
<greater> (than) "numeric" "unit" (of) "quantity"
<equal> (to) "numeric" "unit" (of) "quantity"
"quantity" <between> "numeric" <and> "numeric" ("unit")
<between> "numeric" <and> "numeric" "unit" (of) "quantity"
<today>
<yesterday>
<before> "date"
<after> "date"
<between> "date" <and> "date"
"numeric" "(year|day|month)" <ago>
<last> "(year|day|month)"
<last> "numeric" "(year|day|month)"
"numeric" <last> "(year|day|month)"
"(year|day|month)" <last>
<since> "numeric" "(year|day|month)"
<since> "month" "year"
<since> "date"
<since> "numeric" <last> "(year|day|month)"
<since> <last> "numeric" "(year|day|month)"
<since> <last> "(year|day|month)"
<since> "(year|day|month)" <last>
Query string analysis algorithm
is based on simple recognition
of words and patterns

Example
« Images of urban area in Russia acquired in last year with less than 5 % of cloud cover »

Example
« Images of urban area in Russia acquired in last year with less than 5 % of cloud cover »
keyword location date acquisition parameter

2. Each search result has an « human readable url » that can
be indexed by web crawler (i.e. google robots)
1. Search parameters are derived from
Natural Language query
3. Keywords on resources are links to search requests :
they can be indexed by web crawler…and so on

2. Each search result has an « human readable url » that can
be indexed by web crawler (i.e. google robots)
http://goo.gl/BCZ3z4
1. Search parameters are derived from
Natural Language query
3. Keywords on resources are links to search requests :
they can be indexed by web crawler…and so on

As of version 2.0, resto supports faceted search

http://dinosaurs.wikia.com/wiki/Coelurosauria
Facets

1 000 000
SPOT DATABASE
New products retrieved every 3 hours from ADS catalog
0.2s
SEARCH
0.5s
Time period of 1 month within a 10x10 km2 box
INGEST
Per product for a ~5000 products ingestion
Order of magnitude compute on a Dual Core 2.6 GHz | 4 Go RAM | HDD 500 To

What’s next ?
Conclusion and perspectives

Need for « fresh » tagging reference databases
(e.g. GLC2000 replacement)

Enhance metadata with twitter trends hashtags
Add tags #mh370,#plane,#malaysianairline
to resources acquired between 2014, march 8th and 2014, april 14th
in the south of the Indian Ocean

« Linked data is the right way to do Semantic Web »
Tim Berners-Lee

Update iTag JSON model to follow JSON-LD format
{
"@context": "http://json-ld.org/contexts/person.jsonld",
"@id": "http://dbpedia.org/resource/John_Lennon",
"name": "John Lennon",
"born": "1940-10-09",
"spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
}

Semantic search within Earth Observation products databases based on automatic tagging of image content

Semantic search within Earth Observation products databases based on automatic tagging of image content

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Semantic search within Earth Observation products databases based on automatic tagging of image content

Similaire à Semantic search within Earth Observation products databases based on automatic tagging of image content (20)

Plus de Gasperi Jerome

Plus de Gasperi Jerome (20)

Dernier

Dernier (20)

Semantic search within Earth Observation products databases based on automatic tagging of image content