PHIDIAS organised it's third and final PHIDIAS Webinar of the series, this time dedicated to Use Case 2: Big Data Earth Observations (EO), took place on 18 February 2021 at 15:00 CET, showcasing how PHIDIAS is taking advantage of HPC architecture to facilitate selection and image analysis activities for land surface monitoring.
Bridging the gap to facilitate selection and image analysis activities for land surface monitoring
1.
2. Objectives
Develop a catalogue that will allow users to discover and access data, open-source
software, public Application Programming Interfaces (APIs) and interactive processing
services.
Optimise and industrialize workflows to allow the largest degree of reusability of data as
possible.
Implement an end-user web common interactive processing service based on notebook and
datacube technologies allowing new users to easily have open access to HPC capacities
and develop new algorithms.
Improve the FAIRisation of satellite and environmental datasets and preserve FAIR
(Findable Accessible Interoperable Reusable) datasets into a Remote Data Access (RDA)
certified repository.
Develop new data pre-processing models coupled with HPC capabilities, by building a
new and innovative on-the-fly computing service for smart processing of data, addressing the
problem of efficient filtering and on-the-fly first diagnostics on incoming in-situ data.
Deploy data post-processing methods as a service for several end-users, including
scientific communities, public authorities, private entities and citizen scientists.
19/02/2021 2
Consortium of 13 committed partners from 5 different EU countries, led by CINES (French national
Computing Centre for Higher Education)
3. Webinar series on PHIDIAS Use Cases
19/02/2021 WEBINAR: Bridging the gap to facilitate selection and image analysis activities for land surface monitoring 3
All recorded Webinars are available on Phidias HPC website under the Webinar page
https://www.phidias-hpc.eu/events/webinars/
4. 19/02/2021 4
Webinar agenda
15:00 - 15:05 Introduction
Francesco Osimanti, Trust-IT Services
15:05 - 15:15 Challenges and barriers to the availability of on-demand
processing services for the land surface community
Jean Christophe Desconnets, IRD
15:15 - 15:25 From processing chains prototyping to HPC-based on-demand
services: the example of Soil Moisture Service
Cyprien Alexandre, IRD
15:25 - 15:30 Q&A Session
15:30 15:30 Online and on-demand processing of satellite images: Web
Processing Service, the missing link in the chain
Gérald Fenoy, Geolabs
15:40 15:50 Notebook and Datacube: an original approach for exploiting and
analysing environmental big data within HPC/HPDA environments
Dorian Ginane, Geomatys
15:50 - 16:00 Q&A and Closing Remarks
WEBINAR: Bridging the gap to facilitate selection and image analysis activities for land surface monitoring
6. The PHIDIAS project has received funding from the European Union's Connecting Europe Facility under grant agreement n° INEA/CEF/ICT/A2018/1810854.
Challenges and barriers to the
availability of on-demand processing
services for the land surface community
Jean-Christophe Desconnets, IRD, WP5 leader
Webinar | February 18, 2021
7. Barriers to use Earth Observation images
7
Accessing relevant images
Having the skills
Finding support
Getting appropriate equipments
Researchers Land managers
Government
agencies
End-users
8. A very large panel and amount of images
8
Increasing diversity of supply
(spectral, spatial, temporal
resolution)
Need for another paradigm :
From « bring the data to the
user » to « bring the user to the
data »
9. Increasing interoperable catalogs
Catalog GEOSUD of VHR images
SPOT6, SPOT7 : findable,
accessible and reusable through
standard web services and
comprehensive metadata
14 000 raw images
9
Catalog PEPS of Sentinel 1 &
Sentinel 2 images, findable
accessible and reusable
>> 10 Millions of raw images
10. Technical and scientific skills
Large storage and processing resources
Ensure the reproductibility of data processing
10
But … produce new knowledge with EO
images requires
11. Implementing scientific processing chains
Coming from scientific remote sensing community
THEIA Scientific Expertise Center (SEC) working around ten themes :
Agriculture, Forest, Urban, Coastline, Health, Water…
Outputs : scientifically validated processing chains for environmental monitoring
11
Land cover IOTA Chain Soil moisture with VR spatial resolution
12. Bridging the gap between academic production
and end users usages
Remote sensing
specialist
Scientific data
processing
End-users
Accessible, scalable
and reproducible data
processing
TRL 3 - 6
TRL 6 - 7
TRL : Technical Readness Level
13. WP5 objectives and users scenarios
Propose architecture, tools and
data models to provide
systematic or on-demand map
production accessible for data
scientist, researcher or land manager
13
Bridging the gap for facilitate selection and image
analysis activities for land surface monitoring
Three THEIA processing chains
Soil Moisture at Plot scale
Images processing with artificial
intelligence: application to land cover
mapping and super-resolution
Very high resolution land use
mapping (Moringa)
14. Bring data and HPC resource to user …
14
→ Define models, protocol and tools to deploy and
link end-users analysing environment and execution environment on HPC center
Data processing
chains
Data processing
chains
WPS server
WPS server
Image and product
Discovery
Image and product
Discovery
HPC/HPDA
Center
HPC/HPDA
Center
Analysing environment
Analysing environment
16. The PHIDIAS project has received funding from the European Union's Connecting Europe Facility under grant agreement n° INEA/CEF/ICT/A2018/1810854.
From processing chains
prototyping to HPC-based on-
demand services, example of
Soil Moisture Service
Webinar | 18 February 2021, 15:00 CET
Cyprien Alexandre, research engineer (IRD)
17. Introduction
17 18.02.2021 PHIDIAS Webinar | 18.02.2021 |
How to allow a simplified access to high performance computing for researchers and non-
researchers?
How to transfer a processing chain developed for research to a “non-code-friendly” public?
What are the engineering issues for developing such as processing chain?
How to allow access to HPC capacities “for all”?
18. Soil Moisture Service (SMS)
CES Soil humidity at high spatial resolution
Nicolas Baghdadi
Loïc Lozac’h
https://gitlab.irstea.fr/loic.lozach/soilmoistureservices
Model inversion to retrieve soil moisture value from
Sentinel1 SAR images at the plot scale.
Sentinel2 NDVI are used to adjust the impact of vegetation
on moisture estimation
18 18.02.2021 PHIDIAS Webinar | 18.02.2021 |
Sentinel-1
Sentinel-1
Sentinel-2
Sentinel-2 DEM
DEM Plots
Plots
Soil Moisture Map
Soil Moisture Map
19. Soil Moisture Service (SMS)
Input data from different data provider
Sentinel-1 images CNES (PEPS)
Sentinel-2 images THEIA or PEPS
SRTM ESA
Official inventory of plots France, users
vector files
Output data
Raster and csv
Vector
Python, OTB, TensorFlow (Neuronal
network)
19 18.02.2021 PHIDIAS Webinar | 18.02.2021 |
Download
package
Sentinel-2 (freq 6 days) Sentinel-1 (freq 6 days)
Calibration
package
S2 monthly
synthesis
Agricultural Plot
Vector
{sigma0VV,incidence}
TS
DEM
Pre-process
package
NDVI Time Synthesis
(15 days)
Agric Plot Index
Raster
Slope
Model Inversion Package
Soil Moisture
Time Serie
Frequency
Systematic
products: 6 days
On-demand
products
20. Containerisation
What?
Application encapsulation method
Complete environment and all dependencies
Why in HPC?
Easily deployable, at massive scale
Can be easily maintained
No compatibility issues between hosted applications
Two main solutions Docker vs Singularity
20 18.02.2021 PHIDIAS Webinar | 18.02.2021 |
- Largely used for common usage
- Difficulty for partition/manage access rights
between users
- Developed for HPC, adapted security level
- Singularity can run Docker images
- Open source, BSD licensed, free
21. Porting to HPC
Code adaptation to fit security rules
Ex: no downloads allowed on
computing nodes
France mainland:
S2: 3 To /15d
S1: 316 Go /15d
SRTM: 575 Go
21 18.02.2021 PHIDIAS Webinar | 18.02.2021 |
Data storing
• Size limitation
• Duplicated data
On-demand download
• Execution of SMS container on front machine
• Not the fastest
Virtual file system (iRods)
• Direct link with data providers
• Use of metadata link
• Faster and secured
Complexe organisation
Specific data access protocol
Storage
Access rights administration
Data migration etc…
22. Computing capacities access
22 18.02.2021 PHIDIAS Webinar | 18.02.2021 |
Soil Moisture Service
Soil Moisture Service
Heterogeneous community
Non comfortable with code
Comfortable with code and want to go further in the product
analysis
23. Computing capacities access
Two profiles and two interfaces
Standard web carto interface with virtual
products to compute (GEOSUD portal)
Access Jupyter Notebook interface to run
OTB commands on proposed products
23 18.02.2021 PHIDIAS Webinar | 18.02.2021 |
Dorian Ginane - GEOMATYS
24. Soil Moisture Service
Soil Moisture Service
Computing capacities access
ZOO-Kernel as processing orchestrator
Communication between web server and HPC’s front
Web Processing Service (WPS Open Geospatial
Consortium)
24 18.02.2021 PHIDIAS Webinar | 18.02.2021 |
WPS Server
WPS Server
WPS
WPS
SSH
SSH
Gérald FENOY - GeoLabs
Soil Moisture Service
Soil Moisture Service
25. Conclusion
Making HPC capacities accessible to a wide public implies taking into account several parameters…
…at the processing chain level
Code adaptation
Containerisation
…at the HPC architecture level
Security
Data storage and transfer
…for access to processing
All these development have a common thread : FAIR
Findable, Accessible, Interoperable, Reusable
Use of standardised applications (Python, Singularity …)
Use of standardised protocols (WPS - Open Geospatial Consortium)
Link with WP3 in charge of metadata
SMS processing chain is the first chain to be included in Phidias, more to come.
25 18.02.2021 PHIDIAS Webinar | 18.02.2021 |
26. Thank-you
C. Alexandre, research engineer, UMR Espace-Dev (IRD)
cyprien.alexandre@ird.fr
13.02.2020 PHIDIAS Webinar | 13.02.2020 | www.phidias-hpc.eu | @PhidiasHpc
26
27. The PHIDIAS project has received funding from the European Union's Connecting Europe Facility under grant agreement n° INEA/CEF/ICT/A2018/1810854.
Online and on-demand
processing of satellite images:
Web Processing Service, the
missing link in the chain
Webinar | 18 February 2021, 15:00 CET
Gérald Fenoy, GeoLabs CEO
28. Schedule
Introduction
Web Processing Service
OpenAPI
OGC and OpenAPI
The ZOO-Project implementation
Overview
OGC API – Processes
Remote service execution
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
29
29. Introduction
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
30
XML / KVP
Data diffusion Application diffusion
HPC
CSW
WMS WFS WCS
WPS Server
WPS Server
Execute asynchronous services running on HPC and available as Singularity containers
30. Web Processing Service
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
31
Locked-in applications
Client and Server application
strongly linked together
The Open Geospatial Consotium published the Web Processing Service standard:
WPS provides generic interfaces
Multiple servers can be used
Multiple clients can be implemented
(Server / Desktop / Mobile)
Automatic capability addition
Reproduceability …
…
31. Web Processing Service
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
32
What is available on the servers What are the parameters for a service
GetCapabilties DescribeProcess
Available services List
Detailled description of service
including parameters
Kebab menu
Options:
Meat: Beef, Porc, Chiken
Vegetables: Salad leaves, Tomatoes, ..
Type of bread: thin, normal, none
Run the service with
Desired parameters
Execute Sync Async Execute
Execution Result as
Response / Raw / Stored
Place your order
Result:
. plate
. storage for delivery,
or take away
32. OpenAPI
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
33
The Linux Foundation sponsored the OpenAPI initiative
Swagger specification 1.0 in 2011
Swagger specification 2.0 in 2014
OpenAPI specification (OAS) 3.0.0 in 2017
OAS offers a simple format (JSON/YAML) for writing REST contract
JSON
33. OGC and OpenAPI
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
34
Starting in 2016, the OGC recognized that their actual
Web Services standards were Web API
34. The ZOO-Project implementation
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
35
WPS 1.0.0 / 2.0
OGC API - Processes
ZOO-API
ZOO-Services
HPC
service
ZOO-Client
OrfeoToolbox
SAGA-GIS
MapServer
Highly flexible
build new services
upon existing ones.
Open Source (MIT/X11 License) server implementation
of the OGC WPS standard
36. Remote service Execution
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
37
GeoSUD Project (WPS 2.0)
Security management
Use asynchronous callbacks invocations for
execution history recording
Automatic publication of execution history,
inputs/utputs publication as OGC Web Services
Use SSH and slurm for remote execution
schedulling
Rely on Mail sent from the HPC server
37. From monolithic to container
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
38
Request Handler
Process Manager
…
WPS 1.0.0 / 2.0
OGC API - Processes
Produce
Route
Consume
Exchange
Queue
PG DB redis
38. Conclusion
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
39
WPS facilitates the diffusion of scientists knowledge by defining a protocol to discover,
access to, interact with and reproduce service execution.
The addition of the OpenAPI support, through the OGC API – Processes, eases the diffusion
of the knowledge by providing out-of-box tools to document, discover, access to, interact
with, and reproduce service execution.
Reduce additional developments required to be integrated in: Notebooks and Web client
applications.
Ensure scalability by using advanced message queuing protocol
Ease of maintenance, and deployment by using containers
History management to produce provenance metadata of the resulting product
40. The PHIDIAS project has received funding from the European Union's Connecting Europe Facility under grant agreement n° INEA/CEF/ICT/A2018/1810854.
DataCube and Notebook : another
approach for exploiting images
Dorian Ginane, Project's manager, Geomatys
Webinar | February 18, 2021
41. Virtual products for bridging the gap for the
user
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc 42
Often, user does not need to access to a collection
of datasets but to processed data for a specific
spatio-temporal area.
Due to the volume of raw data accessible. All the
raw data can not be pre-processed.
The idea : Shows to user synthetic products (for
example a single soil moisture product) and
process raw data on the fly when needed
(Analysis Ready Data approach
https://ceos.org/ard/)
42. Virtual products
43
Source : Geomatys
A virtual product is an arbitrary
aggregation of raw data files that can
be manipulated as a unique cube of
data. (for example, an optical layer
mixing Sentinel2 and another optical
satellite…)
Examind DataCube helps user to
directly obtain and manipulate a
subsample of the virtual product
https://datacube.examind.com
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
43. Examind Datacube : How it works
44
The solution catalogues and indexes all
the spatio-temporal and observations’
dimensions for all the datasets available
on the Data Lake :
time range,
Geographic extent,
Resolutions
Unit of measure
Min, max values
… Source : Geomatys
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
Data Lake
Request
Tensor
44. 45
DataCube provides a cube of
data corresponing at what users
or software have asked for,
independently of the raw data
structure :
Numpy or Xarray structure, Java
structure, Dask Array…
With the wished resolution and
coordinated reference system
Export format : COG…
Examind Datacube : How it works
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
45. 46
The virtual products can be used on different
way
On a classic catalog
Through a Web Map Service
as a classical spatio-temporal layer
Trough a Notebook
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc
46. 47
Examind Datacube, Notebook and HPC
infrastructure
Notebook are an easy way for
DataScientists to collaborate for
the conception of new algorithm
or data analysis
DataCube provides an easy way
for asking data
Process can be executed on the
HPC cluster using WPS
Standards
18.02.2021 PHIDIAS Webinar | 18.02.2021 | https://www.phidias-hpc.eu/ | @PhidiasHpc