SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
GEO ANALYTICS CANADA
Demonstration Platform Overview
April 2020
THE PROBLEM
• Satellite EO data is now too
big to analyze using
traditional desktop analytic
tools
• Impossible to analyze
satellite EO data over wide
areas and deep timeseries
using traditional tools
NASA EO archive (EOSDIS) Growth:
approaching 246PB in 2025
2
THE SOLUTION
• Bring your algorithm to the data,
not the other way around
• Embrace big data tools and
systems used in other areas
• Transition away from desktop
analytics to cloud-native analytics
• This new era requires
partnerships between IT and
satellite EO experts
• Demonstration proof of concept
platform: www.geoanalytics.ca
3
www.geoanalytics.ca
DEMONSTRATION PLATFORM ARCHITECTURE
4
• Based on Hatfield’s direct experience with
ESA big data analytic platforms:
• TEPs, DIAS, Oil and Gas Platform,
Energy Corridor monitoring platform,
etc.
• Informed by competitive analysis of other
internationally known platforms:
• OpenDataCube, Google Earth Engine,
Hexagon's M.appX, CS-SI’s GeoStorm,
FAO’s Sepal, EarthServer’s Rasdaman,
Terradue’s Ellip, EOS’s Platform,
DigitalGlobe’s GBDX, and
Radiant.Earth’s platform
DEMONSTRATION PLATFORM ARCHITECTURE
Object Storage
EO, ARD + project
shared data storage
Kubernetes
On-Demand
Compute
Docker image
storage
System Functions
STAC
Indexing of EO assets
GT Data Store with
OGC API-Features
OpenLDAP + DEX
authentication
KubeFlow batch
processing and machine
learning
Kubernetes Compute Cluster
Core system nodes
Per-user private
Interactive compute
nodes
On-demand
scalable compute
nodes
Web Portal
GitLab
private code repository
+ Docker registry
Jupyter-Lab model
development
environment
System documentation
+
examples
Desktops + tools (QGIS,
SNAP, etc.) in a browser
GT data upload and
management functions
EO data query +
discovery
User + cost
management
Infrastructure as a Service
Software as a Service
Web-map tile
generation
EO data
pre-processing
functions
File Browser
NFS Storage
User secure data
storage
Cost accounting
INFRASTRUCTURE AS A SERVICE
• Scalable storage and compute services
6
INFRASTRUCTURE AS A SERVICE
7
• Key Requirements:
• Providing managed Kubernetes clusters – dynamically scheduled and
scaled containerized workloads
• Availability of pre-emptible nodes –largescale computations done in a
cost-effective manner
• Having a Canadian data center – to comply with Canadian data residency
requirements.
• Selected: Google cloud
• Meets all the above requirements
• Already hosts Landsat 4-8 and Sentintel-2 collections, so no-need to
duplicate
INFRASTRUCTURE AS A SERVICE
8
• Vendor Neutrality:
• GEO Analytics Canada uses technologies available
on all major cloud hosting providers
• APIs and layers of abstraction have been used to
assure neutrality
• Vendor neutrality allows us to pursue multi-cloud
integrations
• For example: distributed machine learning, with
compute done close to pre-existing data stores
COMPUTE INFRASTRUCTURE
• Entirely based on Kubernetes (K8s)
• An open-source system for automating
deployment, scaling, and management of
containerized applications
• Analytics is done in parallel on many worker
nodes to conduct big data analytics in a
performant manner
• Pre-emptible nodes make on-demand
compute very inexpensive
• Applications and users request compute
resources (# of CPUs & GBs of RAM) which
are provided on-demand within seconds
9
STORAGE INFRASTRUCTURE
10
• Object storage
• Highly durable with built-in
redundancy
• scales to exabytes of data
• Lowest cost
• On the Demonstration Platform, the
following are stored in object storage:
• Raw satellite EO data, including all
downloaded MODIS products
• Analysis ready satellite data (ARD)
• User and project team shared files
• Docker container images
STORAGE INFRASTRUCTURE
11
• NFS storage service
• Compatible with all Linux-based
systems used on the
demonstration platform
• Used to store user personal home
directories
• Secure – only available to a
specific user (cannot be shared)
• Transfer to project team storage
area (on object store) if sharing
required
• Back-end storage is a standard
SATA disk
SOFTWARE AS A SERVICE
12
• Web user interfaces and APIs for big EO data
analytics and processing
Public website – www.geoanalytics.ca Demonstration Platform Landing Page
13
AUTHENTICATION, SECURITY AND USER
MANAGEMENT
• All applications and APIs require
users to be authenticated
• User management and profiles
through LDAP
• Single-Sign-On
• Uses industry standard OAuth 2
protocol
• Users only need to log in once to
gain access to all applications
• APIs require token to
authenticate
EO DATA QUERY AND DISCOVERY
15
EO DATA QUERY AND DISCOVERY
• Web-browser based browse and
search interfaces
• Browse and search all datasets
• Query and view collections by
time, location
• SpatioTemporal Asset Catalog
(STAC) API of all EO datasets
• OGC API-Features (WFS3)
compliant metadata server
• API documented at
www.stacspec.org
EO DATA QUERY AND DISCOVERY
• Current EO Data Collections:
Collection Name Description Time Period
Available
landsat-8-l1 Landsat-8 images over eastern Canadian
landmass (Manitoba east) 2003-2020
modis.MCD12Q1 MODIS Land Cover 2000-2020
modis.MOD09GQ Terra Surface Reflectance 2000-2020
modis.MOD09Q1 Terra Surface Reflectance 2000-2020
modis.MOD11A1 Terra Land Surface Temperature and Emissivity 2000-2020
modis.MOD11A2 Terra Land Surface Temperature and Emissivity 2000-2020
modis.MOD13Q1 Terra Vegetation Indices 2000-2020
modis.mod09gq.veg.ndvi NDVI derived from Terra Surface Reflectance 2000-2020
modis.mod09gq.veg.evi2 EVI2 derived from Terra Surface Reflectance 2000-2020
EO DATA INGESTION AND PRE-
PROCESSING
18
EO DATA INGESTION AND PRE-PROCESSING
• Fully uses the computing power and
scalability of the IAAS tier
• multi-stage data processing pipelines
• Enables containerized applications to
be put into a processing chain that can
be scaled massively
• Implemented using KubeFlow
• primarily designed to enable machine
learning (ML) workflows
• Same ML workflows constructs are re-
purposed for EO data ingestion and
pre-processing
EO DATA INGESTION AND PRE-PROCESSING
• Proof of concept EO data pipelines created:
• Level-2 Sentinel-2 products using Sen2Cor
• Run any set of commands that are available through ESA’s
Sentinel Application Platform (SNAP) software
• Downloads MODIS products to the object store and adds the
product to the EO metadata system
• Adds Landsat-8 images over the Eastern Canadian landmass
(i.e. Manitoba east) to the EO metadata system
• Creates NDVI and EVI2 products from Terra Surface
Reflectance products
• Creates a daily thermal average product from Terra Land
Surface Temperature products
EO DATA INGESTION AND PRE-PROCESSING
• NDVI and EVI2 derived from Terra Surface
Reflectance Pipeline:
• Processing completed for all products available
between 2000-2020
• Results stored in object storage and indexed in
EO data query system
• Results available through all platform systems,
including EO data query and discovery system,
File Browser, desktop in a browser, etc.
• Runtime Example:
• 3 years of data (3 TB) processed in 13 hours
• 36 processing pods (1 per month), Each pod
is allocated 1vCPU, 5GB RAM
• Total cluster resources: 36vCPU, 180GB
RAM Viewing NDVI product using QGIS through the
‘desktop in a browser’ system
EO DATA INGESTION AND PRE-PROCESSING
22
• 10 Sentinel-2 L1A tiles to L2A conversion
• Typically ~3-4 hours
• GEOAnalytics: ~28 minutes
JUPYTER-LAB ANALYTIC ENVIRONMENT
23
JUPYTER-LAB ANALYTIC ENVIRONMENT
24
• Python-based scalable data analytics
• Interacts with Kubernetes to provide on-demand scalable compute
• Core software systems:
• Jupyter-Lab – provides the web application framework for
interactive analytics
• Xarray – provides an N-Dimensional Array interface and toolset
• Iris – provides methods for analysing and visualising meteorological
and oceanographic data sets
• Dask – provides flexible parallel computing for analytics
• Zarr – the next generation, cloud-native file format for gridded
datasets
JUPYTER-LAB ANALYTIC ENVIRONMENT
25
JUPYTER-LAB ANALYTIC ENVIRONMENT
26
• Implements a “Pangeo” Environment
• www.pangeo.io
• Supports both HPC and Cloud infrastructure
• Similar in nature to the European Joint Research
Centre’s “Earth Observation Data and Processing
Platform” (JEODPP)
• https://jeodpp.jrc.ec.europa.eu/home/
JUPYTER-LAB ANALYTIC ENVIRONMENT
27
• Hatfield has started a library
of example notebooks on how
to use the Jupyter-Lab
Environment
• Access Landsat data
through STAC API and
process/analyze it to
create an NDVI timeseries
• Query EO data hosted on
GEOAnalytics.ca using
OwsLib
https://github.com/geoanalytics-ca/example-notebooks
JUPYTER-LAB ANALYTIC ENVIRONMENT
28
• NDVI Landsat-8 Example Notebook:
• 30 nodes, 210GB RAM, 60 CPUs
• Random location close to Saint
Hyacinthe, QC
NDVI of 2018 acquisitions
mean NDVI
GITLAB PRIVATE CODE REPOSITORY AND
DOCKER REGISTRY
29
GITLAB SOURCE CODE REPOSITORY
30
• Collaboration and sharing
of source code with Git
• Private and shared
repositories available
DOCKER IMAGE STORE / CONTAINER REGISTRY
31
• The container registry is
backed by the object
store system
• Cost effective storage of
large container images
• Images in registry can be used
in scalable workflows in the
platform’s EO data ingestion
and pre-processing systems
ON-DEMAND PERSONAL UBUNTU
DESKTOPS IN A BROWSER
32
DESKTOPS IN A BROWSER
33
• Provides users with their own
Personal Ubuntu desktop
environment
• Accessible through a browser
• Enables data exploration directly
on the platform, reducing the need
to download data
• Users can select the amount of
RAM + CPU on startup:
• From 1 to 31 CPUs
• From 1 to 116 GB RAM
DESKTOPS IN A BROWSER
34
• Pre-installed software (SNAP,
QGIS, Firefox, etc)
• Users can install their own
software and customize the
desktop environment to be
their own
• EO data stores are mounted in
desktop environment for easy
access:
• All Sentinel-2 data
• All Landsat 4-8 data
• Pre-processed data products Viewing a Sentinel-2 product using QGIS
through the ‘desktop in a browser’ system
FILE BROWSER
35
FILE BROWSER
36
• Enables browsing and
downloading of all data
stored on the platform
for use in external
systems
• Users can view and
download data from:
• All EO data stores
• Shared data
between users of the
platform
• Their own personal
data
GROUND TRUTH DATA MANAGEMENT
37
GROUND TRUTH DATA MANAGEMENT
38
• Vector ground truth data
can be uploaded, viewed
and deleted
• Users upload a SHP
file which is imported
into the system
• Organized into collections
that contain features
• A SHP file is a
“collection”
GROUND TRUTH DATA MANAGEMENT
39
• Features can be
browsed/searched
interactively
• Features can be
searched
• Webmap displays
features
• API endpoints implement
OGC API-Features
specification (previously
referred to as WFS3)
• Implemented using
PyGEOApi
CONCLUSION
40
CONCLUSION
41
• The proof of concept platform demonstrates how [1]:
• Existing stores of satellite EO data can be
analyzed in-place using cloud-computing
resources, rather than requiring download
• New modular and user friendly metadata
protocols, particularly Spatio Temporal Asset
Catalogs (STAC), can be used to provide search
interface for satellite EO dataset discovery
CONCLUSION
42
• The proof of concept platform demonstrates how [2]:
• The new OGC API – Features (WFS 3) standard
can be used manage and make available ground
truth and other in-situ datasets
• Satellite EO analytic programs in Python can be
created interactively, and then scaled to analyze
large areas and deep timeseries using XArray
and Dask libraries
• Ingestion, machine learning, analytical and pre-
processing applications (both binary and python
based) can be linked to form scalable satellite EO
data processing chains
CONCLUSION
43
• Bring your algorithm to the data, not the
other way around
Email contacts:
info@geoanalytics.ca
jsuwala@hatfieldgroup.com

Contenu connexe

Tendances

OGC Update for State of Geospatial Tech at T-Rex
OGC Update for State of Geospatial Tech at T-RexOGC Update for State of Geospatial Tech at T-Rex
OGC Update for State of Geospatial Tech at T-RexGeorge Percivall
 
Golden Age of Geospatial Data Science
Golden Age of Geospatial Data ScienceGolden Age of Geospatial Data Science
Golden Age of Geospatial Data ScienceGeorge Percivall
 
DSD-INT 2015 - Fast-data management toolbox using Open Earth building blocks ...
DSD-INT 2015 - Fast-data management toolbox using Open Earth building blocks ...DSD-INT 2015 - Fast-data management toolbox using Open Earth building blocks ...
DSD-INT 2015 - Fast-data management toolbox using Open Earth building blocks ...Deltares
 
DSD-INT 2015 - Fast Tool -Gerrit Hendriksen
DSD-INT 2015 - Fast Tool -Gerrit HendriksenDSD-INT 2015 - Fast Tool -Gerrit Hendriksen
DSD-INT 2015 - Fast Tool -Gerrit HendriksenDeltares
 
DSD-INT 2015 - Data management with open earth datalabs - Gerben de Boer, van...
DSD-INT 2015 - Data management with open earth datalabs - Gerben de Boer, van...DSD-INT 2015 - Data management with open earth datalabs - Gerben de Boer, van...
DSD-INT 2015 - Data management with open earth datalabs - Gerben de Boer, van...Deltares
 
Postgres Vision 2018: Will Postgres Live Forever?
Postgres Vision 2018: Will Postgres Live Forever?Postgres Vision 2018: Will Postgres Live Forever?
Postgres Vision 2018: Will Postgres Live Forever?EDB
 
OZRI_presentation_Nik&Rod
OZRI_presentation_Nik&RodOZRI_presentation_Nik&Rod
OZRI_presentation_Nik&RodNicholas Henry
 
High Throughput Processing of Space Debris Data
High Throughput Processing of Space Debris DataHigh Throughput Processing of Space Debris Data
High Throughput Processing of Space Debris DataAndreas Schreiber
 
CourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on SparkCourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on SparkDataWorks Summit
 
Omid: scalable and highly available transaction processing for Apache Phoenix
Omid: scalable and highly available transaction processing for Apache PhoenixOmid: scalable and highly available transaction processing for Apache Phoenix
Omid: scalable and highly available transaction processing for Apache PhoenixDataWorks Summit
 
Atlas at Hackerspaces
Atlas at HackerspacesAtlas at Hackerspaces
Atlas at HackerspacesRIPE NCC
 
Presentation
PresentationPresentation
Presentationbolu804
 
Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19ExtremeEarth
 
Apache spark 2.4 and beyond
Apache spark 2.4 and beyondApache spark 2.4 and beyond
Apache spark 2.4 and beyondXiao Li
 
Geoverse Case Study: Using LiDAR to perform statewide inventory of sign assets
Geoverse Case Study: Using LiDAR to perform statewide inventory of sign assetsGeoverse Case Study: Using LiDAR to perform statewide inventory of sign assets
Geoverse Case Study: Using LiDAR to perform statewide inventory of sign assetsMerrick & Company
 

Tendances (20)

OGC Update for State of Geospatial Tech at T-Rex
OGC Update for State of Geospatial Tech at T-RexOGC Update for State of Geospatial Tech at T-Rex
OGC Update for State of Geospatial Tech at T-Rex
 
Golden Age of Geospatial Data Science
Golden Age of Geospatial Data ScienceGolden Age of Geospatial Data Science
Golden Age of Geospatial Data Science
 
DSD-INT 2015 - Fast-data management toolbox using Open Earth building blocks ...
DSD-INT 2015 - Fast-data management toolbox using Open Earth building blocks ...DSD-INT 2015 - Fast-data management toolbox using Open Earth building blocks ...
DSD-INT 2015 - Fast-data management toolbox using Open Earth building blocks ...
 
DSD-INT 2015 - Fast Tool -Gerrit Hendriksen
DSD-INT 2015 - Fast Tool -Gerrit HendriksenDSD-INT 2015 - Fast Tool -Gerrit Hendriksen
DSD-INT 2015 - Fast Tool -Gerrit Hendriksen
 
DSD-INT 2015 - Data management with open earth datalabs - Gerben de Boer, van...
DSD-INT 2015 - Data management with open earth datalabs - Gerben de Boer, van...DSD-INT 2015 - Data management with open earth datalabs - Gerben de Boer, van...
DSD-INT 2015 - Data management with open earth datalabs - Gerben de Boer, van...
 
Postgres Vision 2018: Will Postgres Live Forever?
Postgres Vision 2018: Will Postgres Live Forever?Postgres Vision 2018: Will Postgres Live Forever?
Postgres Vision 2018: Will Postgres Live Forever?
 
Keynote
KeynoteKeynote
Keynote
 
Containers and Big Data
Containers and Big Data Containers and Big Data
Containers and Big Data
 
OZRI_presentation_Nik&Rod
OZRI_presentation_Nik&RodOZRI_presentation_Nik&Rod
OZRI_presentation_Nik&Rod
 
High Throughput Processing of Space Debris Data
High Throughput Processing of Space Debris DataHigh Throughput Processing of Space Debris Data
High Throughput Processing of Space Debris Data
 
CourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on SparkCourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on Spark
 
Omid: scalable and highly available transaction processing for Apache Phoenix
Omid: scalable and highly available transaction processing for Apache PhoenixOmid: scalable and highly available transaction processing for Apache Phoenix
Omid: scalable and highly available transaction processing for Apache Phoenix
 
Software for the Hydrographic ocean
Software for the Hydrographic oceanSoftware for the Hydrographic ocean
Software for the Hydrographic ocean
 
Atlas at Hackerspaces
Atlas at HackerspacesAtlas at Hackerspaces
Atlas at Hackerspaces
 
Presentation
PresentationPresentation
Presentation
 
Earth Science Data and Information System (ESDIS) Project Update
Earth Science Data and Information System (ESDIS) Project UpdateEarth Science Data and Information System (ESDIS) Project Update
Earth Science Data and Information System (ESDIS) Project Update
 
Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19
 
NASA Terra Data Fusion
NASA Terra Data FusionNASA Terra Data Fusion
NASA Terra Data Fusion
 
Apache spark 2.4 and beyond
Apache spark 2.4 and beyondApache spark 2.4 and beyond
Apache spark 2.4 and beyond
 
Geoverse Case Study: Using LiDAR to perform statewide inventory of sign assets
Geoverse Case Study: Using LiDAR to perform statewide inventory of sign assetsGeoverse Case Study: Using LiDAR to perform statewide inventory of sign assets
Geoverse Case Study: Using LiDAR to perform statewide inventory of sign assets
 

Similaire à GEO Analytics Canada Overview April 2020

Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...David Wallom
 
Scalable Preservation Workflows
Scalable Preservation WorkflowsScalable Preservation Workflows
Scalable Preservation WorkflowsSCAPE Project
 
One GeoNode, many GeoNodes
One GeoNode, many GeoNodesOne GeoNode, many GeoNodes
One GeoNode, many GeoNodesGeoSolutions
 
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open SourceHigh Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open SourceDataWorks Summit
 
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...moneyjh
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE Project
 
SCAPE Presentation at the Elag2013 conference in Gent/Belgium
SCAPE Presentation at the Elag2013 conference in Gent/BelgiumSCAPE Presentation at the Elag2013 conference in Gent/Belgium
SCAPE Presentation at the Elag2013 conference in Gent/BelgiumSven Schlarb
 
Geospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataGeospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataAlexMiowski
 
Introduction to the EOSC-hub project
Introduction to the EOSC-hub projectIntroduction to the EOSC-hub project
Introduction to the EOSC-hub projectEOSC-hub project
 
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop PlatformApache Apex
 
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database AnalyticsOperationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database AnalyticsKinetica
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructureFernando Lopez Aguilar
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud ComputingDavid Wallom
 
Chemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopChemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopMarcus Hanwell
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3Tim Bell
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3Tim Bell
 
True Reusable Code - DevSum2016
True Reusable Code - DevSum2016True Reusable Code - DevSum2016
True Reusable Code - DevSum2016Eduard Lazar
 

Similaire à GEO Analytics Canada Overview April 2020 (20)

Introduction to Google Earth Engine .pptx
Introduction to Google Earth Engine .pptxIntroduction to Google Earth Engine .pptx
Introduction to Google Earth Engine .pptx
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...
 
Scalable Preservation Workflows
Scalable Preservation WorkflowsScalable Preservation Workflows
Scalable Preservation Workflows
 
One GeoNode, many GeoNodes
One GeoNode, many GeoNodesOne GeoNode, many GeoNodes
One GeoNode, many GeoNodes
 
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open SourceHigh Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
 
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation Environments
 
SCAPE Presentation at the Elag2013 conference in Gent/Belgium
SCAPE Presentation at the Elag2013 conference in Gent/BelgiumSCAPE Presentation at the Elag2013 conference in Gent/Belgium
SCAPE Presentation at the Elag2013 conference in Gent/Belgium
 
Geospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataGeospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning Data
 
Introduction to the EOSC-hub project
Introduction to the EOSC-hub projectIntroduction to the EOSC-hub project
Introduction to the EOSC-hub project
 
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database AnalyticsOperationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
 
EPCC MSc industry projects
EPCC MSc industry projectsEPCC MSc industry projects
EPCC MSc industry projects
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud Computing
 
Chemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopChemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the Desktop
 
CLIM Program: Remote Sensing Workshop, Distributed Access and Analysis: NASA ...
CLIM Program: Remote Sensing Workshop, Distributed Access and Analysis: NASA ...CLIM Program: Remote Sensing Workshop, Distributed Access and Analysis: NASA ...
CLIM Program: Remote Sensing Workshop, Distributed Access and Analysis: NASA ...
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3
 
True Reusable Code - DevSum2016
True Reusable Code - DevSum2016True Reusable Code - DevSum2016
True Reusable Code - DevSum2016
 

Dernier

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Dernier (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

GEO Analytics Canada Overview April 2020

  • 1. GEO ANALYTICS CANADA Demonstration Platform Overview April 2020
  • 2. THE PROBLEM • Satellite EO data is now too big to analyze using traditional desktop analytic tools • Impossible to analyze satellite EO data over wide areas and deep timeseries using traditional tools NASA EO archive (EOSDIS) Growth: approaching 246PB in 2025 2
  • 3. THE SOLUTION • Bring your algorithm to the data, not the other way around • Embrace big data tools and systems used in other areas • Transition away from desktop analytics to cloud-native analytics • This new era requires partnerships between IT and satellite EO experts • Demonstration proof of concept platform: www.geoanalytics.ca 3 www.geoanalytics.ca
  • 4. DEMONSTRATION PLATFORM ARCHITECTURE 4 • Based on Hatfield’s direct experience with ESA big data analytic platforms: • TEPs, DIAS, Oil and Gas Platform, Energy Corridor monitoring platform, etc. • Informed by competitive analysis of other internationally known platforms: • OpenDataCube, Google Earth Engine, Hexagon's M.appX, CS-SI’s GeoStorm, FAO’s Sepal, EarthServer’s Rasdaman, Terradue’s Ellip, EOS’s Platform, DigitalGlobe’s GBDX, and Radiant.Earth’s platform
  • 5. DEMONSTRATION PLATFORM ARCHITECTURE Object Storage EO, ARD + project shared data storage Kubernetes On-Demand Compute Docker image storage System Functions STAC Indexing of EO assets GT Data Store with OGC API-Features OpenLDAP + DEX authentication KubeFlow batch processing and machine learning Kubernetes Compute Cluster Core system nodes Per-user private Interactive compute nodes On-demand scalable compute nodes Web Portal GitLab private code repository + Docker registry Jupyter-Lab model development environment System documentation + examples Desktops + tools (QGIS, SNAP, etc.) in a browser GT data upload and management functions EO data query + discovery User + cost management Infrastructure as a Service Software as a Service Web-map tile generation EO data pre-processing functions File Browser NFS Storage User secure data storage Cost accounting
  • 6. INFRASTRUCTURE AS A SERVICE • Scalable storage and compute services 6
  • 7. INFRASTRUCTURE AS A SERVICE 7 • Key Requirements: • Providing managed Kubernetes clusters – dynamically scheduled and scaled containerized workloads • Availability of pre-emptible nodes –largescale computations done in a cost-effective manner • Having a Canadian data center – to comply with Canadian data residency requirements. • Selected: Google cloud • Meets all the above requirements • Already hosts Landsat 4-8 and Sentintel-2 collections, so no-need to duplicate
  • 8. INFRASTRUCTURE AS A SERVICE 8 • Vendor Neutrality: • GEO Analytics Canada uses technologies available on all major cloud hosting providers • APIs and layers of abstraction have been used to assure neutrality • Vendor neutrality allows us to pursue multi-cloud integrations • For example: distributed machine learning, with compute done close to pre-existing data stores
  • 9. COMPUTE INFRASTRUCTURE • Entirely based on Kubernetes (K8s) • An open-source system for automating deployment, scaling, and management of containerized applications • Analytics is done in parallel on many worker nodes to conduct big data analytics in a performant manner • Pre-emptible nodes make on-demand compute very inexpensive • Applications and users request compute resources (# of CPUs & GBs of RAM) which are provided on-demand within seconds 9
  • 10. STORAGE INFRASTRUCTURE 10 • Object storage • Highly durable with built-in redundancy • scales to exabytes of data • Lowest cost • On the Demonstration Platform, the following are stored in object storage: • Raw satellite EO data, including all downloaded MODIS products • Analysis ready satellite data (ARD) • User and project team shared files • Docker container images
  • 11. STORAGE INFRASTRUCTURE 11 • NFS storage service • Compatible with all Linux-based systems used on the demonstration platform • Used to store user personal home directories • Secure – only available to a specific user (cannot be shared) • Transfer to project team storage area (on object store) if sharing required • Back-end storage is a standard SATA disk
  • 12. SOFTWARE AS A SERVICE 12 • Web user interfaces and APIs for big EO data analytics and processing
  • 13. Public website – www.geoanalytics.ca Demonstration Platform Landing Page 13
  • 14. AUTHENTICATION, SECURITY AND USER MANAGEMENT • All applications and APIs require users to be authenticated • User management and profiles through LDAP • Single-Sign-On • Uses industry standard OAuth 2 protocol • Users only need to log in once to gain access to all applications • APIs require token to authenticate
  • 15. EO DATA QUERY AND DISCOVERY 15
  • 16. EO DATA QUERY AND DISCOVERY • Web-browser based browse and search interfaces • Browse and search all datasets • Query and view collections by time, location • SpatioTemporal Asset Catalog (STAC) API of all EO datasets • OGC API-Features (WFS3) compliant metadata server • API documented at www.stacspec.org
  • 17. EO DATA QUERY AND DISCOVERY • Current EO Data Collections: Collection Name Description Time Period Available landsat-8-l1 Landsat-8 images over eastern Canadian landmass (Manitoba east) 2003-2020 modis.MCD12Q1 MODIS Land Cover 2000-2020 modis.MOD09GQ Terra Surface Reflectance 2000-2020 modis.MOD09Q1 Terra Surface Reflectance 2000-2020 modis.MOD11A1 Terra Land Surface Temperature and Emissivity 2000-2020 modis.MOD11A2 Terra Land Surface Temperature and Emissivity 2000-2020 modis.MOD13Q1 Terra Vegetation Indices 2000-2020 modis.mod09gq.veg.ndvi NDVI derived from Terra Surface Reflectance 2000-2020 modis.mod09gq.veg.evi2 EVI2 derived from Terra Surface Reflectance 2000-2020
  • 18. EO DATA INGESTION AND PRE- PROCESSING 18
  • 19. EO DATA INGESTION AND PRE-PROCESSING • Fully uses the computing power and scalability of the IAAS tier • multi-stage data processing pipelines • Enables containerized applications to be put into a processing chain that can be scaled massively • Implemented using KubeFlow • primarily designed to enable machine learning (ML) workflows • Same ML workflows constructs are re- purposed for EO data ingestion and pre-processing
  • 20. EO DATA INGESTION AND PRE-PROCESSING • Proof of concept EO data pipelines created: • Level-2 Sentinel-2 products using Sen2Cor • Run any set of commands that are available through ESA’s Sentinel Application Platform (SNAP) software • Downloads MODIS products to the object store and adds the product to the EO metadata system • Adds Landsat-8 images over the Eastern Canadian landmass (i.e. Manitoba east) to the EO metadata system • Creates NDVI and EVI2 products from Terra Surface Reflectance products • Creates a daily thermal average product from Terra Land Surface Temperature products
  • 21. EO DATA INGESTION AND PRE-PROCESSING • NDVI and EVI2 derived from Terra Surface Reflectance Pipeline: • Processing completed for all products available between 2000-2020 • Results stored in object storage and indexed in EO data query system • Results available through all platform systems, including EO data query and discovery system, File Browser, desktop in a browser, etc. • Runtime Example: • 3 years of data (3 TB) processed in 13 hours • 36 processing pods (1 per month), Each pod is allocated 1vCPU, 5GB RAM • Total cluster resources: 36vCPU, 180GB RAM Viewing NDVI product using QGIS through the ‘desktop in a browser’ system
  • 22. EO DATA INGESTION AND PRE-PROCESSING 22 • 10 Sentinel-2 L1A tiles to L2A conversion • Typically ~3-4 hours • GEOAnalytics: ~28 minutes
  • 24. JUPYTER-LAB ANALYTIC ENVIRONMENT 24 • Python-based scalable data analytics • Interacts with Kubernetes to provide on-demand scalable compute • Core software systems: • Jupyter-Lab – provides the web application framework for interactive analytics • Xarray – provides an N-Dimensional Array interface and toolset • Iris – provides methods for analysing and visualising meteorological and oceanographic data sets • Dask – provides flexible parallel computing for analytics • Zarr – the next generation, cloud-native file format for gridded datasets
  • 26. JUPYTER-LAB ANALYTIC ENVIRONMENT 26 • Implements a “Pangeo” Environment • www.pangeo.io • Supports both HPC and Cloud infrastructure • Similar in nature to the European Joint Research Centre’s “Earth Observation Data and Processing Platform” (JEODPP) • https://jeodpp.jrc.ec.europa.eu/home/
  • 27. JUPYTER-LAB ANALYTIC ENVIRONMENT 27 • Hatfield has started a library of example notebooks on how to use the Jupyter-Lab Environment • Access Landsat data through STAC API and process/analyze it to create an NDVI timeseries • Query EO data hosted on GEOAnalytics.ca using OwsLib https://github.com/geoanalytics-ca/example-notebooks
  • 28. JUPYTER-LAB ANALYTIC ENVIRONMENT 28 • NDVI Landsat-8 Example Notebook: • 30 nodes, 210GB RAM, 60 CPUs • Random location close to Saint Hyacinthe, QC NDVI of 2018 acquisitions mean NDVI
  • 29. GITLAB PRIVATE CODE REPOSITORY AND DOCKER REGISTRY 29
  • 30. GITLAB SOURCE CODE REPOSITORY 30 • Collaboration and sharing of source code with Git • Private and shared repositories available
  • 31. DOCKER IMAGE STORE / CONTAINER REGISTRY 31 • The container registry is backed by the object store system • Cost effective storage of large container images • Images in registry can be used in scalable workflows in the platform’s EO data ingestion and pre-processing systems
  • 33. DESKTOPS IN A BROWSER 33 • Provides users with their own Personal Ubuntu desktop environment • Accessible through a browser • Enables data exploration directly on the platform, reducing the need to download data • Users can select the amount of RAM + CPU on startup: • From 1 to 31 CPUs • From 1 to 116 GB RAM
  • 34. DESKTOPS IN A BROWSER 34 • Pre-installed software (SNAP, QGIS, Firefox, etc) • Users can install their own software and customize the desktop environment to be their own • EO data stores are mounted in desktop environment for easy access: • All Sentinel-2 data • All Landsat 4-8 data • Pre-processed data products Viewing a Sentinel-2 product using QGIS through the ‘desktop in a browser’ system
  • 36. FILE BROWSER 36 • Enables browsing and downloading of all data stored on the platform for use in external systems • Users can view and download data from: • All EO data stores • Shared data between users of the platform • Their own personal data
  • 37. GROUND TRUTH DATA MANAGEMENT 37
  • 38. GROUND TRUTH DATA MANAGEMENT 38 • Vector ground truth data can be uploaded, viewed and deleted • Users upload a SHP file which is imported into the system • Organized into collections that contain features • A SHP file is a “collection”
  • 39. GROUND TRUTH DATA MANAGEMENT 39 • Features can be browsed/searched interactively • Features can be searched • Webmap displays features • API endpoints implement OGC API-Features specification (previously referred to as WFS3) • Implemented using PyGEOApi
  • 41. CONCLUSION 41 • The proof of concept platform demonstrates how [1]: • Existing stores of satellite EO data can be analyzed in-place using cloud-computing resources, rather than requiring download • New modular and user friendly metadata protocols, particularly Spatio Temporal Asset Catalogs (STAC), can be used to provide search interface for satellite EO dataset discovery
  • 42. CONCLUSION 42 • The proof of concept platform demonstrates how [2]: • The new OGC API – Features (WFS 3) standard can be used manage and make available ground truth and other in-situ datasets • Satellite EO analytic programs in Python can be created interactively, and then scaled to analyze large areas and deep timeseries using XArray and Dask libraries • Ingestion, machine learning, analytical and pre- processing applications (both binary and python based) can be linked to form scalable satellite EO data processing chains
  • 43. CONCLUSION 43 • Bring your algorithm to the data, not the other way around Email contacts: info@geoanalytics.ca jsuwala@hatfieldgroup.com