SlideShare une entreprise Scribd logo
1  sur  39
Ensuring Long Term Access to
Remotely Sensed HDF4 Data
with Layout Maps
Ruth Duerr, NSIDC
Christopher Lynnes, GES DISC
The HDF Group

Oct. 16 2008

HDF and HDF-EOS Workshop XII

1
Background and basic
concept

Oct. 16 2008

HDF and HDF-EOS Workshop XII

2
I’m Plastic Man!

HDF4 is

EXTENSIBLE

FLEXIBLE
SELFDESCRIBING

Oct. 16 2008

HDF and HDF-EOS Workshop XII

3
But
There’s a cost…

Oct. 16 2008

HDF and HDF-EOS Workshop XII

4
Complexity!

Oct. 16 2008

HDF and HDF-EOS Workshop XII

5
Oct. 16 2008

HDF and HDF-EOS Workshop XII

6
Oct. 16 2008

HDF and HDF-EOS Workshop XII

7
Oct. 16 2008

HDF and HDF-EOS Workshop XII

8
Oct. 16 2008

HDF and HDF-EOS Workshop XII

9
Oct. 16 2008

HDF and HDF-EOS Workshop XII

10
Oct. 16 2008

HDF and HDF-EOS Workshop XII

11
Oct. 16 2008

HDF and HDF-EOS Workshop XII

12
How do we save HDF users
from having to deal with all of
the complexity under the
hood?

Oct. 16 2008

HDF and HDF-EOS Workshop XII

13
Through the HDF software
libraries, either by using the
HDF APIs directly or by using
HDF tools that depend on the
HDF libraries.
But what about the future…
Oct. 16 2008

HDF and HDF-EOS Workshop XII

14
• There is a risk in depending solely on the HDF
libraries to access HDF-formatted data over the
long term.
• It is possible, especially in the distant future, that
the libraries may not be available.

Oct. 16 2008

HDF and HDF-EOS Workshop XII

15
Really smart people and software?
Maybe future
data users and
their computers
will be so smart
that the HDF4
format will be a
piece of cake.

Oct. 16 2008

HDF and HDF-EOS Workshop XII

16
Maybe not.

Oct. 16 2008

HDF and HDF-EOS Workshop XII

17
We need an “easy” button

Oct. 16 2008

HDF and HDF-EOS Workshop XII

18
“If only we could read HDF data with an
independent program that does not rely on
the HDF API…
A possible approach [would be to] extend
hdfls to print a hierarchical map of a data file,
[and] write ncdump/hdp-like utilities to find,
assemble and write out SDSes and vdatas.”
“Leveraging HDF Utilities”
Christopher Lynnes
HDF Workshop X.
Oct. 16 2008

HDF and HDF-EOS Workshop XII

19
Oct. 16 2008

HDF and HDF-EOS Workshop XII

20
HDF4 file layout

Oct. 16 2008

HDF and HDF-EOS Workshop XII

21
HDF4 file layout

Oct. 16 2008

HDF and HDF-EOS Workshop XII

22
The project

Oct. 16 2008

HDF and HDF-EOS Workshop XII

23
HDF4 mapping
• Problem
− The complex internal byte layout of HDF files
requires one to use the API to access HDF data.
− This makes long-term readability of HDF data
dependent on long-term allocation of resources to
support HDF software.

• Proposed solution
− Create a map of the layout of data objects in an
HDF file, allowing a simple reader to be written to
access the data.
Oct. 16 2008

HDF and HDF-EOS Workshop XII

24
HDF4 mapping project activities
1. Assess and categorize HDF4 data held by NASA
− To determine what types of objects to map.
− To get an idea of the magnitude of the project.

1. Develop prototype for proof of concept
− Develop markup-language based layout
specification.
− Develop tool to produce layout for an HDF4 file.
− Develop and test two independent tools to read
HDF4 data based solely on the map files.

Oct. 16 2008

HDF and HDF-EOS Workshop XII

25
Project activities (continued)
3. Assess results and plan next steps
− Present results and options for proceeding to the
community.
− Assess the likely usefulness of this approach, as
well as any desirable modifications
− Evaluate the effort required for a full solution that
best meets community needs
− Submit a proposal for the work needed to provide
a full solution

Oct. 16 2008

HDF and HDF-EOS Workshop XII

26
1. Assess and categorize

Oct. 16 2008

HDF and HDF-EOS Workshop XII

27
How many HDF4 products?
Data Center
ASF

HDF4 Products
0

GES-DISC
GHRC

54

ASDC

63

LP-DAAC

67

NSIDC

47

ORNL-DAAC

2

PO.DAAC

22

SDAC

0

MrDC

95

Total
HDF and HDFEOS Workshop
Oct.
XII 16 2008

236

586

28
Data characteristics
Product Characteristics Examined
•

Product Identification
−
−
−
−

•
•

HDF-EOS version
For point data
•
•

−

−

•

Number of swaths
Maximum number of dimensions
Organized by time, space, both, or other
Whether dimension maps were used

For gridded data
•
•
•
•

Number of grids
Max number of dimensions in a grid
Number of projections used
Whether any grids were indexed

HDF Version

−

•

Number of SDSs
Maximum number of dimensions
Did any SDS have attributes
Was any SDS annotated
Were dimension scales used
Was compression used and if so what
kind
Was chunking used

For Vdata
−
−
−
−
−

HDF and HDFEOS Workshop
Oct.
XII 16 2008

Number of 8-bit rasters
Number of 24-bit rasters
Number of general rasters
Whether any rasters had attributes
Whether any rasters were compressed
Whether any rasters were chunked
Whether there were any palettes

For SDS data
−
−
−
−
−
−

Number of point data sets
Maximum number of levels

For swath data
•
•
•
•

For raster data
−
−
−
−
−
−
−

Product Name
Data Level
Archive Location
Product Version

Whether the product was multi-file
For HDF-EOS products
−
−

•

•

Number of Vdata structures
Did any Vdata have attributes
Did any Vdata fields have attributes
Was compression used and if so what
kind
Was chunking used

29
Other results
• Slightly more than half of the HDF4 products are in HDF-EOS 2
format
• Grids are the most common HDF-EOS data structures in use
• No products use a combination of grid, swath, and point data
structures

HDF and HDFEOS Workshop
Oct.
XII 16 2008

30
2. Prototype and proof of
concept

Oct. 16 2008

HDF and HDF-EOS Workshop XII

31
HDF4 mapping prototype workflow

HDF4 File
HDF4 File
“H4.hdf”
“H4.hdf”

hmap
hmap
linked with
linked with
HDF4 library
HDF4 library

HDF4 Mapping File
HDF4 Mapping File
(XML document)
(XML document)
“H4.hdf.map.xml”
“H4.hdf.map.xml”

Groups, Data Objects,
Structural and Application
Metadata;
Locations of Object Data

Object Data

Reader 1
Reader 2
2
(C program)
(Perl Script)
(Perl Script)

October 15-18,
2008

HDF and HDF-EOS Workshop XII

32
Proof-of-concept results
• The HDF Group created prototype map
generation software and a draft map
specification
• Map generator was tested on a wide variety of
data products
• GES-DISC and NSIDC independently wrote
software that uses maps to read data files in
NSIDC’s and GES-DISC’s archives
• Summary - the concept is feasible!
Oct. 16 2008

HDF and HDF-EOS Workshop XII

33
Example map fragment
<?xml version="1.0" encoding="utf-8"?>
<hdf4:HDFMap xmlns:hdf4="http://www.hdfgroup.org/HDF4/HDF4Map">
<hdf4:RootGroup>
<hdf4:SDS objName="data1" objPath="/" objID="xid-DFTAG_NDG-2">
<hdf4:Attribute name="data range" ntDesc="32-bit signed integer">
0 255
</hdf4:Attribute>
<hdf4:Datatype dtypeClass="INT" dtypeSize="4" byteOrder="BE" />
<hdf4:Dataspace ndims="2">
10 100
</hdf4:Dataspace>
<hdf4:Datablock nblocks="1">
<hdf4:BlockOffset>
2502
</hdf4:BlockOffset>
<hdf4:BlockNbytes>
4000
</hdf4:BlockNbytes>
</hdf4:Datablock>
</hdf4:SDS>
</hdf4:RootGroup>
</hdf4:HDFMap>

Oct. 16 2008

HDF and HDF-EOS Workshop XII

34
Next steps

Oct. 16 2008

HDF and HDF-EOS Workshop XII

35
Effort for full implementation
• Generate maps for existing archives
− GES-DISC approach: append the map XML to the XML
files already kept for each file in their archive
− NSIDC non-ECS data implementation: add an XML file
for each data file in same directory
− Other systems TBD

• Generate maps for new data
− Add map generation as a step in the ingest process
using stand alone tool
− Request product generation systems to use new API
calls that generate maps

• Develop production quality implementation of
mapping tool, and possibly an API.
• Possibly do similar assessment for HDF5 maps.
HDF and HDFEOS Workshop
Oct. XII 2008
16

36
How you can help
• Consider what it might take to implement this for
your archive - contact Ruth if you’d like support
• Review the materials on the wiki and elsewhere comment heavily!

HDF and HDFEOS Workshop
Oct.
XII 16 2008

37
For more information
• Wiki page added to Confluence wiki
• Project page at The HDF Group website:
− http://www.hdfgroup.org/projects/hdf4mapping/

• Paper at 2008 fall AGU
• Paper “Ensuring Long Term Access to Remotely
Sensed Data with Layout Maps” in the upcoming
TGRSS special issue on archiving and distribution

HDF and HDFEOS Workshop
Oct. XII 2008
16

38
Thank you.
This report is based upon work supported in part
by a Cooperative Agreement with the National
Aeronautics and Space Administration (NASA)
under NASA Award NNX06AC83A.
Any opinions, findings, and conclusions or
recommendations expressed in this material are
those of the author(s) and do not necessarily
reflect the views of the National Aeronautics and
Space Administration.
Oct. 16 2008

HDF and HDF-EOS Workshop XII

39

Contenu connexe

En vedette

Hdf5 current future
Hdf5 current futureHdf5 current future
Hdf5 current futuremfolk
 
Unidata's Approach to Community Broadening through Data and Technology Sharing
Unidata's Approach to Community Broadening through Data and Technology SharingUnidata's Approach to Community Broadening through Data and Technology Sharing
Unidata's Approach to Community Broadening through Data and Technology SharingThe HDF-EOS Tools and Information Center
 

En vedette (18)

HDFView and HDF Java Products
HDFView and HDF Java ProductsHDFView and HDF Java Products
HDFView and HDF Java Products
 
Shifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data ProviderShifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data Provider
 
ENVI/IDL for HDF
ENVI/IDL for HDFENVI/IDL for HDF
ENVI/IDL for HDF
 
The CFD General Notation System transition to HDF5
The CFD General Notation System transition to HDF5The CFD General Notation System transition to HDF5
The CFD General Notation System transition to HDF5
 
Workshop Discussion: HDF & HDF-EOS Future Direction
Workshop Discussion: HDF & HDF-EOS Future DirectionWorkshop Discussion: HDF & HDF-EOS Future Direction
Workshop Discussion: HDF & HDF-EOS Future Direction
 
HDF5 OPeNDAP project update and demo
HDF5 OPeNDAP project update and demoHDF5 OPeNDAP project update and demo
HDF5 OPeNDAP project update and demo
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
Proposal for adding Named Dimensions to HDF5 Arrays
Proposal for adding Named Dimensions to HDF5 ArraysProposal for adding Named Dimensions to HDF5 Arrays
Proposal for adding Named Dimensions to HDF5 Arrays
 
The MATLAB Low-Level HDF5 Interface
The MATLAB Low-Level HDF5 InterfaceThe MATLAB Low-Level HDF5 Interface
The MATLAB Low-Level HDF5 Interface
 
Reading HDF family of formats via NetCDF-Java / CDM
Reading HDF family of formats via NetCDF-Java / CDMReading HDF family of formats via NetCDF-Java / CDM
Reading HDF family of formats via NetCDF-Java / CDM
 
ORNL DAAC MODIS Land Product Subsets
ORNL DAAC MODIS Land Product SubsetsORNL DAAC MODIS Land Product Subsets
ORNL DAAC MODIS Land Product Subsets
 
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
Using HDF5 Archive Information Package to preserve HDF-EOS2 dataUsing HDF5 Archive Information Package to preserve HDF-EOS2 data
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Advanced HDF5 Features
Advanced HDF5 FeaturesAdvanced HDF5 Features
Advanced HDF5 Features
 
Hdf5 current future
Hdf5 current futureHdf5 current future
Hdf5 current future
 
HDF5 iRODS
HDF5 iRODSHDF5 iRODS
HDF5 iRODS
 
Unidata's Approach to Community Broadening through Data and Technology Sharing
Unidata's Approach to Community Broadening through Data and Technology SharingUnidata's Approach to Community Broadening through Data and Technology Sharing
Unidata's Approach to Community Broadening through Data and Technology Sharing
 
HDF5 Tools
HDF5 ToolsHDF5 Tools
HDF5 Tools
 

Plus de The HDF-EOS Tools and Information Center

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...The HDF-EOS Tools and Information Center
 

Plus de The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 

Dernier

Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 

Dernier (20)

Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 

Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps

  • 1. Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps Ruth Duerr, NSIDC Christopher Lynnes, GES DISC The HDF Group Oct. 16 2008 HDF and HDF-EOS Workshop XII 1
  • 2. Background and basic concept Oct. 16 2008 HDF and HDF-EOS Workshop XII 2
  • 3. I’m Plastic Man! HDF4 is EXTENSIBLE FLEXIBLE SELFDESCRIBING Oct. 16 2008 HDF and HDF-EOS Workshop XII 3
  • 4. But There’s a cost… Oct. 16 2008 HDF and HDF-EOS Workshop XII 4
  • 5. Complexity! Oct. 16 2008 HDF and HDF-EOS Workshop XII 5
  • 6. Oct. 16 2008 HDF and HDF-EOS Workshop XII 6
  • 7. Oct. 16 2008 HDF and HDF-EOS Workshop XII 7
  • 8. Oct. 16 2008 HDF and HDF-EOS Workshop XII 8
  • 9. Oct. 16 2008 HDF and HDF-EOS Workshop XII 9
  • 10. Oct. 16 2008 HDF and HDF-EOS Workshop XII 10
  • 11. Oct. 16 2008 HDF and HDF-EOS Workshop XII 11
  • 12. Oct. 16 2008 HDF and HDF-EOS Workshop XII 12
  • 13. How do we save HDF users from having to deal with all of the complexity under the hood? Oct. 16 2008 HDF and HDF-EOS Workshop XII 13
  • 14. Through the HDF software libraries, either by using the HDF APIs directly or by using HDF tools that depend on the HDF libraries. But what about the future… Oct. 16 2008 HDF and HDF-EOS Workshop XII 14
  • 15. • There is a risk in depending solely on the HDF libraries to access HDF-formatted data over the long term. • It is possible, especially in the distant future, that the libraries may not be available. Oct. 16 2008 HDF and HDF-EOS Workshop XII 15
  • 16. Really smart people and software? Maybe future data users and their computers will be so smart that the HDF4 format will be a piece of cake. Oct. 16 2008 HDF and HDF-EOS Workshop XII 16
  • 17. Maybe not. Oct. 16 2008 HDF and HDF-EOS Workshop XII 17
  • 18. We need an “easy” button Oct. 16 2008 HDF and HDF-EOS Workshop XII 18
  • 19. “If only we could read HDF data with an independent program that does not rely on the HDF API… A possible approach [would be to] extend hdfls to print a hierarchical map of a data file, [and] write ncdump/hdp-like utilities to find, assemble and write out SDSes and vdatas.” “Leveraging HDF Utilities” Christopher Lynnes HDF Workshop X. Oct. 16 2008 HDF and HDF-EOS Workshop XII 19
  • 20. Oct. 16 2008 HDF and HDF-EOS Workshop XII 20
  • 21. HDF4 file layout Oct. 16 2008 HDF and HDF-EOS Workshop XII 21
  • 22. HDF4 file layout Oct. 16 2008 HDF and HDF-EOS Workshop XII 22
  • 23. The project Oct. 16 2008 HDF and HDF-EOS Workshop XII 23
  • 24. HDF4 mapping • Problem − The complex internal byte layout of HDF files requires one to use the API to access HDF data. − This makes long-term readability of HDF data dependent on long-term allocation of resources to support HDF software. • Proposed solution − Create a map of the layout of data objects in an HDF file, allowing a simple reader to be written to access the data. Oct. 16 2008 HDF and HDF-EOS Workshop XII 24
  • 25. HDF4 mapping project activities 1. Assess and categorize HDF4 data held by NASA − To determine what types of objects to map. − To get an idea of the magnitude of the project. 1. Develop prototype for proof of concept − Develop markup-language based layout specification. − Develop tool to produce layout for an HDF4 file. − Develop and test two independent tools to read HDF4 data based solely on the map files. Oct. 16 2008 HDF and HDF-EOS Workshop XII 25
  • 26. Project activities (continued) 3. Assess results and plan next steps − Present results and options for proceeding to the community. − Assess the likely usefulness of this approach, as well as any desirable modifications − Evaluate the effort required for a full solution that best meets community needs − Submit a proposal for the work needed to provide a full solution Oct. 16 2008 HDF and HDF-EOS Workshop XII 26
  • 27. 1. Assess and categorize Oct. 16 2008 HDF and HDF-EOS Workshop XII 27
  • 28. How many HDF4 products? Data Center ASF HDF4 Products 0 GES-DISC GHRC 54 ASDC 63 LP-DAAC 67 NSIDC 47 ORNL-DAAC 2 PO.DAAC 22 SDAC 0 MrDC 95 Total HDF and HDFEOS Workshop Oct. XII 16 2008 236 586 28
  • 29. Data characteristics Product Characteristics Examined • Product Identification − − − − • • HDF-EOS version For point data • • − − • Number of swaths Maximum number of dimensions Organized by time, space, both, or other Whether dimension maps were used For gridded data • • • • Number of grids Max number of dimensions in a grid Number of projections used Whether any grids were indexed HDF Version − • Number of SDSs Maximum number of dimensions Did any SDS have attributes Was any SDS annotated Were dimension scales used Was compression used and if so what kind Was chunking used For Vdata − − − − − HDF and HDFEOS Workshop Oct. XII 16 2008 Number of 8-bit rasters Number of 24-bit rasters Number of general rasters Whether any rasters had attributes Whether any rasters were compressed Whether any rasters were chunked Whether there were any palettes For SDS data − − − − − − Number of point data sets Maximum number of levels For swath data • • • • For raster data − − − − − − − Product Name Data Level Archive Location Product Version Whether the product was multi-file For HDF-EOS products − − • • Number of Vdata structures Did any Vdata have attributes Did any Vdata fields have attributes Was compression used and if so what kind Was chunking used 29
  • 30. Other results • Slightly more than half of the HDF4 products are in HDF-EOS 2 format • Grids are the most common HDF-EOS data structures in use • No products use a combination of grid, swath, and point data structures HDF and HDFEOS Workshop Oct. XII 16 2008 30
  • 31. 2. Prototype and proof of concept Oct. 16 2008 HDF and HDF-EOS Workshop XII 31
  • 32. HDF4 mapping prototype workflow HDF4 File HDF4 File “H4.hdf” “H4.hdf” hmap hmap linked with linked with HDF4 library HDF4 library HDF4 Mapping File HDF4 Mapping File (XML document) (XML document) “H4.hdf.map.xml” “H4.hdf.map.xml” Groups, Data Objects, Structural and Application Metadata; Locations of Object Data Object Data Reader 1 Reader 2 2 (C program) (Perl Script) (Perl Script) October 15-18, 2008 HDF and HDF-EOS Workshop XII 32
  • 33. Proof-of-concept results • The HDF Group created prototype map generation software and a draft map specification • Map generator was tested on a wide variety of data products • GES-DISC and NSIDC independently wrote software that uses maps to read data files in NSIDC’s and GES-DISC’s archives • Summary - the concept is feasible! Oct. 16 2008 HDF and HDF-EOS Workshop XII 33
  • 34. Example map fragment <?xml version="1.0" encoding="utf-8"?> <hdf4:HDFMap xmlns:hdf4="http://www.hdfgroup.org/HDF4/HDF4Map"> <hdf4:RootGroup> <hdf4:SDS objName="data1" objPath="/" objID="xid-DFTAG_NDG-2"> <hdf4:Attribute name="data range" ntDesc="32-bit signed integer"> 0 255 </hdf4:Attribute> <hdf4:Datatype dtypeClass="INT" dtypeSize="4" byteOrder="BE" /> <hdf4:Dataspace ndims="2"> 10 100 </hdf4:Dataspace> <hdf4:Datablock nblocks="1"> <hdf4:BlockOffset> 2502 </hdf4:BlockOffset> <hdf4:BlockNbytes> 4000 </hdf4:BlockNbytes> </hdf4:Datablock> </hdf4:SDS> </hdf4:RootGroup> </hdf4:HDFMap> Oct. 16 2008 HDF and HDF-EOS Workshop XII 34
  • 35. Next steps Oct. 16 2008 HDF and HDF-EOS Workshop XII 35
  • 36. Effort for full implementation • Generate maps for existing archives − GES-DISC approach: append the map XML to the XML files already kept for each file in their archive − NSIDC non-ECS data implementation: add an XML file for each data file in same directory − Other systems TBD • Generate maps for new data − Add map generation as a step in the ingest process using stand alone tool − Request product generation systems to use new API calls that generate maps • Develop production quality implementation of mapping tool, and possibly an API. • Possibly do similar assessment for HDF5 maps. HDF and HDFEOS Workshop Oct. XII 2008 16 36
  • 37. How you can help • Consider what it might take to implement this for your archive - contact Ruth if you’d like support • Review the materials on the wiki and elsewhere comment heavily! HDF and HDFEOS Workshop Oct. XII 16 2008 37
  • 38. For more information • Wiki page added to Confluence wiki • Project page at The HDF Group website: − http://www.hdfgroup.org/projects/hdf4mapping/ • Paper at 2008 fall AGU • Paper “Ensuring Long Term Access to Remotely Sensed Data with Layout Maps” in the upcoming TGRSS special issue on archiving and distribution HDF and HDFEOS Workshop Oct. XII 2008 16 38
  • 39. Thank you. This report is based upon work supported in part by a Cooperative Agreement with the National Aeronautics and Space Administration (NASA) under NASA Award NNX06AC83A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration. Oct. 16 2008 HDF and HDF-EOS Workshop XII 39

Notes de l'éditeur

  1. Full quote, from proposal: Through the HDF software libraries, either by using the HDF APIs directly or by using HDF tools that depend on the HDF libraries. However there is a risk in depending solely on the HDF libraries to access HDF-formatted data over the long term. It is possible, especially in the distant future, that the libraries may not be as readily available as they are today. To address this risk, it is desirable to have a way to retrieve the data independently. At the 10th HDF workshop, Christopher Lynnes of the Goddard Earth Sciences Data and Information Services Center(GES DISC) addressed this need: “If only we could read HDF data with an independent program that does not rely on the HDF API… A possible approach [would be to] extend” hdfls to print a hierarchical map of a data file, [and] write ncdump/hdp-like utilities to find, assemble and write out SDSes and vdatas.” “Leveraging HDF Utilities,” Christopher Lynnes, 10th HDF Workshop. http://www.hdfeos.org/workshops/ws10/presentations/day3/Leveraging_HDF_Utilities.ppt.
  2. An XML-based prototype schema for HDF4 mapping files (XML documents) was created. For a given binary HDF4 file, an associated mapping file contains structural and application metadata for the HDF4 file, as well as the locations of the object data (array element values) in the HDF4 file. A tool was written to generate mapping files. Other tools were developed that use the mapping files to read HDF4 files without calling the HDF4 library, confirming the approach is viable. While the focus of this effort was NASA EOSDIS data stored in HDF4 files, the general methodology is also relevant to other cases where the long-term accessibility of data stored in binary files is of concern. In addition, this work demonstrates how binary HDF files can be used to efficiently store large volumes of scientific data that is referenced by text-based XML documents (the mapping files).