This document summarizes discussions from the HDF Group's HDF/HDF-EOS Workshop XIV about data interoperability. It covered topics like enabling one set of APIs to handle multiple data formats through projects like netCDF4 and CDM. It also discussed format conversions and translations between formats like HDF4, HDF5, netCDF and others. Finally, it addressed semantic and content interoperability challenges like representing latitude and longitude in different formats and how standards like CF conventions help with interpretation of metadata across tools and applications. Interoperability issues that can arise from simultaneous access of HDF5 files via HDF5 and netCDF-4 libraries were also presented.
Injustice - Developers Among Us (SciFiDevCon 2024)
HDF Group Data Format Interoperability Workshop
1. The HDF Group
Data Interoperability
The HDF Group Staff
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
1
www.hdfgroup.org
2. Interoperability
• Interoperability is a property of a product or
system, whose interfaces are completely
understood, to work with other products or
systems, present or future, without any
restricted access or implementation.
(From Wikipedia)
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
2
www.hdfgroup.org
3. Background for data interoperability
• Focus on Earth Sciences
• Data needs to be shared across communities
• Self-described data format
• Many different types of data
• Grids, moving-sensor multidimensional fields,
time series, profiles, trajectories, geospatial
framework data, points
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
3
www.hdfgroup.org
4. Why data interoperability?
• Data format
• Several data formats available
HDF4/5, netCDF3/4, GRIB, BUFR, Binary
• Data are created with some data formats by
following different physical models
• End users would like to use familiar tools to
access data via different formats
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
4
www.hdfgroup.org
5. Why data interoperability?
• Semantic/content interoperability
• An example: the missing value of a physical
variable
• Tools or applications need to know the fill
value of a physical variable
• How can they identify? Normally via an attribute
that stores the fill value
• The name can be
“Fill_value, _FillValue, badValue,……”
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
5
www.hdfgroup.org
6. Goals
• Discuss a few cases about data interoperability
• Hear opinions or issues about data
interoperability from audience.
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
6
www.hdfgroup.org
7. The HDF Group
Data Format Interoperability
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
7
www.hdfgroup.org
8. Case 1
• One set of APIs to handle multiple data
formats
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
8
www.hdfgroup.org
9. Case 1
• netCDF4
• Combine powerful model and simplicity of
netCDF with features of HDF5
• netCDF interface on top of HDF5
• CDM(Common Data Model)
• Harmonize netCDF, HDF4, HDF5, OPeNDAP,
GRIB and others
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
9
www.hdfgroup.org
10. Discussions
• Any comments/sharings about the
experiences with netCDF4 and
CDM?
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
10
www.hdfgroup.org
11. Case 2
• Format conversions and translations
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
11
www.hdfgroup.org
12. Data format
Conversions/Translations
• Format conversions
•
•
•
•
The HDF4 to HDF5 conversion tool
The HDF-EOS2 to netCDF3/4 conversion tool
HDF-EOS to GeoTIFF conversion tool
……
• Data format Translations
• netCDF tools to access HDF4/5 files via
OPeNDAP
• ……
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
12
www.hdfgroup.org
13. Use netCDF tools to access HDF via OPeNDAP
Aqua/Aura
Translation Layers
Users
IDV
HDF4/5
Handlers
libnc-dap
AIRS/OMI
DAP
Visualization
Tools
Sep. 28-30, 2010
OPeNDAP
Clients
OPeNDAP
Servers
HDF/HDF-EOS Workshop XIV
HDF4/5
Files
13
www.hdfgroup.org
14. netCDF4 to access HDF-EOS5 files
• Augmentation
• One file can be used for both EOS5 and NetCDF-4.
• Note that EOS5 users are not affected at all.
Augmentation
HDF-EOS5
HDF5
Sep. 28-30, 2010
HDF-EOS5
file
Augmented
HDF-EOS5
file
HDF/HDF-EOS Workshop XIV
NetCDF-4
file
NetCDF4
HDF5
14
www.hdfgroup.org
15. Discussions
• Any comments/sharings about the
experiences with data format
conversions and translations
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
15
www.hdfgroup.org
17. An example
• Degree of latitude and longitude
• Champaign’s latitude is 40°6´38" N
• It can be represented in two formats:
• 400638 in DDDMMSS format
• 40.1105556 in decimal format
• How can the applications know which format
the file is used?
• Better to have a common standard to facilitate
the exchange this kind of information
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
17
www.hdfgroup.org
18. Semantic/Content Interoperability
• CF conventions become such a standard
• Many applications/tools follow CF conventions
to interpret the metadata
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
18
www.hdfgroup.org
19. Use netCDF tools to access HDF via OPeNDAP
Aqua/Aura
Not follow CF
conventions
Users
IDV
HDF4/5
Handlers
libnc-dap
AIRS/OMI
DAP
Visualization
Tools
Sep. 28-30, 2010
OPeNDAP
Clients
OPeNDAP
Servers
HDF/HDF-EOS Workshop XIV
HDF4/5
Files
19
www.hdfgroup.org
20. Discussions
• Any comments about
semantic/content interoperability
• Any comments about the usage of
CF conventions
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
20
www.hdfgroup.org
21. The HDF Group
HDF5 and netCDF-4
Libraries Interoperability
Two sources of interoperability
problems and how to deal with them
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
21
www.hdfgroup.org
22. netCDF-4 library and arbitrary HDF5 files
• netCDF-4 is built on top of HDF5
• netCDF-4 files are HDF5 files with specific
characteristics
1. Tracked creation order of objects and their
attributes
2. Absence of datasets with complex compound,
and region and object references
3. Presence of dimension scales
• netCDF-4 expects to find properties 1 and 2 in an
HDF5 file and fails to open or read the file when
those are not present (e.g., the NPOESS files)
• Solution: TB discussed with the netCDF-4 folks
02/03/14Sep. 28-30,
2010
HDF/HDF-EOS Workshop XIV
22
www.hdfgroup.org
23. Programming models and libraries settings
• Use case 1 (libraries settings): Simultaneous
access to an HDF5 file via HDF5 and netCDF-4
• One of the libraries will not be able to open the file due
to the different access properties used by each library
(H5F_CLOSE_WEAK in HDF5 vs.
H5F_CLOSE_ STRONG in netCDF-4)
• Solution:
• Design improvements in HDF5
• New APIs to detect situation
• Automatic detection and correction of the access
properties
• Document the best practices for applications
02/03/14Sep. 28-30,
2010
HDF/HDF-EOS Workshop XIV
23
www.hdfgroup.org
24. Programming models and libraries settings
• Use case 2 (programming models): Usage of
HDF5 wrapper libraries and netCDF-4 library in
the same application
• C++ H5Topen call fails due to the netCDF-4 library
shutting down the HDF5 library underneath by calling
H5close
• Solution:
• Resolved in the newest netCDF-4
• Document the best practices for applications
• For more detailed discussion see “NetCDF-4/HDF5
Libraries Interoperability Issues” at
http://www.hdfgroup.uiuc.edu/RFC/HDF5/netCDF4-HDF5/
• Let us know your use case to improve testing!
02/03/14Sep. 28-30,
2010
HDF/HDF-EOS Workshop XIV
24
www.hdfgroup.org
25. Thank you !
Sep. 28-30, 2010
HDF/HDF-EOS Workshop XIV
25
www.hdfgroup.org