SlideShare une entreprise Scribd logo
1  sur  13
SESIP-0722-KY
HDF5 OPeNDAP Handler Updates,
and Performance Discussion
2022 ESIP Summer Meeting
This work was supported by NASA/GSFC under Raytheon Technologies contract number 80GSFC21CA001.
This document does not contain technology or Technical Data controlled under either the U.S. International Traffic
in Arms Regulations or the U.S. Export Administration Regulations.
Kent Yang
Software Engineer/NASA EED-3 contractor
myang6@hdfgroup.org
SESIP-0722-KY
2
• 2001: A prototype of HDF5 data handler
– HDF5 to DAP***2: Default option
• 2008: Handler in production
– Climate and Forecast(CF) option:
• Translate HDF5 metadata to follow CF
• 2008-2018: Significant improvement
– Still HDF5 to DAP2
HDF*5 OPeNDAP** Handler History
* Hierarchical Data Format
** Open-source Project for a Network Data Access Protocol
*** Data Access Protocol
SESIP-0722-KY
3
• Support DAP4
– CF option
• Support 8-bit and 64-bit integer mapping
– Default option
• Support NetCDF* data model(group etc. )
• Documentation
– A comprehensive user’s guide at github
• https://github.com/OPENDAP/hyrax_guide/blob/master/handl
ers/BES_Modules_The_HDF5_Handler.adoc
HDF5 OPeNDAP Handler Update
* Network Common Data Form
SESIP-0722-KY
4
• Output NetCDF file via the handler
– Sometimes it is very slow
HDF5 Handler Performance Study
HDF5
File
Hyrax
Core
HDF5 handler File netCDF NetCDF
File
SESIP-0722-KY
5
• Because HDF5 variables are compressed.
HDF5 Handler Performance Study
SESIP-0722-KY
6
HDF5 Handler Performance Study
• How compressed variables are processed
– HDF5 handler: Decompress via H5Dread
– File NetCDF: Compress via H5write
HDF5
File
HDF5 handler File NetCDF NetCDF
File
Decompress Compress
Hyrax
Core
SESIP-0722-KY
7
HDF5 Handler Performance Study
• Compression/decompression is costly
• Solution
– Passing through the compressed data
HDF5
File
HDF5 handler File NetCDF NetCDF
File
Decompress Compress
Hyrax
Core
Pass through the data Pass through the data
SESIP-0722-KY
8
HDF5 Handler Performance Study
HDF5
File
HDF5 handler File NetCDF NetCDF
File
Hyrax
Core
Pass through the data Pass through the data
• Is this possible?
• A proof-of-concept Study
SESIP-0722-KY
9
HDF5 Handler Performance Study
• A proof-of-concept study
– Use HDF5 direct chunk IO* API**s
• Packages that need to be updated
– HDF5 handler
• Read the passing-through compressed data
– DAP library
• Pass through the variable storage information
– NetCDF-4
• Write the passing-through compressed data
* Input Output
** Application Programming Interface
SESIP-0722-KY
10
HDF5 Handler Performance Study
• Testing Files Used
– GHRSST* and MERRA-2** data
• Repack the data to one chunk per variable
• Test Approach
– Only Hyrax Back-End Server(BES)
– besstandalone program on a Linux server
– Measure the wall clock time to output a
NetCDF-4 file
GHRSST: Group for High Resolution Sea Surface Temperature
MERRA: Modern-Era Retrospective analysis for Research and Applications
SESIP-0722-KY
11
HDF5 Handler Performance Study
• Testing Files
– GHRSST
• File size: 237 MB
• About 20 variables
• 5392x3200 8-bit or 16-bit integer
– MERRA-2
• File size: 489 MB
• About 50 variables
• 24x361x576 32-bit floating-point
SESIP-0722-KY
12
Performance Study Results
• Performance improved ~17 and ~30
times compared to the standard way
Wall Clock Time(Seconds) MERRA2 GHRSST
Standard Way
(Decompress and
compress the data)
55 26
Pass through the
compressed data
1.8 1.5
Speed up ~ 30 ~17
• Credit to the HDF5 library.
SESIP-0722-KY
13
This work was supported by NASA/GSFC under
Raytheon Technologies contract number
80GSFC21CA001.

Contenu connexe

Similaire à HDF5 OPeNDAP Handler Updates, and Performance Discussion

Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
The HDF-EOS Tools and Information Center
 

Similaire à HDF5 OPeNDAP Handler Updates, and Performance Discussion (20)

HDF Update
HDF UpdateHDF Update
HDF Update
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
Using HDF5 Archive Information Package to preserve HDF-EOS2 dataUsing HDF5 Archive Information Package to preserve HDF-EOS2 data
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
 
HDF5 iRODS
HDF5 iRODSHDF5 iRODS
HDF5 iRODS
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
HDF OPeNDAP project update and demo
HDF OPeNDAP project update and demoHDF OPeNDAP project update and demo
HDF OPeNDAP project update and demo
 
Access HDF5 Datasets via OPeNDAP's Data Access Protocol (DAP)
Access HDF5 Datasets via OPeNDAP's Data Access Protocol (DAP)Access HDF5 Datasets via OPeNDAP's Data Access Protocol (DAP)
Access HDF5 Datasets via OPeNDAP's Data Access Protocol (DAP)
 
Introduction to NetCDF-4
Introduction to NetCDF-4Introduction to NetCDF-4
Introduction to NetCDF-4
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
HDF5 OPeNDAP project update and demo
HDF5 OPeNDAP project update and demoHDF5 OPeNDAP project update and demo
HDF5 OPeNDAP project update and demo
 
Bridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data ProductsBridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data Products
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
 
HDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSSHDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSS
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
Integrating HDF5 with SRB
Integrating HDF5 with SRBIntegrating HDF5 with SRB
Integrating HDF5 with SRB
 
HDF Update 2016
HDF Update 2016HDF Update 2016
HDF Update 2016
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF Updae
HDF UpdaeHDF Updae
HDF Updae
 

Plus de The HDF-EOS Tools and Information Center

Plus de The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
Leveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software TestingLeveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software Testing
 
Google Colaboratory for HDF-EOS
Google Colaboratory for HDF-EOSGoogle Colaboratory for HDF-EOS
Google Colaboratory for HDF-EOS
 
Parallel Computing with HDF Server
Parallel Computing with HDF ServerParallel Computing with HDF Server
Parallel Computing with HDF Server
 
HDF-EOS Data Product Developer's Guide
HDF-EOS Data Product Developer's GuideHDF-EOS Data Product Developer's Guide
HDF-EOS Data Product Developer's Guide
 
NASA Terra Data Fusion
NASA Terra Data FusionNASA Terra Data Fusion
NASA Terra Data Fusion
 
HDF Cloud: HDF5 at Scale
HDF Cloud: HDF5 at ScaleHDF Cloud: HDF5 at Scale
HDF Cloud: HDF5 at Scale
 
HDF for the Cloud
HDF for the CloudHDF for the Cloud
HDF for the Cloud
 
S3 VFD
S3 VFDS3 VFD
S3 VFD
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

HDF5 OPeNDAP Handler Updates, and Performance Discussion

  • 1. SESIP-0722-KY HDF5 OPeNDAP Handler Updates, and Performance Discussion 2022 ESIP Summer Meeting This work was supported by NASA/GSFC under Raytheon Technologies contract number 80GSFC21CA001. This document does not contain technology or Technical Data controlled under either the U.S. International Traffic in Arms Regulations or the U.S. Export Administration Regulations. Kent Yang Software Engineer/NASA EED-3 contractor myang6@hdfgroup.org
  • 2. SESIP-0722-KY 2 • 2001: A prototype of HDF5 data handler – HDF5 to DAP***2: Default option • 2008: Handler in production – Climate and Forecast(CF) option: • Translate HDF5 metadata to follow CF • 2008-2018: Significant improvement – Still HDF5 to DAP2 HDF*5 OPeNDAP** Handler History * Hierarchical Data Format ** Open-source Project for a Network Data Access Protocol *** Data Access Protocol
  • 3. SESIP-0722-KY 3 • Support DAP4 – CF option • Support 8-bit and 64-bit integer mapping – Default option • Support NetCDF* data model(group etc. ) • Documentation – A comprehensive user’s guide at github • https://github.com/OPENDAP/hyrax_guide/blob/master/handl ers/BES_Modules_The_HDF5_Handler.adoc HDF5 OPeNDAP Handler Update * Network Common Data Form
  • 4. SESIP-0722-KY 4 • Output NetCDF file via the handler – Sometimes it is very slow HDF5 Handler Performance Study HDF5 File Hyrax Core HDF5 handler File netCDF NetCDF File
  • 5. SESIP-0722-KY 5 • Because HDF5 variables are compressed. HDF5 Handler Performance Study
  • 6. SESIP-0722-KY 6 HDF5 Handler Performance Study • How compressed variables are processed – HDF5 handler: Decompress via H5Dread – File NetCDF: Compress via H5write HDF5 File HDF5 handler File NetCDF NetCDF File Decompress Compress Hyrax Core
  • 7. SESIP-0722-KY 7 HDF5 Handler Performance Study • Compression/decompression is costly • Solution – Passing through the compressed data HDF5 File HDF5 handler File NetCDF NetCDF File Decompress Compress Hyrax Core Pass through the data Pass through the data
  • 8. SESIP-0722-KY 8 HDF5 Handler Performance Study HDF5 File HDF5 handler File NetCDF NetCDF File Hyrax Core Pass through the data Pass through the data • Is this possible? • A proof-of-concept Study
  • 9. SESIP-0722-KY 9 HDF5 Handler Performance Study • A proof-of-concept study – Use HDF5 direct chunk IO* API**s • Packages that need to be updated – HDF5 handler • Read the passing-through compressed data – DAP library • Pass through the variable storage information – NetCDF-4 • Write the passing-through compressed data * Input Output ** Application Programming Interface
  • 10. SESIP-0722-KY 10 HDF5 Handler Performance Study • Testing Files Used – GHRSST* and MERRA-2** data • Repack the data to one chunk per variable • Test Approach – Only Hyrax Back-End Server(BES) – besstandalone program on a Linux server – Measure the wall clock time to output a NetCDF-4 file GHRSST: Group for High Resolution Sea Surface Temperature MERRA: Modern-Era Retrospective analysis for Research and Applications
  • 11. SESIP-0722-KY 11 HDF5 Handler Performance Study • Testing Files – GHRSST • File size: 237 MB • About 20 variables • 5392x3200 8-bit or 16-bit integer – MERRA-2 • File size: 489 MB • About 50 variables • 24x361x576 32-bit floating-point
  • 12. SESIP-0722-KY 12 Performance Study Results • Performance improved ~17 and ~30 times compared to the standard way Wall Clock Time(Seconds) MERRA2 GHRSST Standard Way (Decompress and compress the data) 55 26 Pass through the compressed data 1.8 1.5 Speed up ~ 30 ~17 • Credit to the HDF5 library.
  • 13. SESIP-0722-KY 13 This work was supported by NASA/GSFC under Raytheon Technologies contract number 80GSFC21CA001.