SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
HDF Update
Mike Folk
National Center for Supercomputing Applications
Science Data Processing Workshop
February 26-28, 2002
-1-

HDF
Topics
• What is HDF?
• HDF4 update and plans
• HDF5 update and plans

-2-

HDF
NCSA HDF Mission

To develop, promote, deploy, and support
open and free technologies that facilitate
scientific data storage, exchange, access,
analysis and discovery.

-3-

HDF
What is HDF?
•
•
•
•
•
•

Format and software for scientific data
Stores images, arrays, tables, etc.
Storage and I/O efficiency
Free and commercial software support
Promote standards
Broad user base in engineering & science

-4-

HDF
Who is supporting HDF?
• NASA/ESDIS
– Earth science applications, instrument data
– All aspects of data management

• DOE/ASCI (Accelerated Strategic Computing Init.)
– Simulations on massively parallel machines
– Emphasis on parallel I/O performance, functionality

• DOE Scientific Data Analysis & Computation Program
– High performance I/O R & D

• NCSA
– Grid, Vis, other R&D, user support

• Others
– Applications, support, some R&D
-5-

HDF
HDF4

-6-

HDF
Goals for HDF4
• Support Terra, Aqua & other EOS data users
• Maintain and upgrade library and tools
• Address differences between HDF4 & HDF5

-7-

HDF
HDF4 milestones since Sept 2000
Q4 ‘00 Q1 ‘01 Q2 ‘01 Q3 ‘01 Q4 ‘01 Q1 ‘02

HDF4 library
and utilities:

HDF4 Java
API & tools:

•H

F4
D

•V

.

r4
1

sio
er

•H

F4
D

.

r5
1

2.6
n
•V

-8-

sio
er

2.7
n

HDF
HDF4 releases
• HDF4.1 r4 (Nov 2000)
–
–
–
–

Chunking and chunking with compression
JPEG compression fixed
hdp utility enhancements
HDF4-to-GIF and GIF-to-HDF4 converters

• HDF4.1 r5 (Nov 2001)
–
–
–
–
–

Compression query functions
Vdata: set size of Vdata link blocks
Tools updates
Added Solaris 2.8
Retired Solaris 2.6, OSF1 V4.0 and HPUX 10.20
-9-

HDF
HDF5

- 10 -

HDF
HDF5 milestones since Sept 2000
Q4 ‘00 Q1 β
‘01 Q2 ‘01 Q3 ‘01 Q4 ‘01 Q1 ‘02
)
afe
s
ch
at
dp
3
rea
2
0
1
2(
Base
.4.
. 4 . 1.4.
.4. 1.4.
Th
_
_1
_1 _
_1
_
library:
β
e”
ag
β
“im
e”
&
High level
l
e”
ab
“t
“lit
library:
_
•
Java &
other tools:
HDF4-HDF5
conversion:

_

.4.
1

0
_

.4.
1

1

ion
rs

e
nv
co lity
_ ti
u
- 11 -

Ja
_

v

2.2
a

α
α2
y
y
rar ibrar
lib
_
_l

HDF
HDF5 Library Work

- 12 -

HDF
New platforms and languages
• New platforms
–
–
–
–
–
–

• Fortran 90

HP-UX
IBM SP
IA64 architecture
Solaris 2.8 with 64-bit
Linux in parallel mode
Metrowerks Code
Warrior Windows
compiler

– Solaris 2.6 & 2.7, O2K,
DEC OSF, Linux, Cray
T3E, SV1
– Parallel on T3E & O2K
– Windows

• C++

- 13 -

– Solaris 2.6 and 2.7, Free
BSD, Linux, Windows

HDF
New to HDF5
•
•
•
•
•
•

Virtual File Layer – alternate I/O drivers
Array datatype – array as atomic type
File sizes greater than 2GB on Linux
Performance improvements – parallel & serial
Improvements in configuration
Tutorials and documentation

- 14 -

HDF
Next major release -- HDF5 1.6
• New format and library features
–
–
–
–
–
–

Reclamation of free space within a file
Enhanced hyperslab/region selection
Dimension scale support
Bzip2 compression
Generic Properties
Performance improvements

• Parallel I/O performance benchmark suite
• Release date: Fall 2002
- 15 -

HDF
High level APIs
• HDF5 routines that do more operations per call
than the basic HDF5 interface
• Goals
– Make HDF5 easier to use
– Encourage standard ways to store objects in HDF5

- 16 -

HDF
High level APIs
•
•
•
•
•
•

Lite – done
Image – done
Table – partly done
Dimension scale – in the works
Unstructured grids – in the works
http://hdf.ncsa.uiuc.edu/HDF5/hdf5_hl/doc/

- 17 -

HDF
HDF5 High Level APIs – HDF5 Image
• For datasets to be interpreted as images/palettes
– 2-D raster data like HDF4 raster images

• Image operations
– Create, write, read, query

• Based on “HDF5 Image & Palette Specification”

- 18 -

HDF
HDF5 High Level APIs – HDF5 Table
• For datasets to be interpreted as “tables”
– A collection of records
– All records have the same structure
– Like Vdatas in HDF4, but more operations

• Table operations
– Create, write, read, query
– Insert, delete records or fields
– Future: sort and search
- 19 -

HDF
HDF5 tools activities

- 20 -

HDF
H5View – new features
•
•
•
•
•
•
•

Improved editing
XML support
Supports “image” data sets
Line plots of row/column data
Save data to a text file
Set user preferences
http://hdf.ncsa.uiuc.edu/java-hdf5-html/
- 21 -

HDF
Future Viewer work
• Objectives
– Support transition from HDF4 to HDF5
– Eliminate duplicate work on HDF4 and HDF5 tools
– Provide plug-in architecture for tool development

• Three stages
– Object model to cover both HDF4 & HDF5
• Nearly done

– Simple browser for both HDF4 and HDF5
• Summer 2002

– Editor
• Fall 2002

– Modular toolkit for browser plug-ins
• Fall 2002, if all goes well
- 22 -

HDF
XML and web experiments
• Explore XML, Web applications for HDF5
– Investigate use of XML
– investigate Web XML technologies, use with HDF5

• XML DTD for HDF5
• Conversions involving XML
– HDF4 HDF5 XML
– Compression and timing study
– netcdf to HDF5 translation, via XML style sheet

• XML Schema for HDF5 investigated
- 23 -

HDF
Other Java-inspired explorations
• Tomcat web server and JSP servlets
– HDF5 file XML
html display in a web browser
– Shows promise for providing remote access to HDF5

• HDF5 and CORBA
–
–
–
–

Investigating using CORBA with Java and HDF5.
Data access done in a CORBA servant written in C/C++
Browsing and presenting the information done in Java
Uses CORBA is alternative to the Java Native Interface, which
calls C directly from Java.

- 24 -

HDF
Other Tools Activities
• HDF5-to-GIF and GIF-to-HDF5 converters
• H5dump subsetting

- 25 -

HDF
Facilitating the transition
from HDF4 to HDF5

- 26 -

HDF
The transition from HDF4 to HDF5
• We support both HDF4 and HDF5.
• We will continue to maintain HDF4, as
long as we are funded to do so.
• We recommend:
• Use HDF5 for new projects
• Consider migrating from HDF4 to
HDF5 to take advantage of the improved
features and performance of HDF5
- 27 -

HDF
HDF4-to-HDF5 Work
•
•
•
•

Mapping specification
Utility
Library
Experiments

- 28 -

HDF
HDF4-to-HDF5 Work
• Later talk by Bob McGrath to cover
– The key technical challenges
– NCSA toolkit work
– Experiments to support transition

- 29 -

HDF
Other Activities of Interest

- 30 -

HDF
Parallel HDF5
•
•
•
•

HDF5 supports parallel I/O
Emphasis on performance
Driven by DOE labs
Tutorial
http://hdf.ncsa.uiuc.edu/HDF5/doc/Tutor/
• See Elena’s talk

- 31 -

HDF
Parallel I/O benchmark suite
• A set of parallel performance benchmarks that
can be distributed with HDF5
• Parallel HDF5 in MPI environment
• Measures performance of raw I/O, MPI-I/O, &
HDF5
• Draft documentation:
http://hdf/RFC/PIO_Perf/PHDF5_performance.html

- 32 -

HDF
Other performance studies
See Elena’s talk

- 33 -

HDF
Thread-Safe HDF5 (Beta)
• Uses Pthreads (POSIX threads) library.
• Implements Phase 1 of HDF5 thread safety plan
– One lock per program

• Later phases
– 2: Separate locks per major operation (read, write, convert)
– 3: Separate locks for all operations.

• For details see
– http://hdf.ncsa.uiuc.edu/HDF5/papers/mthdf/

- 34 -

HDF
HDF5-DODS* Server
• HDF5 data available from DODS clients
• Demonstrated with Ferret client
• Documentation
– HDF5-DODS Data Model and Mapping
– The HDF5-DODS Server Prototype
– Demo of HDF5-DODS Server Prototype with
the DODS-Ferret Client

• http://hdf.ncsa.uiuc.edu/apps/dods/
* Distributed Oceanographic Data System
- 35 -

HDF
Transform architecture prototype
• Provides a mechanism for operating on data as it
is moved from one location to another
• Inspired by HDF5 I/O operations when moving
data between memory and file:
– Convert numbers – size, endianness, etc.
– Selection – subsetting & subsampling
– Linear order – row major vs. column major

• Later
– Change value/Meaning of data – units, coordinate
system, etc.

- 36 -

HDF
Outreach
• Documentation
– Expanded and improved
– Searchable, printable
– HDF5 User’s Guide

• Advanced and parallel HDF5 Tutorial
– http://hdf.ncsa.uiuc.edu/HDF5/doc/Tutor/

– Presented at SC2001
http://hdf.ncsa.uiuc.edu/HDF5/papers/SC2001/SC01_tutorial/

- 37 -

HDF
szip Compression
Reporting for
Pen-Shu Yeh and Wei Xia

- 38 -

HDF
Szip Compression in HDF 4

• CCSDS lossless compression
– Fast, effective compression method for EOS data
– Outperforms other common techniques in size & speed

• Szip software
– Implements CCSDS lossless compression
– Originally developed at University of New Mexico
– Implemented in HDF-4 by UNM

• HDF-4 with CCSDS-szip option (HDF-4-Szip)
– Delivered to GSFC and tested
– Tested with MODIS Level-1B data from GSFC DAAC
- 39 -

HDF
Sample results
Convert 343 Megabyte MODIS file
Conversion performed on PC-Linux 7.1, Pentium II, 300Mhz

Compression Ratio

Compression and decompresson time
(seconds)

2.80

3.00
700
600
500
400
300
200
100
0

558
Huff

2.28

2.50

575
Huff

2.00

1.60

1.50
RLE

szip

RLE

szip

1.00

86

72

42

64

0.50
0.00

CompressTime

DecompressTime

Rle

- 40 -

Huff

szip

HDF
Szip-HDF4 software

• HDF4 library routine SDsetcompress
– compress existing dataset or create new compressed dataset

• Utilities
– bin2hdf – input array from binary file, compress using
HDF-4-Szip, and output compressed data in HDF file
– hdf2chdf – input data from HDF file, compress using HDF4-Szip, and output compressed data in new HDF file
– chdf2bin – input HDF file (with or without compression),
and output user-selected arrays as separated binary files
- 41 -

HDF
Future Work

• Compress IEEE 32 bit float data with HDF-4-Szip
• Use HDF chunking routines
– Especially beneficial for subsetting performance

• Test HDF-4-szip on different computer platforms
– Currently HDF-4-Szip has been tested under Linux only
– Others to test: Sun, SGI and HP, Windows 98 and NT

• Advocacy in the scientific community
– Add User’s Guides and documentation to HDF docs.
– Present and demo at workshops and conferences.

• Resolve distribution issues with NCSA and UNM
- 42 -

HDF
Thank you!
Information Sources
HDF • HDF website
– http://hdf.ncsa.uiuc.edu/

5

• HDF5 Information Center
– http://hdf.ncsa.uiuc.edu/HDF5/

• HDF Helpdesk
– hdfhelp@ncsa.uiuc.edu

• HDF users mailing list
– hdfnews@ncsa.uiuc.edu
- 43 -

HDF

Contenu connexe

Tendances

Tendances (20)

2011 ACSI Survey Summary
2011 ACSI Survey Summary2011 ACSI Survey Summary
2011 ACSI Survey Summary
 
Transition from HDF4 to HDF5
Transition from HDF4 to HDF5 Transition from HDF4 to HDF5
Transition from HDF4 to HDF5
 
Parallel HDF5 Developments
Parallel HDF5 DevelopmentsParallel HDF5 Developments
Parallel HDF5 Developments
 
Status of HDF-EOS, Related Software and Tools
Status of HDF-EOS, Related Software and ToolsStatus of HDF-EOS, Related Software and Tools
Status of HDF-EOS, Related Software and Tools
 
The HDF Group: Community models and outreach
The HDF Group: Community models and outreachThe HDF Group: Community models and outreach
The HDF Group: Community models and outreach
 
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout MapsEnsuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
 
HDF Product Designer
HDF Product DesignerHDF Product Designer
HDF Product Designer
 
Transitioning from HDF4 to HDF5
Transitioning from HDF4 to HDF5Transitioning from HDF4 to HDF5
Transitioning from HDF4 to HDF5
 
HDF and netCDF Data Support in ArcGIS
HDF and netCDF Data Support in ArcGISHDF and netCDF Data Support in ArcGIS
HDF and netCDF Data Support in ArcGIS
 
NetCDF and HDF5
NetCDF and HDF5NetCDF and HDF5
NetCDF and HDF5
 
HDF & HDF-EOS Data & Support at NSIDC
HDF & HDF-EOS Data & Support at NSIDCHDF & HDF-EOS Data & Support at NSIDC
HDF & HDF-EOS Data & Support at NSIDC
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
HDF OPeNDAP project update and demo
HDF OPeNDAP project update and demoHDF OPeNDAP project update and demo
HDF OPeNDAP project update and demo
 
HDF Project Update
HDF Project UpdateHDF Project Update
HDF Project Update
 
Hadoop pycon2011uk
Hadoop pycon2011ukHadoop pycon2011uk
Hadoop pycon2011uk
 
Migrating from HDF5 1.6 to 1.8
Migrating from HDF5 1.6 to 1.8Migrating from HDF5 1.6 to 1.8
Migrating from HDF5 1.6 to 1.8
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Status of HDF-EOS, Related Software and Tools
 Status of HDF-EOS, Related Software and Tools Status of HDF-EOS, Related Software and Tools
Status of HDF-EOS, Related Software and Tools
 
HDF4 Mapping Project Update
HDF4 Mapping Project UpdateHDF4 Mapping Project Update
HDF4 Mapping Project Update
 

En vedette (6)

Welcome to HDF Workshop V
Welcome to HDF Workshop VWelcome to HDF Workshop V
Welcome to HDF Workshop V
 
The LEISA Atmospheric Corrector (LAC) on Earth Observer 1 (EO1)
The LEISA Atmospheric Corrector (LAC) on Earth Observer 1 (EO1)The LEISA Atmospheric Corrector (LAC) on Earth Observer 1 (EO1)
The LEISA Atmospheric Corrector (LAC) on Earth Observer 1 (EO1)
 
HDF and HDF-EOS Experiences and Applications
HDF and HDF-EOS Experiences and ApplicationsHDF and HDF-EOS Experiences and Applications
HDF and HDF-EOS Experiences and Applications
 
Workshop Discussion: HDF & HDF-EOS Future Direction
Workshop Discussion: HDF & HDF-EOS Future DirectionWorkshop Discussion: HDF & HDF-EOS Future Direction
Workshop Discussion: HDF & HDF-EOS Future Direction
 
Subsetting
SubsettingSubsetting
Subsetting
 
HDF-EOS Aura File Format Guidelines
HDF-EOS Aura File Format GuidelinesHDF-EOS Aura File Format Guidelines
HDF-EOS Aura File Format Guidelines
 

Similaire à HDF Update

Hdf5 current future
Hdf5 current futureHdf5 current future
Hdf5 current futuremfolk
 
9.-dados e processamento distribuido-hadoop.pdf
9.-dados e processamento distribuido-hadoop.pdf9.-dados e processamento distribuido-hadoop.pdf
9.-dados e processamento distribuido-hadoop.pdfManoel Ribeiro
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY pptsravya raju
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY pptsravya raju
 
Improving long-term preservation of EOS data by independently mapping HDF4 da...
Improving long-term preservation of EOS data by independently mapping HDF4 da...Improving long-term preservation of EOS data by independently mapping HDF4 da...
Improving long-term preservation of EOS data by independently mapping HDF4 da...The HDF-EOS Tools and Information Center
 

Similaire à HDF Update (20)

HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF Software Process - Lessons Learned & Success Factors
HDF Software Process - Lessons Learned & Success FactorsHDF Software Process - Lessons Learned & Success Factors
HDF Software Process - Lessons Learned & Success Factors
 
Support for NPP/NPOESS/JPSS by The HDF Group
 Support for NPP/NPOESS/JPSS by The HDF Group Support for NPP/NPOESS/JPSS by The HDF Group
Support for NPP/NPOESS/JPSS by The HDF Group
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
HDF Project Update
HDF Project UpdateHDF Project Update
HDF Project Update
 
Hdf5 current future
Hdf5 current futureHdf5 current future
Hdf5 current future
 
HDF5 High Level and Lite Libraries
HDF5 High Level and Lite LibrariesHDF5 High Level and Lite Libraries
HDF5 High Level and Lite Libraries
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
HDF Product Designer
HDF Product DesignerHDF Product Designer
HDF Product Designer
 
HDF5 and The HDF Group
HDF5 and The HDF GroupHDF5 and The HDF Group
HDF5 and The HDF Group
 
9.-dados e processamento distribuido-hadoop.pdf
9.-dados e processamento distribuido-hadoop.pdf9.-dados e processamento distribuido-hadoop.pdf
9.-dados e processamento distribuido-hadoop.pdf
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Plans for Enhanced NetCDF-4 Interface to HDF5 Data
Plans for Enhanced NetCDF-4 Interface to HDF5 DataPlans for Enhanced NetCDF-4 Interface to HDF5 Data
Plans for Enhanced NetCDF-4 Interface to HDF5 Data
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
Improving long-term preservation of EOS data by independently mapping HDF4 da...
Improving long-term preservation of EOS data by independently mapping HDF4 da...Improving long-term preservation of EOS data by independently mapping HDF4 da...
Improving long-term preservation of EOS data by independently mapping HDF4 da...
 
Introduction to HDF5 Data and Programming Models
Introduction to HDF5 Data and Programming ModelsIntroduction to HDF5 Data and Programming Models
Introduction to HDF5 Data and Programming Models
 
Hierarchical Data Formats (HDF) Update
Hierarchical Data Formats (HDF) UpdateHierarchical Data Formats (HDF) Update
Hierarchical Data Formats (HDF) Update
 

Plus de The HDF-EOS Tools and Information Center

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...The HDF-EOS Tools and Information Center
 

Plus de The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 
Leveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software TestingLeveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software Testing
 
Google Colaboratory for HDF-EOS
Google Colaboratory for HDF-EOSGoogle Colaboratory for HDF-EOS
Google Colaboratory for HDF-EOS
 

Dernier

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Dernier (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

HDF Update

  • 1. HDF Update Mike Folk National Center for Supercomputing Applications Science Data Processing Workshop February 26-28, 2002 -1- HDF
  • 2. Topics • What is HDF? • HDF4 update and plans • HDF5 update and plans -2- HDF
  • 3. NCSA HDF Mission To develop, promote, deploy, and support open and free technologies that facilitate scientific data storage, exchange, access, analysis and discovery. -3- HDF
  • 4. What is HDF? • • • • • • Format and software for scientific data Stores images, arrays, tables, etc. Storage and I/O efficiency Free and commercial software support Promote standards Broad user base in engineering & science -4- HDF
  • 5. Who is supporting HDF? • NASA/ESDIS – Earth science applications, instrument data – All aspects of data management • DOE/ASCI (Accelerated Strategic Computing Init.) – Simulations on massively parallel machines – Emphasis on parallel I/O performance, functionality • DOE Scientific Data Analysis & Computation Program – High performance I/O R & D • NCSA – Grid, Vis, other R&D, user support • Others – Applications, support, some R&D -5- HDF
  • 7. Goals for HDF4 • Support Terra, Aqua & other EOS data users • Maintain and upgrade library and tools • Address differences between HDF4 & HDF5 -7- HDF
  • 8. HDF4 milestones since Sept 2000 Q4 ‘00 Q1 ‘01 Q2 ‘01 Q3 ‘01 Q4 ‘01 Q1 ‘02 HDF4 library and utilities: HDF4 Java API & tools: •H F4 D •V . r4 1 sio er •H F4 D . r5 1 2.6 n •V -8- sio er 2.7 n HDF
  • 9. HDF4 releases • HDF4.1 r4 (Nov 2000) – – – – Chunking and chunking with compression JPEG compression fixed hdp utility enhancements HDF4-to-GIF and GIF-to-HDF4 converters • HDF4.1 r5 (Nov 2001) – – – – – Compression query functions Vdata: set size of Vdata link blocks Tools updates Added Solaris 2.8 Retired Solaris 2.6, OSF1 V4.0 and HPUX 10.20 -9- HDF
  • 11. HDF5 milestones since Sept 2000 Q4 ‘00 Q1 β ‘01 Q2 ‘01 Q3 ‘01 Q4 ‘01 Q1 ‘02 ) afe s ch at dp 3 rea 2 0 1 2( Base .4. . 4 . 1.4. .4. 1.4. Th _ _1 _1 _ _1 _ library: β e” ag β “im e” & High level l e” ab “t “lit library: _ • Java & other tools: HDF4-HDF5 conversion: _ .4. 1 0 _ .4. 1 1 ion rs e nv co lity _ ti u - 11 - Ja _ v 2.2 a α α2 y y rar ibrar lib _ _l HDF
  • 13. New platforms and languages • New platforms – – – – – – • Fortran 90 HP-UX IBM SP IA64 architecture Solaris 2.8 with 64-bit Linux in parallel mode Metrowerks Code Warrior Windows compiler – Solaris 2.6 & 2.7, O2K, DEC OSF, Linux, Cray T3E, SV1 – Parallel on T3E & O2K – Windows • C++ - 13 - – Solaris 2.6 and 2.7, Free BSD, Linux, Windows HDF
  • 14. New to HDF5 • • • • • • Virtual File Layer – alternate I/O drivers Array datatype – array as atomic type File sizes greater than 2GB on Linux Performance improvements – parallel & serial Improvements in configuration Tutorials and documentation - 14 - HDF
  • 15. Next major release -- HDF5 1.6 • New format and library features – – – – – – Reclamation of free space within a file Enhanced hyperslab/region selection Dimension scale support Bzip2 compression Generic Properties Performance improvements • Parallel I/O performance benchmark suite • Release date: Fall 2002 - 15 - HDF
  • 16. High level APIs • HDF5 routines that do more operations per call than the basic HDF5 interface • Goals – Make HDF5 easier to use – Encourage standard ways to store objects in HDF5 - 16 - HDF
  • 17. High level APIs • • • • • • Lite – done Image – done Table – partly done Dimension scale – in the works Unstructured grids – in the works http://hdf.ncsa.uiuc.edu/HDF5/hdf5_hl/doc/ - 17 - HDF
  • 18. HDF5 High Level APIs – HDF5 Image • For datasets to be interpreted as images/palettes – 2-D raster data like HDF4 raster images • Image operations – Create, write, read, query • Based on “HDF5 Image & Palette Specification” - 18 - HDF
  • 19. HDF5 High Level APIs – HDF5 Table • For datasets to be interpreted as “tables” – A collection of records – All records have the same structure – Like Vdatas in HDF4, but more operations • Table operations – Create, write, read, query – Insert, delete records or fields – Future: sort and search - 19 - HDF
  • 21. H5View – new features • • • • • • • Improved editing XML support Supports “image” data sets Line plots of row/column data Save data to a text file Set user preferences http://hdf.ncsa.uiuc.edu/java-hdf5-html/ - 21 - HDF
  • 22. Future Viewer work • Objectives – Support transition from HDF4 to HDF5 – Eliminate duplicate work on HDF4 and HDF5 tools – Provide plug-in architecture for tool development • Three stages – Object model to cover both HDF4 & HDF5 • Nearly done – Simple browser for both HDF4 and HDF5 • Summer 2002 – Editor • Fall 2002 – Modular toolkit for browser plug-ins • Fall 2002, if all goes well - 22 - HDF
  • 23. XML and web experiments • Explore XML, Web applications for HDF5 – Investigate use of XML – investigate Web XML technologies, use with HDF5 • XML DTD for HDF5 • Conversions involving XML – HDF4 HDF5 XML – Compression and timing study – netcdf to HDF5 translation, via XML style sheet • XML Schema for HDF5 investigated - 23 - HDF
  • 24. Other Java-inspired explorations • Tomcat web server and JSP servlets – HDF5 file XML html display in a web browser – Shows promise for providing remote access to HDF5 • HDF5 and CORBA – – – – Investigating using CORBA with Java and HDF5. Data access done in a CORBA servant written in C/C++ Browsing and presenting the information done in Java Uses CORBA is alternative to the Java Native Interface, which calls C directly from Java. - 24 - HDF
  • 25. Other Tools Activities • HDF5-to-GIF and GIF-to-HDF5 converters • H5dump subsetting - 25 - HDF
  • 26. Facilitating the transition from HDF4 to HDF5 - 26 - HDF
  • 27. The transition from HDF4 to HDF5 • We support both HDF4 and HDF5. • We will continue to maintain HDF4, as long as we are funded to do so. • We recommend: • Use HDF5 for new projects • Consider migrating from HDF4 to HDF5 to take advantage of the improved features and performance of HDF5 - 27 - HDF
  • 29. HDF4-to-HDF5 Work • Later talk by Bob McGrath to cover – The key technical challenges – NCSA toolkit work – Experiments to support transition - 29 - HDF
  • 30. Other Activities of Interest - 30 - HDF
  • 31. Parallel HDF5 • • • • HDF5 supports parallel I/O Emphasis on performance Driven by DOE labs Tutorial http://hdf.ncsa.uiuc.edu/HDF5/doc/Tutor/ • See Elena’s talk - 31 - HDF
  • 32. Parallel I/O benchmark suite • A set of parallel performance benchmarks that can be distributed with HDF5 • Parallel HDF5 in MPI environment • Measures performance of raw I/O, MPI-I/O, & HDF5 • Draft documentation: http://hdf/RFC/PIO_Perf/PHDF5_performance.html - 32 - HDF
  • 33. Other performance studies See Elena’s talk - 33 - HDF
  • 34. Thread-Safe HDF5 (Beta) • Uses Pthreads (POSIX threads) library. • Implements Phase 1 of HDF5 thread safety plan – One lock per program • Later phases – 2: Separate locks per major operation (read, write, convert) – 3: Separate locks for all operations. • For details see – http://hdf.ncsa.uiuc.edu/HDF5/papers/mthdf/ - 34 - HDF
  • 35. HDF5-DODS* Server • HDF5 data available from DODS clients • Demonstrated with Ferret client • Documentation – HDF5-DODS Data Model and Mapping – The HDF5-DODS Server Prototype – Demo of HDF5-DODS Server Prototype with the DODS-Ferret Client • http://hdf.ncsa.uiuc.edu/apps/dods/ * Distributed Oceanographic Data System - 35 - HDF
  • 36. Transform architecture prototype • Provides a mechanism for operating on data as it is moved from one location to another • Inspired by HDF5 I/O operations when moving data between memory and file: – Convert numbers – size, endianness, etc. – Selection – subsetting & subsampling – Linear order – row major vs. column major • Later – Change value/Meaning of data – units, coordinate system, etc. - 36 - HDF
  • 37. Outreach • Documentation – Expanded and improved – Searchable, printable – HDF5 User’s Guide • Advanced and parallel HDF5 Tutorial – http://hdf.ncsa.uiuc.edu/HDF5/doc/Tutor/ – Presented at SC2001 http://hdf.ncsa.uiuc.edu/HDF5/papers/SC2001/SC01_tutorial/ - 37 - HDF
  • 38. szip Compression Reporting for Pen-Shu Yeh and Wei Xia - 38 - HDF
  • 39. Szip Compression in HDF 4 • CCSDS lossless compression – Fast, effective compression method for EOS data – Outperforms other common techniques in size & speed • Szip software – Implements CCSDS lossless compression – Originally developed at University of New Mexico – Implemented in HDF-4 by UNM • HDF-4 with CCSDS-szip option (HDF-4-Szip) – Delivered to GSFC and tested – Tested with MODIS Level-1B data from GSFC DAAC - 39 - HDF
  • 40. Sample results Convert 343 Megabyte MODIS file Conversion performed on PC-Linux 7.1, Pentium II, 300Mhz Compression Ratio Compression and decompresson time (seconds) 2.80 3.00 700 600 500 400 300 200 100 0 558 Huff 2.28 2.50 575 Huff 2.00 1.60 1.50 RLE szip RLE szip 1.00 86 72 42 64 0.50 0.00 CompressTime DecompressTime Rle - 40 - Huff szip HDF
  • 41. Szip-HDF4 software • HDF4 library routine SDsetcompress – compress existing dataset or create new compressed dataset • Utilities – bin2hdf – input array from binary file, compress using HDF-4-Szip, and output compressed data in HDF file – hdf2chdf – input data from HDF file, compress using HDF4-Szip, and output compressed data in new HDF file – chdf2bin – input HDF file (with or without compression), and output user-selected arrays as separated binary files - 41 - HDF
  • 42. Future Work • Compress IEEE 32 bit float data with HDF-4-Szip • Use HDF chunking routines – Especially beneficial for subsetting performance • Test HDF-4-szip on different computer platforms – Currently HDF-4-Szip has been tested under Linux only – Others to test: Sun, SGI and HP, Windows 98 and NT • Advocacy in the scientific community – Add User’s Guides and documentation to HDF docs. – Present and demo at workshops and conferences. • Resolve distribution issues with NCSA and UNM - 42 - HDF
  • 43. Thank you! Information Sources HDF • HDF website – http://hdf.ncsa.uiuc.edu/ 5 • HDF5 Information Center – http://hdf.ncsa.uiuc.edu/HDF5/ • HDF Helpdesk – hdfhelp@ncsa.uiuc.edu • HDF users mailing list – hdfnews@ncsa.uiuc.edu - 43 - HDF