It will cover features of the HDF5 library for achieving better I/O performance and efficient storage. The following HDF5 features will be discussed: chunked storage layout.
This tutorial is for persons who are already familiar with HDF5 and wish to take advantage is some of its advanced features.
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
HDF5 Advanced Topics - Chunking
1. The HDF Group
HDF5 Advanced Topics
Elena Pourmal
The HDF Group
The 13th HDF and HDF-EOS Workshop
November 3-5, 2009
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
1
www.hdfgroup.org
2. The HDF Group
Chunking in HDF5
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
2
www.hdfgroup.org
3. Goal
• To help you with understanding of how HDF5
chunking works, so you can efficiently store and
retrieve data from HDF5
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
3
www.hdfgroup.org
4. Recall from Intro: HDF5 Dataset
Metadata
Dataset data
Dataspace
Rank Dimensions
3
Dim_1 = 4
Dim_2 = 5
Dim_3 = 7
Datatype
IEEE 32-bit float
Storage info
Attributes
Time = 32.4
Chunked
Pressure = 987
Compressed
Temp = 56
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
4
www.hdfgroup.org
5. Contiguous storage layout
• Metadata header separate from dataset data
• Data stored in one contiguous block in HDF5
file
Metadata cache
Dataset header
………….
Datatype
Dataspace
………….
Attributes
…
Dataset data
Application memory
File
November 3-5, 2009
Dataset data
HDF/HDF-EOS Workshop XIII
5
www.hdfgroup.org
6. What is HDF5 Chunking?
• Data is stored in chunks of predefined size
• Two-dimensional instance may be referred to
as data tiling
• HDF5 library always writes/reads the whole
chunk
Contiguous
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
Chunked
6
www.hdfgroup.org
7. What is HDF5 Chunking?
• Dataset data is divided into equally sized blocks (chunks).
• Each chunk is stored separately as a contiguous block in
HDF5 file.
Metadata cache
Dataset data
Dataset header
………….
Datatype
Dataspace
………….
Attributes
…
File
November 3-5, 2009
A
B
C
D
Chunk
index
Application memory
header
Chunk
index
A
HDF/HDF-EOS Workshop XIII
C
D
7
B
www.hdfgroup.org
8. Why HDF5 Chunking?
• Chunking is required for several HDF5
features
• Enabling compression and other filters like
checksum
• Extendible datasets
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
8
www.hdfgroup.org
9. Why HDF5 Chunking?
• If used appropriately chunking improves
partial I/O for big datasets
Only two chunks are involved in I/O
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
9
www.hdfgroup.org
10. Creating Chunked Dataset
1.
2.
3.
Create a dataset creation property list.
Set property list to use chunked storage layout.
Create dataset with the above property list.
dcpl_id = H5Pcreate(H5P_DATASET_CREATE);
rank = 2;
ch_dims[0] = 100;
ch_dims[1] = 200;
H5Pset_chunk(dcpl_id, rank, ch_dims);
dset_id = H5Dcreate (…, dcpl_id);
H5Pclose(dcpl_id);
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
10
www.hdfgroup.org
11. Creating Chunked Dataset
• Things to remember:
• Chunk always has the same rank as a dataset
• Chunk’s dimensions do not need to be factors
of dataset’s dimensions
• Caution: May cause more I/O than desired
(see white portions of the chunks below)
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
11
www.hdfgroup.org
12. Quiz time
• Why shouldn’t I make a chunk with dimension
sizes equal to one?
• Can I change chunk size after dataset was
created?
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
12
www.hdfgroup.org
13. Writing or Reading Chunked Dataset
1.
2.
Chunking mechanism is transparent to application.
Use the same set of operation as for contiguous
dataset, for example,
H5Dopen(…);
H5Sselect_hyperslab (…);
H5Dread(…);
3.
Selections do not need to coincide precisely with the
chunks boundaries.
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
13
www.hdfgroup.org
14. HDF5 Chunking and compression
Chunking is required for compression and
other filters
HDF5 filters modify data during I/O operations
Filters provided by HDF5:
•
•
•
•
•
•
•
Checksum (H5Pset_fletcher32)
Data transformation (in 1.8.*)
Shuffling filter (H5Pset_shuffle)
Compression (also called filters) in HDF5
•
•
•
•
November 3-5, 2009
Scale + offset (in 1.8.*) (H5Pset_scaleoffset)
N-bit (in 1.8.*) (H5Pset_nbit)
GZIP (deflate) (H5Pset_deflate)
SZIP (H5Pset_szip)
HDF/HDF-EOS Workshop XIII
14
www.hdfgroup.org
16. Creating Compressed Dataset
1.
2.
3.
4.
Create a dataset creation property list
Set property list to use chunked storage layout
Set property list to use filters
Create dataset with the above property list
crp_id = H5Pcreate(H5P_DATASET_CREATE);
rank = 2;
ch_dims[0] = 100;
ch_dims[1] = 100;
H5Pset_chunk(crp_id, rank, ch_dims);
H5Pset_deflate(crp_id, 9);
dset_id = H5Dcreate (…, crp_id);
H5Pclose(crp_id);
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
16
www.hdfgroup.org
17. The HDF Group
Performance Issues
or
What everyone needs to know
about chunking, compression and
chunk cache
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
17
www.hdfgroup.org
18. Accessing a row in contiguous dataset
One seek is needed to find the starting location of row of data.
Data is read/written using one disk access.
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
18
www.hdfgroup.org
19. Accessing a row in chunked dataset
Five seeks is needed to find each chunk. Data is read/written
using five disk accesses. Chunking storage is less efficient
than contiguous storage.
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
19
www.hdfgroup.org
20. Quiz time
• How might I improve this situation, if it is
common to access my data in this way?
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
20
www.hdfgroup.org
21. Accessing data in contiguous dataset
M rows
M seeks are needed to find the starting location of the element.
Data is read/written using M disk accesses. Performance may be
very bad.
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
21
www.hdfgroup.org
22. Motivation for chunking storage
M rows
Two seeks are needed to find two chunks. Data is
read/written using two disk accesses. For this pattern
chunking helps with I/O performance.
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
22
www.hdfgroup.org
23. Quiz time
• If I know I shall always access a column at a time,
what size and shape should I make my chunks?
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
23
www.hdfgroup.org
24. Motivation for chunk cache
A
B
H5Dwrite
H5Dwrite
Selection shown is written by two H5Dwrite calls (one for
each row).
Chunks A and B are accessed twice (one time for each
row). If both chunks fit into cache, only two I/O accesses
needed to write the shown selections.
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
24
www.hdfgroup.org
25. Motivation for chunk cache
A
B
H5Dwrite
H5Dwrite
Question: What happens if there is a space for only one
chunk at a time?
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
25
www.hdfgroup.org
26. HDF5 raw data chunk cache
• Improves performance whenever the same chunks are
read or written multiple times.
• Current implementation doesn’t adjust parameters
automatically (cache size, size of hash table).
• Chunks are indexed with a simple hash table.
• Hash function = (cindex mod nslots), where cindex is the
linear index into a hypothetical array of chunks and nslots
is the size of hash table.
• Only one of several chunks with the same hash value
stays in cache.
• Nslots should be a prime number to minimize the number
of hash value collisions.
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
26
www.hdfgroup.org
27. HDF5 Chunk Cache APIs
• H5Pset_chunk_cache sets raw data chunk
cache parameters for a dataset
H5Pset_chunk_cache (dapl, rdcc_nslots,
rdcc_nbytes, rdcc_w0);
• H5Pset_cache sets raw data chunk cache
parameters for all datasets in a file
H5Pset_cache (fapl, 0, nslots,
5*1024*1024, rdcc_w0);
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
27
www.hdfgroup.org
28. Hints for Chunk Settings
• Chunk dimension sizes should align as closely as
possible with hyperslab dimensions for read/write
• Chunk cache size (rdcc_nbytes) should be large
enough to hold all the chunks in a selection
• If this is not possible, it may be best to disable chunk
caching altogether (set rdcc_nbytes to 0)
• rdcc_slots should be a prime number that is at
least 10 to 100 times the number of chunks that can
fit into rdcc_nbytes
• rdcc_w0 should be set to 1 if chunks that have been
fully read/written will never be read/written again
November 3-5,
2009
HDF/HDF-EOS Workshop XIII
28
www.hdfgroup.org
29. The Good and The Ugly: Reading a row
A
B
M rows
Each row is read by a
separate call to H5Dread
The Good: If both chunks fit into cache, 2 disks accesses
are needed to read the data.
The Ugly: If one chunk fits into cache, 2M disks accesses
are needed to read the data (compare with M accesses for
contiguous storage).
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
29
www.hdfgroup.org
30. Case study: Writing Chunked Dataset
• 1000x100x100 dataset
• 4 byte integers
• Random values 0-99
• 50x100x100 chunks (20 total)
• Chunk size: 2 MB
• Write the entire dataset using 1x100x100
slices
• Slices are written sequentially
• Chunk cache size 1MB (default) compared
with chunk cache size is 5MB
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
30
www.hdfgroup.org
31. Test Setup
• 20 Chunks
• 1000 slices
• Chunk size ~ 2MB
• Total size ~ 40MB
• Each plane ~ 40KB
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
31
www.hdfgroup.org
32. Aside: Writing dataset with contiguous storage
• 1000 disk accesses to write 1000 planes
• Total size written 40MB
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
32
www.hdfgroup.org
33. Writing chunked dataset
• Example: Chunk fits into cache
• Chunk is filled in cache and then written to
disk
• 20 disk accesses are needed
• Total size written 40MB
1000 disk access for contiguous
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
33
www.hdfgroup.org
34. Writing chunked dataset
• Example: Chunk doesn’t fit into cache
• For each chunk (20 total)
1. Fill chunk in memory with the first plane and write it
to the file
2. Write 49 new planes to file directly
• End For
• Total disk accesses 20 x(1 + 49)= 1000
• Total data written ~80MB (vs. 40MB)
B
November 3-5,
2009
B
Chunk cache
HDF/HDF-EOS Workshop XIII
34
A
A
November 3-5, 2009
A
HDF/HDF-EOS Workshop XIII
34
www.hdfgroup.org
35. Writing compressed chunked dataset
• Example: Chunk fits into cache
• For each chunk (20 total)
1. Fill chunk in memory, compress it and write it to file
• End For
• Total disk accesses 20
• Total data written less than 40MB
Chunk cache
B
November 3-5,
2009
Chunk in a file
B
B
35
A
A
A
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
35
www.hdfgroup.org
36. Writing compressed chunked dataset
• Example: Chunk doesn’t fit into cache
• For each chunk (20 total)
•
•
•
•
•
•
•
Fill chunk with the first plane, compress, write to a file
For each new plane (49 planes)
• Read chunk back
• Fill chunk with the plane
• Compress
• Write chunk to a file
End For
End For
Total disk accesses 20 x(1+2x49)= 1980
Total data written and read ? (see next slide)
Note: HDF5 can probably detect such behavior and increase cache
size
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
36
www.hdfgroup.org
37. Effect of Chunk Cache Size on Write
No compression, chunk size is 2MB
Cache size
I/O operations
Total data
written
File size
1 MB (default)
1002
75.54 MB
38.15 MB
5 MB
22
38.16 MB
38.15 MB
Gzip compression
Cache size
I/O operations
Total data
written
File size
1 MB (default)
1982
335.42 MB
(322.34 MB read)
13.08 MB
5 MB
22
13.08 MB
13.08 MB
November 3-5,
2009
HDF/HDF-EOS Workshop XIII
37
www.hdfgroup.org
38. Effect of Chunk Cache Size on Write
• With the 1 MB cache size, a chunk will not fit into the
cache
• All writes to the dataset must be immediately written to
disk
• With compression, the entire chunk must be read and
rewritten every time a part of the chunk is written to
• Data must also be decompressed and recompressed
each time
• Non sequential writes could result in a larger file
• Without compression, the entire chunk must be
written when it is first written to the file
• If the selection were not contiguous on disk, it could
require as much as 1 I/O disk access for each
element
November 3-5,
2009
HDF/HDF-EOS Workshop XIII
38
www.hdfgroup.org
39. Effect of Chunk Cache Size on Write
• With the 5 MB cache size, the chunk is written
only after it is full
• Drastically reduces the number of I/O
operations
• Reduces the amount of data that must be
written (and read)
• Reduces processing time, especially with the
compression filter
November 3-5,
2009
HDF/HDF-EOS Workshop XIII
39
www.hdfgroup.org
40. Conclusion
• It is important to make sure that a chunk will fit
into the raw data chunk cache
• If you will be writing to multiple chunks at once,
you should increase the cache size even more
• Try to design chunk dimensions to minimize
the number you will be writing to at once
November 3-5,
2009
HDF/HDF-EOS Workshop XIII
40
www.hdfgroup.org
41. Reading Chunked Dataset
• Read the same dataset, again by slices, but
the slices cross through all the chunks
• 2 orientations for read plane
• Plane includes fastest changing dimension
• Plane does not include fastest changing
dimension
• Measure total read operations, and total size
read
• Chunk sizes of 50x100x100, and 10x100x100
• 1 MB cache
November 3-5,
2009
HDF/HDF-EOS Workshop XIII
41
www.hdfgroup.org
42. Test Setup
• Chunks
• Read slices
100
• Vertical and horizontal
00
10
100
November 3-5,
2009
HDF/HDF-EOS Workshop XIII
42
www.hdfgroup.org
43. Aside: Reading from contiguous dataset
• Repeat 100 times for each plane
• Repeat 1000 times
• Read a row
• Seek to the beginning of the next read
• Total 105 disk accesses
00
1
0
seek
November 3-5,
2009
100
read
100
HDF/HDF-EOS Workshop XIII
43
www.hdfgroup.org
44. Reading chunked dataset
• No compression; chunk fits into cache
• For each plane (100 total)
• For each chunk (20 total)
• Read chunk
• Extract 50 rows
00
10
• End For
• End For
100
• Total 2000 disk accesses
• Chunk doesn’t fit into cache
• Data is read directly
from the file
• 105 disk accesses
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
100
44
www.hdfgroup.org
45. Reading chunked dataset
• Compression
• Cache size doesn’t matter in this case
• For each plane (100 total)
• For each chunk (20 total)
• Read chunk, uncompress
• Extract 50 rows
00
10
• End
• Total 2000 disk accesses
November 3-5,
2009
100
• End
100
HDF/HDF-EOS Workshop XIII
45
www.hdfgroup.org
46. Results
• Read slice includes fastest changing
dimension
Chunk size
Compression
I/O operations
Total data read
50
Yes
2010
1307 MB
10
Yes
10012
1308 MB
50
No
100010
38 MB
10
No
10012
3814 MB
November 3-5,
2009
HDF/HDF-EOS Workshop XIII
46
www.hdfgroup.org
47. Aside: Reading from contiguous dataset
• Repeat for each plane (100 total)
• Repeat for each column (1000 total)
• Repeat for each element (100 total)
• Read element
• Seek to the next one
00
10
seek
November 3-5,
2009
100
• Total 107 disk accesses
100
HDF/HDF-EOS Workshop XIII
47
www.hdfgroup.org
48. Reading chunked dataset
• No compression; chunk fits into cache
• For each plane (100 total)
• For each chunk (20 total)
• Read chunk, uncompress
• Extract 50 columns
• End
• End
• Total 2000 disk accesses
• Chunk doesn’t fit into cache
• Data is read directly
from the file
• 7
November 3-5,10 disk operations
2009
HDF/HDF-EOS Workshop XIII
48
www.hdfgroup.org
49. Reading chunked dataset
• Compression; cache size doesn’t matter
• For each plane (100 total)
• For each chunk (20 total)
• Read chunk, uncompress
• Extract 50 columns
• End
• End
• Total 2000 disk accesses
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
49
www.hdfgroup.org
50. Results (continued)
• Read slice does not include fastest changing
dimension
Chunk size
Compression
I/O operations
Total data read
50
Yes
2010
1307 MB
10
Yes
10012
1308 MB
50
No
10000010
38 MB
10
No
10012
3814 MB
November 3-5,
2009
HDF/HDF-EOS Workshop XIII
50
www.hdfgroup.org
51. Effect of Cache Size on Read
• When compression is enabled, the library must
always read entire chunk once for each call to
H5Dread (unless it is in cache)
• When compression is disabled, the library’s
behavior depends on the cache size relative to
the chunk size.
• If the chunk fits in cache, the library reads entire
chunk once for each call to H5Dread
• If the chunk does not fit in cache, the library reads
only the data that is selected
• More read operations, especially if the read plane
does not include the fastest changing dimension
• Less total data read
November 3-5,
2009
HDF/HDF-EOS Workshop XIII
51
www.hdfgroup.org
52. Conclusion
• On read cache size does not matter when
compression is enabled.
• Without compression, the cache must be large
enough to hold all of the chunks to get good
preformance.
• The optimum cache size depends on the exact
shape of the data, as well as the hardware, as
well as access pattern.
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
52
www.hdfgroup.org
53. The HDF Group
Thank You!
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
53
www.hdfgroup.org
54. Acknowledgements
This work was supported by cooperative agreement
number NNX08AO77A from the National
Aeronautics and Space Administration (NASA).
Any opinions, findings, conclusions, or
recommendations expressed in this material are
those of the author[s] and do not necessarily reflect
the views of the National Aeronautics and Space
Administration.
November 3-5, 2009
HDF/HDF-EOS Workshop XIII
54
www.hdfgroup.org
More data will be written in this case
Ghost zones are filled with fill value unless fill value is disabled
2MB chunk doesn’t fit to 1MB cache. Chunk is allocated in the file, and written once with the first frame in the chunk, then HDF5 writes one frame at a time, writing 20x49 + 20 times = 1000 plus small I/O for metatdata. Almost twice the size participates in I/O
2MB chunk does fit into 5MB cache, we write 50 frames at once, when we fill the chunk and do it 20 times only!
In case of compression situation is even worse: we need to write chunk every time we write a frame, then read it back to write another frame, etc.
For each plane we write chunk 1000; in order to modify plane 2-50 (49 of them) in one chunk , we have to read the chunk 49 times x 20 times (number of chunks), so we get 1000 (writes) + 980 (reads).
We do 1000 writes and 980 reads for raw data.
Bigger cache works nicely.
First case: to read one plane, we need to read each chunk (20 I/Os) and do it 100 times (number of rows) (doesn’t fit into cache)
Second case: to read one plane we need to read each chunk (100 I/O) and do it 100 times (for each row)
Third case: cache is bypassed; reads from the file 1000 x 100 (for each row) = 100000
Fourth case: chunk fits into cache, so for one plane we do 100 reads to bring in all chunks, then do it 100 times for each row
No difference for the first two cases and fourth one (for first two we always bring chunk into memory and uncompress), the third one fits into cache
In the third case, chunk doesn’t fit into cache and library reads directly from the file getting one element at a time (1000x 100 (# of rows) x 100 (# columns) = 10000000)