Democratizing Memory Storage

© Hortonworks Inc. 2011 - 2015
Democratizing Memory Storage
Arpit Agarwal
arp@apache.org
@aagarw
Page 1

HDFS Heterogeneous Storage Media
Page 2
Architecting the Future of Big Data

Heterogeneous Storage (continued)
• Introduced in Apache Hadoop 2.3
• Memory introduced as a storage medium
–RAM Disk provides retention across process restarts
• Memory is treated differently due to its transient nature
–More on this later
Page 3

HDFS Heterogeneous Storage (Continued)
• Rich storage media policies introduced in Hadoop 2.6
• Applications can target different storage media
• Set policy of individual file or directory sub-tree
–setStoragePolicy API
Page 4

HDFS Heterogeneous Storage (Continued)
• Example built-in policies
– DEFAULT – All replicas on DISK
– ONESSD – One replica on SSD, rest on DISK
– ALLSSD – All replicas on SSD
– COLD – All replicas on Archival Storage
– LAZY_PERSIST – 1 replica in local memory, lazy write to disk
Page 5

Page 6
• Why not rely on the OS page cache?

Page 7
• Scan workloads invalidate the page cache
–HDFS uses buffered IO for reads and writes
• Control the eviction scheme
• Permit further optimizations
–Checksum computation off the hot path
–Collocate data and computation

Centralized Cache Management (CCM)
• Introduced in Hadoop 2.3
• Pin hot data to memory
Page 8

CCM (Continued)
• Administrator configures cache pools
• User issues commands to manage the contents of pools
• Users specify which files or directories are hot
–HDFS loads file contents into memory
Page 9

CCM (Continued)
Page 10

CCM (Continued)
• Eliminate checksum computations during read
–Checksums used to flag disk and network errors
–HDFS will pre-verify checksums when caching data from disk
• Data Node and the HDFS client use shared memory segments to
communicate which blocks are shared
Page 11

CCM (Continued)
• Enables short-circuit and zero-copy reads from memory to avoid RPC
overhead
• Short-circuit reads are transparent to applications
• Zero-copy read API
–ByteBuffer read(ByteBufferPool factory, int maxLength,
EnumSet<ReadOption> opts);
–void releaseBuffer(ByteBuffer buffer);
• E.g. Apache Hive uses ZCR for ORC files
Page 12

HDFS Lazy Persist Writes
• HDFS feature Introduced in Apache Hadoop 2.6
• Exposed via Storage Policies
–Set the LAZY_PERSIST policy on a file or directory
Page 13

HDFS Lazy Persist Writes (continued)
• Applications can write to files in memory
• HDFS will write the data to persistent storage off the hot path
–Retain memory latency
• Expected to be used with single replica writes
–Latency benefits negated by pipeline replication over the network
Page 14

HDFS Lazy Persist Writes (Continued)
Page 15

HDFS Lazy Persist Writes (Continued)
• Best-effort persistence with retention across process restarts
• Data loss rare but possible – node restart, network partition
–Recovery pushed to compute framework layers
• Adoption by Apache projects
–Hive in-memory tables
–Low latency persistence for Spark RDDs
Page 16

Areas of Improvement
• Cache data on Read as opposed to pinning on demand
• Short-circuit writes
–Eliminate Hadoop RPC overhead for writes
• Isolate applications from HDFS APIs
Page 17

Areas of Improvement
• Challenging to fix computation frameworks to use memory storage
• Address use cases beyond intermediate data
–When to cache?
–Frameworks do not know
• The application context knows or the user knows
• Let the user decide
–E.g. jobfoo input=memfs://… tmp=memfs://… output=hdfs://…
Page 18

Memfs – A Layered File System
• Planned for Apache Hadoop 2.9
• A thin HCFS that can layer over any other HCFS
• Transparently uses HDFS memory features when available
• HDFS has used layered FS approach before
–ViewFS, ChecksumFS
Page 19

Page 20
• Memfs paths correspond to underlying FS paths 1:1
–E.g. memfs://results.txt hdfs://results.txt
• Reading a file via Memfs loads it into DataNode RAM
• Writing a file via Memfs transparently uses the Lazy Persist Storage
Policy for low latency writes

Memfs Benefits
• Beyond the typical use case of intermediate data
• Isolate applications from HDFS APIs
–Let us evolve HDFS support over time
• Lightweight - no state maintained outside of the base FS
Page 21

Memfs Benefits (Continued)
• All IO is channeled through the base FS in the user’s security context
• Behavior can be controlled by configuration
–E.g. Administrator configures separate cache pools for Memfs
–Move the pool selection logic to Memfs
• Future Memfs implementations using other base HCFS are possible
–May not be as lightweight
Page 22

Spark RDD
• Spark Resilient Distributed Datasets
• Lineage Information for Fault Tolerance is recorded with the RDD
–Lost data recomputed via Lineage
• HDFS Lazy Persist writes can complement Spark RDD as a low latency
backing store (SPARK-6479)
Page 23

Tachyon
• Tachyon is also a layered file system
–Powerful idea
• Works best when data is guaranteed to fit in memory
• Introduces the concept of Lineage
–Optional but required for persistence and recovery
–memfs designed to use recovery built into framework layers in case of rare failures
Page 24

Credits
• Heterogeneous Storage Media
– Tsz Wo (Nicholas) Sze, Hortonworks (szetszwo@apache.org)
– Sanjay Radia, Hortonworks (sradia@apache.org)
– Suresh Srinivas, Hortonworks (suresh@apache.org)
– Junping Du, Hortonworks (junping_du@apache.org)
• Rich Storage Policies
– Jing Zhao, Hortonworks (jing9@apache.org)
– Tsz Wo (Nicholas) Sze, Hortonworks (szetszwo@apache.org)
• CCM
– Andrew Wang, Cloudera (wang@apache.org)
– Colin Mccabe, Cloudera (cmccabe@apache.org)
– Chris Nauroth, Hortonworks (cnauroth@apache.org)
• Lazy Persist Writes
– Jitendra Pandey, Hortonworks (jitendra@apache.org)
– Sanjay Radia, Hortonworks (sradia@apache.org)
– Xiaoyu Yao, Hortonworks (xyao@apache.org)
– Gopal V, Hortonworks (gopalv@apache.org)
Page 25

Slides URL
• http://s.apache.org/mem-2015
Page 26

Page 27

Apache Hadoop File Systems primer (Bonus)
• FileSystem interface captures common FS operations
• Any conforming implementation is a Hadoop Compatible File System
(HCFS)
Page 28

Page 29
• HDFS is the canonical Hadoop FS
• Ships with Apache Hadoop and implements the complete set of features
exposed by the FileSystem interface e.g.
–Snapshots
–Heterogeneous Storage Media
–Extended Attributes
–Posix ACLs
• Supports Kerberos Authentication in Secure Mode

Democratizing Memory Storage

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Democratizing Memory Storage

Similar to Democratizing Memory Storage (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Democratizing Memory Storage

Editor's Notes