%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
20151027 sahara + manila final
1. MANILA* AND SAHARA*: CROSSING
THE DESERT TO THE BIG DATA OASIS
Ethan Gafford, Red Hat
Jeff Applewhite, NetApp
Malini Bhandaru, Intel
covering for Weiting Chen
2. AGENDA
• Introduction
• Sahara Overview
• Manila Overview
• The goal for Sahara and Manila integration
• The approaches
•Manila HDFS Driver
•Manila NFS Share Mount
•Manila + NetApp NFS Connector for Hadoop
• Conclusion
• Q&A
2Intel NetApp RedHat
3. Sahara: The Problem
Hadoop* (and Spark*, Storm*…) clusters are difficult to configure
Commodity hardware is cheap but requires frequent (costly) maintenance
Reliable hardware is expensive, and a fixed-size cluster will cause contention
Demand for data processing varies over time within an organization
Baremetal clusters go down, and can be a single point of failure
Hadoop dev is very difficult without a real cluster
TL;DR: Data processing clusters are harder to provision and maintain than they
should be, and it hurts.
3
Intel NetApp RedHat
4. Sahara: The Solution
Put it in a cloud!
Then have easy-to-use, standardized interfaces:
● To create clusters (reliably and repeatedly)
● To scale clusters
● To run data processing jobs
● On any popular data processing framework
● With sensible defaults that just work
● And sophisticated configuration management for expert users
That's OpenStack* Sahara.
4
Intel NetApp RedHat
9. Manila Share and Access APIs
Operation CLI Command Description
Create manila create Create a Manila share of specified size; optional name, availability zone, share type, share network, source snapshot
Delete manila delete Delete an existing Manila share; the manila force-delete command may be required if the Manila share is in an error state
Edit manila metadata Set or unset metadata on a Manila share
List manila list List all Manila shares
Show manila show Show details about a Manila share
Operation CLI Command Description
Allow manila access-allow Allow access to the specified share for the specified access type and value (IP address or IP network address in CIDR notation or Windows user name).
Deny manila access-deny Deny access to the specified share for the specified access type and value (IP address or IP network address in CIDR notation or Windows user name).
List manila access-list List all Manila share access rules
9
Intel NetApp RedHat
11. The Goal for Sahara and Manila
Integration
To support as many as storage backends and protocols in Sahara as possible
11
Intel NetApp RedHat
12. Sahara Data Processing Model in Kilo*
Host
Virtual Cluster
VM1 VM2
Computing Task
HDFS
Computing Task
HDFS
PATTERN 1:
Internal HDFS in the same node
Host
Virtual Cluster
VM1 VM2
Computing Task HDFS
PATTERN 2:
Internal HDFS in different nodes
Host
Virtual
Cluster
VM1
Computing Task
Swift*
PATTERN 3:
Swift*
Host
12Intel NetApp RedHat
Compute and data reside
together in the same instance
in your Hadoop cluster.
Compute and data reside in
different instances. This is an
elastic way to manage
Hadoop clusters.
In order to persist data,
Sahara supports Swift to
stream the data directly.
13. Sahara Data Processing Model in Liberty* and the future
PATTERN 4:
External HDFS via Manila*
PATTERN 5:
Local Storage with Diverse
Storage Backend in Manila
PATTERN 6:
NFS
Host
Virtual
Cluster
VM1
Computing Task
Host
Manila Service
HDFS Driver
HDFS
Host
Virtual
Cluster
VM1
Computing Task
Host
Manila Service
NFS Driver
(Extensible)
GlusterFS
Local Volume
Host
Virtual
Cluster
VM1
Computing Task
NFS
Host
NetApp* Hadoop
NFS Connector
Manila Service
NFS Driver
This feature will be implemented in Mitaka
13Intel NetApp RedHat
Sahara can support external
HDFS by using the HDFS
driver in Manila.
Use local storage in Hadoop
and remote mount any type of
file storage in Manila.
NetApp Hadoop NFS
Connector can bring the NFS
capability into Hadoop.
14. Manila HDFS Driver
Use Manila HDFS Driver as external storage in Sahara
14Intel NetApp RedHat
15. Data Node Data Node Data Node
Name Node
Manila*
Share
Compute2Compute1 Compute3
VM1 VM2 VM3 VM4
Tenant B
VM5 VM6
HDFS Driver
Use Case: Manila HDFS Driver
Use Case
● Use external HDFS either in the same node w/
compute service or in a physical cluster
Rationales For Use
● Use Manila HDFS driver to connect with HDFS
● Manila would help to create HDFS share
The Advantages
● Use existing HDFS cluster
● Centralized managing HDFS via Manila
Limitations
● Only support non-secured HDFS due to account
management issue between OpenStack and
Hadoop
Reference: https://blueprints.launchpad.net/manila/+spec/hdfs-driver
Tenant A
Step1
Step2
Step3
User A
User A User B
HDFS HDFS HDFS
15
Intel NetApp RedHat
16. Enable HDFS Driver in Manila
Step 1: Set up Manila configuration
• /etc/manila/manila.conf
• Make sure the login username and
password are correct
• Manila service needs to use the
user to login HDFS and create the
share folder by individual user
Step 2: Restart Manila Service
Reference: http://docs.openstack.org/developer/manila/devref/hdfs_native_driver.html
16
share_driver =
manila.share.drivers.hdfs.hdfs_native.HDFSNativeShareDriver
hdfs_namenode_ip = the IP address of the HDFS namenode. Only
single
namenode is supported now.
hdfs_namenode_port = the port of the HDFS namenode service
hdfs_ssh_port = HDFS namenode SSH port
hdfs_ssh_name = HDFS namenode SSH login name
hdfs_ssh_pw = HDFS namenode SSH login password, this parameter
is not necessary, if the following hdfs_ssh_private_key is configured
hdfs_ssh_private_key = Path to the HDFS namenode private key to ssh
login
…
manila.conf example
Intel NetApp RedHat
17. Add external HDFS as a Data Source in Sahara
• Make the user account - “hdfs” has been set up in HDFS side
• Sahara will use “hdfs” user to access external HDFS by default. You can
still set up your own user account in Sahara as well.
• Add external HDFS Location as a data source in Sahara
Limitation
No need for user account setup since currently it can only support non-
secured HDFS
17Intel NetApp RedHat
18. NFS Share Mounting
Binary storage and input / output data from Manila-provisioned NFS shares
18Intel NetApp RedHat
19. The Feature
• Mount Manila NFS shares to:
• All nodes in cluster
• Specific node groups (NN, etc.)
• Currently NFS-only
• Extensible to other share types
• API (see right)
• Path and access defaults shown
• Only id field needed
shares: {[
{
“id”: “uuid”,
“path”: “/mnt/uuid”,
“access_level”: “rw”
}
]}
19Intel NetApp RedHat
20. Use Case: Binary Data Storage
• “Job binaries”: *.jar, *.pig, etc.
•Comparatively small size
•Initial location irrelevant to perf
• Previous storage options in Sahara
•Swift (still available)
•Sahara DB (as blobs in SQL table)
• Rationales for NFS storage
•Version control directly on storage FS
•Long-term storage for use by transient clusters
20Intel NetApp RedHat
21. Gluster Node Gluster Node Gluster Node
Manila*
Share
Compute2Compute1 Compute3
VM1 VM2 VM3 VM4
Tenant B
VM5 VM6
Any Drivers
Use Case: Input / Output Data
Previous options in Sahara
● Cluster-internal HDFS
● External HDFS
● Swift
Rationales for use
● Standard FS access to data
● Convenient in many cases
Data copy necessary
● Similar to built-in hadoop fs -put operation
● Irrelevant in heavily reduced output or small
input case
● In large input case, network transfer is a
consideration
Reference: https://blueprints.launchpad.net/sahara/+spec/manila-as-a-data-
source
Tenant A
LocalLocal LocalLocalLocalLocal
Step1
Gluster-Volume Gluster-Volume Gluster-Volume
Use GlusterFS as an example
Step2
Step3
21
Intel NetApp RedHat
22. Workflow: NFS Binary Storage and Input Data
1. Create manila NFS share
2. Place binary file on share at /absolute/path/to/binary.jar
3. Create sahara job binary object with path reference
manila://share_uuid/absolute/path/to/binary.jar
4. Utilize job binary in job template (per normal)
5. Create sahara data source with path reference
manila://share_uuid/absolute/path/to/input_dir
6. Run job from template using data source
22Intel NetApp RedHat
23. Automatic Mounting
• API field necessary to mount for non-EDP users
• Sahara’s EDP API mounts needed shares to a long-
standing cluster when a job references any data source or
binary on that share
• Uses defaults for permissions: rw
and path: /mnt/share_uuid/
23Intel NetApp RedHat
24. Automatic Mounting: Under the Hood
Framework Job Binaries Data Sources
All (Universal flow, per
cluster node)
Check to ensure required shares are mounted. If not:
1) Install nfs-common (Debian*) or nfs-utils (Red Hat) if not present
2) Get remote path for share UUID from Manila
3) Manila: access-allow for each required ip in cluster (if access does not exist)
4) mount -t nfs %(access_arg)s %(remote_path)s %(local_path)s
All (Universal flow) Translate manila://uuid/absolute/path to
/local_path/absolute/path
Translate manila://uuid/absolute/path to
file:///local_path/absolute/path
Hadoop (w/ Oozie) hadoop fs -copy-from-local into workflow
directory; referenced as filesystem paths in
workflow
Use file URL in Oozie workflow document (as
named job parameter or positional argument)
Spark Referenced by local filesystem path in spark-
submit call
Use file URL in spark-submit call (as positional
argument)
Storm Referenced as filesystem paths in storm jar
call
Use file URL in storm jar call (as positional
argument)
24Intel NetApp RedHat
30. NetApp Hadoop NFS Connector
Future Proposal: Use NetApp Hadoop NFS
Connector in Sahara
30Intel NetApp RedHat
31. 31
NetApp NFS Connector - Architecture Overview
● NFS Client written in Java
● Implements the Hadoop filesystem API
● No changes to Hadoop framework
● No changes to user programs
● Eliminates copying data into HDFS
● Optimized performance for NFS access
Intel NetApp RedHat
32. NFS Node NFS Node NFS Node
Manila
Share
Compute2Compute1 Compute3
VM1 VM2 VM3 VM4 VM5 VM6
NFS Driver
Sahara + Manila + NetApp NFS Connector
How to use
1. Use Manila to expose the NFS share
2. NetApp Hadoop NFS Connector as
“interface” to shared data
The Advantages
● NFS is one of the most common storage
protocols used in IT
● A direct way to communicate and process data
instead of using HDFS
Reference: https://blueprints.launchpad.net/sahara/+spec/nfs-as-a-data-
source
NetApp
NFS
Driver
NetApp
NFS
Driver
NetApp
NFS
Driver
NetApp
NFS
Driver
NetApp
NFS
Driver
NetApp
NFS
Driver
Step1
NFS Folder NFS Folder NFS Folder
Step2
Step3
32
Intel NetApp RedHat
Tenant BTenant A
Use Case
● NFS protocol to access data for Hadoop
34. NetApp Hadoop NFS Plugin
Use NetApp NFS Connector to run Hadoop on your existing data
• $ hadoop jar <path-to-examples> jar terasort nfs://<nfs-server-
hostname>:2049/tera/in /tera/out
• $ hadoop jar <path-to-examples> jar terasort nfs://<nfs-server-
hostname>:2049/tera/in nfs://<nfs-server-hostname>:2049/tera/out
Reference:
1. http://www.netapp.com/us/solutions/big-data/nfs-connector-hadoop.aspx
2. https://github.com/NetApp/NetApp-Hadoop-NFS-Connector
34Intel NetApp RedHat
35. Summary
● The choices:
a) Manila HDFS Driver
b) Manila NFS Share Mount
https://www.netapp.com/us/media/tr-4464.pdf
a) NetApp NFS Connector for Hadoop
https://github.com/NetApp/NetApp-Hadoop-NFS-Connector
35
Intel NetApp RedHat
Sahara and Manila:
Access the Big Data Oasis
37. Participating in the Intel Passport Program?
37
Are you playing? Be sure to get your Passport
Stamp for attending this session! See me or my
helper in the back at the end!
Not Playing yet? What are you waiting for? See
me or my helper in the back at the end and we can
get you started!
Don’t forget to return your stamped passport to the
Intel Booth #H3 to enter our raffle drawing! 3
Stamps = 1 Raffle Ticket
Intel NetApp RedHat