SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
2
Did someone just order
Hadoop?
- Best practices from the field
uweseiler
2
About me
Big Data Nerd
TravelpiratePhotography Enthusiast
Hadoop Trainer NoSQL Fan Boy
2
About us
specializes on...
Big Data Nerds Agile Ninjas Continuous Delivery Gurus
Enterprise Java Specialists Performance Geeks
Join us!
2
Agenda
• Basics
• Software I
• Architecture & Rack Design
• Hardware & Cluster Sizing
• Software II
• Advanced
• Data Ingestion
• Operation & Monitoring
• Security
2
Agenda
• Basics
• Software I
• Architecture & Rack Design
• Hardware & Cluster Sizing
• Software II
• Advanced
• Data Ingestion
• Operation & Monitoring
• Security
2
Deployment Options
On Premise
Hadoop
Appliance
Hadoop
Hosting
Hadoop as a
service
Bare Metal Cloud
2
Hadoop Distributions
2
Cloudera vs. Hortonworks
Guess what:
Both will do the job!
2
Cloudera vs. Hortonworks
Which ideology do you prefer?
“Closed“ Source Open Source
2
Cloudera vs. Hortonworks
Pricing model
Software
+
Support Support
2
Agenda
• Basics
• Software I
• Architecture & Rack Design
• Hardware & Cluster Sizing
• Software II
• Advanced
• Data Ingestion
• Operation & Monitoring
• Security
2
…
Platform for Data Exploration
…
…
…
ETL
Visualization
Data Warehouse
Create the Big Picture
2
HDFS
Map
Reduce
Tez
Spark
Pig Hive
YARN
Solr
Sqoop
NFSGateway
Falcon
Ambari
Oozie
Knox
Ranger
Ganglia
Nagios
Monitoring
ZooKeeper
Journal Nodes
Cluster Management Services
Data Ingestion
& Governance
Data Storage
Data Processing Search Security
Workflow
Mgmt.
MySQL
MySQL
MySQL
Pick your Hadoop Stack
2
Rack Design (without HA)
Rack 1 Rack 2
NameNode
ResourceManager
Mgmt. Server
5 x Master
Nodes
5 x Worker Nodes
6 x Worker Nodes
Gateway Server
Nexus 3 K
Cisco Catalyst 2960
1 x ToR Switch Nexus 3 K 1 x ToR Switch
1 x Mgmt. Network Cisco Catalyst 2960 1 x Mgmt. Network
Secondary
NameNode
2
Rack 1 Rack 2
NameNode
(Active)
ResourceManager
(Active)
Mgmt. Server
4 x Master
Nodes
5 x Worker Nodes
6 x Worker Nodes
NameNode
(Passive)
ResourceManager
(Passive)
2 x Standby
HA Nodes
Gateway Server
Nexus 3 K
Cisco Catalyst 2960
1 x ToR Switch Nexus 3 K 1 x ToR Switch
1 x Mgmt. Network Cisco Catalyst 2960 1 x Mgmt. Network
Rack Design (with HA)
2
HDFS DataNode
YARN NodeManager
Hadoop Client Libraries
Worker Nodes
NameNode (Active)
ZooKeeper Server
Journal Node
Hadoop Client Libraries
HDFS NameNode (Active) NameNode (Passive)
ZooKeeper Server
Journal Node
Hadoop Client Libraries
HDFS NameNode (Passive)
ResourceManager
App Timeline Server
MapReduce2 History Server
ZooKeeper Server
Journal Node
Hadoop Client Libraries
YARN ResourceManager (Active)
MySQL Server
• Hive MetaStore
• Oozie
• Ganglia
HiveServer2
Oozie Server
Ganglia Server
Nagios Server
Zookeeper Server
Journal Node
Kerberos
Hadoop Client Libraries
Management Server
Hue Server
Ambari Server
NFS Gateway Server
WebHCat Server
WebHDFS
Falcon
Sqoop
Solr
Hadoop Client Libraries
Gateway Server
ResourceManager
App Timeline Server
MapReduce2 History Server
ZooKeeper Server
Journal Node
Hadoop Client Libraries
YARN ResourceManager (Passive)
Service Mapping
2
Agenda
• Basics
• Software I
• Architecture & Rack Design
• Hardware & Cluster Sizing
• Software II
• Advanced
• Data Ingestion
• Operation & Monitoring
• Security
2
Hardware
• Get good-quality commodity hardware!
• Buy the sweet-spot in pricing: 3 TB disk, 128 GB
RAM, 8-12 core CPU
– More memory is better. Always.
• First scale horizontally than vertically (1U 6 disks
vs. 2U 12 disks)
– Get to at least 30-40 machines or 3-4 racks
• Don‘t forget about rack size (42U) and power
consumption.
• Use pilot cluster to learn about load patterns
– Balanced workload
– Compute intensive
– I/O intensive
2
It’s about storage
Total: 3,00 TB
Intermediate data: ~25% - 0,75 TB
= 2,25 TB
HDFS Replication: 3
= 0,75 TB
x 12 disk
x 11 Data Nodes
= 99 TB
Compression: …well, it depends…
2
It’s about Zen
Xeon 10C
Model E5-2660v2
4 Memory Channels
10 Cores
8 x 16 GB
12 disks
2
Hardware
Component HDFS NameNode
+
HDFS Secondary NN
+
YARN Resource Manager
Management Server
+
Gateway
Server
Worker Nodes
CPU 2 x 3+ GHz with 8+ cores 2 x 3+ GHz with 8+ cores 2 x 2.6+ GHz with 8+ cores
Memory 128 GB
(DDR3, ECC)
128 GB
(DDR3, ECC)
128 GB
(DDR3, ECC)
Storage 2 x 1+ TB
(RAID 1, OS)
1 x 1 TB
(Hadoop Logs)
1 x 1 TB
(ZooKeeper)
1 x 3 TB
(HDFS)
2 x 1+ TB
(RAID 1, OS)
1 x 1 TB
(Hadoop Logs)
1 x 3 TB
(HDFS)
2 x 1+ TB
(RAID 1, OS)
10 x 3 TB (HDFS)
If disk chassis allows:
12 x 3 TB (HDFS)
Network 2 x Bonded
10 GbE NICs
1 x 1 GbE NIC
(for mgmt.)
2 x Bonded
10 GbE NICs
1 x 1 GbE NIC
(for mgmt.)
2 x Bonded
10 GbE NICs
1 x 1 GbE NIC
(for mgmt.)
2
Example: IBM x3650 series
Master Nodes
Data Nodes
2
Agenda
• Basics
• Architecture & Rack Design
• Hardware
• Software
• Advanced
• Data Ingestion
• Operation & Monitoring
• Security
2
Operating System
2
Linux File System
• Ext3
• Ext4
• XFS with -noattime, -inode64, -nobarrier options
Possibly better performance, be aware of delayed data
allocation (Consider turning off the delalloc option in /etc/fstab)
2
OS Optimizations
• Of course depending on your OS choice
• Specific recommendations available by OS vendors
• Common recommendations
• No physical I/O Scheduling (competes with virtual/HDFS
I/O Scheduling) (e.g. use NOOP Scheduler)
• Adjust vm.swapiness to 0
• Set number of file handles (ulimit, soft+hard) to 16384
(Data Nodes) / 65536 (Master Nodes)
• Set number of pending connections (net.core.somaxconn)
to 1024
• Use Jumbo Frames (MTU=9000)
• Consider network bonding (802.3ad)
2
Java
• Oracle JDK 1.7 (64-bit)
• Oracle JDK 1.6 (64-bit)
• Open JDK 7 (64-bit)
2
Java Optimizations
• Use 64 bit JVM for all daemons
– Compressed OOPS enabled by default (Java 6 u23+)
• Java Heap Size
– Set Xmx == Xms
– Avoid Java defaults for NewSize and MaxNewSize
• Use 1/8 to 1/6 of max size for JVM’s larger than 4 GB
– Configure –XX:PermSize=128 MB, -XX:MaxPermSize=256 MB
• Use low-latency GC collector
– Set -XX:+UseConcMarkSweepGC, -XX:ParallelGCThreads=<N>
• Use high <N> on NameNode & ResourceManager
• Useful for debugging
– -verbose:gc -Xloggc:<file> -XX:+PrintGCDetails
– -XX:ErrorFile=<file>
– -XX:+HeapDumpOnOutOfMemoryError
2
Hadoop Configuration
• Multiple redundant directories for NameNode metadata
– One of dfs.namenode.name.dir should be on NFS
– Softmount NFS with -tcp,soft,intr,timeo=20,retrans=5
• Take periodic backups of NameNode metadata
– Make copies of the entire storage directory
• Set dfs.datanode.failed.volumes.tolerated=true
– Disk failure is no longer complete DataNode failure
– Especially important for large density nodes
• Set dfs.namenode.name.dir.restore=true
• Restores NN storage directory during checkpointing
• Reserve a lot of disk space for NameNode logs
– Hadoop logging is verbose – set aside multiple GB’s
– NameNode logs roll with in minutes – hard to debug issues
• Use version control for configuration!
2
Agenda
• Basics
• Software I
• Architecture & Rack Design
• Hardware & Cluster Sizing
• Software II
• Advanced
• Data Ingestion
• Operation & Monitoring
• Security
2
Options for Data Ingestion
MapReduce
WebHDFS
hadoop fs -put
NFS Gateway
hadoop distcp
Oracle, Teradata,
SQL Server, et al.
Connectors
…
2
Agenda
• Basics
• Architecture & Rack Design
• Hardware
• Software
• Advanced
• Data Ingestion
• Operation & Monitoring
• Security
2
Operation
Apache Ambari Cloudera Manager
2
Monitoring
• The basics: Nagios, Ganglia, Ambari/Cloudera Manager, Hue
• Admins need to understand the principles behind Hadoop and learn
about their tool set: fsck, dfsadmin, …
• Monitor the hardware usage for your work load
– Disk I/O, network I/O, CPU and memory usage
– Use this information when expanding cluster capacity
• Monitor the usage with Hadoop metrics
– JVM metrics: GC times, memory used, thread Status
– RPC metrics: especially latency to track slowdowns
– HDFS metrics: Used storage, # of files & blocks, cluster load, file system
operations
– Job Metrics: Slot utilization and Job status
• Tweak configurations during upgrades & maintenance windows on an
ongoing basis
• Establish regular performance tests
– Use Oozie to run standard test like TeraSort, TestDFSIO, HiBench,
…
2
Agenda
• Basics
• Architecture & Rack Design
• Hardware
• Software
• Advanced
• Data Ingestion
• Operation & Monitoring
• Security
2
Security today
Kerberos in
native Apache
Hadoop
Perimeter
Security with
Apache Knox
• LDAP
• SSO
Authentication
Control access to
cluster.
Authorization
Restrict access
to explicit data
Audit
Understand who
did what
Data Protection
Encrypt data at
rest & motion
Native in Apache Hadoop
• HDFS Permissions + ACL’s
• Queues + Job ACL’s
• Process Execution audit trail
Fine grained role based authorization
• Hive
• Apache Sentry
• Apache Accumulo
Service level authorization with Knox
Central security policies with Ranger
Wire encryption
in native
Apache Hadoop
Wire Encryption
with Knox
Orchestrated
encryption with
3rd party tools
2
Apache Knox
Knox
DMZ
Client
SSO
HDFS
Map
Reduce
Tez Spark
Pig Hive
YARN
Solr
Ambari
Oozie
Knox
Ranger
Ganglia
Nagios
ZooKeeper
Journal Nodes
Firewall
Firewall
Hadoop Cluster
LDAP
SSH
REST
WebHDFS
WebHCat
Oozie
Hive
YARN
SSH
2
Data Boxing
Raw data layer
Read & Write
Read
Division 1
--
Read & Write
Division 2
--
Read & Write
Read
Set up data boxing using
• Users & Groups
• HDFS Permissions & ACL‘s
• Higher level where applicable
2
Apache Ranger
File Level
Access Control
Control
permissions
Supports
• HDFS
• Hive
• HBase
• Storm
• Knox
2
Thanks for listening
Twitter:
@uweseiler
Mail:
uwe.seiler@codecentric.de
XING:
https://www.xing.com/profile
/Uwe_Seiler

Contenu connexe

Tendances

Tendances (20)

Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuning
 
Apache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration storyApache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration story
 
From docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native wayFrom docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native way
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Improving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxImproving Hadoop Performance via Linux
Improving Hadoop Performance via Linux
 
Big data processing meets non-volatile memory: opportunities and challenges
Big data processing meets non-volatile memory: opportunities and challenges Big data processing meets non-volatile memory: opportunities and challenges
Big data processing meets non-volatile memory: opportunities and challenges
 
Hortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideHortonworks.Cluster Config Guide
Hortonworks.Cluster Config Guide
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
 
Advanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterAdvanced Security In Hadoop Cluster
Advanced Security In Hadoop Cluster
 
Hoodie: Incremental processing on hadoop
Hoodie: Incremental processing on hadoopHoodie: Incremental processing on hadoop
Hoodie: Incremental processing on hadoop
 
Hadoop 3 (2017 hadoop taiwan workshop)
Hadoop 3 (2017 hadoop taiwan workshop)Hadoop 3 (2017 hadoop taiwan workshop)
Hadoop 3 (2017 hadoop taiwan workshop)
 
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSHDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
 
Improving HDFS Availability with Hadoop RPC Quality of Service
Improving HDFS Availability with Hadoop RPC Quality of ServiceImproving HDFS Availability with Hadoop RPC Quality of Service
Improving HDFS Availability with Hadoop RPC Quality of Service
 
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.02013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
 
Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017
 
Spark SQL versus Apache Drill: Different Tools with Different Rules
Spark SQL versus Apache Drill: Different Tools with Different RulesSpark SQL versus Apache Drill: Different Tools with Different Rules
Spark SQL versus Apache Drill: Different Tools with Different Rules
 
Introduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache HadoopIntroduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache Hadoop
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
 

En vedette

Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
DataWorks Summit
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and Perficient
Perficient, Inc.
 

En vedette (20)

Apache Spark
Apache SparkApache Spark
Apache Spark
 
Deploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARIDeploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARI
 
Informatica Big Data Edition - Profinit - Jan Ulrych
Informatica Big Data Edition - Profinit - Jan UlrychInformatica Big Data Edition - Profinit - Jan Ulrych
Informatica Big Data Edition - Profinit - Jan Ulrych
 
Deep Learning with Apache Spark: an Introduction
Deep Learning with Apache Spark: an IntroductionDeep Learning with Apache Spark: an Introduction
Deep Learning with Apache Spark: an Introduction
 
Deep dive into spark streaming
Deep dive into spark streamingDeep dive into spark streaming
Deep dive into spark streaming
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
 
Big data, Analytics and Beyond
Big data, Analytics and BeyondBig data, Analytics and Beyond
Big data, Analytics and Beyond
 
Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
 
Fully fault tolerant real time data pipeline with docker and mesos
Fully fault tolerant real time data pipeline with docker and mesos Fully fault tolerant real time data pipeline with docker and mesos
Fully fault tolerant real time data pipeline with docker and mesos
 
Hadoop Operations
Hadoop OperationsHadoop Operations
Hadoop Operations
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and Perficient
 
Energy analytics with Apache Spark workshop
Energy analytics with Apache Spark workshopEnergy analytics with Apache Spark workshop
Energy analytics with Apache Spark workshop
 
Hadoop bootcamp getting started
Hadoop bootcamp getting startedHadoop bootcamp getting started
Hadoop bootcamp getting started
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Kafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesKafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier Architectures
 
Hadoop Security
Hadoop SecurityHadoop Security
Hadoop Security
 
Apache Spark: What's under the hood
Apache Spark: What's under the hoodApache Spark: What's under the hood
Apache Spark: What's under the hood
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
 

Similaire à Hadoop Operations - Best practices from the field

Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecture
saipriyacoool
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
mundlapudi
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoop
fann wu
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around Hadoop
DataWorks Summit
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
elliando dias
 
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
MaharajothiP
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentation
Amrut Patil
 

Similaire à Hadoop Operations - Best practices from the field (20)

Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecture
 
What's new in hadoop 3.0
What's new in hadoop 3.0What's new in hadoop 3.0
What's new in hadoop 3.0
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoop
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introduction
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 
Hadoop, Taming Elephants
Hadoop, Taming ElephantsHadoop, Taming Elephants
Hadoop, Taming Elephants
 
Hadoop and Distributed Computing
Hadoop and Distributed ComputingHadoop and Distributed Computing
Hadoop and Distributed Computing
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around Hadoop
 
Hadoop 2.0 handout 5.0
Hadoop 2.0 handout 5.0Hadoop 2.0 handout 5.0
Hadoop 2.0 handout 5.0
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionTugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentation
 
Hadoop training in bangalore
Hadoop training in bangaloreHadoop training in bangalore
Hadoop training in bangalore
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
 

Plus de Uwe Printz

First meetup of the MongoDB User Group Frankfurt
First meetup of the MongoDB User Group FrankfurtFirst meetup of the MongoDB User Group Frankfurt
First meetup of the MongoDB User Group Frankfurt
Uwe Printz
 

Plus de Uwe Printz (13)

Lightning Talk: Agility & Databases
Lightning Talk: Agility & DatabasesLightning Talk: Agility & Databases
Lightning Talk: Agility & Databases
 
MongoDB für Java Programmierer (JUGKA, 11.12.13)
MongoDB für Java Programmierer (JUGKA, 11.12.13)MongoDB für Java Programmierer (JUGKA, 11.12.13)
MongoDB für Java Programmierer (JUGKA, 11.12.13)
 
Hadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduceHadoop 2 - Going beyond MapReduce
Hadoop 2 - Going beyond MapReduce
 
Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)
Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)
Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
 
MongoDB für Java-Programmierer
MongoDB für Java-ProgrammiererMongoDB für Java-Programmierer
MongoDB für Java-Programmierer
 
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
 
Introduction to Twitter Storm
Introduction to Twitter StormIntroduction to Twitter Storm
Introduction to Twitter Storm
 
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
 
Introduction to the Hadoop Ecosystem (SEACON Edition)
Introduction to the Hadoop Ecosystem (SEACON Edition)Introduction to the Hadoop Ecosystem (SEACON Edition)
Introduction to the Hadoop Ecosystem (SEACON Edition)
 
Introduction to the Hadoop Ecosystem (codemotion Edition)
Introduction to the Hadoop Ecosystem (codemotion Edition)Introduction to the Hadoop Ecosystem (codemotion Edition)
Introduction to the Hadoop Ecosystem (codemotion Edition)
 
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDBMap/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
 
First meetup of the MongoDB User Group Frankfurt
First meetup of the MongoDB User Group FrankfurtFirst meetup of the MongoDB User Group Frankfurt
First meetup of the MongoDB User Group Frankfurt
 

Dernier

Dernier (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Hadoop Operations - Best practices from the field

  • 1. 2 Did someone just order Hadoop? - Best practices from the field uweseiler
  • 2. 2 About me Big Data Nerd TravelpiratePhotography Enthusiast Hadoop Trainer NoSQL Fan Boy
  • 3. 2 About us specializes on... Big Data Nerds Agile Ninjas Continuous Delivery Gurus Enterprise Java Specialists Performance Geeks Join us!
  • 4. 2 Agenda • Basics • Software I • Architecture & Rack Design • Hardware & Cluster Sizing • Software II • Advanced • Data Ingestion • Operation & Monitoring • Security
  • 5. 2 Agenda • Basics • Software I • Architecture & Rack Design • Hardware & Cluster Sizing • Software II • Advanced • Data Ingestion • Operation & Monitoring • Security
  • 8. 2 Cloudera vs. Hortonworks Guess what: Both will do the job!
  • 9. 2 Cloudera vs. Hortonworks Which ideology do you prefer? “Closed“ Source Open Source
  • 10. 2 Cloudera vs. Hortonworks Pricing model Software + Support Support
  • 11. 2 Agenda • Basics • Software I • Architecture & Rack Design • Hardware & Cluster Sizing • Software II • Advanced • Data Ingestion • Operation & Monitoring • Security
  • 12. 2 … Platform for Data Exploration … … … ETL Visualization Data Warehouse Create the Big Picture
  • 13. 2 HDFS Map Reduce Tez Spark Pig Hive YARN Solr Sqoop NFSGateway Falcon Ambari Oozie Knox Ranger Ganglia Nagios Monitoring ZooKeeper Journal Nodes Cluster Management Services Data Ingestion & Governance Data Storage Data Processing Search Security Workflow Mgmt. MySQL MySQL MySQL Pick your Hadoop Stack
  • 14. 2 Rack Design (without HA) Rack 1 Rack 2 NameNode ResourceManager Mgmt. Server 5 x Master Nodes 5 x Worker Nodes 6 x Worker Nodes Gateway Server Nexus 3 K Cisco Catalyst 2960 1 x ToR Switch Nexus 3 K 1 x ToR Switch 1 x Mgmt. Network Cisco Catalyst 2960 1 x Mgmt. Network Secondary NameNode
  • 15. 2 Rack 1 Rack 2 NameNode (Active) ResourceManager (Active) Mgmt. Server 4 x Master Nodes 5 x Worker Nodes 6 x Worker Nodes NameNode (Passive) ResourceManager (Passive) 2 x Standby HA Nodes Gateway Server Nexus 3 K Cisco Catalyst 2960 1 x ToR Switch Nexus 3 K 1 x ToR Switch 1 x Mgmt. Network Cisco Catalyst 2960 1 x Mgmt. Network Rack Design (with HA)
  • 16. 2 HDFS DataNode YARN NodeManager Hadoop Client Libraries Worker Nodes NameNode (Active) ZooKeeper Server Journal Node Hadoop Client Libraries HDFS NameNode (Active) NameNode (Passive) ZooKeeper Server Journal Node Hadoop Client Libraries HDFS NameNode (Passive) ResourceManager App Timeline Server MapReduce2 History Server ZooKeeper Server Journal Node Hadoop Client Libraries YARN ResourceManager (Active) MySQL Server • Hive MetaStore • Oozie • Ganglia HiveServer2 Oozie Server Ganglia Server Nagios Server Zookeeper Server Journal Node Kerberos Hadoop Client Libraries Management Server Hue Server Ambari Server NFS Gateway Server WebHCat Server WebHDFS Falcon Sqoop Solr Hadoop Client Libraries Gateway Server ResourceManager App Timeline Server MapReduce2 History Server ZooKeeper Server Journal Node Hadoop Client Libraries YARN ResourceManager (Passive) Service Mapping
  • 17. 2 Agenda • Basics • Software I • Architecture & Rack Design • Hardware & Cluster Sizing • Software II • Advanced • Data Ingestion • Operation & Monitoring • Security
  • 18. 2 Hardware • Get good-quality commodity hardware! • Buy the sweet-spot in pricing: 3 TB disk, 128 GB RAM, 8-12 core CPU – More memory is better. Always. • First scale horizontally than vertically (1U 6 disks vs. 2U 12 disks) – Get to at least 30-40 machines or 3-4 racks • Don‘t forget about rack size (42U) and power consumption. • Use pilot cluster to learn about load patterns – Balanced workload – Compute intensive – I/O intensive
  • 19. 2 It’s about storage Total: 3,00 TB Intermediate data: ~25% - 0,75 TB = 2,25 TB HDFS Replication: 3 = 0,75 TB x 12 disk x 11 Data Nodes = 99 TB Compression: …well, it depends…
  • 20. 2 It’s about Zen Xeon 10C Model E5-2660v2 4 Memory Channels 10 Cores 8 x 16 GB 12 disks
  • 21. 2 Hardware Component HDFS NameNode + HDFS Secondary NN + YARN Resource Manager Management Server + Gateway Server Worker Nodes CPU 2 x 3+ GHz with 8+ cores 2 x 3+ GHz with 8+ cores 2 x 2.6+ GHz with 8+ cores Memory 128 GB (DDR3, ECC) 128 GB (DDR3, ECC) 128 GB (DDR3, ECC) Storage 2 x 1+ TB (RAID 1, OS) 1 x 1 TB (Hadoop Logs) 1 x 1 TB (ZooKeeper) 1 x 3 TB (HDFS) 2 x 1+ TB (RAID 1, OS) 1 x 1 TB (Hadoop Logs) 1 x 3 TB (HDFS) 2 x 1+ TB (RAID 1, OS) 10 x 3 TB (HDFS) If disk chassis allows: 12 x 3 TB (HDFS) Network 2 x Bonded 10 GbE NICs 1 x 1 GbE NIC (for mgmt.) 2 x Bonded 10 GbE NICs 1 x 1 GbE NIC (for mgmt.) 2 x Bonded 10 GbE NICs 1 x 1 GbE NIC (for mgmt.)
  • 22. 2 Example: IBM x3650 series Master Nodes Data Nodes
  • 23. 2 Agenda • Basics • Architecture & Rack Design • Hardware • Software • Advanced • Data Ingestion • Operation & Monitoring • Security
  • 25. 2 Linux File System • Ext3 • Ext4 • XFS with -noattime, -inode64, -nobarrier options Possibly better performance, be aware of delayed data allocation (Consider turning off the delalloc option in /etc/fstab)
  • 26. 2 OS Optimizations • Of course depending on your OS choice • Specific recommendations available by OS vendors • Common recommendations • No physical I/O Scheduling (competes with virtual/HDFS I/O Scheduling) (e.g. use NOOP Scheduler) • Adjust vm.swapiness to 0 • Set number of file handles (ulimit, soft+hard) to 16384 (Data Nodes) / 65536 (Master Nodes) • Set number of pending connections (net.core.somaxconn) to 1024 • Use Jumbo Frames (MTU=9000) • Consider network bonding (802.3ad)
  • 27. 2 Java • Oracle JDK 1.7 (64-bit) • Oracle JDK 1.6 (64-bit) • Open JDK 7 (64-bit)
  • 28. 2 Java Optimizations • Use 64 bit JVM for all daemons – Compressed OOPS enabled by default (Java 6 u23+) • Java Heap Size – Set Xmx == Xms – Avoid Java defaults for NewSize and MaxNewSize • Use 1/8 to 1/6 of max size for JVM’s larger than 4 GB – Configure –XX:PermSize=128 MB, -XX:MaxPermSize=256 MB • Use low-latency GC collector – Set -XX:+UseConcMarkSweepGC, -XX:ParallelGCThreads=<N> • Use high <N> on NameNode & ResourceManager • Useful for debugging – -verbose:gc -Xloggc:<file> -XX:+PrintGCDetails – -XX:ErrorFile=<file> – -XX:+HeapDumpOnOutOfMemoryError
  • 29. 2 Hadoop Configuration • Multiple redundant directories for NameNode metadata – One of dfs.namenode.name.dir should be on NFS – Softmount NFS with -tcp,soft,intr,timeo=20,retrans=5 • Take periodic backups of NameNode metadata – Make copies of the entire storage directory • Set dfs.datanode.failed.volumes.tolerated=true – Disk failure is no longer complete DataNode failure – Especially important for large density nodes • Set dfs.namenode.name.dir.restore=true • Restores NN storage directory during checkpointing • Reserve a lot of disk space for NameNode logs – Hadoop logging is verbose – set aside multiple GB’s – NameNode logs roll with in minutes – hard to debug issues • Use version control for configuration!
  • 30. 2 Agenda • Basics • Software I • Architecture & Rack Design • Hardware & Cluster Sizing • Software II • Advanced • Data Ingestion • Operation & Monitoring • Security
  • 31. 2 Options for Data Ingestion MapReduce WebHDFS hadoop fs -put NFS Gateway hadoop distcp Oracle, Teradata, SQL Server, et al. Connectors …
  • 32. 2 Agenda • Basics • Architecture & Rack Design • Hardware • Software • Advanced • Data Ingestion • Operation & Monitoring • Security
  • 34. 2 Monitoring • The basics: Nagios, Ganglia, Ambari/Cloudera Manager, Hue • Admins need to understand the principles behind Hadoop and learn about their tool set: fsck, dfsadmin, … • Monitor the hardware usage for your work load – Disk I/O, network I/O, CPU and memory usage – Use this information when expanding cluster capacity • Monitor the usage with Hadoop metrics – JVM metrics: GC times, memory used, thread Status – RPC metrics: especially latency to track slowdowns – HDFS metrics: Used storage, # of files & blocks, cluster load, file system operations – Job Metrics: Slot utilization and Job status • Tweak configurations during upgrades & maintenance windows on an ongoing basis • Establish regular performance tests – Use Oozie to run standard test like TeraSort, TestDFSIO, HiBench, …
  • 35. 2 Agenda • Basics • Architecture & Rack Design • Hardware • Software • Advanced • Data Ingestion • Operation & Monitoring • Security
  • 36. 2 Security today Kerberos in native Apache Hadoop Perimeter Security with Apache Knox • LDAP • SSO Authentication Control access to cluster. Authorization Restrict access to explicit data Audit Understand who did what Data Protection Encrypt data at rest & motion Native in Apache Hadoop • HDFS Permissions + ACL’s • Queues + Job ACL’s • Process Execution audit trail Fine grained role based authorization • Hive • Apache Sentry • Apache Accumulo Service level authorization with Knox Central security policies with Ranger Wire encryption in native Apache Hadoop Wire Encryption with Knox Orchestrated encryption with 3rd party tools
  • 37. 2 Apache Knox Knox DMZ Client SSO HDFS Map Reduce Tez Spark Pig Hive YARN Solr Ambari Oozie Knox Ranger Ganglia Nagios ZooKeeper Journal Nodes Firewall Firewall Hadoop Cluster LDAP SSH REST WebHDFS WebHCat Oozie Hive YARN SSH
  • 38. 2 Data Boxing Raw data layer Read & Write Read Division 1 -- Read & Write Division 2 -- Read & Write Read Set up data boxing using • Users & Groups • HDFS Permissions & ACL‘s • Higher level where applicable
  • 39. 2 Apache Ranger File Level Access Control Control permissions Supports • HDFS • Hive • HBase • Storm • Knox