SlideShare une entreprise Scribd logo
1  sur  20
Hadoop Cluster Management



                 Dheeraj Kapur
                 Principal Engineer, Yahoo!
                 dheerajk@yahoo-inc.com
What it is…
§  Workflow based system for cluster management.
§  Completely modular & distributed design.
§  Has its own JMX based library(can be used to monitor other
    services on cluster).
§  Fully controllable from WebUI.
§  Has command line utility for adhoc administration.




                                    2
What it does…
§    Manage clusters.
§    Break fixing.
§    Upgrades OS seamlessly.
§    Consistency/efficiency of clusters.
§    Proactive self-healing Model.
§    User Management.




                                       3
Manage Clusters
§    Its has well defined workflow to manage clusters.
§    No/Minimal human intervention required.
§    Keep up efficiency of cluster.
§    Keep track of Missing/Bad blocks on system.
§    Well defined WebUI and Command line utility




                                      4
System Overview




                  5
Workflow




           6
Contd..




          7
Command Line Utility




                   8
Web Interface




                9
Web Interface contd…




                  10
fixing bad/mal-performing nodes
These errors can lead to SLA miss or Job failures
§  Takes care of Blacklisted JT nodes.
§  Errors like high load average, wrong network speed.
§  Parse system logs at X frequency (thru workflows) and look for
    patterns.
§  Visit each node multiple times in a day and check health of node.




                                   11
Upgrade OS
§  Upgrade & rollback OS seamlessly.
§  Upgrading on production, heavily used clusters.




                                   12
Consistency & efficiency of clusters
§  Keep track of cluster MR capacity
§  Proactive Fixing of sick nodes, which can cause potential issues.




                                   13
Introducing Proactive self-healing system

Let me set the ground for it.
§  Wounded hosts Called Set A - Hosts having issues, but still in service
    (with degraded services), Which can cause potential SLA misses and
    job execution issues.(which we have seen in past)
§  Fractured Hosts Called Set B - Hosts already in Break fix cycle and
    getting fixed
§  All grid hosts Called Set X - all grid hosts healthy + fine
§  Set A & B are sub-set of set X
§  to find wounded hosts we have to scan entire infrastructure once a
    day.
§  Calculate Symmetric difference b/w Set A & B, we will get actual
    wounded hosts needs service.




                                    14
Proactive self-healing contd….
                  All Grid Hosts - X




                  Set A    Set B




                          15
Proactive self-healing contd….




                   16
User Management
§  We have one of the most complex and secure environment.
§  User access and management is a complex task, due to the
    number of users, security constraints and complexity involved in
    provisioning access.
§  Single request provisioning requires change at multiple places.
§  Well defined workflow based system, where 100% automation is
    achieved.
§  Great help during system audit and compliance.




                                   17
Q&A




 18
Thank You



    19
Sessions will resume at 4:30pm




                             Page 20

Contenu connexe

Tendances

PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALEPostgreSQL Experts, Inc.
 
Streaming Replication Made Easy in v9.3
Streaming Replication Made Easy in v9.3Streaming Replication Made Easy in v9.3
Streaming Replication Made Easy in v9.3Sameer Kumar
 
MySQL's new Secure by Default Install -- All Things Open October 20th 2015
MySQL's new Secure by Default Install -- All Things Open October 20th 2015MySQL's new Secure by Default Install -- All Things Open October 20th 2015
MySQL's new Secure by Default Install -- All Things Open October 20th 2015Dave Stokes
 
Windows Server 2012 R2 Hyper-V Replica
Windows Server 2012 R2 Hyper-V ReplicaWindows Server 2012 R2 Hyper-V Replica
Windows Server 2012 R2 Hyper-V ReplicaRavikanth Chaganti
 
Ora10g Rac Best Practices
Ora10g Rac Best PracticesOra10g Rac Best Practices
Ora10g Rac Best Practicesvasanthkp
 
My two cents about Mysql backup
My two cents about Mysql backupMy two cents about Mysql backup
My two cents about Mysql backupAndrejs Vorobjovs
 
Built-in Replication in PostgreSQL
Built-in Replication in PostgreSQLBuilt-in Replication in PostgreSQL
Built-in Replication in PostgreSQLMasao Fujii
 
What's New in Postgres Plus Advanced Server 9.3
What's New in Postgres Plus Advanced Server 9.3What's New in Postgres Plus Advanced Server 9.3
What's New in Postgres Plus Advanced Server 9.3EDB
 
MySQL Backup and Recovery Essentials
MySQL Backup and Recovery EssentialsMySQL Backup and Recovery Essentials
MySQL Backup and Recovery EssentialsRonald Bradford
 
Online MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackupOnline MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackupKenny Gryp
 
Sql server 2012 ha dr nova
Sql server 2012 ha dr novaSql server 2012 ha dr nova
Sql server 2012 ha dr novaJoseph D'Antoni
 
PGPool-II Load testing
PGPool-II Load testingPGPool-II Load testing
PGPool-II Load testingEDB
 
PostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverPostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverJohn Paulett
 
Essential Linux Commands for DBAs
Essential Linux Commands for DBAsEssential Linux Commands for DBAs
Essential Linux Commands for DBAsGokhan Atil
 
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationPercona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationKenny Gryp
 
Sql server 2012 ha dr 24_hop_final
Sql server 2012 ha dr 24_hop_finalSql server 2012 ha dr 24_hop_final
Sql server 2012 ha dr 24_hop_finalJoseph D'Antoni
 

Tendances (20)

PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
 
Streaming Replication Made Easy in v9.3
Streaming Replication Made Easy in v9.3Streaming Replication Made Easy in v9.3
Streaming Replication Made Easy in v9.3
 
MySQL's new Secure by Default Install -- All Things Open October 20th 2015
MySQL's new Secure by Default Install -- All Things Open October 20th 2015MySQL's new Secure by Default Install -- All Things Open October 20th 2015
MySQL's new Secure by Default Install -- All Things Open October 20th 2015
 
Windows Server 2012 R2 Hyper-V Replica
Windows Server 2012 R2 Hyper-V ReplicaWindows Server 2012 R2 Hyper-V Replica
Windows Server 2012 R2 Hyper-V Replica
 
GlassFish v2 Clustering
GlassFish v2 ClusteringGlassFish v2 Clustering
GlassFish v2 Clustering
 
Ora10g Rac Best Practices
Ora10g Rac Best PracticesOra10g Rac Best Practices
Ora10g Rac Best Practices
 
My two cents about Mysql backup
My two cents about Mysql backupMy two cents about Mysql backup
My two cents about Mysql backup
 
SQL Server vs Postgres
SQL Server vs PostgresSQL Server vs Postgres
SQL Server vs Postgres
 
Built-in Replication in PostgreSQL
Built-in Replication in PostgreSQLBuilt-in Replication in PostgreSQL
Built-in Replication in PostgreSQL
 
What's New in Postgres Plus Advanced Server 9.3
What's New in Postgres Plus Advanced Server 9.3What's New in Postgres Plus Advanced Server 9.3
What's New in Postgres Plus Advanced Server 9.3
 
MySQL Backup and Recovery Essentials
MySQL Backup and Recovery EssentialsMySQL Backup and Recovery Essentials
MySQL Backup and Recovery Essentials
 
Online MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackupOnline MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackup
 
Sql server 2012 ha dr nova
Sql server 2012 ha dr novaSql server 2012 ha dr nova
Sql server 2012 ha dr nova
 
PGPool-II Load testing
PGPool-II Load testingPGPool-II Load testing
PGPool-II Load testing
 
PostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverPostgreSQL Scaling And Failover
PostgreSQL Scaling And Failover
 
Essential Linux Commands for DBAs
Essential Linux Commands for DBAsEssential Linux Commands for DBAs
Essential Linux Commands for DBAs
 
ProxySQL para mysql
ProxySQL para mysqlProxySQL para mysql
ProxySQL para mysql
 
MySQL Tuning
MySQL TuningMySQL Tuning
MySQL Tuning
 
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationPercona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
 
Sql server 2012 ha dr 24_hop_final
Sql server 2012 ha dr 24_hop_finalSql server 2012 ha dr 24_hop_final
Sql server 2012 ha dr 24_hop_final
 

En vedette

Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
빠르고쉬운대출『LG777』.『XYZ』무자본창업 운전자보험만기환급
빠르고쉬운대출『LG777』.『XYZ』무자본창업 운전자보험만기환급빠르고쉬운대출『LG777』.『XYZ』무자본창업 운전자보험만기환급
빠르고쉬운대출『LG777』.『XYZ』무자본창업 운전자보험만기환급hslkdfjs
 
New Use Cases for DAM in the Enterprise
New Use Cases for DAM in the EnterpriseNew Use Cases for DAM in the Enterprise
New Use Cases for DAM in the EnterpriseNuxeo
 
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB Schemas
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB SchemasRemaining Agile with Billions of Documents: Appboy and Creative MongoDB Schemas
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB SchemasMongoDB
 
01_HTML - 작심10시간! 나만의 웹사이트 기획하고 만들기
01_HTML - 작심10시간! 나만의 웹사이트 기획하고 만들기01_HTML - 작심10시간! 나만의 웹사이트 기획하고 만들기
01_HTML - 작심10시간! 나만의 웹사이트 기획하고 만들기설리번 프로젝트
 
Tailings dump recovery concept
Tailings dump recovery conceptTailings dump recovery concept
Tailings dump recovery conceptphillip shambare
 
Hadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to TezHadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to TezJan Pieter Posthuma
 
GIS for Infrastructure Management
GIS for Infrastructure ManagementGIS for Infrastructure Management
GIS for Infrastructure ManagementDavid Puckett
 
Real-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping ContainersReal-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping Containersbenaam
 
Designing your Product as a Platform
Designing your Product as a PlatformDesigning your Product as a Platform
Designing your Product as a PlatformMicah Laaker
 
Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?Ed Kohlwey
 
Airport Billing System for Aviation and Non-Aviation Services
Airport Billing System for Aviation and Non-Aviation Services Airport Billing System for Aviation and Non-Aviation Services
Airport Billing System for Aviation and Non-Aviation Services Ericsson
 
Web Services Automated Testing via SoapUI Tool
Web Services Automated Testing via SoapUI ToolWeb Services Automated Testing via SoapUI Tool
Web Services Automated Testing via SoapUI ToolSperasoft
 
Spend Analysis In 60 Seconds
Spend Analysis In 60 SecondsSpend Analysis In 60 Seconds
Spend Analysis In 60 SecondsClaritum
 
Surgical induced astigmatism
Surgical induced astigmatismSurgical induced astigmatism
Surgical induced astigmatismNamrata Gupta
 
Best practice strategies to clean up and maintain your database with Hether G...
Best practice strategies to clean up and maintain your database with Hether G...Best practice strategies to clean up and maintain your database with Hether G...
Best practice strategies to clean up and maintain your database with Hether G...Blackbaud Pacific
 

En vedette (20)

Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
빠르고쉬운대출『LG777』.『XYZ』무자본창업 운전자보험만기환급
빠르고쉬운대출『LG777』.『XYZ』무자본창업 운전자보험만기환급빠르고쉬운대출『LG777』.『XYZ』무자본창업 운전자보험만기환급
빠르고쉬운대출『LG777』.『XYZ』무자본창업 운전자보험만기환급
 
New Use Cases for DAM in the Enterprise
New Use Cases for DAM in the EnterpriseNew Use Cases for DAM in the Enterprise
New Use Cases for DAM in the Enterprise
 
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB Schemas
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB SchemasRemaining Agile with Billions of Documents: Appboy and Creative MongoDB Schemas
Remaining Agile with Billions of Documents: Appboy and Creative MongoDB Schemas
 
01_HTML - 작심10시간! 나만의 웹사이트 기획하고 만들기
01_HTML - 작심10시간! 나만의 웹사이트 기획하고 만들기01_HTML - 작심10시간! 나만의 웹사이트 기획하고 만들기
01_HTML - 작심10시간! 나만의 웹사이트 기획하고 만들기
 
Tailings dump recovery concept
Tailings dump recovery conceptTailings dump recovery concept
Tailings dump recovery concept
 
Polymer optical fibers
Polymer optical fibersPolymer optical fibers
Polymer optical fibers
 
SAP Cloud for Service
SAP Cloud for ServiceSAP Cloud for Service
SAP Cloud for Service
 
Hadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to TezHadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to Tez
 
GIS for Infrastructure Management
GIS for Infrastructure ManagementGIS for Infrastructure Management
GIS for Infrastructure Management
 
Real-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping ContainersReal-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping Containers
 
Chem Lab Report (1)
Chem Lab Report (1)Chem Lab Report (1)
Chem Lab Report (1)
 
Designing your Product as a Platform
Designing your Product as a PlatformDesigning your Product as a Platform
Designing your Product as a Platform
 
Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?
 
High-Density Wireless Networks for Auditoriums
High-Density Wireless Networks for AuditoriumsHigh-Density Wireless Networks for Auditoriums
High-Density Wireless Networks for Auditoriums
 
Airport Billing System for Aviation and Non-Aviation Services
Airport Billing System for Aviation and Non-Aviation Services Airport Billing System for Aviation and Non-Aviation Services
Airport Billing System for Aviation and Non-Aviation Services
 
Web Services Automated Testing via SoapUI Tool
Web Services Automated Testing via SoapUI ToolWeb Services Automated Testing via SoapUI Tool
Web Services Automated Testing via SoapUI Tool
 
Spend Analysis In 60 Seconds
Spend Analysis In 60 SecondsSpend Analysis In 60 Seconds
Spend Analysis In 60 Seconds
 
Surgical induced astigmatism
Surgical induced astigmatismSurgical induced astigmatism
Surgical induced astigmatism
 
Best practice strategies to clean up and maintain your database with Hether G...
Best practice strategies to clean up and maintain your database with Hether G...Best practice strategies to clean up and maintain your database with Hether G...
Best practice strategies to clean up and maintain your database with Hether G...
 

Similaire à Hadoop Cluster Management

Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13Dave Gardner
 
Highly Available Load Balanced Galera MySql Cluster
Highly Available Load Balanced  Galera MySql ClusterHighly Available Load Balanced  Galera MySql Cluster
Highly Available Load Balanced Galera MySql ClusterAmr Fawzy
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13Dave Gardner
 
VMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing DatabasesVMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing DatabasesVMworld
 
1 introduction
1 introduction1 introduction
1 introductionMohd Arif
 
Run Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateRun Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateNovell
 
Run Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateRun Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateNovell
 
Run Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateRun Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateNovell
 
Run Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateRun Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateNovell
 
Run Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateRun Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateNovell
 
Nonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the CoinNonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the CoinTechWell
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudVinay Kumar Chella
 
Practice and challenges from building IaaS
Practice and challenges from building IaaSPractice and challenges from building IaaS
Practice and challenges from building IaaSShawn Zhu
 
Resilience Testing
Resilience Testing Resilience Testing
Resilience Testing Ran Levy
 
Ansible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less CoffeeAnsible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less CoffeeSarah Z
 
Next-Gen Decision Making in Under 2ms
Next-Gen Decision Making in Under 2msNext-Gen Decision Making in Under 2ms
Next-Gen Decision Making in Under 2msIlya Ganelin
 
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Spark Summit
 
IaaS - Virtualization_Cambridge.pdf
IaaS - Virtualization_Cambridge.pdfIaaS - Virtualization_Cambridge.pdf
IaaS - Virtualization_Cambridge.pdfDharavathRamesh2
 

Similaire à Hadoop Cluster Management (20)

Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
 
Highly Available Load Balanced Galera MySql Cluster
Highly Available Load Balanced  Galera MySql ClusterHighly Available Load Balanced  Galera MySql Cluster
Highly Available Load Balanced Galera MySql Cluster
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13
 
VMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing DatabasesVMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing Databases
 
1 introduction
1 introduction1 introduction
1 introduction
 
Run Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateRun Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin Orchestrate
 
Run Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateRun Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin Orchestrate
 
Run Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateRun Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin Orchestrate
 
Run Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateRun Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin Orchestrate
 
Run Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin OrchestrateRun Book Automation with PlateSpin Orchestrate
Run Book Automation with PlateSpin Orchestrate
 
Nonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the CoinNonfunctional Testing: Examine the Other Side of the Coin
Nonfunctional Testing: Examine the Other Side of the Coin
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
 
Practice and challenges from building IaaS
Practice and challenges from building IaaSPractice and challenges from building IaaS
Practice and challenges from building IaaS
 
Resilience Testing
Resilience Testing Resilience Testing
Resilience Testing
 
Bishwambar Linux Admin
Bishwambar Linux AdminBishwambar Linux Admin
Bishwambar Linux Admin
 
Ansible for networks
Ansible for networksAnsible for networks
Ansible for networks
 
Ansible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less CoffeeAnsible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less Coffee
 
Next-Gen Decision Making in Under 2ms
Next-Gen Decision Making in Under 2msNext-Gen Decision Making in Under 2ms
Next-Gen Decision Making in Under 2ms
 
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
 
IaaS - Virtualization_Cambridge.pdf
IaaS - Virtualization_Cambridge.pdfIaaS - Virtualization_Cambridge.pdf
IaaS - Virtualization_Cambridge.pdf
 

Plus de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Dernier

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...itnewsafrica
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 

Dernier (20)

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 

Hadoop Cluster Management

  • 1. Hadoop Cluster Management Dheeraj Kapur Principal Engineer, Yahoo! dheerajk@yahoo-inc.com
  • 2. What it is… §  Workflow based system for cluster management. §  Completely modular & distributed design. §  Has its own JMX based library(can be used to monitor other services on cluster). §  Fully controllable from WebUI. §  Has command line utility for adhoc administration. 2
  • 3. What it does… §  Manage clusters. §  Break fixing. §  Upgrades OS seamlessly. §  Consistency/efficiency of clusters. §  Proactive self-healing Model. §  User Management. 3
  • 4. Manage Clusters §  Its has well defined workflow to manage clusters. §  No/Minimal human intervention required. §  Keep up efficiency of cluster. §  Keep track of Missing/Bad blocks on system. §  Well defined WebUI and Command line utility 4
  • 11. fixing bad/mal-performing nodes These errors can lead to SLA miss or Job failures §  Takes care of Blacklisted JT nodes. §  Errors like high load average, wrong network speed. §  Parse system logs at X frequency (thru workflows) and look for patterns. §  Visit each node multiple times in a day and check health of node. 11
  • 12. Upgrade OS §  Upgrade & rollback OS seamlessly. §  Upgrading on production, heavily used clusters. 12
  • 13. Consistency & efficiency of clusters §  Keep track of cluster MR capacity §  Proactive Fixing of sick nodes, which can cause potential issues. 13
  • 14. Introducing Proactive self-healing system Let me set the ground for it. §  Wounded hosts Called Set A - Hosts having issues, but still in service (with degraded services), Which can cause potential SLA misses and job execution issues.(which we have seen in past) §  Fractured Hosts Called Set B - Hosts already in Break fix cycle and getting fixed §  All grid hosts Called Set X - all grid hosts healthy + fine §  Set A & B are sub-set of set X §  to find wounded hosts we have to scan entire infrastructure once a day. §  Calculate Symmetric difference b/w Set A & B, we will get actual wounded hosts needs service. 14
  • 15. Proactive self-healing contd…. All Grid Hosts - X Set A Set B 15
  • 17. User Management §  We have one of the most complex and secure environment. §  User access and management is a complex task, due to the number of users, security constraints and complexity involved in provisioning access. §  Single request provisioning requires change at multiple places. §  Well defined workflow based system, where 100% automation is achieved. §  Great help during system audit and compliance. 17
  • 19. Thank You 19
  • 20. Sessions will resume at 4:30pm Page 20