SlideShare une entreprise Scribd logo
1  sur  19
Oozie High Availability 
Hadoop User Group meetup 10/15/14 
Robert Kanter (Cloudera) 
Ryota Egashira (Yahoo!)
Agenda 
● What is High Availability? 
● Architectural Overview 
● Security 
● Authentication 
● HCatalog Integration in HA 
● Other Challenges in HA 
● Future Work
What is High Availability? 
● A system without non-planned downtime when partial 
failures occur 
o Typically achieved by having redundancies and 
removing single-points of failure 
● Our Goals 
o Don’t change the API or usage patterns 
o User doesn’t even have to know it’s HA
Architectural Overview: 
Database 
● Oozie stores most of its state in a database 
o (submitted jobs, workflow definitions, etc) 
● Instead of a failover model, we want to run many Oozie 
servers against the same database 
o Active-Active HA 
o Also provides horizontal scalability 
● Zookeeper for coordination
Architectural Overview: 
Access 
● Users and client programs need a single address to 
connect to 
o Web UI, REST/Java API, 
JobTracker/ResourceManager callbacks, etc 
● Load balancer, Virtual IP, or DNS round-robin can be 
used to provide a single entry point to the Oozie servers 
o Technically also needs to be HA
Architectural Overview: 
Log Streaming 
● Oozie’s log files are not in the database 
o Each Oozie server only has access to its own logs 
● Jobs are not assigned to a specific Oozie server 
● What if the user asks Oozie Server 1 for logs for a job 
processed by Oozie Server 2? 
o Oozie Server 1 can ask Oozie Server 2 for its logs 
● Caveat: If an Oozie Server goes down, any logs from it 
will be unavailable
Security 
● Existing security features continue to work 
● authentication tokens 
o Signed cookies for authenticating users to Oozie server 
o Each Oozie server uses it’s own randomly generated secret 
 Problem: Won’t accept cookies signed by other Oozie 
servers 
● Hadoop-auth in Hadoop 2.6.0 will add support for pluggable secret 
providers 
o Includes a Zookeeper-backed implementation that 
synchronizes a rolling randomly generated secret across 
multiple servers 
 No locking required!
Authentication 
Load 
Balancer 
Oozie 
Server 1 
Oozie 
Server 2 
Oracle DB 
Zookeeper 
Hadoop Cluster 
user submit request 
Load balanced request 
redirection 
Inter server communication for 
log streaming, sharelib.etc 
Zookeeper for lock and 
management 
Apache 
Curator
Authentication 
Load 
Balancer 
Oozie 
Server 1 
Oozie 
Server 2 
Oracle DB 
Zookeeper 
Hadoop Cluster 
user submit request 
Security: https + kerberos / 
cookie-based auth 
Load balanced request 
redirection 
Security: https + 
kerberos / cookie-based- 
auth 
Inter server communication for 
log streaming, sharelib.etc 
Security: https+kerberos 
Zookeeper for lock and 
management 
Security: Kerberos 
Security: kerberos 
Apache 
Curator
HCatalog Integration in HA 
• HCatalog : metadata management service for HDFS datasets 
– Oozie receive notification from Hcatalog through JMS 
(e.g., ActiveMQ) 
– Start a job immediately after data becomes ready 
Oozie 1 
JMS 
1. Query/Poll Partition 
(e.g, ActiveMQ) 
HCatalog 
3. Push notification 
<New Partition> 
2. Register Topic 
4. Notify New Partition 
Oozie 2 
Job
HCatalog Integration in HA 
• To support HA 
– Keep consistency in in-memory data structure 
• store list of jobs waiting for a data partition 
– Create and cleanup topic listener for JMS 
Oozie 1 HCatalog 
3. Push notification 
<New Partition> 
1. Query/Poll Partition 
2. Register Topic 
4. Notify New Partition 
Job 
Oozie 2 
JMS 
(e.g, ActiveMQ)
Other Challenges in HA 
• SLA support 
– Oozie has in-memory data structure to track sla status for 
each job (start/duration/end met/miss and notifications) 
– add check of sla status against Database 
– use ZK lock to synchronize update on the same job from 
multiple servers. 
• Distributed Locks 
– Reentrant distributed lock using Apache Curator + 
Zookeeper
Other Challenges in HA 
● Distributed Job ID 
o Maintain distributed sequence number for Job ID 
using Apache Curator + Zookeeper 
● Zookeeper Failure Handling 
o Oozie servers automatically shutdown when 
Zookeeper is down
Future work 
• Learn from experience for stability 
– At Y!, HA running on non-prod grids >1 month, and 
prod deployment in Q4 
• Faster job fail-over 
– currently wait for a thread (Recovery Service) to pick 
non-progressing jobs every few minutes 
– Oozie server should immediately notice when other 
server is down and fail-over job (e.g, using ZK 
watcher) 
• Improve log streaming

Contenu connexe

Tendances

Oozie &amp; sqoop by pradeep
Oozie &amp; sqoop by pradeepOozie &amp; sqoop by pradeep
Oozie &amp; sqoop by pradeepPradeep Pandey
 
Data Pipeline Management Framework on Oozie
Data Pipeline Management Framework on OozieData Pipeline Management Framework on Oozie
Data Pipeline Management Framework on OozieShareThis
 
SolrCloud Cluster management via APIs
SolrCloud Cluster management via APIsSolrCloud Cluster management via APIs
SolrCloud Cluster management via APIsAnshum Gupta
 
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieEverything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieChicago Hadoop Users Group
 
Infrastructure modeling with chef
Infrastructure modeling with chefInfrastructure modeling with chef
Infrastructure modeling with chefCharles Johnson
 
Solr security frameworks
Solr security frameworksSolr security frameworks
Solr security frameworksAnshum Gupta
 
Oracle Traffic Director - a vital part of your Oracle infrastructure
Oracle Traffic Director - a vital part of your Oracle infrastructureOracle Traffic Director - a vital part of your Oracle infrastructure
Oracle Traffic Director - a vital part of your Oracle infrastructureSimon Haslam
 
Understanding the Solr security framework - Lucene Solr Revolution 2015
Understanding the Solr security framework - Lucene Solr Revolution 2015Understanding the Solr security framework - Lucene Solr Revolution 2015
Understanding the Solr security framework - Lucene Solr Revolution 2015Anshum Gupta
 
WebLogic Scripting Tool made Cool!
WebLogic Scripting Tool made Cool!WebLogic Scripting Tool made Cool!
WebLogic Scripting Tool made Cool!Maarten Smeets
 
Ambari Meetup: What's New in Ambari
Ambari Meetup: What's New in AmbariAmbari Meetup: What's New in Ambari
Ambari Meetup: What's New in AmbariHortonworks
 
WebLogic on ODA - Oracle Open World 2013
WebLogic on ODA - Oracle Open World 2013WebLogic on ODA - Oracle Open World 2013
WebLogic on ODA - Oracle Open World 2013Michel Schildmeijer
 
WebLogic authentication debugging
WebLogic authentication debuggingWebLogic authentication debugging
WebLogic authentication debuggingMaarten Smeets
 
Ansible for large scale deployment
Ansible for large scale deploymentAnsible for large scale deployment
Ansible for large scale deploymentKarthik .P.R
 
Oracle Unified Directory. Lessons learnt. Is it ready for a move from OID? (O...
Oracle Unified Directory. Lessons learnt. Is it ready for a move from OID? (O...Oracle Unified Directory. Lessons learnt. Is it ready for a move from OID? (O...
Oracle Unified Directory. Lessons learnt. Is it ready for a move from OID? (O...Andrejs Prokopjevs
 
AWS Observability Made Simple
AWS Observability Made SimpleAWS Observability Made Simple
AWS Observability Made SimpleLuciano Mammino
 
Oozie HUG May12
Oozie HUG May12Oozie HUG May12
Oozie HUG May12mislam77
 
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for HadoopMay 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for HadoopYahoo Developer Network
 
Oracle database 12.2 new features
Oracle database 12.2 new featuresOracle database 12.2 new features
Oracle database 12.2 new featuresAlfredo Krieg
 

Tendances (20)

Oozie &amp; sqoop by pradeep
Oozie &amp; sqoop by pradeepOozie &amp; sqoop by pradeep
Oozie &amp; sqoop by pradeep
 
Apache Oozie
Apache OozieApache Oozie
Apache Oozie
 
Data Pipeline Management Framework on Oozie
Data Pipeline Management Framework on OozieData Pipeline Management Framework on Oozie
Data Pipeline Management Framework on Oozie
 
SolrCloud Cluster management via APIs
SolrCloud Cluster management via APIsSolrCloud Cluster management via APIs
SolrCloud Cluster management via APIs
 
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieEverything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about Oozie
 
Infrastructure modeling with chef
Infrastructure modeling with chefInfrastructure modeling with chef
Infrastructure modeling with chef
 
Solr security frameworks
Solr security frameworksSolr security frameworks
Solr security frameworks
 
Oracle Traffic Director - a vital part of your Oracle infrastructure
Oracle Traffic Director - a vital part of your Oracle infrastructureOracle Traffic Director - a vital part of your Oracle infrastructure
Oracle Traffic Director - a vital part of your Oracle infrastructure
 
Understanding the Solr security framework - Lucene Solr Revolution 2015
Understanding the Solr security framework - Lucene Solr Revolution 2015Understanding the Solr security framework - Lucene Solr Revolution 2015
Understanding the Solr security framework - Lucene Solr Revolution 2015
 
WebLogic Scripting Tool made Cool!
WebLogic Scripting Tool made Cool!WebLogic Scripting Tool made Cool!
WebLogic Scripting Tool made Cool!
 
Ambari Meetup: What's New in Ambari
Ambari Meetup: What's New in AmbariAmbari Meetup: What's New in Ambari
Ambari Meetup: What's New in Ambari
 
WebLogic on ODA - Oracle Open World 2013
WebLogic on ODA - Oracle Open World 2013WebLogic on ODA - Oracle Open World 2013
WebLogic on ODA - Oracle Open World 2013
 
WebLogic authentication debugging
WebLogic authentication debuggingWebLogic authentication debugging
WebLogic authentication debugging
 
Ansible for large scale deployment
Ansible for large scale deploymentAnsible for large scale deployment
Ansible for large scale deployment
 
Oracle Unified Directory. Lessons learnt. Is it ready for a move from OID? (O...
Oracle Unified Directory. Lessons learnt. Is it ready for a move from OID? (O...Oracle Unified Directory. Lessons learnt. Is it ready for a move from OID? (O...
Oracle Unified Directory. Lessons learnt. Is it ready for a move from OID? (O...
 
AWS Observability Made Simple
AWS Observability Made SimpleAWS Observability Made Simple
AWS Observability Made Simple
 
What's new in chef 12
What's new in chef 12 What's new in chef 12
What's new in chef 12
 
Oozie HUG May12
Oozie HUG May12Oozie HUG May12
Oozie HUG May12
 
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for HadoopMay 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
 
Oracle database 12.2 new features
Oracle database 12.2 new featuresOracle database 12.2 new features
Oracle database 12.2 new features
 

En vedette

Ilex beller schteitale
Ilex beller schteitaleIlex beller schteitale
Ilex beller schteitaleEugene Dvorkin
 
Oozie meetup - HA + Cron Scheduling
Oozie meetup - HA + Cron SchedulingOozie meetup - HA + Cron Scheduling
Oozie meetup - HA + Cron SchedulingMona Chitnis
 
Psicointegración (una técnica para integrar la mente)
Psicointegración (una técnica para integrar la mente)Psicointegración (una técnica para integrar la mente)
Psicointegración (una técnica para integrar la mente)0 agape
 
Grupo Elron - Psicoauditación Vera-El
Grupo Elron - Psicoauditación Vera-ElGrupo Elron - Psicoauditación Vera-El
Grupo Elron - Psicoauditación Vera-El0 agape
 
July 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification ProcessJuly 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification ProcessYahoo Developer Network
 
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas NApache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas NYahoo Developer Network
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormEugene Dvorkin
 
Building and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache OozieBuilding and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache OozieDataWorks Summit/Hadoop Summit
 
A Basic Hive Inspection
A Basic Hive InspectionA Basic Hive Inspection
A Basic Hive InspectionLinda Tillman
 
Método clínico
Método clínicoMétodo clínico
Método clínicoRuTh Licea
 
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopZheng Shao
 
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieYahoo Developer Network
 
Hive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it finalHive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it finalHortonworks
 

En vedette (15)

Ilex beller schteitale
Ilex beller schteitaleIlex beller schteitale
Ilex beller schteitale
 
Oozie meetup - HA + Cron Scheduling
Oozie meetup - HA + Cron SchedulingOozie meetup - HA + Cron Scheduling
Oozie meetup - HA + Cron Scheduling
 
Psicointegración (una técnica para integrar la mente)
Psicointegración (una técnica para integrar la mente)Psicointegración (una técnica para integrar la mente)
Psicointegración (una técnica para integrar la mente)
 
Grupo Elron - Psicoauditación Vera-El
Grupo Elron - Psicoauditación Vera-ElGrupo Elron - Psicoauditación Vera-El
Grupo Elron - Psicoauditación Vera-El
 
July 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification ProcessJuly 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification Process
 
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas NApache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache Storm
 
Building and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache OozieBuilding and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache Oozie
 
A Basic Hive Inspection
A Basic Hive InspectionA Basic Hive Inspection
A Basic Hive Inspection
 
Método clínico
Método clínicoMétodo clínico
Método clínico
 
Hive tuning
Hive tuningHive tuning
Hive tuning
 
Método clínico
Método clínicoMétodo clínico
Método clínico
 
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
 
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie
 
Hive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it finalHive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it final
 

Similaire à October 2014 HUG : Oozie HA

oozieee.pdf
oozieee.pdfoozieee.pdf
oozieee.pdfwwww63
 
Workflow Engines for Hadoop
Workflow Engines for HadoopWorkflow Engines for Hadoop
Workflow Engines for HadoopJoe Crobak
 
CIS 2015- Building IAM for OpenStack- Steve Martinelli
CIS 2015- Building IAM for OpenStack- Steve MartinelliCIS 2015- Building IAM for OpenStack- Steve Martinelli
CIS 2015- Building IAM for OpenStack- Steve MartinelliCloudIDSummit
 
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio, Inc.
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio, Inc.
 
Shopzilla On Concurrency
Shopzilla On ConcurrencyShopzilla On Concurrency
Shopzilla On ConcurrencyRodney Barlow
 
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?DataWorks Summit
 
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019alanfgates
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?DataWorks Summit
 
Ahmedabad MuleSoft Meetup #4
Ahmedabad MuleSoft Meetup #4Ahmedabad MuleSoft Meetup #4
Ahmedabad MuleSoft Meetup #4Tejas Purohit
 
HLoader – Automated Incremental Hadoop Data Loader Service and Framework
HLoader – Automated Incremental Hadoop Data Loader Service and FrameworkHLoader – Automated Incremental Hadoop Data Loader Service and Framework
HLoader – Automated Incremental Hadoop Data Loader Service and FrameworkDániel Stein
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?DataWorks Summit
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoDataWorks Summit
 
Monitoring in a fast-changing world with Prometheus
Monitoring in a fast-changing world with PrometheusMonitoring in a fast-changing world with Prometheus
Monitoring in a fast-changing world with PrometheusJulien Pivotto
 
Plain english guide to drupal 8 criticals
Plain english guide to drupal 8 criticalsPlain english guide to drupal 8 criticals
Plain english guide to drupal 8 criticalsAngela Byron
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQLKonstantin Gredeskoul
 
OTN Developer Days - GlassFish
OTN Developer Days - GlassFishOTN Developer Days - GlassFish
OTN Developer Days - GlassFishglassfish
 

Similaire à October 2014 HUG : Oozie HA (20)

Hadoop Oozie
Hadoop OozieHadoop Oozie
Hadoop Oozie
 
oozieee.pdf
oozieee.pdfoozieee.pdf
oozieee.pdf
 
Workflow Engines for Hadoop
Workflow Engines for HadoopWorkflow Engines for Hadoop
Workflow Engines for Hadoop
 
CIS 2015- Building IAM for OpenStack- Steve Martinelli
CIS 2015- Building IAM for OpenStack- Steve MartinelliCIS 2015- Building IAM for OpenStack- Steve Martinelli
CIS 2015- Building IAM for OpenStack- Steve Martinelli
 
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
 
Shopzilla On Concurrency
Shopzilla On ConcurrencyShopzilla On Concurrency
Shopzilla On Concurrency
 
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?
 
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
Apache Oozie
Apache OozieApache Oozie
Apache Oozie
 
Ahmedabad MuleSoft Meetup #4
Ahmedabad MuleSoft Meetup #4Ahmedabad MuleSoft Meetup #4
Ahmedabad MuleSoft Meetup #4
 
GoDocker presentation
GoDocker presentationGoDocker presentation
GoDocker presentation
 
HLoader – Automated Incremental Hadoop Data Loader Service and Framework
HLoader – Automated Incremental Hadoop Data Loader Service and FrameworkHLoader – Automated Incremental Hadoop Data Loader Service and Framework
HLoader – Automated Incremental Hadoop Data Loader Service and Framework
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
 
Monitoring in a fast-changing world with Prometheus
Monitoring in a fast-changing world with PrometheusMonitoring in a fast-changing world with Prometheus
Monitoring in a fast-changing world with Prometheus
 
Plain english guide to drupal 8 criticals
Plain english guide to drupal 8 criticalsPlain english guide to drupal 8 criticals
Plain english guide to drupal 8 criticals
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL
 
OTN Developer Days - GlassFish
OTN Developer Days - GlassFishOTN Developer Days - GlassFish
OTN Developer Days - GlassFish
 

Plus de Yahoo Developer Network

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaYahoo Developer Network
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Yahoo Developer Network
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanYahoo Developer Network
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Yahoo Developer Network
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathYahoo Developer Network
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuYahoo Developer Network
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolYahoo Developer Network
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Yahoo Developer Network
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Yahoo Developer Network
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathYahoo Developer Network
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Yahoo Developer Network
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathYahoo Developer Network
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsYahoo Developer Network
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Yahoo Developer Network
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondYahoo Developer Network
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Yahoo Developer Network
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...Yahoo Developer Network
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexYahoo Developer Network
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsYahoo Developer Network
 

Plus de Yahoo Developer Network (20)

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, Oath
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 

Dernier

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Dernier (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

October 2014 HUG : Oozie HA

  • 1. Oozie High Availability Hadoop User Group meetup 10/15/14 Robert Kanter (Cloudera) Ryota Egashira (Yahoo!)
  • 2. Agenda ● What is High Availability? ● Architectural Overview ● Security ● Authentication ● HCatalog Integration in HA ● Other Challenges in HA ● Future Work
  • 3. What is High Availability? ● A system without non-planned downtime when partial failures occur o Typically achieved by having redundancies and removing single-points of failure ● Our Goals o Don’t change the API or usage patterns o User doesn’t even have to know it’s HA
  • 4. Architectural Overview: Database ● Oozie stores most of its state in a database o (submitted jobs, workflow definitions, etc) ● Instead of a failover model, we want to run many Oozie servers against the same database o Active-Active HA o Also provides horizontal scalability ● Zookeeper for coordination
  • 5.
  • 6. Architectural Overview: Access ● Users and client programs need a single address to connect to o Web UI, REST/Java API, JobTracker/ResourceManager callbacks, etc ● Load balancer, Virtual IP, or DNS round-robin can be used to provide a single entry point to the Oozie servers o Technically also needs to be HA
  • 7.
  • 8. Architectural Overview: Log Streaming ● Oozie’s log files are not in the database o Each Oozie server only has access to its own logs ● Jobs are not assigned to a specific Oozie server ● What if the user asks Oozie Server 1 for logs for a job processed by Oozie Server 2? o Oozie Server 1 can ask Oozie Server 2 for its logs ● Caveat: If an Oozie Server goes down, any logs from it will be unavailable
  • 9.
  • 10. Security ● Existing security features continue to work ● authentication tokens o Signed cookies for authenticating users to Oozie server o Each Oozie server uses it’s own randomly generated secret  Problem: Won’t accept cookies signed by other Oozie servers ● Hadoop-auth in Hadoop 2.6.0 will add support for pluggable secret providers o Includes a Zookeeper-backed implementation that synchronizes a rolling randomly generated secret across multiple servers  No locking required!
  • 11.
  • 12.
  • 13. Authentication Load Balancer Oozie Server 1 Oozie Server 2 Oracle DB Zookeeper Hadoop Cluster user submit request Load balanced request redirection Inter server communication for log streaming, sharelib.etc Zookeeper for lock and management Apache Curator
  • 14. Authentication Load Balancer Oozie Server 1 Oozie Server 2 Oracle DB Zookeeper Hadoop Cluster user submit request Security: https + kerberos / cookie-based auth Load balanced request redirection Security: https + kerberos / cookie-based- auth Inter server communication for log streaming, sharelib.etc Security: https+kerberos Zookeeper for lock and management Security: Kerberos Security: kerberos Apache Curator
  • 15. HCatalog Integration in HA • HCatalog : metadata management service for HDFS datasets – Oozie receive notification from Hcatalog through JMS (e.g., ActiveMQ) – Start a job immediately after data becomes ready Oozie 1 JMS 1. Query/Poll Partition (e.g, ActiveMQ) HCatalog 3. Push notification <New Partition> 2. Register Topic 4. Notify New Partition Oozie 2 Job
  • 16. HCatalog Integration in HA • To support HA – Keep consistency in in-memory data structure • store list of jobs waiting for a data partition – Create and cleanup topic listener for JMS Oozie 1 HCatalog 3. Push notification <New Partition> 1. Query/Poll Partition 2. Register Topic 4. Notify New Partition Job Oozie 2 JMS (e.g, ActiveMQ)
  • 17. Other Challenges in HA • SLA support – Oozie has in-memory data structure to track sla status for each job (start/duration/end met/miss and notifications) – add check of sla status against Database – use ZK lock to synchronize update on the same job from multiple servers. • Distributed Locks – Reentrant distributed lock using Apache Curator + Zookeeper
  • 18. Other Challenges in HA ● Distributed Job ID o Maintain distributed sequence number for Job ID using Apache Curator + Zookeeper ● Zookeeper Failure Handling o Oozie servers automatically shutdown when Zookeeper is down
  • 19. Future work • Learn from experience for stability – At Y!, HA running on non-prod grids >1 month, and prod deployment in Q4 • Faster job fail-over – currently wait for a thread (Recovery Service) to pick non-progressing jobs every few minutes – Oozie server should immediately notice when other server is down and fail-over job (e.g, using ZK watcher) • Improve log streaming