SlideShare une entreprise Scribd logo
1  sur  41
Télécharger pour lire hors ligne
Proprietary & Confidential. Copyright © 2014.
Hado’ops’
or
Had’oops’ 1
We’re Hiring
rocketfuel.com/careers
Kishore Kumar Yellamraju
Abhijit Pol
Proprietary & Confidential. Copyright © 2014.
The Web Is Monetized By Advertising
Proprietary & Confidential. Copyright © 2014.
Delivery Methods
»Display
»Video
»Mobile
»Social
Proprietary & Confidential. Copyright © 2014.
6. Ad
Served
User
Segment
s
3. Bid
Reques
t
Overview
Publishers
2. Ad
Request
1. Page
Request
4. Bid &
Ad
User
Engagement
s
Data
Partners
Advertisers
Browser
Some Exchange Partners
Ad Exchange
Optimize
Rocket Fuel Platform
Real-time Bidder
Automated Decisions
Model
s
Refresh
learning
Data
Store
Ads &
Budget
Model
Scores
Events
5.
Rocketfuel
Winning Ad
Proprietary & Confidential. Copyright © 2014.
1.25
$2.11
$1.26
$2.78
$1.256
$1.809
$2.42
1.25
$2.11
$1.26
$2.78
$0.586
$2.009
1.25
$2.11
$1.26
$2.78
$1.56
$0.00
[ + ][ + ]
Site/PageGeo/WeatherTime of DayBrand AffinityUser
Always buying the best impressions & serving the best ad
Real Time Bidding and Serving
Proprietary & Confidential. Copyright © 2014.
Goal:
Leads
& sales
Goal:
Coupon
downloads
Goal:
Brand
awareness
Site/PageGeo/WeatherTime of DayBrand AffinityDemo
Impression Scorecard
Demo
Brand Affinity
Time of Day
Geo/Weather
Site/Page
Ad Position
In-market
Behavior
Response
Impression Scorecard
Demo
Brand Affinity
Time of Day
Geo/Weather
Site/Page
Ad Position
In-Market
Behavior
Response
X
Impression Scorecard
Demo
Brand Affinity
Time of Day
Geo/Weather
Site/Page
Ad Position
In-Market
Behavior
Response
+100
+40
-20
+20
+15
+10
+40
+35
+9.7%
+40
-70
-20
+10
+15
-25
-40
-18
+0.7%
+10
-10
-20
+20
+10
-35
-25
+10
+1.4%
Real Time Bidding and Serving
X
Proprietary & Confidential. Copyright © 2014.
6. Ad
Served
User
Segment
s
3. Bid
Reques
t
Overview
Publishers
2. Ad
Request
1. Page
Request
4. Bid &
Ad
User
Engagement
s
Data
Partners
Advertisers
Browser
Some Exchange Partners
Ad Exchange
Optimize
Rocket Fuel Platform
Real-time Bidder
Automated Decisions
Model
s
Refresh
learning
Data
Store
Ads &
Budget
Model
Scores
Events
5.
Rocketfuel
Winning Ad
Proprietary & Confidential. Copyright © 2014.
5 B
6 B
45 B
Facebook likes
Searches on Google
Bid Requests Considered by Rocketfuel
Requests per day
Throughput
Proprietary & Confidential. Copyright © 2014.
400
100
20
2
Blink of an eye
SF to Tokyo network round trip
One beat of a hummindbird's wing
Look up in Blackbird
Time (ms)
Latency
Proprietary & Confidential. Copyright © 2014.
Architecture and Scale
»Datacenters
»Scale
»Growth
»Architecture
Proprietary & Confidential. Copyright © 2014.
Data Center Expansion
»abc
Proprietary & Confidential. Copyright © 2014.
Data Center Design
• Racks custom built at Rocket Fuel
• Leased space/bandwidth in colocation facilities
Hadoop Server
20 2U servers (8.5kW)
Bidders
40 2-U Twin 2 servers (17kW)
Proprietary & Confidential. Copyright © 2014.
Rocket Fuel Scale
»34,474 CPU processor cores
–2655 servers
–187.4 Teraflops of computing
»188 Terabytes of memory
–13X the memory of IBM computer Watson that
played Jeopardy
»42PB Petabytes of storage
–106X the data volume of the entire Library of
Congress
Proprietary & Confidential. Copyright © 2014.
Hadoop at Rocket Fuel
»1400 servers
»15K Disks
»15K Cores
»90 TB
»30K MR slots
»12K daily MR jobs
Proprietary & Confidential. Copyright © 2014.
200 Servers 1400 Servers
5 PB
41 PB
8x
Growth
Proprietary & Confidential. Copyright © 2014.
Data Architecture 3.0
Proprietary & Confidential. Copyright © 2014.
Hadoop Setup
QJM ZK Quorum
» 6x2TB Disks
» 2x6 core
» 196 GB RAM
» 2x1G NIC
» 12x3TB Disks
» 2x6 core
» 64 GB RAM
» 10G NIC
» same as DN’s
» Dedicated disk
to ZK or JN
JT
Standby NN
ZKFCZKFC
Active NN
DN
TT
DN
TT
DN
TT
DN
TT
DN
TT
DN
TT
Proprietary & Confidential. Copyright © 2014.
Operations
» Maintenance
» Performance Tuning
» Monitoring
» BCP
» YARN
Proprietary & Confidential. Copyright © 2014.
Puppet
+
Infradb
Automation is key
Maintenance is Not Easy
Proprietary & Confidential. Copyright © 2014.
Puppet and Infradb
» Automate as much as you can
» Adding a slave node to Hadoop cluster < 120 seconds
» Bringing up a new Hadoop cluster < 500 seconds
» MR slots are automatically determined based on hardware config
Isn’t it cool ?
Just define once
Proprietary & Confidential. Copyright © 2014.
No issues when cluster is small Problems starts when it grows
Performance Tuning
Proprietary & Confidential. Copyright © 2014.
dfs.namenode.handler.count
dfs.image.transfer.timeout
mapred.reduce.parallel.copies
mapred.job.tracker.handler.count
io.sort.mbio.sort.factor
maxClientCnxns
ZK :
HDFS :
MR :
IMP : MAPREDUCE-2026
-XX:+UseConcMarkSweepGC
-XX:CMSFullGCsBeforeCompaction=1
-XX:CMSInitiatingOccupancyFraction=60
ha.*-timeout.ms
JVM:
Performance Tuning
mapreduce.reduce.shuffle.parallelcopies
Proprietary & Confidential. Copyright © 2014.
MAPREDUCE-5351
MAPREDUCE-5508
"keep.failed.task.files=true"
We Have an Issue!
Proprietary & Confidential. Copyright © 2014.
#instances of "JobInProgress” class = no. of users submitted jobs X
mapred.jobtracker.completeuserjobs.maximum
mapred.jobtracker.completeuserjobs.maximum mapred.jobtracker.retirejob.interval
mapred.jobtracker.retiredjobs.cache.size
JT OOM
Proprietary & Confidential. Copyright © 2014.
Operations
» Maintenance
» Performance Tuning
» Monitoring
» BCP
» YARN
Proprietary & Confidential. Copyright © 2014.
Monitoring
Wall of Ops
Proprietary & Confidential. Copyright © 2014.
Monitoring
hadoop.namenode.CallQueueLength hadoop.jobtracker.jvm.memheapusedm
Don’t fly blind, you will crash!
Proprietary & Confidential. Copyright © 2014.
MR Workload Monitoring
Proprietary & Confidential. Copyright © 2014.
Network Monitoring
Don’t blame network, instead monitor it Network Mesh can be mess
Proprietary & Confidential. Copyright © 2014.
Alerting
Monitoring is not enough, need better Alerting
Proprietary & Confidential. Copyright © 2014.
Alerts
http://hostname:port/jmx?
qry=Hadoop:service=NameNode,name=NameNodeInfo
>> Checking whether NN and JT are up is a no brainer
>> Reduce alert noise by having summary/aggregate alerts
>> We heavily rely on custom scripts that query /jmx for NN and JT
qry=hadoop:service=JobTracker,name=JobTrackerInfo
NameDirStatuses, DeadNodes, NumberOfMissingBlocks ,
qry=Hadoop:service=NameNode,name=FSNamesystemState
FSState , CapacityRemaining , NumDeadDataNodes , UnderReplicatedBlocks
Blacklisted TT’s , #jobs , #slots_used , ThreadCount ,
qry=java.lang:type=Memory"
Used jvm , free jvm etc
Proprietary & Confidential. Copyright © 2014.
MR Workload Alerting
» Monitoring MR workload and alert
– In-house tool that use “houdah” ruby gem monitors
– Long running jobs , jobs with more map tasks , blacklisted
TT’s with more failure counts etc…
» Collect details and auto-restart blacklisted TT’s
» Parse the JT logfile for rouge jobs.
» Parse the JT log and collects all Job related info
» White-elephant or hraven could help
» Parse the scheduler html page or use metrics page
http://<JT-hostname>:50030/scheduler?advanced
http://<JT-hostname>:50030/metrics
Proprietary & Confidential. Copyright © 2014.
Modeling
OPS
ETL
Ad-hoc
Multi Tenancy
Proprietary & Confidential. Copyright © 2014.
No Scheduler is perfect unless you understand and tune it properly
Scheduling
Proprietary & Confidential. Copyright © 2014.
Operations
» Maintenance
» Performance Tuning
» Monitoring
» BCP
» YARN
Proprietary & Confidential. Copyright © 2014.
BCP
» BCP  Business Continuity Plan
» Near real time reporting over 15+ TB of daily data
» Freshness of models trained over petabytes of data
Proprietary & Confidential. Copyright © 2014.
Data BCP Cluster
INW
Data
Cluster
US
Serving
Clusters
EU
Serving
Clusters
HK
Serving
Clusters
Modeling
Repor
ting
User
Queries
Amazon Backup
LSV
Data
Cluster
US/EU/HK
Serving
Clusters
Research
Ad-hoc
Queries
Processed Data
Proprietary & Confidential. Copyright © 2014.
YARN
» Resource Manager
- Global resource scheduler
- Hierarchical queues
- Application management
» Node Manager
- Per-machine agent
- Manages life cycle of container
- Container resource monitoring
» Application Master
- Per-application
- Manages application scheduling and
task execution
Proprietary & Confidential. Copyright © 2014.
YARN at Rocket FueI
» Yarn is in production
» 700+ nodes
» 31TB RAM , 8500 disks , 8500 cores
» Primary use case Map-Reduce
» No more static slots
» Tez , Spark , Storm are in race
YAY !!!
Proprietary & Confidential. Copyright © 2014.
Obligatory “we are hiring” slide!
http://rocketfuel.com/careers
Proprietary & Confidential. Copyright © 2014.
THANKS
kishore@rocketfuel.com
apol@rocketfuel.com

Contenu connexe

Tendances

Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitFoundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitAmazon Web Services
 
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multiKohei KaiGai
 
Distributed Model Training using MXNet with Horovod
Distributed Model Training using MXNet with HorovodDistributed Model Training using MXNet with Horovod
Distributed Model Training using MXNet with HorovodLin Yuan
 
Amazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS SummitAmazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS SummitAmazon Web Services
 
Tuning up with Apache Tez
Tuning up with Apache TezTuning up with Apache Tez
Tuning up with Apache TezGal Vinograd
 
Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization Shivkumar Babshetty
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Sumeet Singh
 
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とは
分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とはCloudera Japan
 
OLTP+OLAP=HTAP
 OLTP+OLAP=HTAP OLTP+OLAP=HTAP
OLTP+OLAP=HTAPEDB
 
20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - EnglishKohei KaiGai
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Managementrightsize
 
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Big Data Joe™ Rossi
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingImpetus Technologies
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8Distributed caching-computing v3.8
Distributed caching-computing v3.8Rahul Gupta
 
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャリアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャHiroyuki Inoue
 
Presto at Tivo, Boston Hadoop Meetup
Presto at Tivo, Boston Hadoop MeetupPresto at Tivo, Boston Hadoop Meetup
Presto at Tivo, Boston Hadoop MeetupJustin Borgman
 
GoodFit: Multi-Resource Packing of Tasks with Dependencies
GoodFit: Multi-Resource Packing of Tasks with DependenciesGoodFit: Multi-Resource Packing of Tasks with Dependencies
GoodFit: Multi-Resource Packing of Tasks with DependenciesDataWorks Summit/Hadoop Summit
 
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340Big Data Joe™ Rossi
 

Tendances (20)

Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitFoundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
 
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
 
Distributed Model Training using MXNet with Horovod
Distributed Model Training using MXNet with HorovodDistributed Model Training using MXNet with Horovod
Distributed Model Training using MXNet with Horovod
 
Amazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS SummitAmazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS Summit
 
Hadoop + GPU
Hadoop + GPUHadoop + GPU
Hadoop + GPU
 
Tuning up with Apache Tez
Tuning up with Apache TezTuning up with Apache Tez
Tuning up with Apache Tez
 
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DMUpgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
 
Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization Advanced Hadoop Tuning and Optimization
Advanced Hadoop Tuning and Optimization
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
 
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とは
分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは分散DB Apache KuduのアーキテクチャDBの性能と一貫性を両立させる仕組み「HybridTime」とは
分散DB Apache Kuduのアーキテクチャ DBの性能と一貫性を両立させる仕組み 「HybridTime」とは
 
OLTP+OLAP=HTAP
 OLTP+OLAP=HTAP OLTP+OLAP=HTAP
OLTP+OLAP=HTAP
 
20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Management
 
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8Distributed caching-computing v3.8
Distributed caching-computing v3.8
 
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャリアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
 
Presto at Tivo, Boston Hadoop Meetup
Presto at Tivo, Boston Hadoop MeetupPresto at Tivo, Boston Hadoop Meetup
Presto at Tivo, Boston Hadoop Meetup
 
GoodFit: Multi-Resource Packing of Tasks with Dependencies
GoodFit: Multi-Resource Packing of Tasks with DependenciesGoodFit: Multi-Resource Packing of Tasks with Dependencies
GoodFit: Multi-Resource Packing of Tasks with Dependencies
 
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
 

En vedette

Hive for Analytic Workloads
Hive for Analytic WorkloadsHive for Analytic Workloads
Hive for Analytic WorkloadsDataWorks Summit
 
The Great Enterprise Hadoop Analyst Debate
The Great Enterprise Hadoop Analyst DebateThe Great Enterprise Hadoop Analyst Debate
The Great Enterprise Hadoop Analyst DebateDataWorks Summit
 
Operational Intelligence Using Hadoop
Operational Intelligence Using HadoopOperational Intelligence Using Hadoop
Operational Intelligence Using HadoopDataWorks Summit
 
Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersEnabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersDataWorks Summit
 
Hadoop 2 @ Twitter, Elephant Scale
Hadoop 2 @ Twitter, Elephant ScaleHadoop 2 @ Twitter, Elephant Scale
Hadoop 2 @ Twitter, Elephant ScaleDataWorks Summit
 
Visualising your Big Data: Eye Vegetables and Eye Candy
Visualising your Big Data: Eye Vegetables and Eye CandyVisualising your Big Data: Eye Vegetables and Eye Candy
Visualising your Big Data: Eye Vegetables and Eye CandyDataWorks Summit
 

En vedette (6)

Hive for Analytic Workloads
Hive for Analytic WorkloadsHive for Analytic Workloads
Hive for Analytic Workloads
 
The Great Enterprise Hadoop Analyst Debate
The Great Enterprise Hadoop Analyst DebateThe Great Enterprise Hadoop Analyst Debate
The Great Enterprise Hadoop Analyst Debate
 
Operational Intelligence Using Hadoop
Operational Intelligence Using HadoopOperational Intelligence Using Hadoop
Operational Intelligence Using Hadoop
 
Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersEnabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
 
Hadoop 2 @ Twitter, Elephant Scale
Hadoop 2 @ Twitter, Elephant ScaleHadoop 2 @ Twitter, Elephant Scale
Hadoop 2 @ Twitter, Elephant Scale
 
Visualising your Big Data: Eye Vegetables and Eye Candy
Visualising your Big Data: Eye Vegetables and Eye CandyVisualising your Big Data: Eye Vegetables and Eye Candy
Visualising your Big Data: Eye Vegetables and Eye Candy
 

Similaire à Hado "OPS" or Had "oops"

Dawn of YARN @ Rocket Fuel
Dawn of YARN @ Rocket FuelDawn of YARN @ Rocket Fuel
Dawn of YARN @ Rocket FuelDataWorks Summit
 
How did you know this ad would be relevant for me?
How did you know this ad would be relevant for me?How did you know this ad would be relevant for me?
How did you know this ad would be relevant for me?DataWorks Summit
 
MySQL Performance Metrics that Matter
MySQL Performance Metrics that MatterMySQL Performance Metrics that Matter
MySQL Performance Metrics that MatterMorgan Tocker
 
Making your PostgreSQL Database Highly Available
Making your PostgreSQL Database Highly AvailableMaking your PostgreSQL Database Highly Available
Making your PostgreSQL Database Highly AvailableEDB
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDinakar Guniguntala
 
Times ten 18.1_overview_meetup
Times ten 18.1_overview_meetupTimes ten 18.1_overview_meetup
Times ten 18.1_overview_meetupByung Ho Lee
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Hortonworks
 
Managing Oracle Solaris Systems with Puppet
Managing Oracle Solaris Systems with PuppetManaging Oracle Solaris Systems with Puppet
Managing Oracle Solaris Systems with Puppetglynnfoster
 
Beginners Guide to High Availability for Postgres
Beginners Guide to High Availability for PostgresBeginners Guide to High Availability for Postgres
Beginners Guide to High Availability for PostgresEDB
 
Where to Deploy Hadoop: Bare Metal or Cloud?
Where to Deploy Hadoop: Bare Metal or Cloud? Where to Deploy Hadoop: Bare Metal or Cloud?
Where to Deploy Hadoop: Bare Metal or Cloud? DataWorks Summit
 
Performance Engineering Sterling MCS-OM - An Accenture Capability (3)
Performance Engineering Sterling MCS-OM - An Accenture Capability (3)Performance Engineering Sterling MCS-OM - An Accenture Capability (3)
Performance Engineering Sterling MCS-OM - An Accenture Capability (3)Guruprasad Nagaraja
 
Enabling a hardware accelerated deep learning data science experience for Apa...
Enabling a hardware accelerated deep learning data science experience for Apa...Enabling a hardware accelerated deep learning data science experience for Apa...
Enabling a hardware accelerated deep learning data science experience for Apa...DataWorks Summit
 
Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster Fran Navarro
 
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...DataWorks Summit
 
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...VMware Tanzu
 
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiWhither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiFelicia Haggarty
 
Where to Deploy Hadoop: Bare-metal or Cloud?
Where to Deploy Hadoop:  Bare-metal or Cloud?Where to Deploy Hadoop:  Bare-metal or Cloud?
Where to Deploy Hadoop: Bare-metal or Cloud?Mike Wendt
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to DeploymentAerospike, Inc.
 

Similaire à Hado "OPS" or Had "oops" (20)

Dawn of YARN @ Rocket Fuel
Dawn of YARN @ Rocket FuelDawn of YARN @ Rocket Fuel
Dawn of YARN @ Rocket Fuel
 
How did you know this ad would be relevant for me?
How did you know this ad would be relevant for me?How did you know this ad would be relevant for me?
How did you know this ad would be relevant for me?
 
MySQL Performance Metrics that Matter
MySQL Performance Metrics that MatterMySQL Performance Metrics that Matter
MySQL Performance Metrics that Matter
 
Making your PostgreSQL Database Highly Available
Making your PostgreSQL Database Highly AvailableMaking your PostgreSQL Database Highly Available
Making your PostgreSQL Database Highly Available
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on Kubernetes
 
141106 actifio overview
141106 actifio overview 141106 actifio overview
141106 actifio overview
 
Times ten 18.1_overview_meetup
Times ten 18.1_overview_meetupTimes ten 18.1_overview_meetup
Times ten 18.1_overview_meetup
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture
 
Managing Oracle Solaris Systems with Puppet
Managing Oracle Solaris Systems with PuppetManaging Oracle Solaris Systems with Puppet
Managing Oracle Solaris Systems with Puppet
 
Beginners Guide to High Availability for Postgres
Beginners Guide to High Availability for PostgresBeginners Guide to High Availability for Postgres
Beginners Guide to High Availability for Postgres
 
Where to Deploy Hadoop: Bare Metal or Cloud?
Where to Deploy Hadoop: Bare Metal or Cloud? Where to Deploy Hadoop: Bare Metal or Cloud?
Where to Deploy Hadoop: Bare Metal or Cloud?
 
Performance Engineering Sterling MCS-OM - An Accenture Capability (3)
Performance Engineering Sterling MCS-OM - An Accenture Capability (3)Performance Engineering Sterling MCS-OM - An Accenture Capability (3)
Performance Engineering Sterling MCS-OM - An Accenture Capability (3)
 
Enabling a hardware accelerated deep learning data science experience for Apa...
Enabling a hardware accelerated deep learning data science experience for Apa...Enabling a hardware accelerated deep learning data science experience for Apa...
Enabling a hardware accelerated deep learning data science experience for Apa...
 
Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster
 
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
 
HP 3PAR SSMC 2.1
HP 3PAR SSMC 2.1HP 3PAR SSMC 2.1
HP 3PAR SSMC 2.1
 
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
 
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiWhither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
 
Where to Deploy Hadoop: Bare-metal or Cloud?
Where to Deploy Hadoop:  Bare-metal or Cloud?Where to Deploy Hadoop:  Bare-metal or Cloud?
Where to Deploy Hadoop: Bare-metal or Cloud?
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to Deployment
 

Plus de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Dernier

Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 

Dernier (20)

Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 

Hado "OPS" or Had "oops"

  • 1. Proprietary & Confidential. Copyright © 2014. Hado’ops’ or Had’oops’ 1 We’re Hiring rocketfuel.com/careers Kishore Kumar Yellamraju Abhijit Pol
  • 2. Proprietary & Confidential. Copyright © 2014. The Web Is Monetized By Advertising
  • 3. Proprietary & Confidential. Copyright © 2014. Delivery Methods »Display »Video »Mobile »Social
  • 4. Proprietary & Confidential. Copyright © 2014. 6. Ad Served User Segment s 3. Bid Reques t Overview Publishers 2. Ad Request 1. Page Request 4. Bid & Ad User Engagement s Data Partners Advertisers Browser Some Exchange Partners Ad Exchange Optimize Rocket Fuel Platform Real-time Bidder Automated Decisions Model s Refresh learning Data Store Ads & Budget Model Scores Events 5. Rocketfuel Winning Ad
  • 5. Proprietary & Confidential. Copyright © 2014. 1.25 $2.11 $1.26 $2.78 $1.256 $1.809 $2.42 1.25 $2.11 $1.26 $2.78 $0.586 $2.009 1.25 $2.11 $1.26 $2.78 $1.56 $0.00 [ + ][ + ] Site/PageGeo/WeatherTime of DayBrand AffinityUser Always buying the best impressions & serving the best ad Real Time Bidding and Serving
  • 6. Proprietary & Confidential. Copyright © 2014. Goal: Leads & sales Goal: Coupon downloads Goal: Brand awareness Site/PageGeo/WeatherTime of DayBrand AffinityDemo Impression Scorecard Demo Brand Affinity Time of Day Geo/Weather Site/Page Ad Position In-market Behavior Response Impression Scorecard Demo Brand Affinity Time of Day Geo/Weather Site/Page Ad Position In-Market Behavior Response X Impression Scorecard Demo Brand Affinity Time of Day Geo/Weather Site/Page Ad Position In-Market Behavior Response +100 +40 -20 +20 +15 +10 +40 +35 +9.7% +40 -70 -20 +10 +15 -25 -40 -18 +0.7% +10 -10 -20 +20 +10 -35 -25 +10 +1.4% Real Time Bidding and Serving X
  • 7. Proprietary & Confidential. Copyright © 2014. 6. Ad Served User Segment s 3. Bid Reques t Overview Publishers 2. Ad Request 1. Page Request 4. Bid & Ad User Engagement s Data Partners Advertisers Browser Some Exchange Partners Ad Exchange Optimize Rocket Fuel Platform Real-time Bidder Automated Decisions Model s Refresh learning Data Store Ads & Budget Model Scores Events 5. Rocketfuel Winning Ad
  • 8. Proprietary & Confidential. Copyright © 2014. 5 B 6 B 45 B Facebook likes Searches on Google Bid Requests Considered by Rocketfuel Requests per day Throughput
  • 9. Proprietary & Confidential. Copyright © 2014. 400 100 20 2 Blink of an eye SF to Tokyo network round trip One beat of a hummindbird's wing Look up in Blackbird Time (ms) Latency
  • 10. Proprietary & Confidential. Copyright © 2014. Architecture and Scale »Datacenters »Scale »Growth »Architecture
  • 11. Proprietary & Confidential. Copyright © 2014. Data Center Expansion »abc
  • 12. Proprietary & Confidential. Copyright © 2014. Data Center Design • Racks custom built at Rocket Fuel • Leased space/bandwidth in colocation facilities Hadoop Server 20 2U servers (8.5kW) Bidders 40 2-U Twin 2 servers (17kW)
  • 13. Proprietary & Confidential. Copyright © 2014. Rocket Fuel Scale »34,474 CPU processor cores –2655 servers –187.4 Teraflops of computing »188 Terabytes of memory –13X the memory of IBM computer Watson that played Jeopardy »42PB Petabytes of storage –106X the data volume of the entire Library of Congress
  • 14. Proprietary & Confidential. Copyright © 2014. Hadoop at Rocket Fuel »1400 servers »15K Disks »15K Cores »90 TB »30K MR slots »12K daily MR jobs
  • 15. Proprietary & Confidential. Copyright © 2014. 200 Servers 1400 Servers 5 PB 41 PB 8x Growth
  • 16. Proprietary & Confidential. Copyright © 2014. Data Architecture 3.0
  • 17. Proprietary & Confidential. Copyright © 2014. Hadoop Setup QJM ZK Quorum » 6x2TB Disks » 2x6 core » 196 GB RAM » 2x1G NIC » 12x3TB Disks » 2x6 core » 64 GB RAM » 10G NIC » same as DN’s » Dedicated disk to ZK or JN JT Standby NN ZKFCZKFC Active NN DN TT DN TT DN TT DN TT DN TT DN TT
  • 18. Proprietary & Confidential. Copyright © 2014. Operations » Maintenance » Performance Tuning » Monitoring » BCP » YARN
  • 19. Proprietary & Confidential. Copyright © 2014. Puppet + Infradb Automation is key Maintenance is Not Easy
  • 20. Proprietary & Confidential. Copyright © 2014. Puppet and Infradb » Automate as much as you can » Adding a slave node to Hadoop cluster < 120 seconds » Bringing up a new Hadoop cluster < 500 seconds » MR slots are automatically determined based on hardware config Isn’t it cool ? Just define once
  • 21. Proprietary & Confidential. Copyright © 2014. No issues when cluster is small Problems starts when it grows Performance Tuning
  • 22. Proprietary & Confidential. Copyright © 2014. dfs.namenode.handler.count dfs.image.transfer.timeout mapred.reduce.parallel.copies mapred.job.tracker.handler.count io.sort.mbio.sort.factor maxClientCnxns ZK : HDFS : MR : IMP : MAPREDUCE-2026 -XX:+UseConcMarkSweepGC -XX:CMSFullGCsBeforeCompaction=1 -XX:CMSInitiatingOccupancyFraction=60 ha.*-timeout.ms JVM: Performance Tuning mapreduce.reduce.shuffle.parallelcopies
  • 23. Proprietary & Confidential. Copyright © 2014. MAPREDUCE-5351 MAPREDUCE-5508 "keep.failed.task.files=true" We Have an Issue!
  • 24. Proprietary & Confidential. Copyright © 2014. #instances of "JobInProgress” class = no. of users submitted jobs X mapred.jobtracker.completeuserjobs.maximum mapred.jobtracker.completeuserjobs.maximum mapred.jobtracker.retirejob.interval mapred.jobtracker.retiredjobs.cache.size JT OOM
  • 25. Proprietary & Confidential. Copyright © 2014. Operations » Maintenance » Performance Tuning » Monitoring » BCP » YARN
  • 26. Proprietary & Confidential. Copyright © 2014. Monitoring Wall of Ops
  • 27. Proprietary & Confidential. Copyright © 2014. Monitoring hadoop.namenode.CallQueueLength hadoop.jobtracker.jvm.memheapusedm Don’t fly blind, you will crash!
  • 28. Proprietary & Confidential. Copyright © 2014. MR Workload Monitoring
  • 29. Proprietary & Confidential. Copyright © 2014. Network Monitoring Don’t blame network, instead monitor it Network Mesh can be mess
  • 30. Proprietary & Confidential. Copyright © 2014. Alerting Monitoring is not enough, need better Alerting
  • 31. Proprietary & Confidential. Copyright © 2014. Alerts http://hostname:port/jmx? qry=Hadoop:service=NameNode,name=NameNodeInfo >> Checking whether NN and JT are up is a no brainer >> Reduce alert noise by having summary/aggregate alerts >> We heavily rely on custom scripts that query /jmx for NN and JT qry=hadoop:service=JobTracker,name=JobTrackerInfo NameDirStatuses, DeadNodes, NumberOfMissingBlocks , qry=Hadoop:service=NameNode,name=FSNamesystemState FSState , CapacityRemaining , NumDeadDataNodes , UnderReplicatedBlocks Blacklisted TT’s , #jobs , #slots_used , ThreadCount , qry=java.lang:type=Memory" Used jvm , free jvm etc
  • 32. Proprietary & Confidential. Copyright © 2014. MR Workload Alerting » Monitoring MR workload and alert – In-house tool that use “houdah” ruby gem monitors – Long running jobs , jobs with more map tasks , blacklisted TT’s with more failure counts etc… » Collect details and auto-restart blacklisted TT’s » Parse the JT logfile for rouge jobs. » Parse the JT log and collects all Job related info » White-elephant or hraven could help » Parse the scheduler html page or use metrics page http://<JT-hostname>:50030/scheduler?advanced http://<JT-hostname>:50030/metrics
  • 33. Proprietary & Confidential. Copyright © 2014. Modeling OPS ETL Ad-hoc Multi Tenancy
  • 34. Proprietary & Confidential. Copyright © 2014. No Scheduler is perfect unless you understand and tune it properly Scheduling
  • 35. Proprietary & Confidential. Copyright © 2014. Operations » Maintenance » Performance Tuning » Monitoring » BCP » YARN
  • 36. Proprietary & Confidential. Copyright © 2014. BCP » BCP  Business Continuity Plan » Near real time reporting over 15+ TB of daily data » Freshness of models trained over petabytes of data
  • 37. Proprietary & Confidential. Copyright © 2014. Data BCP Cluster INW Data Cluster US Serving Clusters EU Serving Clusters HK Serving Clusters Modeling Repor ting User Queries Amazon Backup LSV Data Cluster US/EU/HK Serving Clusters Research Ad-hoc Queries Processed Data
  • 38. Proprietary & Confidential. Copyright © 2014. YARN » Resource Manager - Global resource scheduler - Hierarchical queues - Application management » Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring » Application Master - Per-application - Manages application scheduling and task execution
  • 39. Proprietary & Confidential. Copyright © 2014. YARN at Rocket FueI » Yarn is in production » 700+ nodes » 31TB RAM , 8500 disks , 8500 cores » Primary use case Map-Reduce » No more static slots » Tez , Spark , Storm are in race YAY !!!
  • 40. Proprietary & Confidential. Copyright © 2014. Obligatory “we are hiring” slide! http://rocketfuel.com/careers
  • 41. Proprietary & Confidential. Copyright © 2014. THANKS kishore@rocketfuel.com apol@rocketfuel.com

Notes de l'éditeur

  1. Most people have probably used IMDb, but they probably won’t use if if they have to pay.
  2. What we do: Display, video, mobile, social
  3. Loop 1-6 is 200ms, 3-4 is 100ms for RF We do this 45B times a day
  4. Real Time Auction Selecting the right ad for each auction
  5. Automatically learning from every response & getting better Nobody else is doing this as fast, precisely, consistently for our customers
  6. Loop 1-6 is 200ms, 3-4 is 100ms for RF We do this 45B times a day
  7. We are now in 8 data centers in the world
  8. We have optimized design of data centers as well. We custom design our racks, get servers assembled, racked and tested in a California facility. Then, ship to the data center. This is what we do not just for US data centers but also for data centers in Europe or Asia. Each rack can be 1500 lb or more and many racks are sent by air for initially install. Now, let’s look at the two kinds of racks shown above: Hadoop Server (the full racks) :L Data (Hadoop) servers are bigger as they have 12X3TB drives and 20 servers fill the whole rack. Bidders: Bidders have lot of cores but take less space because they have only 2 2.5” drives each. 40 servers fill up half the rack but we run out of switch ports. And, this is 5% of Rocket Fuel
  9. Just say “We have amazing scale” – let the numbers speak for themselves.
  10. Managing Hadoop cluster is not easy Start early We are heavy users of puppet Infradb is similar what puppet hiera but infradb was written in house 4 yrs ago. -> puppet and infradb are tightly integrated We use puppet and infradb to make maintenance easy Infradb helps us populate hadoop property values based on hardware config we have. For ex: Our fairshare slot distribution is automatically handled by infradb whenever we add new nodes.
  11. -> here is an example , we define the formula to decide no. of MR slots per server based on mem , cpu , disks -> we always want to have homogenous hardware for easy maintenance and planning , but it is impossible since “need changes with time” -> automation like this will let you not about having heterogeneous servers. -> not just configuration, we use infradb to define alerts once and all the newly added hosts and clusters will be automatically monitored by our nagios.
  12. A typical hadoop problem, you start with small cluster and want to grow. Hadoop default properties works well on small clusters , the problem starts when your cluster grows. Problem will be big when it happens on large clusters Aren’t we suppose to get better performance after adding nodes ?
  13. -> too many properties to change -> be careful when tuning any changes -> have metrics to compare pre and post changes. -> MAPREDUCE-2026 : JobTracker.getJobCounter() will lock JobTracker and call JobInProgress.getCounters(). JobInProgress.getCounters() can be very expensive because it aggregates all the task counters. We found that from the JobTracker jstacks that this method is one of the bottleneck of the JobTracker performance.
  14. -> few jiras that talk about memory leak in JobTracker. -> none of them really fixed issue even though the bugs are marked as resolved. -> 5351 is fixed but introduced 5508. 5508 is later fixed but there is workaround “set keep.failed.task.files=true” -> none of them really resolved the issue of JT OOM.
  15. -> any HDFS heavy job can impact your HDFS performance , you will not realize unless you monitor the metrics -> don’t let any engineer impact your cluster.
  16. -> monitoring your applications is not enough when you are running on a scale in multiple datacenters across world -> we should monitor the network mesh
  17. -> find out bad queries immediately and kill them before they impact cluster -> don’t loose your capacity due to mass tasktracker blacklisting by a single job. -> long running jobs should be killed . No point in letting them run.
  18. -> understand the workload on your cluster to better tune the scheduler properties ->whenever you add more nodes, the slot distribution should be automatically distributed to different queues. -> no scheduler is perfect unless you understand and tune it -> have ACL’s in place , don’t let any one engineer impact your MR workload. -> have an proper accounting for teams who use more MR capacity.
  19. How we operated for initial few years Recently added DATA BCP (Business Continuity Plan) Latency critical and important data goes both places Other data after processing Make use of BCP cluster to do meaningful things until disaster happens