SlideShare une entreprise Scribd logo
1  sur  38
1
Kannan Muthukkaruppan & Karthik Ranganathan
Jun/20/2013
How Big Data Technologies
Power Facebook
How Big Data Technologies Power Facebook
Karthik Ranganathan
September, 2013
2
Introduction
Email: karthik@nutanix.com
Twitter: @KarthikR
Current: Member of Technical Staff, Nutanix
Background: Technical Engineering Lead at Facebook. Co-
built Cassandra for Facebook Inbox Search and improved
performance and resiliency of Hbase for Facebook
Messages and Search Indexing.
3
Agenda
 Big data at Facebook
 HBase use cases
• OLTP
• Analytics
 Operating at scale
 The Nutanix solution
4
Big Data at Facebook
 OLTP
• User databases (MySQL)
• Photos (Haystack)
• Facebook Messages, Operational Data Store (HBase)
 Warehouse
• Hive Analytics
• Graph Search Indexing
5
HBase in a nutshell
 Apache project, modeled after BigTable
 Distributed, large scale data store
 Built on top of Hadoop DFS (HDFS)
 Efficient at random reads and writes
6
FB’s Largest Hbase Application
Facebook Messages
7
The New Facebook Messages
8
Why HBase?
 Evaluated a bunch of different options
• MySQL, Cassandra, building a custom storage system for
messages
 Horizontal Scalability
 Automatic failover and load balancing
 Optimized for write-heavy workloads
 HDFS already battle-tested at Facebook
 HBase’s strong consistency model
9
Quick stats (as of Nov 2011)
 Traffic to HBase
• Billions of messages per day
• 75B+ rpc’s per day
 Usage pattern
• 55% reads, 45% writes
• Average write: 16 KV’s to multiple CF’s
10
Data Sizes
 7PB+ online data
• ~21PB with replication
• LZO compressed
• Excludes backups
 Growth rate
• 500TB+ per month
• ~20PB of raw disk per year!
11
Growing with size
 Constant need of features with growth
 Read and write path improvements
• Performance optimizations
• IOPS reduction
• New database file format
 Intelligent data and compute placement
• Shard level block placement
• Locality based load-balancing
12
Other OLTP use cases of HBase
 Operational Data Store
 Multi-tenant KeyValue store
 Site integrity – fighting spam
13
Warehouse use cases of HBase
 Graph Search Indexing
• Complex application logic
• Multiple verticals
 Hive over HBase
• Realtime data ingest
• Enables real-time analytics
14
Real-time monitoring and anomaly detection
Operational Data Store
15
ODS: Facebook’s #1 Debugging Tool
 Collects metrics from
production servers
 Supports complex
aggregations and
transformations
 Really well-designed UI
16
Quick stats
 Traffic to HBase
• 150B+ ops per day
 Usage pattern
• Heavy reads of recent data
• Frequent MR jobs for rollups
• TTL to expire older data
17
Real-time Analytics
Facebook Insights
18
Real-time URL/Domain Insights
 Deep analytics for websites
• Facebook widgets
 Massive scale
• Billions of URL’s
• Millions of increments/sec
19
Detailed Insights
 Tracks many metrics
• Clicks, likes, shares, impr
essions
• Referral traffic
 Detailed breakdown
• Age
buckets, gender, location
20
Controlled Multi-tenancy
Generic KeyValue Store
21
A Multi-tenant solution on HBase
 Generic Key-Value store
• Multiple apps on the same cluster
• Transparent schema design
• Simple API
put(appid, key, value)
value = get(appid, key)
22
Architecture
HBase
put(appid, key, value)
Memcache
get(appid, key)
Read
Write
23
Multi-tenancy Issues
 Not a self-service model
• Each app is reviewed
 Global and per-app metrics
• Monitor RPCs by type, latencies, errors
• Friendly names for apps
 If things went wrong
• Per-app kill switch
24
Powering FB’s Semantic Search Engine
Graph Search Indexing
25
Framework to build search indexes
 Multiple, independent input sources
 HBase stores document info
 Output is the search index image
rowKey = document id
value = terms, document data
26
Architecture
HBase cluster
Document
source 2
Document
source 1
MR
cluster
…
Image files
…
27
Do’s and Do-Not’s From Experience
Operating at Scale
28
Design for failures(!)
 Architect for failures and manageability
 No single point of failure
• Killing any process is legit
 Minimize manual intervention
• Especially for frequent failures
 Uptime is important
• Rolling upgrades are the norm
• Need to survive rack failures
29
Dashboard and Metrics
 Single place to graph/report everything
 RPC calls
 SLA misses
• Latencies, p99, Errors
• Per-request profiling
 Cluster and node health
 Network Utilization
30
Health Checks
 Constantly monitor nodes
 Auto-exclude nodes on failure
• Machine not ssh-able
• Hardware failures (HDD failure, etc)
• Do NOT exclude on rack failures
 Auto-include nodes once repaired
 Rate limit remediation of nodes
31
In a nutshell…
 Use commodity hardware
 Scaling out is #1
 Efficiency is #2
• though pretty close behind scale-out
 Design for failures
• Frequent failures must be auto handled
 Metrics, Metrics, Metrics!
32
Overview through comparison
The Nutanix Solution
33
Nutanix compared with HBase
 Evaluated a bunch of different options
• MySQL, Cassandra, building a custom storage system for
messages
 Horizontal Scalability
 Just add more nodes to scale out
 Automatic failover and load balancing
 When a node goes down, others take its place automatically
 Load of node that went down is distributed to many others
34
Nutanix compared with HBase
philosophy
 Optimized for write-heavy workloads
 Optimized for virtualized environments
 Read and write heavy workloads
 Transparent use of flash to boost perf
 HDFS already battle-tested at Facebook
 Nutanix is also quite battle-tested
 HBase’s strong consistency model
 Nutanix is also strongly consistent
35
Other aspects of Nutanix
 Architected for failures and manageability
 No single point of failure
 Minimal manual intervention for frequent failures
 Uptime is important
 Rolling upgrades are the norm
• Need to survive rack failures
 Single place to graph/report everything
 Prism UI to report and manage the entire cluster
 Constantly monitor nodes
 Auto-exclude nodes on failure
36
In a nutshell about Nutanix…
 Runs on commodity hardware
 Scaling out is #1
 Drop in scale out for nodes
 Efficiency is #2
 Constant work on perf improvements
 Design for failures
 Frequent failures auto handled
 Alerts in UI for many other states
 Metrics, Metrics, Metrics!
 Prism UI gives insights into the cluster health
37
Questions?
38NUTANIX INC. – CONFIDENTIAL AND PROPRIETARY

Contenu connexe

Tendances

Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...✔ Eric David Benari, PMP
 
HP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataHP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataRob Winters
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenChristoph Adler
 
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, BlazegraphDatabase Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph✔ Eric David Benari, PMP
 
(ATS3-PLAT08) Optimizing Protocol Performance
(ATS3-PLAT08) Optimizing Protocol Performance(ATS3-PLAT08) Optimizing Protocol Performance
(ATS3-PLAT08) Optimizing Protocol PerformanceBIOVIA
 
The Holy Grail of Data Analytics
The Holy Grail of Data AnalyticsThe Holy Grail of Data Analytics
The Holy Grail of Data AnalyticsDan Lynn
 
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24Data Caching Evolution - the SafePeak deck from webcast 2014-04-24
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24Vladi Vexler
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachDataWorks Summit
 
Presentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - englishPresentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - englishJose Luis Sanchez del Coso
 
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Data
 
Solution Brief: Commvault & Red Hat Storage
Solution Brief: Commvault & Red Hat StorageSolution Brief: Commvault & Red Hat Storage
Solution Brief: Commvault & Red Hat StorageMarcel Hergaarden
 
Telco analytics at scale
Telco analytics at scaleTelco analytics at scale
Telco analytics at scaledatamantra
 
Reducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off OracleReducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off OracleEDB
 
Customer Education Webcast: New Features in Data Integration and Streaming CDC
Customer Education Webcast: New Features in Data Integration and Streaming CDCCustomer Education Webcast: New Features in Data Integration and Streaming CDC
Customer Education Webcast: New Features in Data Integration and Streaming CDCPrecisely
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsHisham Arafat
 
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, CouchbaseDatabase Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase✔ Eric David Benari, PMP
 

Tendances (20)

Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
 
A deep dive into neuton
A deep dive into neutonA deep dive into neuton
A deep dive into neuton
 
HP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataHP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big Data
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für Administratoren
 
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, BlazegraphDatabase Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
 
(ATS3-PLAT08) Optimizing Protocol Performance
(ATS3-PLAT08) Optimizing Protocol Performance(ATS3-PLAT08) Optimizing Protocol Performance
(ATS3-PLAT08) Optimizing Protocol Performance
 
sitMAI, Helping a Friend
sitMAI, Helping a FriendsitMAI, Helping a Friend
sitMAI, Helping a Friend
 
The Holy Grail of Data Analytics
The Holy Grail of Data AnalyticsThe Holy Grail of Data Analytics
The Holy Grail of Data Analytics
 
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24Data Caching Evolution - the SafePeak deck from webcast 2014-04-24
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
 
Presentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - englishPresentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - english
 
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time Analytics
 
Solution Brief: Commvault & Red Hat Storage
Solution Brief: Commvault & Red Hat StorageSolution Brief: Commvault & Red Hat Storage
Solution Brief: Commvault & Red Hat Storage
 
Telco analytics at scale
Telco analytics at scaleTelco analytics at scale
Telco analytics at scale
 
Reducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off OracleReducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off Oracle
 
SAP HANA Overview
SAP HANA OverviewSAP HANA Overview
SAP HANA Overview
 
Customer Education Webcast: New Features in Data Integration and Streaming CDC
Customer Education Webcast: New Features in Data Integration and Streaming CDCCustomer Education Webcast: New Features in Data Integration and Streaming CDC
Customer Education Webcast: New Features in Data Integration and Streaming CDC
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platforms
 
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, CouchbaseDatabase Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase
 

En vedette

Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...
Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...
Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...Nick Howlett
 
Facebook marketing event - Big data & social
Facebook marketing event - Big data & socialFacebook marketing event - Big data & social
Facebook marketing event - Big data & socialIskander Smit
 
Big data luiss Facebook and epistemology
Big data luiss Facebook and epistemologyBig data luiss Facebook and epistemology
Big data luiss Facebook and epistemologyTeresa Numerico
 
You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...Kai Wähner
 
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數Yuan CHAO
 
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據Yuan CHAO
 
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息Etu Solution
 
資料科學計劃的成果與展望
資料科學計劃的成果與展望資料科學計劃的成果與展望
資料科學計劃的成果與展望Johnson Hsieh
 
豆瓣数据架构实践
豆瓣数据架构实践豆瓣数据架构实践
豆瓣数据架构实践Xupeng Yun
 
優化宅的日常-數據分析篇
優化宅的日常-數據分析篇優化宅的日常-數據分析篇
優化宅的日常-數據分析篇Wanju Wang
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyNati Shalom
 
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 HadooperFred Chiang
 
Facebook Marketing Intelligence
Facebook Marketing IntelligenceFacebook Marketing Intelligence
Facebook Marketing IntelligenceGuido Picus
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldabaux singapore
 

En vedette (14)

Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...
Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...
Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...
 
Facebook marketing event - Big data & social
Facebook marketing event - Big data & socialFacebook marketing event - Big data & social
Facebook marketing event - Big data & social
 
Big data luiss Facebook and epistemology
Big data luiss Facebook and epistemologyBig data luiss Facebook and epistemology
Big data luiss Facebook and epistemology
 
You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...
 
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
 
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
 
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
 
資料科學計劃的成果與展望
資料科學計劃的成果與展望資料科學計劃的成果與展望
資料科學計劃的成果與展望
 
豆瓣数据架构实践
豆瓣数据架构实践豆瓣数据架构实践
豆瓣数据架构实践
 
優化宅的日常-數據分析篇
優化宅的日常-數據分析篇優化宅的日常-數據分析篇
優化宅的日常-數據分析篇
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper
 
Facebook Marketing Intelligence
Facebook Marketing IntelligenceFacebook Marketing Intelligence
Facebook Marketing Intelligence
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 

Similaire à Datacenter@Night: How Big Data Technologies Power Facebook

Pacemaker hadoop infrastructure and soft serve experience
Pacemaker   hadoop infrastructure and soft serve experiencePacemaker   hadoop infrastructure and soft serve experience
Pacemaker hadoop infrastructure and soft serve experienceVitaliy Bashun
 
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data ArchitectHadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data ArchitectSoftServe
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdataTom Rogers
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Perficient, Inc.
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSatish Mohan
 
Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Thomas W. Dinsmore
 
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseData Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseDataWorks Summit
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Altan Khendup
 
Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Dave Nielsen
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics PlatformN Masahiro
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4Michael Kehoe
 
high performance databases
high performance databaseshigh performance databases
high performance databasesmahdi_92
 
Birst for SAP HANA
Birst for SAP HANABirst for SAP HANA
Birst for SAP HANABirst
 
Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overviewRohit Jain
 

Similaire à Datacenter@Night: How Big Data Technologies Power Facebook (20)

Pacemaker hadoop infrastructure and soft serve experience
Pacemaker   hadoop infrastructure and soft serve experiencePacemaker   hadoop infrastructure and soft serve experience
Pacemaker hadoop infrastructure and soft serve experience
 
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data ArchitectHadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
 
Apache drill
Apache drillApache drill
Apache drill
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdata
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseData Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
 
Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Microservices - Is it time to breakup?
Microservices - Is it time to breakup?
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4
 
high performance databases
high performance databaseshigh performance databases
high performance databases
 
Graph Day 2017 Spring Boot
Graph Day 2017 Spring BootGraph Day 2017 Spring Boot
Graph Day 2017 Spring Boot
 
Birst for SAP HANA
Birst for SAP HANABirst for SAP HANA
Birst for SAP HANA
 
Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overview
 

Plus de Digicomp Academy AG

Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019
Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019
Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019Digicomp Academy AG
 
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...Digicomp Academy AG
 
Innovation durch kollaboration gennex 2018
Innovation durch kollaboration gennex 2018Innovation durch kollaboration gennex 2018
Innovation durch kollaboration gennex 2018Digicomp Academy AG
 
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handout
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handoutRoger basler meetup_digitale-geschaeftsmodelle-entwickeln_handout
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handoutDigicomp Academy AG
 
Roger basler meetup_21082018_work-smarter-not-harder_handout
Roger basler meetup_21082018_work-smarter-not-harder_handoutRoger basler meetup_21082018_work-smarter-not-harder_handout
Roger basler meetup_21082018_work-smarter-not-harder_handoutDigicomp Academy AG
 
Xing expertendialog zu nudge unit x
Xing expertendialog zu nudge unit xXing expertendialog zu nudge unit x
Xing expertendialog zu nudge unit xDigicomp Academy AG
 
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?Digicomp Academy AG
 
IPv6 Security Talk mit Joe Klein
IPv6 Security Talk mit Joe KleinIPv6 Security Talk mit Joe Klein
IPv6 Security Talk mit Joe KleinDigicomp Academy AG
 
Agiles Management - Wie geht das?
Agiles Management - Wie geht das?Agiles Management - Wie geht das?
Agiles Management - Wie geht das?Digicomp Academy AG
 
Gewinnen Sie Menschen und Ziele - Referat von Andi Odermatt
Gewinnen Sie Menschen und Ziele - Referat von Andi OdermattGewinnen Sie Menschen und Ziele - Referat von Andi Odermatt
Gewinnen Sie Menschen und Ziele - Referat von Andi OdermattDigicomp Academy AG
 
Querdenken mit Kreativitätsmethoden – XING Expertendialog
Querdenken mit Kreativitätsmethoden – XING ExpertendialogQuerdenken mit Kreativitätsmethoden – XING Expertendialog
Querdenken mit Kreativitätsmethoden – XING ExpertendialogDigicomp Academy AG
 
Xing LearningZ: Digitale Geschäftsmodelle entwickeln
Xing LearningZ: Digitale Geschäftsmodelle entwickelnXing LearningZ: Digitale Geschäftsmodelle entwickeln
Xing LearningZ: Digitale Geschäftsmodelle entwickelnDigicomp Academy AG
 
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only Building
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only BuildingSwiss IPv6 Council: The Cisco-Journey to an IPv6-only Building
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only BuildingDigicomp Academy AG
 
UX – Schlüssel zum Erfolg im Digital Business
UX – Schlüssel zum Erfolg im Digital BusinessUX – Schlüssel zum Erfolg im Digital Business
UX – Schlüssel zum Erfolg im Digital BusinessDigicomp Academy AG
 
Die IPv6 Journey der ETH Zürich
Die IPv6 Journey der ETH Zürich Die IPv6 Journey der ETH Zürich
Die IPv6 Journey der ETH Zürich Digicomp Academy AG
 
Xing LearningZ: Die 10 + 1 Trends im (E-)Commerce
Xing LearningZ: Die 10 + 1 Trends im (E-)CommerceXing LearningZ: Die 10 + 1 Trends im (E-)Commerce
Xing LearningZ: Die 10 + 1 Trends im (E-)CommerceDigicomp Academy AG
 
Zahlen Battle: klassische werbung vs.online-werbung-somexcloud
Zahlen Battle: klassische werbung vs.online-werbung-somexcloudZahlen Battle: klassische werbung vs.online-werbung-somexcloud
Zahlen Battle: klassische werbung vs.online-werbung-somexcloudDigicomp Academy AG
 
General data protection regulation-slides
General data protection regulation-slidesGeneral data protection regulation-slides
General data protection regulation-slidesDigicomp Academy AG
 

Plus de Digicomp Academy AG (20)

Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019
Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019
Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019
 
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...
 
Innovation durch kollaboration gennex 2018
Innovation durch kollaboration gennex 2018Innovation durch kollaboration gennex 2018
Innovation durch kollaboration gennex 2018
 
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handout
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handoutRoger basler meetup_digitale-geschaeftsmodelle-entwickeln_handout
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handout
 
Roger basler meetup_21082018_work-smarter-not-harder_handout
Roger basler meetup_21082018_work-smarter-not-harder_handoutRoger basler meetup_21082018_work-smarter-not-harder_handout
Roger basler meetup_21082018_work-smarter-not-harder_handout
 
Xing expertendialog zu nudge unit x
Xing expertendialog zu nudge unit xXing expertendialog zu nudge unit x
Xing expertendialog zu nudge unit x
 
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?
 
IPv6 Security Talk mit Joe Klein
IPv6 Security Talk mit Joe KleinIPv6 Security Talk mit Joe Klein
IPv6 Security Talk mit Joe Klein
 
Agiles Management - Wie geht das?
Agiles Management - Wie geht das?Agiles Management - Wie geht das?
Agiles Management - Wie geht das?
 
Gewinnen Sie Menschen und Ziele - Referat von Andi Odermatt
Gewinnen Sie Menschen und Ziele - Referat von Andi OdermattGewinnen Sie Menschen und Ziele - Referat von Andi Odermatt
Gewinnen Sie Menschen und Ziele - Referat von Andi Odermatt
 
Querdenken mit Kreativitätsmethoden – XING Expertendialog
Querdenken mit Kreativitätsmethoden – XING ExpertendialogQuerdenken mit Kreativitätsmethoden – XING Expertendialog
Querdenken mit Kreativitätsmethoden – XING Expertendialog
 
Xing LearningZ: Digitale Geschäftsmodelle entwickeln
Xing LearningZ: Digitale Geschäftsmodelle entwickelnXing LearningZ: Digitale Geschäftsmodelle entwickeln
Xing LearningZ: Digitale Geschäftsmodelle entwickeln
 
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only Building
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only BuildingSwiss IPv6 Council: The Cisco-Journey to an IPv6-only Building
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only Building
 
UX – Schlüssel zum Erfolg im Digital Business
UX – Schlüssel zum Erfolg im Digital BusinessUX – Schlüssel zum Erfolg im Digital Business
UX – Schlüssel zum Erfolg im Digital Business
 
Minenfeld IPv6
Minenfeld IPv6Minenfeld IPv6
Minenfeld IPv6
 
Was ist design thinking
Was ist design thinkingWas ist design thinking
Was ist design thinking
 
Die IPv6 Journey der ETH Zürich
Die IPv6 Journey der ETH Zürich Die IPv6 Journey der ETH Zürich
Die IPv6 Journey der ETH Zürich
 
Xing LearningZ: Die 10 + 1 Trends im (E-)Commerce
Xing LearningZ: Die 10 + 1 Trends im (E-)CommerceXing LearningZ: Die 10 + 1 Trends im (E-)Commerce
Xing LearningZ: Die 10 + 1 Trends im (E-)Commerce
 
Zahlen Battle: klassische werbung vs.online-werbung-somexcloud
Zahlen Battle: klassische werbung vs.online-werbung-somexcloudZahlen Battle: klassische werbung vs.online-werbung-somexcloud
Zahlen Battle: klassische werbung vs.online-werbung-somexcloud
 
General data protection regulation-slides
General data protection regulation-slidesGeneral data protection regulation-slides
General data protection regulation-slides
 

Dernier

VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...Suhani Kapoor
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communicationskarancommunications
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsP&CO
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxWorkforce Group
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876dlhescort
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayNZSG
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...amitlee9823
 
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Lviv Startup Club
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageMatteo Carbone
 
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key InsightsUnderstanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insightsseri bangash
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...Aggregage
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
A305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdfA305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdftbatkhuu1
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetDenis Gagné
 
Unlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfUnlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfOnline Income Engine
 

Dernier (20)

VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key InsightsUnderstanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insights
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
A305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdfA305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdf
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
 
Unlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfUnlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdf
 

Datacenter@Night: How Big Data Technologies Power Facebook

  • 1. 1 Kannan Muthukkaruppan & Karthik Ranganathan Jun/20/2013 How Big Data Technologies Power Facebook How Big Data Technologies Power Facebook Karthik Ranganathan September, 2013
  • 2. 2 Introduction Email: karthik@nutanix.com Twitter: @KarthikR Current: Member of Technical Staff, Nutanix Background: Technical Engineering Lead at Facebook. Co- built Cassandra for Facebook Inbox Search and improved performance and resiliency of Hbase for Facebook Messages and Search Indexing.
  • 3. 3 Agenda  Big data at Facebook  HBase use cases • OLTP • Analytics  Operating at scale  The Nutanix solution
  • 4. 4 Big Data at Facebook  OLTP • User databases (MySQL) • Photos (Haystack) • Facebook Messages, Operational Data Store (HBase)  Warehouse • Hive Analytics • Graph Search Indexing
  • 5. 5 HBase in a nutshell  Apache project, modeled after BigTable  Distributed, large scale data store  Built on top of Hadoop DFS (HDFS)  Efficient at random reads and writes
  • 6. 6 FB’s Largest Hbase Application Facebook Messages
  • 8. 8 Why HBase?  Evaluated a bunch of different options • MySQL, Cassandra, building a custom storage system for messages  Horizontal Scalability  Automatic failover and load balancing  Optimized for write-heavy workloads  HDFS already battle-tested at Facebook  HBase’s strong consistency model
  • 9. 9 Quick stats (as of Nov 2011)  Traffic to HBase • Billions of messages per day • 75B+ rpc’s per day  Usage pattern • 55% reads, 45% writes • Average write: 16 KV’s to multiple CF’s
  • 10. 10 Data Sizes  7PB+ online data • ~21PB with replication • LZO compressed • Excludes backups  Growth rate • 500TB+ per month • ~20PB of raw disk per year!
  • 11. 11 Growing with size  Constant need of features with growth  Read and write path improvements • Performance optimizations • IOPS reduction • New database file format  Intelligent data and compute placement • Shard level block placement • Locality based load-balancing
  • 12. 12 Other OLTP use cases of HBase  Operational Data Store  Multi-tenant KeyValue store  Site integrity – fighting spam
  • 13. 13 Warehouse use cases of HBase  Graph Search Indexing • Complex application logic • Multiple verticals  Hive over HBase • Realtime data ingest • Enables real-time analytics
  • 14. 14 Real-time monitoring and anomaly detection Operational Data Store
  • 15. 15 ODS: Facebook’s #1 Debugging Tool  Collects metrics from production servers  Supports complex aggregations and transformations  Really well-designed UI
  • 16. 16 Quick stats  Traffic to HBase • 150B+ ops per day  Usage pattern • Heavy reads of recent data • Frequent MR jobs for rollups • TTL to expire older data
  • 18. 18 Real-time URL/Domain Insights  Deep analytics for websites • Facebook widgets  Massive scale • Billions of URL’s • Millions of increments/sec
  • 19. 19 Detailed Insights  Tracks many metrics • Clicks, likes, shares, impr essions • Referral traffic  Detailed breakdown • Age buckets, gender, location
  • 21. 21 A Multi-tenant solution on HBase  Generic Key-Value store • Multiple apps on the same cluster • Transparent schema design • Simple API put(appid, key, value) value = get(appid, key)
  • 23. 23 Multi-tenancy Issues  Not a self-service model • Each app is reviewed  Global and per-app metrics • Monitor RPCs by type, latencies, errors • Friendly names for apps  If things went wrong • Per-app kill switch
  • 24. 24 Powering FB’s Semantic Search Engine Graph Search Indexing
  • 25. 25 Framework to build search indexes  Multiple, independent input sources  HBase stores document info  Output is the search index image rowKey = document id value = terms, document data
  • 27. 27 Do’s and Do-Not’s From Experience Operating at Scale
  • 28. 28 Design for failures(!)  Architect for failures and manageability  No single point of failure • Killing any process is legit  Minimize manual intervention • Especially for frequent failures  Uptime is important • Rolling upgrades are the norm • Need to survive rack failures
  • 29. 29 Dashboard and Metrics  Single place to graph/report everything  RPC calls  SLA misses • Latencies, p99, Errors • Per-request profiling  Cluster and node health  Network Utilization
  • 30. 30 Health Checks  Constantly monitor nodes  Auto-exclude nodes on failure • Machine not ssh-able • Hardware failures (HDD failure, etc) • Do NOT exclude on rack failures  Auto-include nodes once repaired  Rate limit remediation of nodes
  • 31. 31 In a nutshell…  Use commodity hardware  Scaling out is #1  Efficiency is #2 • though pretty close behind scale-out  Design for failures • Frequent failures must be auto handled  Metrics, Metrics, Metrics!
  • 33. 33 Nutanix compared with HBase  Evaluated a bunch of different options • MySQL, Cassandra, building a custom storage system for messages  Horizontal Scalability  Just add more nodes to scale out  Automatic failover and load balancing  When a node goes down, others take its place automatically  Load of node that went down is distributed to many others
  • 34. 34 Nutanix compared with HBase philosophy  Optimized for write-heavy workloads  Optimized for virtualized environments  Read and write heavy workloads  Transparent use of flash to boost perf  HDFS already battle-tested at Facebook  Nutanix is also quite battle-tested  HBase’s strong consistency model  Nutanix is also strongly consistent
  • 35. 35 Other aspects of Nutanix  Architected for failures and manageability  No single point of failure  Minimal manual intervention for frequent failures  Uptime is important  Rolling upgrades are the norm • Need to survive rack failures  Single place to graph/report everything  Prism UI to report and manage the entire cluster  Constantly monitor nodes  Auto-exclude nodes on failure
  • 36. 36 In a nutshell about Nutanix…  Runs on commodity hardware  Scaling out is #1  Drop in scale out for nodes  Efficiency is #2  Constant work on perf improvements  Design for failures  Frequent failures auto handled  Alerts in UI for many other states  Metrics, Metrics, Metrics!  Prism UI gives insights into the cluster health
  • 38. 38NUTANIX INC. – CONFIDENTIAL AND PROPRIETARY