SlideShare une entreprise Scribd logo
1  sur  50
Télécharger pour lire hors ligne
Gian Merlino, Matt Herman, Vivek Pasari, Samarth Jain
Druid Meetup
11/14/18 - Los Gatos, CA
@
Agenda
● Druid Deployment @ Netflix
● Scaling & Sketch Strings
● Druid Roadmap
● Q&A
Druid
Deployment &
Use Cases
@Netflix
Overview
● Druid in Netflix D/W
● Data ingestion
● Deploying Druid @ Netflix
● Use Cases
Netflix Data Warehouse Pipeline
Druid Ingestion
● Batch ingestion
● Druid hadoop indexer
● Input - Hive Text/Parquet tables
● S3 deep storage
Druid Ingestion
Druid Ingestion
import BigDataApi as bda
tbl = bda.Table("hive/table_name")
# build spec
spec = bda.druid.DruidSpec.from_table(tbl)
.spec(ingestion_spec.json)
# create a job from the spec
job = bda.genie.DruidIndexerJob(spec)
.cluster(kg.druid.clusters.DRUID_CLUSTER_NAME)
# submit the job…
job.execute()
Druid Cluster @ Netflix
● r 4.16 x large instance type
● 0.12.2 version
● ~100s nodes
Multitenancy
● Single Tier
● Router
○ Ad hoc
○ Experimental - broker downtime acceptable. Used
for query fine tuning etc.
○ Reporting - pre-defined queries /dashboards
Autoscale
● Favor segments in memory
● Autoscale up - cluster disk utilization beyond 80%
● Handle large data ingestion without having to worry
about cluster tripping over
Deployment Pipeline
● Spinnaker (https://www.spinnaker.io/)
● Clusters upgraded using red black
○ Jenkins jobs - druid tar ball and debian package
○ Deploy components with new code line
○ Wait for segments to load
○ Switch dns records
○ Scale down old cluster
● Rollback
○ Switch dns back to old cluster
Deployment Pipeline
Use Cases
● Dashboard backend
● Sub second query times
○ User interactive slice and dice
○ Longer data retention vs Redshift
○ More dimensions vs Redshift
● Custom UI
AWS Capacity Planning
AWS Capacity Planning
AWS Capacity Planning
Other use cases
● Payments analysis
● Algorithms comparison
● Security
● Quality of Experience (QoE)
Future work
● Real time ingestion
○ Tranquility or Kafka indexing
● Open source T-Digest based Histogram module
● Investigate tiering
● Change auto-scaling policy considering EBS
Scaling & Sketch
Strings
How Netflix Processes 160B
Daily Customer Actions to
Monitor Client Performance
#netflixeverywhere
“With this launch, consumers around the world
will be able to enjoy TV shows and movies
simultaneously -- no more waiting. With the help
of the Internet, we are putting power in
consumers’ hands to watch whenever, wherever
and on whatever device.”
“With this launch, consumers around the world
will be able to enjoy TV shows and movies
simultaneously -- no more waiting. With the help
of the Internet, we are putting power in
consumers’ hands to watch whenever, wherever
and on whatever device.”
“With this launch, consumers around the world
will be able to enjoy TV shows and movies
simultaneously -- no more waiting. With the help
of the Internet, we are putting power in
consumers’ hands to watch whenever, wherever
and on whatever device.”
“With this launch, consumers around the world
will be able to enjoy TV shows and movies
simultaneously -- no more waiting. With the help
of the Internet, we are putting power in
consumers’ hands to watch whenever, wherever
and on whatever device.”
“With this launch, consumers around the world
will be able to enjoy TV shows and movies
simultaneously -- no more waiting. With the help
of the Internet, we are putting power in
consumers’ hands to watch whenever, wherever
and on whatever device.”
● 160 Billion client side
data points daily
● 135+ million members
● 190 countries
● 300 million devices
● 4 major UI platforms
TVUI, Web, iOS,
Android
Measure Everything Consistently
Client Performance
● Metrics
○ App launch times
○ Play delay
○ Details page time
● 29 dimensions
○ Geo
○ Network
○ Device
○ AB test cell
Client Performance Metrics
Architecture
Show Me
The Data
Summarize Instead
Anscombe’s Quartet
Saved by Sketch Strings
Box & Whisker Plots
Median application load times are similar
Country B has a larger IQR and long tail
Cumulative Distribution Functions
Recap
● Ingesting consistent and
highly dimensional data
● Analyzing data via custom
web visualizations
● Summarizing responsibly via
sketch strings
● Druid helps us provide the
best customer experience
Druid Roadmap
roadmap and community update
Gian Merlino
gian@imply.io
Who am I?
Gian Merlino
Committer & PMC member on
Cofounder at
>10 years working on scalable systems
40
Druid 0.13.0
…and beyond!!
Druid 0.13.0
400 new features and bug fixes from 81 contributors!
42
Druid 0.13.0
Our first Apache release!
(After years as an independent project.)
43
Druid 0.13.0
● Native parallel batch indexing (phase 1)
● Automatic compaction (phase 1)
● Ingestion statistics and errors via API
● SQL system tables: segments, tasks, servers
● SQL standard-compliant null handling option
● Additional aggregators (stringFirst/stringLast, new HllSketch)
● Support for multiple grouping specs in groupBy query
● Backpressure, compact result formats for large result sets
44
…and beyond!!
● Native parallel batch indexing (phase 2)
● Automatic compaction (phase 2)
● Smaller, faster compression (FastPFOR, etc)
● Faster quantiles: Fixed-bin histograms, moments sketches
● Dynamic prioritization
● Simpler, self-configuring deployment
● … your item here!!
45
Try this at home
46
Download
Druid community site (current): http://druid.io/
Druid community site (new): https://druid.apache.org/
Imply distribution: https://imply.io/get-started
47
Contribute
48
https://github.com/apache/druid
Stay in touch
49
@druidio
http://druid.io/community
Q&A

Contenu connexe

Tendances

Apache Druid Design and Future prospect
Apache Druid Design and Future prospectApache Druid Design and Future prospect
Apache Druid Design and Future prospectc-bslim
 
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.What does Netflix, NTT and Rubicon Project have in common? Apache Druid.
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.Rommel Garcia
 
Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...
Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...
Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...HostedbyConfluent
 
Building a Cross Cloud Data Protection Engine
Building a Cross Cloud Data Protection EngineBuilding a Cross Cloud Data Protection Engine
Building a Cross Cloud Data Protection EngineDatabricks
 
Redis for Fast Data Ingest
Redis for Fast Data IngestRedis for Fast Data Ingest
Redis for Fast Data IngestRedis Labs
 
Elephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud readyElephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud readyKrzysztof Adamski
 
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
ARCHITECTING INFLUXENTERPRISE FOR SUCCESS
ARCHITECTING INFLUXENTERPRISE FOR SUCCESSARCHITECTING INFLUXENTERPRISE FOR SUCCESS
ARCHITECTING INFLUXENTERPRISE FOR SUCCESSInfluxData
 
RedisConf17 - Real-time Intelligence with Redis-ML and Apache Spark
RedisConf17 - Real-time Intelligence with Redis-ML and Apache SparkRedisConf17 - Real-time Intelligence with Redis-ML and Apache Spark
RedisConf17 - Real-time Intelligence with Redis-ML and Apache SparkRedis Labs
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterSpark Summit
 
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElasticsearch
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformNatalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformmatteo mazzeri
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingAmir Sedighi
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
Redis Day TLV 2018 - Redis & BioCatch
Redis Day TLV 2018 - Redis & BioCatchRedis Day TLV 2018 - Redis & BioCatch
Redis Day TLV 2018 - Redis & BioCatchRedis Labs
 
Taking Your Database Global with Kubernetes
Taking Your Database Global with KubernetesTaking Your Database Global with Kubernetes
Taking Your Database Global with KubernetesChristopher Bradford
 
Virtual training intro to InfluxDB - June 2021
Virtual training  intro to InfluxDB  - June 2021Virtual training  intro to InfluxDB  - June 2021
Virtual training intro to InfluxDB - June 2021InfluxData
 

Tendances (20)

Apache Druid Design and Future prospect
Apache Druid Design and Future prospectApache Druid Design and Future prospect
Apache Druid Design and Future prospect
 
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.What does Netflix, NTT and Rubicon Project have in common? Apache Druid.
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.
 
Big Data Applications
Big Data ApplicationsBig Data Applications
Big Data Applications
 
Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...
Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...
Advanced Change Data Streaming Patterns in Distributed Systems | Gunnar Morli...
 
Building a Cross Cloud Data Protection Engine
Building a Cross Cloud Data Protection EngineBuilding a Cross Cloud Data Protection Engine
Building a Cross Cloud Data Protection Engine
 
Elastic Stack Roadmap
Elastic Stack RoadmapElastic Stack Roadmap
Elastic Stack Roadmap
 
Redis for Fast Data Ingest
Redis for Fast Data IngestRedis for Fast Data Ingest
Redis for Fast Data Ingest
 
Elephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud readyElephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud ready
 
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
 
ARCHITECTING INFLUXENTERPRISE FOR SUCCESS
ARCHITECTING INFLUXENTERPRISE FOR SUCCESSARCHITECTING INFLUXENTERPRISE FOR SUCCESS
ARCHITECTING INFLUXENTERPRISE FOR SUCCESS
 
RedisConf17 - Real-time Intelligence with Redis-ML and Apache Spark
RedisConf17 - Real-time Intelligence with Redis-ML and Apache SparkRedisConf17 - Real-time Intelligence with Redis-ML and Apache Spark
RedisConf17 - Real-time Intelligence with Redis-ML and Apache Spark
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
 
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElastic Stack roadmap deep dive
Elastic Stack roadmap deep dive
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformNatalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData Processing
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
Redis Day TLV 2018 - Redis & BioCatch
Redis Day TLV 2018 - Redis & BioCatchRedis Day TLV 2018 - Redis & BioCatch
Redis Day TLV 2018 - Redis & BioCatch
 
Taking Your Database Global with Kubernetes
Taking Your Database Global with KubernetesTaking Your Database Global with Kubernetes
Taking Your Database Global with Kubernetes
 
Virtual training intro to InfluxDB - June 2021
Virtual training  intro to InfluxDB  - June 2021Virtual training  intro to InfluxDB  - June 2021
Virtual training intro to InfluxDB - June 2021
 

Similaire à Druid meetup @ Netflix (11/14/2018 )

Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 
Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1aspyker
 
Easy Microservices with JHipster - Devoxx BE 2017
Easy Microservices with JHipster - Devoxx BE 2017Easy Microservices with JHipster - Devoxx BE 2017
Easy Microservices with JHipster - Devoxx BE 2017Deepu K Sasidharan
 
Devoxx Belgium 2017 - easy microservices with JHipster
Devoxx Belgium 2017 - easy microservices with JHipsterDevoxx Belgium 2017 - easy microservices with JHipster
Devoxx Belgium 2017 - easy microservices with JHipsterJulien Dubois
 
node.js on Google Compute Engine
node.js on Google Compute Enginenode.js on Google Compute Engine
node.js on Google Compute EngineArun Nagarajan
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015aspyker
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Programaspyker
 
Building a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixBuilding a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixAll Things Open
 
Big data in action
Big data in actionBig data in action
Big data in actionTu Pham
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesSrinath Perera
 
Setting up InfluxData for IoT
Setting up InfluxData for IoTSetting up InfluxData for IoT
Setting up InfluxData for IoTInfluxData
 
DockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General SessionDockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General SessionDocker, Inc.
 
Netflix Architecture and Open Source
Netflix Architecture and Open SourceNetflix Architecture and Open Source
Netflix Architecture and Open SourceAll Things Open
 
Google Cloud - Scale With A Smile (Dec 2014)
Google Cloud - Scale With A Smile (Dec 2014)Google Cloud - Scale With A Smile (Dec 2014)
Google Cloud - Scale With A Smile (Dec 2014)Ido Green
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsMonal Daxini
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the CloudAmihay Zer-Kavod
 
[Public] 7 archetipi della tecnologia moderna [italy]
[Public] 7 archetipi della tecnologia moderna [italy][Public] 7 archetipi della tecnologia moderna [italy]
[Public] 7 archetipi della tecnologia moderna [italy]Nicolas Bortolotti
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding HadoopAhmed Ossama
 
JHipster Code 2020 keynote
JHipster Code 2020 keynoteJHipster Code 2020 keynote
JHipster Code 2020 keynoteJulien Dubois
 

Similaire à Druid meetup @ Netflix (11/14/2018 ) (20)

Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1
 
Easy Microservices with JHipster - Devoxx BE 2017
Easy Microservices with JHipster - Devoxx BE 2017Easy Microservices with JHipster - Devoxx BE 2017
Easy Microservices with JHipster - Devoxx BE 2017
 
Devoxx Belgium 2017 - easy microservices with JHipster
Devoxx Belgium 2017 - easy microservices with JHipsterDevoxx Belgium 2017 - easy microservices with JHipster
Devoxx Belgium 2017 - easy microservices with JHipster
 
node.js on Google Compute Engine
node.js on Google Compute Enginenode.js on Google Compute Engine
node.js on Google Compute Engine
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Program
 
Building a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixBuilding a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at Netflix
 
Big data in action
Big data in actionBig data in action
Big data in action
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
 
Setting up InfluxData for IoT
Setting up InfluxData for IoTSetting up InfluxData for IoT
Setting up InfluxData for IoT
 
DockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General SessionDockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General Session
 
Data-Driven @ Netflix
Data-Driven @ NetflixData-Driven @ Netflix
Data-Driven @ Netflix
 
Netflix Architecture and Open Source
Netflix Architecture and Open SourceNetflix Architecture and Open Source
Netflix Architecture and Open Source
 
Google Cloud - Scale With A Smile (Dec 2014)
Google Cloud - Scale With A Smile (Dec 2014)Google Cloud - Scale With A Smile (Dec 2014)
Google Cloud - Scale With A Smile (Dec 2014)
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data Problems
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the Cloud
 
[Public] 7 archetipi della tecnologia moderna [italy]
[Public] 7 archetipi della tecnologia moderna [italy][Public] 7 archetipi della tecnologia moderna [italy]
[Public] 7 archetipi della tecnologia moderna [italy]
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
 
JHipster Code 2020 keynote
JHipster Code 2020 keynoteJHipster Code 2020 keynote
JHipster Code 2020 keynote
 

Dernier

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 

Dernier (20)

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 

Druid meetup @ Netflix (11/14/2018 )

  • 1. Gian Merlino, Matt Herman, Vivek Pasari, Samarth Jain Druid Meetup 11/14/18 - Los Gatos, CA @
  • 2. Agenda ● Druid Deployment @ Netflix ● Scaling & Sketch Strings ● Druid Roadmap ● Q&A
  • 4. Overview ● Druid in Netflix D/W ● Data ingestion ● Deploying Druid @ Netflix ● Use Cases
  • 6.
  • 7. Druid Ingestion ● Batch ingestion ● Druid hadoop indexer ● Input - Hive Text/Parquet tables ● S3 deep storage
  • 9. Druid Ingestion import BigDataApi as bda tbl = bda.Table("hive/table_name") # build spec spec = bda.druid.DruidSpec.from_table(tbl) .spec(ingestion_spec.json) # create a job from the spec job = bda.genie.DruidIndexerJob(spec) .cluster(kg.druid.clusters.DRUID_CLUSTER_NAME) # submit the job… job.execute()
  • 10. Druid Cluster @ Netflix ● r 4.16 x large instance type ● 0.12.2 version ● ~100s nodes
  • 11. Multitenancy ● Single Tier ● Router ○ Ad hoc ○ Experimental - broker downtime acceptable. Used for query fine tuning etc. ○ Reporting - pre-defined queries /dashboards
  • 12. Autoscale ● Favor segments in memory ● Autoscale up - cluster disk utilization beyond 80% ● Handle large data ingestion without having to worry about cluster tripping over
  • 13. Deployment Pipeline ● Spinnaker (https://www.spinnaker.io/) ● Clusters upgraded using red black ○ Jenkins jobs - druid tar ball and debian package ○ Deploy components with new code line ○ Wait for segments to load ○ Switch dns records ○ Scale down old cluster ● Rollback ○ Switch dns back to old cluster
  • 15. Use Cases ● Dashboard backend ● Sub second query times ○ User interactive slice and dice ○ Longer data retention vs Redshift ○ More dimensions vs Redshift ● Custom UI
  • 19.
  • 20.
  • 21.
  • 22. Other use cases ● Payments analysis ● Algorithms comparison ● Security ● Quality of Experience (QoE)
  • 23. Future work ● Real time ingestion ○ Tranquility or Kafka indexing ● Open source T-Digest based Histogram module ● Investigate tiering ● Change auto-scaling policy considering EBS
  • 24. Scaling & Sketch Strings How Netflix Processes 160B Daily Customer Actions to Monitor Client Performance
  • 26. “With this launch, consumers around the world will be able to enjoy TV shows and movies simultaneously -- no more waiting. With the help of the Internet, we are putting power in consumers’ hands to watch whenever, wherever and on whatever device.” “With this launch, consumers around the world will be able to enjoy TV shows and movies simultaneously -- no more waiting. With the help of the Internet, we are putting power in consumers’ hands to watch whenever, wherever and on whatever device.” “With this launch, consumers around the world will be able to enjoy TV shows and movies simultaneously -- no more waiting. With the help of the Internet, we are putting power in consumers’ hands to watch whenever, wherever and on whatever device.” “With this launch, consumers around the world will be able to enjoy TV shows and movies simultaneously -- no more waiting. With the help of the Internet, we are putting power in consumers’ hands to watch whenever, wherever and on whatever device.” “With this launch, consumers around the world will be able to enjoy TV shows and movies simultaneously -- no more waiting. With the help of the Internet, we are putting power in consumers’ hands to watch whenever, wherever and on whatever device.”
  • 27. ● 160 Billion client side data points daily ● 135+ million members ● 190 countries ● 300 million devices ● 4 major UI platforms TVUI, Web, iOS, Android Measure Everything Consistently
  • 29. ● Metrics ○ App launch times ○ Play delay ○ Details page time ● 29 dimensions ○ Geo ○ Network ○ Device ○ AB test cell Client Performance Metrics
  • 34. Saved by Sketch Strings
  • 35. Box & Whisker Plots Median application load times are similar Country B has a larger IQR and long tail
  • 37. Recap ● Ingesting consistent and highly dimensional data ● Analyzing data via custom web visualizations ● Summarizing responsibly via sketch strings ● Druid helps us provide the best customer experience
  • 39. roadmap and community update Gian Merlino gian@imply.io
  • 40. Who am I? Gian Merlino Committer & PMC member on Cofounder at >10 years working on scalable systems 40
  • 42. Druid 0.13.0 400 new features and bug fixes from 81 contributors! 42
  • 43. Druid 0.13.0 Our first Apache release! (After years as an independent project.) 43
  • 44. Druid 0.13.0 ● Native parallel batch indexing (phase 1) ● Automatic compaction (phase 1) ● Ingestion statistics and errors via API ● SQL system tables: segments, tasks, servers ● SQL standard-compliant null handling option ● Additional aggregators (stringFirst/stringLast, new HllSketch) ● Support for multiple grouping specs in groupBy query ● Backpressure, compact result formats for large result sets 44
  • 45. …and beyond!! ● Native parallel batch indexing (phase 2) ● Automatic compaction (phase 2) ● Smaller, faster compression (FastPFOR, etc) ● Faster quantiles: Fixed-bin histograms, moments sketches ● Dynamic prioritization ● Simpler, self-configuring deployment ● … your item here!! 45
  • 46. Try this at home 46
  • 47. Download Druid community site (current): http://druid.io/ Druid community site (new): https://druid.apache.org/ Imply distribution: https://imply.io/get-started 47
  • 50. Q&A