SlideShare a Scribd company logo
1 of 19
Copyright © 2016 kt NexR. All rights reserved.
1
Druid
Overview of real time
indexing
Azrael Seoeun Park
seoeun25@gmail.com
Copyright © 2016 kt NexR. All rights reserved.
2
Introduction
• Indexing Service
• Design Architecture
• Tranquility
• Task Spec
• Firehose
• Plumber
• Tranquility Configs
• Flow of Realtime Indexing
Copyright © 2016 kt NexR. All rights reserved.
3
Indexing Service
• Indexing Service
– Runs indexing task that create druid segment
• Indexing task type
– Index_realtime
– Index_hadoop : batch ingestion
– Index
Indexing
Service
Data
Source
Segment
Submit
Task
Copyright © 2016 kt NexR. All rights reserved.
4
Design Architecture
Deep Storage
Tranquility
CoordinatorTranquility
Broker
Tranquility
Indexing Service
Overlord
MiddleManager
Peon Peon Peon
ZooKeeper
Kafka
SparkStreaming
Storm
task
Task
(realtime
_index)
Segments
segment
Segment-
cache
Historical
Segment-
cache
Historical
Copyright © 2016 kt NexR. All rights reserved.
5
Tranquility
• Send event streams to Druid in real-time
• Written in Scala
• Samza, Spark, Strom, Kafka, Flink
• Tranquility Kafka
– Submit realtime indexing task to overlord
• Post http request with task spec
– Pull the data from kafka
– Push the data to realtime indexing task
Copyright © 2016 kt NexR. All rights reserved.
6
Task Spec
• "type" : "index_realtime",
• "id" : "index_realtime_sip_2016-05-17T04:00:00.000Z_0_0",
• "spec" :
– "dataSchema" :
• "dataSource" : "sip",
• "parser" : {
– "parseSpec"
» "format" : "json",
» "timestampSpec”
» "dimensionsSpec”
• "metricsSpec”
• "granularitySpec"
– "segmentGranularity" : "TEN_MINUTE”
– "queryGranularity" : “MINUTE”
– "ioConfig" :
– "tuningConfig" :
Data Ingestion : index
firehose
plumber
RealtimeIndexTask
Copyright © 2016 kt NexR. All rights reserved.
7
RealtimeIndexTask
RealtimeIndexTask
Firehose
Plumber
Data
Source
Segment
IncremetalIndex
IndexMerger
Copyright © 2016 kt NexR. All rights reserved.
8
Firehose
• Pipe line to read data
• Type
– LocalFirehose
• Local file 과 연결
– IngestSegmentFirehose
• Existing druid segment
– CombiningFilrehose
– EventReceiverFirehose
• Ingest event using an http endpoint
– TimedShutoffFirehose
• Shutdown at a specified time
Copyright © 2016 kt NexR. All rights reserved.
9
Firehose – real time index example
• Firehose
– Shutoff time : segmentGranulariy + windowPeriod + firehoseGracePeriod
– Buffer size : firehouseBufferSize
“ioConfig" : {
"type" : "realtime",
"firehose" : {
"type" : "clipped",
"delegate" : {
"type" : "timed",
"delegate" : {
"type" : "receiver",
"serviceName" : "firehose:druid:overlord:sip-00-0000-0000",
"bufferSize" : 100000
},
"shutoffTime" : "2016-05-17T04:15:00.000Z"
}
"interval" : "2016-05-17T04:00:00.000Z/2016-05-17T04:10:00.000Z”
EventReceiver
buffer
Timed : shutofftime
Clip
Tranquility
nextRow
Plumber
push
Copyright © 2016 kt NexR. All rights reserved.
10
Plumber
• Generate segment
– Intermediate persist
• Indexing Task 실행 중에 index를 segment로 저장
• local에 있는 base directory에 segment 저장
• 동일한 id로 Task를 다시 실행 할 때 intermediate persist로 저장된 segment를 복구할
수 있다.
– when the task finish
• Indexing Task가 끝나면 전체 index를 segment로 저장
• DeepStorage에 segment를 push
• Type
– YeOldePlumber
• This plumber creates single historical segments.
– RealtimePlumber
• This plumber creates real-time/mutable segments.
Copyright © 2016 kt NexR. All rights reserved.
11
How to indexing
Firehose
row
currentHydrant
Index
FireHydrant
Index
Plumber
FireHydrant
Index
Sink
Intermediate Persist
…/task
/index_realtime_sip_2016-
05-18T05:10:00.000Z_0_0
/work
/persist/sip/2016-05-18T0
5:10:00.000Z_2016-05-18T
05:20:00.000Z
/0
/v8-tmp
Persist and Merge
…/task
/index_realtime_sip_2016-
05-18T05:10:00.000Z_0_0
/work
/persist/sip/2016-05-18T0
5:10:00.000Z_2016-05-18T
05:20:00.000Z
/merged
/v8-tmp
Copyright © 2016 kt NexR. All rights reserved.
12
Abstraction of index
• Structure
– TimeAndDims
– Aggregators
{"timestamp": "2016-05-18T04:31:39Z", "sip": "a", "packet_total": "10"}
{"timestamp": "2016-05-18T04:31:39Z", "sip": "b", "packet_total": "3"}
{"timestamp": "2016-05-18T04:33:42Z", "sip": "a", "packet_total": "10"}
{"timestamp": "2016-05-18T04:33:42Z", "sip": "c", "packet_total": "5"}
{"timestamp": "2016-05-18T04:37:55Z", "sip": "a", "packet_total": "10"}
{"timestamp": "2016-05-18T04:45:11Z", "sip": "a", "packet_total": "7"}
{"timestamp": "2016-05-18T04:45:11Z", "sip": "b", "packet_total": "8"}
{"timestamp": "2016-05-18T04:45:22Z", "sip": "b", "packet_total": "8"}
time sip sum
04:30:00 a (10)(10)(10)=30
04:30:00 b 3
04:30:00 c 5
04:40:00 a 7
04:40:00 b (8)(8)=16
Time 단위 : queryGranularity.
- queryGrandularity 가 1 row를 의미
Copyright © 2016 kt NexR. All rights reserved.
13
Plumber – real time index example
• Plumber
– maxRowsInMemory: persist하기 전에 최대 max row
– intermediatePersistPeriod: persist 주기
– maxPendingPersist: pending 할 수 있는 persist 갯수. 0 = 한 개의 persist만 실
행
• How to set persist period?
– maxRowsInMemory가 크면 메모리 사용량 증가
– intermedatePersistPeriod 가 빠르면 메모리 사용량 증가
• 주기가 느리면 recovery시 데이터 유실이 많이 된다
• stream processing의 recovery는 batch로 보완
"tuningConfig" : {
"type" : "realtime",
"maxRowsInMemory" : 1000,
"intermediatePersistPeriod" : "PT2M",
"windowPeriod" : "PT1S",
"basePersistDirectory" : "/Users/seoeun/libs/druid-0.9.0/var/tmp/1463447509447-0",
"maxPendingPersists" : 0,
….}
Copyright © 2016 kt NexR. All rights reserved.
14
Segment
• Files that store index, partitioned by time.
• Created for each time interval configured in “SegmentGranularity”
• File size: 300mb – 700mb
• Row number: 5 million
• Components
– version.bin
– meta.smoosh
– XXXXX.smoosh
Copyright © 2016 kt NexR. All rights reserved.
15
Tranquility kafka config – about druid
• druid.discovery.curator.path
– Curator service discovery path
– /druid/discovery
• druid.selectors.indexing.serviceName
– Overlord node 의 service name
– druid/overlord
• druidBeam.firehoseBufferSize
– Size of buffer used by firehose to store events.
– 100,000
• druidBeam.firehoseChunkSize
– Maximum number of events to send to Druid in one HTTP request.
– 1,000
• druidBeam.firehoseGracePeriod
– Druid indexing tasks will shut down this long after the windowPeriod has elapsed.
– PT5M
• task.partitions
– Number of Druid partitions to create.
– 1
• task.replicants
– Number of instances of each Druid partition to create.
– 1
Copyright © 2016 kt NexR. All rights reserved.
16
Tranquility kafka config – about tranquility
• tranquility.blockOnFull
– Whether "send" will block (true) or throw an exception (false) when called whi
le the outgoing queue is full.
– true
• tranquility.lingerMillis
– Wait this long for batches to collect more messages (up to maxBatchSize) bef
ore sending them.
– 0 (disable waiting)
• tranquility.maxBatchSize
– Maximum number of messages to send at once.
– 2,000
• tranquility.maxPendingBatches
– Maximum number of batches that can be pending
– 5
Copyright © 2016 kt NexR. All rights reserved.
17
Tranquility kafka config – about kafka
• kafka.group.id
– Group ID for Kafka consumers. This must be the same!
– tranquility-kafka
• consumer.numThreads
– The number of threads that will be made available to the Kafka consumer for
fetching messages.
– -1
– Partition number/ consumers로 계산하여 분산
• commit.periodMillis
– The frequency with which consumer offsets will be committed to ZooKeeper t
o track processed messages.
– 15000
• kafka.*
– kafka. will be passed to the underlying Kafka consumer with the kafka. prefix
removed.
– kafka.fetch.message.max.bytes  fetch.message.max.bytes
• The number of bytes of messages to attempt to fetch for each topic-partition in eac
h fetch request.
• Partition 수 만큼 곱하게 되므로 메모리 사용 유의
• broker의 message.max.bytes 보다 크게 설정
Copyright © 2016 kt NexR. All rights reserved.
18
Flow of real time indexing
Tranquility
Overlord
Zookeeper
Middle
Manager
Segment
Deep
Storage
submit
task
Realtime
IndexTask
firehose
plumber
push
data task
firehose
segement
forking
Task
task
task
firehose
Copyright © 2016 kt NexR. All rights reserved.
19
Q&A

More Related Content

What's hot

Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerMichael Spector
 
Data Analytics with Druid
Data Analytics with DruidData Analytics with Druid
Data Analytics with DruidYousun Jeong
 
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015NoSQLmatters
 
Lightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingLightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingMongoDB
 
Real-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and DruidReal-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and DruidJan Graßegger
 
Counters At Scale - A Cautionary Tale
Counters At Scale - A Cautionary TaleCounters At Scale - A Cautionary Tale
Counters At Scale - A Cautionary TaleEric Lubow
 
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016DataStax
 
Chronix Time Series Database - The New Time Series Kid on the Block
Chronix Time Series Database - The New Time Series Kid on the BlockChronix Time Series Database - The New Time Series Kid on the Block
Chronix Time Series Database - The New Time Series Kid on the BlockQAware GmbH
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB MongoDB
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB
 
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and Superset
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and SupersetInteractive Realtime Dashboards on Data Streams using Kafka, Druid and Superset
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and SupersetHortonworks
 
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...DataStax
 
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...Databricks
 
Need for Time series Database
Need for Time series DatabaseNeed for Time series Database
Need for Time series DatabasePramit Choudhary
 
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016DataStax
 
I have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingI have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingDavid Murphy
 
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Alexey Kharlamov
 

What's hot (20)

Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at Appsflyer
 
druid.io
druid.iodruid.io
druid.io
 
Data Analytics with Druid
Data Analytics with DruidData Analytics with Druid
Data Analytics with Druid
 
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
 
Lightning Talk: MongoDB Sharding
Lightning Talk: MongoDB ShardingLightning Talk: MongoDB Sharding
Lightning Talk: MongoDB Sharding
 
Real-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and DruidReal-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and Druid
 
Counters At Scale - A Cautionary Tale
Counters At Scale - A Cautionary TaleCounters At Scale - A Cautionary Tale
Counters At Scale - A Cautionary Tale
 
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
 
Druid
DruidDruid
Druid
 
Chronix Time Series Database - The New Time Series Kid on the Block
Chronix Time Series Database - The New Time Series Kid on the BlockChronix Time Series Database - The New Time Series Kid on the Block
Chronix Time Series Database - The New Time Series Kid on the Block
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: Sharding
 
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and Superset
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and SupersetInteractive Realtime Dashboards on Data Streams using Kafka, Druid and Superset
Interactive Realtime Dashboards on Data Streams using Kafka, Druid and Superset
 
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
 
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
 
Need for Time series Database
Need for Time series DatabaseNeed for Time series Database
Need for Time series Database
 
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
 
I have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingI have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced Sharding
 
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
 

Viewers also liked

Interactive analytics at scale with druid
Interactive analytics at scale with druidInteractive analytics at scale with druid
Interactive analytics at scale with druidJulien Lavigne du Cadet
 
Case Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with DruidCase Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with DruidSalil Kalia
 
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)SANG WON PARK
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidDataWorks Summit
 
Druid at Hadoop Ecosystem
Druid at Hadoop EcosystemDruid at Hadoop Ecosystem
Druid at Hadoop EcosystemSlim Bouguerra
 
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and DruidPulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and DruidTony Ng
 
Druid at SF Big Analytics 2015-12-01
Druid at SF Big Analytics 2015-12-01Druid at SF Big Analytics 2015-12-01
Druid at SF Big Analytics 2015-12-01gianmerlino
 
PayPal Real Time Analytics
PayPal  Real Time AnalyticsPayPal  Real Time Analytics
PayPal Real Time AnalyticsAnil Madan
 
Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1Anuj Jain
 
Monitoring @ scale over diverse data sources @ PayPal - Druid, TSDB, Hadoop
Monitoring @ scale over diverse data sources @ PayPal  - Druid, TSDB, HadoopMonitoring @ scale over diverse data sources @ PayPal  - Druid, TSDB, Hadoop
Monitoring @ scale over diverse data sources @ PayPal - Druid, TSDB, HadoopSenthil Pandurangan
 
Using druid for interactive count distinct queries at scale @ nmc
Using druid  for interactive count distinct queries at scale @ nmcUsing druid  for interactive count distinct queries at scale @ nmc
Using druid for interactive count distinct queries at scale @ nmcIdo Shilon
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in PracticeC4Media
 
Go Faster with Ansible (PHP meetup)
Go Faster with Ansible (PHP meetup)Go Faster with Ansible (PHP meetup)
Go Faster with Ansible (PHP meetup)Richard Donkin
 
보안프로젝트 세미나 Viper-v1.2
보안프로젝트 세미나 Viper-v1.2보안프로젝트 세미나 Viper-v1.2
보안프로젝트 세미나 Viper-v1.2Jason Choi
 
DevOps with Ansible
DevOps with AnsibleDevOps with Ansible
DevOps with AnsibleSwapnil Jain
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitecturePerficient, Inc.
 

Viewers also liked (19)

Scalable Real-time analytics using Druid
Scalable Real-time analytics using DruidScalable Real-time analytics using Druid
Scalable Real-time analytics using Druid
 
Interactive analytics at scale with druid
Interactive analytics at scale with druidInteractive analytics at scale with druid
Interactive analytics at scale with druid
 
Case Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with DruidCase Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with Druid
 
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
 
Druid at Hadoop Ecosystem
Druid at Hadoop EcosystemDruid at Hadoop Ecosystem
Druid at Hadoop Ecosystem
 
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and DruidPulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
 
Druid at SF Big Analytics 2015-12-01
Druid at SF Big Analytics 2015-12-01Druid at SF Big Analytics 2015-12-01
Druid at SF Big Analytics 2015-12-01
 
PayPal Real Time Analytics
PayPal  Real Time AnalyticsPayPal  Real Time Analytics
PayPal Real Time Analytics
 
Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1
 
Monitoring @ scale over diverse data sources @ PayPal - Druid, TSDB, Hadoop
Monitoring @ scale over diverse data sources @ PayPal  - Druid, TSDB, HadoopMonitoring @ scale over diverse data sources @ PayPal  - Druid, TSDB, Hadoop
Monitoring @ scale over diverse data sources @ PayPal - Druid, TSDB, Hadoop
 
Using druid for interactive count distinct queries at scale @ nmc
Using druid  for interactive count distinct queries at scale @ nmcUsing druid  for interactive count distinct queries at scale @ nmc
Using druid for interactive count distinct queries at scale @ nmc
 
Druid @ branch
Druid @ branch Druid @ branch
Druid @ branch
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in Practice
 
Go Faster with Ansible (PHP meetup)
Go Faster with Ansible (PHP meetup)Go Faster with Ansible (PHP meetup)
Go Faster with Ansible (PHP meetup)
 
보안프로젝트 세미나 Viper-v1.2
보안프로젝트 세미나 Viper-v1.2보안프로젝트 세미나 Viper-v1.2
보안프로젝트 세미나 Viper-v1.2
 
Wiad17
Wiad17Wiad17
Wiad17
 
DevOps with Ansible
DevOps with AnsibleDevOps with Ansible
DevOps with Ansible
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
 

Similar to Druid realtime indexing

Построение распределенной системы сбора данных с помощью RabbitMQ, Alvaro Vid...
Построение распределенной системы сбора данных с помощью RabbitMQ, Alvaro Vid...Построение распределенной системы сбора данных с помощью RabbitMQ, Alvaro Vid...
Построение распределенной системы сбора данных с помощью RabbitMQ, Alvaro Vid...Ontico
 
Toward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackToward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackTon Ngo
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Xavier Lucas
 
Cassandra
CassandraCassandra
Cassandraexsuns
 
AWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual MeetupAWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual MeetupAnahit Pogosova
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...DataWorks Summit/Hadoop Summit
 
Real-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesReal-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesOleksii Diagiliev
 
Public private hybrid - cmdb challenge
Public private hybrid - cmdb challengePublic private hybrid - cmdb challenge
Public private hybrid - cmdb challengeryszardsshare
 
Stream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETconfluent
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormJohn Georgiadis
 
Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Guido Schmutz
 
Kafka indexing service
Kafka indexing serviceKafka indexing service
Kafka indexing serviceSeoeun Park
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CachePer Buer
 
Monitoring Cassandra with Riemann
Monitoring Cassandra with RiemannMonitoring Cassandra with Riemann
Monitoring Cassandra with RiemannPatricia Gorla
 
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with PrometheusOpenStack Korea Community
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...confluent
 
Donatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIsDonatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIsTanya Denisyuk
 

Similar to Druid realtime indexing (20)

Построение распределенной системы сбора данных с помощью RabbitMQ, Alvaro Vid...
Построение распределенной системы сбора данных с помощью RabbitMQ, Alvaro Vid...Построение распределенной системы сбора данных с помощью RabbitMQ, Alvaro Vid...
Построение распределенной системы сбора данных с помощью RabbitMQ, Alvaro Vid...
 
Toward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackToward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStack
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28
 
Cassandra
CassandraCassandra
Cassandra
 
Kafka Deep Dive
Kafka Deep DiveKafka Deep Dive
Kafka Deep Dive
 
AWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual MeetupAWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual Meetup
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
Real-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesReal-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpaces
 
Public private hybrid - cmdb challenge
Public private hybrid - cmdb challengePublic private hybrid - cmdb challenge
Public private hybrid - cmdb challenge
 
Stream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NET
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and Storm
 
In Flux Limiting for a multi-tenant logging service
In Flux Limiting for a multi-tenant logging serviceIn Flux Limiting for a multi-tenant logging service
In Flux Limiting for a multi-tenant logging service
 
Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!
 
Kafka indexing service
Kafka indexing serviceKafka indexing service
Kafka indexing service
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish Cache
 
Monitoring Cassandra with Riemann
Monitoring Cassandra with RiemannMonitoring Cassandra with Riemann
Monitoring Cassandra with Riemann
 
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
 
Donatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIsDonatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIs
 

Recently uploaded

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...gajnagarg
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 

Recently uploaded (20)

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 

Druid realtime indexing

  • 1. Copyright © 2016 kt NexR. All rights reserved. 1 Druid Overview of real time indexing Azrael Seoeun Park seoeun25@gmail.com
  • 2. Copyright © 2016 kt NexR. All rights reserved. 2 Introduction • Indexing Service • Design Architecture • Tranquility • Task Spec • Firehose • Plumber • Tranquility Configs • Flow of Realtime Indexing
  • 3. Copyright © 2016 kt NexR. All rights reserved. 3 Indexing Service • Indexing Service – Runs indexing task that create druid segment • Indexing task type – Index_realtime – Index_hadoop : batch ingestion – Index Indexing Service Data Source Segment Submit Task
  • 4. Copyright © 2016 kt NexR. All rights reserved. 4 Design Architecture Deep Storage Tranquility CoordinatorTranquility Broker Tranquility Indexing Service Overlord MiddleManager Peon Peon Peon ZooKeeper Kafka SparkStreaming Storm task Task (realtime _index) Segments segment Segment- cache Historical Segment- cache Historical
  • 5. Copyright © 2016 kt NexR. All rights reserved. 5 Tranquility • Send event streams to Druid in real-time • Written in Scala • Samza, Spark, Strom, Kafka, Flink • Tranquility Kafka – Submit realtime indexing task to overlord • Post http request with task spec – Pull the data from kafka – Push the data to realtime indexing task
  • 6. Copyright © 2016 kt NexR. All rights reserved. 6 Task Spec • "type" : "index_realtime", • "id" : "index_realtime_sip_2016-05-17T04:00:00.000Z_0_0", • "spec" : – "dataSchema" : • "dataSource" : "sip", • "parser" : { – "parseSpec" » "format" : "json", » "timestampSpec” » "dimensionsSpec” • "metricsSpec” • "granularitySpec" – "segmentGranularity" : "TEN_MINUTE” – "queryGranularity" : “MINUTE” – "ioConfig" : – "tuningConfig" : Data Ingestion : index firehose plumber RealtimeIndexTask
  • 7. Copyright © 2016 kt NexR. All rights reserved. 7 RealtimeIndexTask RealtimeIndexTask Firehose Plumber Data Source Segment IncremetalIndex IndexMerger
  • 8. Copyright © 2016 kt NexR. All rights reserved. 8 Firehose • Pipe line to read data • Type – LocalFirehose • Local file 과 연결 – IngestSegmentFirehose • Existing druid segment – CombiningFilrehose – EventReceiverFirehose • Ingest event using an http endpoint – TimedShutoffFirehose • Shutdown at a specified time
  • 9. Copyright © 2016 kt NexR. All rights reserved. 9 Firehose – real time index example • Firehose – Shutoff time : segmentGranulariy + windowPeriod + firehoseGracePeriod – Buffer size : firehouseBufferSize “ioConfig" : { "type" : "realtime", "firehose" : { "type" : "clipped", "delegate" : { "type" : "timed", "delegate" : { "type" : "receiver", "serviceName" : "firehose:druid:overlord:sip-00-0000-0000", "bufferSize" : 100000 }, "shutoffTime" : "2016-05-17T04:15:00.000Z" } "interval" : "2016-05-17T04:00:00.000Z/2016-05-17T04:10:00.000Z” EventReceiver buffer Timed : shutofftime Clip Tranquility nextRow Plumber push
  • 10. Copyright © 2016 kt NexR. All rights reserved. 10 Plumber • Generate segment – Intermediate persist • Indexing Task 실행 중에 index를 segment로 저장 • local에 있는 base directory에 segment 저장 • 동일한 id로 Task를 다시 실행 할 때 intermediate persist로 저장된 segment를 복구할 수 있다. – when the task finish • Indexing Task가 끝나면 전체 index를 segment로 저장 • DeepStorage에 segment를 push • Type – YeOldePlumber • This plumber creates single historical segments. – RealtimePlumber • This plumber creates real-time/mutable segments.
  • 11. Copyright © 2016 kt NexR. All rights reserved. 11 How to indexing Firehose row currentHydrant Index FireHydrant Index Plumber FireHydrant Index Sink Intermediate Persist …/task /index_realtime_sip_2016- 05-18T05:10:00.000Z_0_0 /work /persist/sip/2016-05-18T0 5:10:00.000Z_2016-05-18T 05:20:00.000Z /0 /v8-tmp Persist and Merge …/task /index_realtime_sip_2016- 05-18T05:10:00.000Z_0_0 /work /persist/sip/2016-05-18T0 5:10:00.000Z_2016-05-18T 05:20:00.000Z /merged /v8-tmp
  • 12. Copyright © 2016 kt NexR. All rights reserved. 12 Abstraction of index • Structure – TimeAndDims – Aggregators {"timestamp": "2016-05-18T04:31:39Z", "sip": "a", "packet_total": "10"} {"timestamp": "2016-05-18T04:31:39Z", "sip": "b", "packet_total": "3"} {"timestamp": "2016-05-18T04:33:42Z", "sip": "a", "packet_total": "10"} {"timestamp": "2016-05-18T04:33:42Z", "sip": "c", "packet_total": "5"} {"timestamp": "2016-05-18T04:37:55Z", "sip": "a", "packet_total": "10"} {"timestamp": "2016-05-18T04:45:11Z", "sip": "a", "packet_total": "7"} {"timestamp": "2016-05-18T04:45:11Z", "sip": "b", "packet_total": "8"} {"timestamp": "2016-05-18T04:45:22Z", "sip": "b", "packet_total": "8"} time sip sum 04:30:00 a (10)(10)(10)=30 04:30:00 b 3 04:30:00 c 5 04:40:00 a 7 04:40:00 b (8)(8)=16 Time 단위 : queryGranularity. - queryGrandularity 가 1 row를 의미
  • 13. Copyright © 2016 kt NexR. All rights reserved. 13 Plumber – real time index example • Plumber – maxRowsInMemory: persist하기 전에 최대 max row – intermediatePersistPeriod: persist 주기 – maxPendingPersist: pending 할 수 있는 persist 갯수. 0 = 한 개의 persist만 실 행 • How to set persist period? – maxRowsInMemory가 크면 메모리 사용량 증가 – intermedatePersistPeriod 가 빠르면 메모리 사용량 증가 • 주기가 느리면 recovery시 데이터 유실이 많이 된다 • stream processing의 recovery는 batch로 보완 "tuningConfig" : { "type" : "realtime", "maxRowsInMemory" : 1000, "intermediatePersistPeriod" : "PT2M", "windowPeriod" : "PT1S", "basePersistDirectory" : "/Users/seoeun/libs/druid-0.9.0/var/tmp/1463447509447-0", "maxPendingPersists" : 0, ….}
  • 14. Copyright © 2016 kt NexR. All rights reserved. 14 Segment • Files that store index, partitioned by time. • Created for each time interval configured in “SegmentGranularity” • File size: 300mb – 700mb • Row number: 5 million • Components – version.bin – meta.smoosh – XXXXX.smoosh
  • 15. Copyright © 2016 kt NexR. All rights reserved. 15 Tranquility kafka config – about druid • druid.discovery.curator.path – Curator service discovery path – /druid/discovery • druid.selectors.indexing.serviceName – Overlord node 의 service name – druid/overlord • druidBeam.firehoseBufferSize – Size of buffer used by firehose to store events. – 100,000 • druidBeam.firehoseChunkSize – Maximum number of events to send to Druid in one HTTP request. – 1,000 • druidBeam.firehoseGracePeriod – Druid indexing tasks will shut down this long after the windowPeriod has elapsed. – PT5M • task.partitions – Number of Druid partitions to create. – 1 • task.replicants – Number of instances of each Druid partition to create. – 1
  • 16. Copyright © 2016 kt NexR. All rights reserved. 16 Tranquility kafka config – about tranquility • tranquility.blockOnFull – Whether "send" will block (true) or throw an exception (false) when called whi le the outgoing queue is full. – true • tranquility.lingerMillis – Wait this long for batches to collect more messages (up to maxBatchSize) bef ore sending them. – 0 (disable waiting) • tranquility.maxBatchSize – Maximum number of messages to send at once. – 2,000 • tranquility.maxPendingBatches – Maximum number of batches that can be pending – 5
  • 17. Copyright © 2016 kt NexR. All rights reserved. 17 Tranquility kafka config – about kafka • kafka.group.id – Group ID for Kafka consumers. This must be the same! – tranquility-kafka • consumer.numThreads – The number of threads that will be made available to the Kafka consumer for fetching messages. – -1 – Partition number/ consumers로 계산하여 분산 • commit.periodMillis – The frequency with which consumer offsets will be committed to ZooKeeper t o track processed messages. – 15000 • kafka.* – kafka. will be passed to the underlying Kafka consumer with the kafka. prefix removed. – kafka.fetch.message.max.bytes  fetch.message.max.bytes • The number of bytes of messages to attempt to fetch for each topic-partition in eac h fetch request. • Partition 수 만큼 곱하게 되므로 메모리 사용 유의 • broker의 message.max.bytes 보다 크게 설정
  • 18. Copyright © 2016 kt NexR. All rights reserved. 18 Flow of real time indexing Tranquility Overlord Zookeeper Middle Manager Segment Deep Storage submit task Realtime IndexTask firehose plumber push data task firehose segement forking Task task task firehose
  • 19. Copyright © 2016 kt NexR. All rights reserved. 19 Q&A