End to End Processing of 3.7 Million Telemetry Events per Second using Lambda Architecture

End to End Processing of 3.7 Million Telemetry Events per
Second Using Lambda Architecture
Saurabh Mishra Raghavendra Nandagopal

Who are We?
Saurabh Mishra
Solution Architect, Hortonworks Professional Services
@draftsperson
smishra@hortonworks.com
Raghavendra Nandagopal
Cloud Data Services Architect, Symantec
@speaktoraghav
raghavendra_nandagop@symantec.com

Cloud Platform Engineering
Symantec - Global Leader in Cyber Security
- Symantec is the world leader in providing security software for both enterprises and end users
- There are 1000’s of Enterprise Customers and more than 400 million devices (Pcs, Tablets and
Phones) that rely on Symantec to help them secure their assets from attacks, including their
data centers, emails and other sensitive data
Cloud Platform Engineering (CPE)
- Build consolidated cloud infrastructure and platform services for next generation data powered
Symantec applications
- A big data platform for batch and stream analytics integrated with both private and public cloud.
- Open source components as building blocks
- Hadoop and Openstack
- Bridge feature gaps and contribute back

Agenda
• Security Data Lake @ Global Scale
• Infrastructure At Scale
• Telemetry Data Processing Architecture
• Tunable Targets
• Performance Benchmarks
• Service Monitoring

Security Data Lake @ Global Scale

Security Data Lake @ Global Scale
Products
HDFS
Analytic Applications, Workload Management (YARN)
Stream Processing
(Storm)
Real-time Results
(HBase, ElasticSearch)
Query
(Hive, Spark SQL)
Device
Agents
Telemetry Data
Data Transfer
Threat Protection
Inbound Messaging
(Data Import, Kafka)
Physical Machine , Public Cloud, Private Cloud

Lambda Architecture
7
Speed Layer
Batch Layer
Serving
Layer
Complexity Isolation
Once Data Makes to Serving Layer via Batch , Speed Layer can be Neglected
Compensate for high latency of updates to serving layer
Fast and incremental algorithms on real time data
Batch layer eventually overrides speed layer
Random access to batch views
Updated by batch layer
Stores master dataset
Batch layer stores the master copy of the Serving layer
Computes arbitrary Views

Yarn Applications in Production
- 669388 submitted
- 646632 completed
- 4640 killed
- 401 failed
Hive in Production
- 25887 Tables
- 306 Databases
- 98493 Partitions
Storm in Production
- 210 Nodes 50+ Topologies
Kafka in Production
- 80 Nodes
Hbase in Production
- 135 Nodes
ElasticSearch in Production
- 62 Nodes
Ambari
Infrastructure At Scale
Centralize
d Logging
and
Metering
Ironic
Ansible
Cloudbreak
Hybrid Data Lake
OpenStack
(Dev)
350 Nodes
Metal
(Production)
600 Nodes
Public
Cloud
(Production
)
200 Nodes

Telemetry Data Processing Architecture

Telemetry Data Processing Architecture
Telemetry Data
Collector
Telemetry Gateway
Raw
Events
Data Centers
Avro Serialized
Telemetry Avro Serialized
Telemetry
Opaque Trident
Kafka Spout
Deserialized
Objects
Transformations Functions
Transformation Topology
Trident Streams (Micro batch implementation, Exactly Once semantics)
Persist Avro
Objects
Avro Serialized
Transformed Objects
ElasticSearc
h Ingestion
Topology
Opaque
Trident
Kafka Spout
Trident
ElasticSearch
Writer Bolt
HBase Ingestion
Topology
Trident
HBase Bolt
Trident
Hive
Streamin
g
Opaque
Trident
Kafka
Spout
YARN
Hive
Ingestion
Topology
Identity Topology

Operating System
Tuning Targets
● Operating System
● Disk
● Network
Tunables
● Disable Transparent Huge Pages
echo never > defrag and > enabled
● Disable Swap
● Configure VM Cache flushing
● Configure IO Scheduler as deadline
● Disk Jbod Ext4
Mount Options- inode_readahead_blks=128,
data=writeback,noatime,nodiratime
● Network
Dual Bonded 10gbps
rx-checksumming: on, tx-checksumming: on, scatter-gather:
on, tcp-segmentation-offload: on

Kafka
Tuning Targets
● Broker
● Producer
● Consumer
Tunables
● replica.fetch.max.bytes
● socket.send.buffer.bytes
● socket.receive.buffer.bytes
● replica.socket.receive.buffer.bytes
● num.network.threads
● num.io.threads
● zookeeper.*.timeout.ms
Type
Metal 2.6 GHz
E5-2660 v3
12 * 4TB
JBOD
128 GB
DDR4 ECC
Cloud AWS: D2*8xlarge
v 0.8.2.1

Kafka
Tuning Targets
● Broker
● Producer
● Consumer
Tunables
● buffer.memory
● batch.size
● linger.ms
● compression.type
● socket.send.buffer.bytes
Type
Metal 2.6 GHz
E5-2660 v3
12 * 4TB
JBOD
128 GB
DDR4 ECC
v 0.8.2.1

Kafka
Tuning Targets
● Broker
● Producer
● Consumer
Tunables
● num.consumer.fetchers
● socket.receive.buffer.bytes
● fetch.message.max.bytes
● fetch.min.bytes
Type
Metal 2.6 GHz
E5-2660 v3
12 * 4TB
JBOD
128 GB
DDR4 ECC
v 0.8.2.1

Storm
Tuning Targets
● Nimbus
● Supervisors
● Workers and Executors
● Topology
Tunables
● Nimbus High Availability - 4 Nimbus Servers
Avoid downtime and performance degradation.
● Reduce workload on Zookeeper and Decreased
topology Submission time.
storm.codedistributor.class = HDFSCodeDistributor
● topology.min.replication.count = 3
floor(number_of_nimbus_hosts/2 + 1)
● max.replication.wait.time.sec = -1
● code.sync.freq.secs = 2 mins
● storm.messaging.netty.buffer_size = 10 mb
● nimbus.thrift.threads = 256
Type
Metal 2.6 GHz
E5-2660 v3
2 * 500GB
SSD
256 GB
DDR4 ECC
Cloud AWS: r3*8xlarge
v 0.10.0.2.4

Storm
Tuning Targets
● Nimbus
● Supervisors
● Topology
Tunables
● Use supervisord to control Supervisors
● Supervisor.slots.ports = Min (No of HT Cores ,
TotalMem of Server/Worker heap size)
● Supervisor.childopts = -Xms4096m -Xmx4096m -
verbose:gc-
Xloggc:/var/log/storm/supervisor_%ID%_gc.log
Type
Metal 2.6 GHz
E5-2660 v3
2 * 500GB
SSD
256 GB
DDR4 ECC
v 0.10.0.2.4

Storm
Tuning Targets
● Nimbus
● Supervisors
● Topology
Tunables
Rule of Thumb! - Use Case of Storm – Telemetry
Processing
● CPU bound tasks 1 Executor Per worker
● IO Bound tasks 8 Executors Per worker,
● Fixed the JVM Memory for each Worker Based on
Fetch Size of Kafka Trident Spout and Split Size of
Bolt
-Xms8g -Xmx8g -XX:MaxDirectMemorySize=2048m
-XX:NewSize=2g -XX:MaxNewSize=2g
-XX:+UseParNewGC -XX:MaxTenuringThreshold=2 -XX:SurvivorRatio=8 -
XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=32768 -
XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -
XX:+CMSClassUnloadingEnabled -XX:CMSInitiatingOccupancyFraction=80 -
XX:+UseCMSInitiatingOccupancyOnly -XX:-CMSConcurrentMTEnabled -
XX:+AlwaysPreTouch
Type
Metal 2.6 GHz
E5-2660 v3
2 * 500GB
SSD
256 GB
DDR4 ECC
v 0.10.0.2.4

Storm
Tuning Targets
● Nimbus
● Supervisors
● Topology
Tunables
● Topology.optimize = true
● Topology.message.timeout.secs = 110
● Topology.max.spout.pending = 3
● Remove Topology.metrics.consumer.register- AMBARI-13237
Incoming queue and Outgoing queue
● Topology.transfer.buffer.size = 64 – Batch Size
● Topology.receiver.buffer.size = 16 – Queue Size
● Topology.executor.receive.buffer.size = 32768
● Topology.executor.send.buffer.size = 32768
Type
Metal 2.6 GHz
E5-2660 v3
2 * 500GB
SSD
256 GB
DDR4 ECC
v 0.10.0.2.4

Storm
Tuning Targets
● Nimbus
● Supervisors
● Topology
Tunables
● topology.trident.parallelism.hint = (number of worker
nodes in cluster * number cores per worker node) -
(number of acker tasks)
● Kafka.consumer.fetch.size.byte = 209715200
( 200MB - Yes! We process large batches)
● Kafka.consumer.buffer.size.byte = 209715200
● Kafka.consumer.min.fetch.byte = 100428800
Type
Metal 2.6 GHz
E5-2660 v3
2 * 500GB
SSD
256 GB
DDR4 ECC
v 0.10.0.2.4

ZooKeeper
Tunables
● Keep data and log directories separately and on different
mounts
● Separate Zookeeper quorum of 5 Servers each for Kafka,
Storm, Hbase, HA quorum.
● Zookeeper GC Configurations
-Xms4192m -Xmx4192m -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC
-Xloggc:gc.log -XX:+PrintGCApplicationStoppedTime -
XX:+PrintGCApplicationConcurrentTime -XX:+PrintGC -XX:+PrintGCTimeStamps -
XX:+PrintGCDetails -verbose:gc -Xloggc:/var/log/zookeeper/zookeeper_gc.log
Type
Metal 2.6 GHz
E5-2660 v3
2*400 GB
SSD
128 GB
DDR4 ECC
Cloud AWS: r3.2xlarge
v 3.4.6
Tuning Targets
● Data and Log directory
● Garbage Collection

Elasticsearch
Type
Metal 2.6 GHz
E5-2660 v3
14 * 400 GB
SSD
256 GB
DDR4 ECC
Cloud AWS: i2.4xlarge
Tunables
● bootstrap.mlockall: true
● indices.fielddata.cache.size: 25%
● threadpool.bulk.queue_size: 5000
● index.refresh_interval: 30s
● Index.memory.index_buffer_size: 10%
● index.store.type: mmapfs
● GC settings: -verbose:gc -Xloggc:/var/log/elasticsearch/elasticsearch_gc.log -
Xss256k -Djava.awt.headless=true -XX:+UseCompressedOops -XX:+UseG1GC -
XX:MaxGCPauseMillis=200 -XX:+DisableExplicitGC -XX:+PrintGCDateStamps -
XX:+PrintGCDetails -XX:+PrintGCTimeStamps -
XX:ErrorFile=/var/log/elasticsearch_err.log -XX:ParallelGCThreads=8
● Bulk api
● Client node
v 1.7.5
Tuning Targets
● Index Parameters
● Garbage Collection

Hbase
Tuning Targets
● Region server GC
● Hbase Configurations
Tunables
export
HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVE
R_OPTS -XX:+UseConcMarkSweepGC -Xmn2500m -
XX:SurvivorRatio=4 -XX:CMSInitiatingOccupancyFraction=50
-XX:+UseCMSInitiatingOccupancyOnly
-Xmx{{regionserver_heapsize}}
-Xms{{regionserver_heapsize}}
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -
XX:+PrintPromotionFailure -XX:+PrintTenuringDistribution -
XX:+PrintGCApplicationStoppedTime
-Xloggc:${HBASE_LOG_DIR}/hbase-gc-regionserver.log.`date
+'%Y%m%d%H%M'`
Type
Metal 2.6 GHz
E5-2660 v3
14 * 400 GB
SSD
256 GB
DDR4 ECC
Cloud AWS: i2.8xlarge
v 1.1.0

Hive
Tuning Targets
● Table Structure
● Partition and Bucketing
Scheme
● Orc Tuning
Tunables
● Use strings instead of binaries.
● Use Integer fields.
Type
Metal 2.6 GHz
E5-2660 v3
14 * 6TB
HDD
256 GB
DDR4 ECC
Cloud AWS: d2*8xlarge
v 1.2.1

Hive
Tuning Targets
● Table Structure
Scheme
● Orc Tuning
Tunables
● Partitioning by Date Timestamp.
● Additional partitioning - Resulted in explosion of number of
partitions, small file size and inefficient ORC compression.
● Bucketing: If two tables bucket on the same column, they should
use the same number of buckets to support joining
● Sorting : Each table should optimize its sorting. The bucket
column typically should be the first sorted column.
Type
Metal 2.6 GHz
E5-2660 v3
14 * 6TB
HDD
256 GB
DDR4 ECC
v 1.2.1

Hive
Tuning Targets
● Table Structure
Scheme
● Orc Tuning
Tunables
● Table Structure , Bucketing and Partition and Sorting
Impact ORC Performance.
● ORC Stripe Size default 128MB Balanced Insert and
Query Optimized.
● ORC use ZLIB Compression. Smaller data size
improves any query.
● Predicate Push Down.
Type
Metal 2.6 GHz
E5-2660 v3
14 * 6TB
HDD
256 GB
DDR4 ECC
v 1.2.1
No of Yarn Containers Per Query
orc.compress ZLIB high level compression (one of NONE, ZLIB, SNAPPY)
orc.compress.size 262144 number of bytes in each compression chunk
orc.stripe.size 130023424 number of bytes in each stripe
orc.row.index.stride 64,000 number of rows between index entries (must be >= 1000)
orc.create.index true whether to create row indexes
orc.bloom.filter.column
s
"file_sha2" comma separated list of column names for which bloom filter
created
orc.bloom.filter.fpp 0.05 false positive probability for bloom filter (must >0.0 and <1.0)

Hive Streaming
Tuning Targets
● Hive Metastore Stability
● Evaluate BatchSize & TxnsPerBatch
Tunables
● No Hive Shell Access only Hiveserver2.
● Multiple Hive Metastore Process
○ Compaction Metastore - 5 - 10 Compaction
thread
○ Streaming Metastore - 5 - Connection pool
● 16 GB Heap Size.
● Metastore Mysql Database Scalability.
● Maximum EPS was achieved by Increasing Batch Size
and keeping TxnPerBatch as Smaller.
Type
Metal 2.6 GHz
E5-2660 v3
14 * 6TB
HDD
256 GB
DDR4 ECC
v 1.2.1

Benchmarking Suite
Kafka Producer Consumer Throughput Test
Storm Core and Trident Topologies
Standard Platform Test Suite
Hive TPC

Kafka Producer and Consumer Tests
The benchmark set contains Producer and Consumer test executing at various message size.
Producer and Consumer Together
● 100 bytes
● 1000 bytes - Average Telemetry Event Size
● 10000 bytes
● 100000 bytes
● 500000 bytes
● 1000000 bytes
Type of Tests
● Single thread no replication
● Single-thread, async 3x replication
● Single-thread, sync 3x replication
● Throughput Versus Stored Data
Ingesting 10 Telemetry Sources in Parallel

Storm Topology
The benchmark set custom topologies for processing telemetry data source transformation and ingestion
which simulates end to end use cases for real time streaming of Telemetry.
● Storm Trident HDFS Telemetry Transformation and Ingestion
● Storm Trident Hive Telemetry Ingestion
● Storm Trident Hbase Telemetry Ingestion
● Storm Trident Elasticsearch Telemetry Ingestion
Ingesting 10 Telemetry Sources in Parallel

Standard Platform Tests
TeraSort benchmark suite ( 2TB , 5TB, 10TB)
RandomWriter(Write and Sort) - 10GB of random data per node
DFS-IO Write and Read - TestDFSIO
NNBench (Write, Read, Rename and Delete)
MRBench
Data Load (Upload and Download)

TPC-DS 20TB
TPC-DS: Decision Support Performance Benchmarks
● Classic EDW Dimensional model
● Large fact tables
● Complex queries
Scale: 20TB
TPC-DS Benchmarking

Service Monitoring Architecture

Storm Kafka Logging Collection

End to End Processing of 3.7 Million Telemetry Events per Second using Lambda Architecture

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to End to End Processing of 3.7 Million Telemetry Events per Second using Lambda Architecture

Similar to End to End Processing of 3.7 Million Telemetry Events per Second using Lambda Architecture (20)

More from DataWorks Summit/Hadoop Summit

More from DataWorks Summit/Hadoop Summit (20)

Recently uploaded

Recently uploaded (20)

End to End Processing of 3.7 Million Telemetry Events per Second using Lambda Architecture