SlideShare une entreprise Scribd logo
1  sur  84
Télécharger pour lire hors ligne
SignalFx
SignalFx
Scaling ingest pipelines with
high performance computing principles
Rajiv Kurian, Software Engineer
rajiv@signalfx.com
Agenda
1. Why we need to scale ingest
2. Basic properties and limitations of modern
hardware
3. Optimization techniques inspired by HPC
4. Results!
5. Q&A (hopefully!)
SignalFx
Why we need to scale ingest
• High resolution:
• Up to 1 sec
• Streaming analytics:
• Charts/analytics update @1sec
• Real time
• Multidimensional metrics:
• Dimensions : representing customer, server etc
• Filter, aggregate : 99th-pct-latency-by-service,customer
SignalFx is an advanced monitoring platform for modern applications
Ingest pipeline
ROLLUPS PERSIST
REST/RATE
CONTROL
Raw time
series data
Processed data to
analytics
SignalFx ingest library
Raw data in Rollup data out
TimeSeries 0 rollup
TimeSeries 1 rollup
TimeSeries 2 rollup
TimeSeries 3 rollup
TimeSeries 4 rollup
TimeSeries 5 rollup
TimeSeries 6 rollup
TimeSeries 7 rollup
TimeSeries 8 rollup
Issues identified (before applying HPC techniques)
• Expensive - too many servers

• Exhibits parallel slow down
• More threads = worse performance

• What did the profile say?
• Death by a thousand cuts
• The core library = 35% of profile
SignalFx
Basic properties and limitations
of modern hardware
SignalFx
L1 Data
L1
Instruction
L3
L1 Data
L1
Instruction
L2L2
Core 1 Core 2
Main memory
Cache Lines
• Data is transferred between memory and cache in blocks of
fixed size, called cache lines. Usually 64 bytes
• When the processor needs to read or write a location in
main memory, it first checks for a corresponding entry in the
cache. In the case of:
• a cache hit, the processor immediately reads or writes
the data in the cache line
• a cache miss, the cache allocates a new entry and
copies in data from main memory, then the request (read
or write) is fulfilled from the contents of the cache
• The memory subsystem makes two kinds of bets to help us:
• Temporal locality
• Spatial locality
Reference latency numbers for comparison
By Jeff Dean: http://research.google.com/people/jeff/
L1 Cache 0.5ns
Branch mispredict 5 ns
L2 Cache 7 ns 14x L1 Cache
Mutex lock/unlock 25 ns
Main memory 100 ns 20x L2 Cache, 200x L1 Cache
Compress 1K bytes (Zippy) 3,000 ns
Send 1K bytes over 1Gbps 10,000 ns 0.01 ms
Read 4K randomly from SSD 150,000 ns 0.15 ms
Read 1MB sequentially from memory 250,000 ns 0.25 ms
Round trip within same DC 500,000 ns 0.5 ms
Read 1MB sequentially from SSD 1,000,000 ns 1 ms 4x memory
Disk seek 10,000,000 ns 10 ms 20x DC roundtrip
Read 1MB sequentially from disk 20,000,000 ns 20 ms 80x memory, 20x SSD
Send packet CA->Netherlands->CA 150,000,000 ns 150 ms
SignalFx
L1 CORE
SignalFx
L2 CORE
SignalFx
Main
Memory
CORE
Our optimization goal
Convert a memory bandwidth bound
application to a CPU bound application
Things we kept in mind
• Measure, measure, measure!

• Don’t rely on micro benchmarks alone
SignalFx
Benchmark
SignalFx library benchmark
Rollup data out
Key Value
ID 0 TimeSeries rollup 0
ID 1 TimeSeries rollup 1
ID 2 TimeSeries rollup 2
ID 3 TimeSeries rollup 3
ID 4 TimeSeries rollup 4
….. …..
….. …..
Key 1M TimeSeries rollup 1M
Raw data in,
in random order,
one per Time Series.
50x
SignalFx library benchmark
Rollup data out
Key Value
ID 0 TimeSeries rollup 0
ID 1 TimeSeries rollup 1
ID 2 TimeSeries rollup 2
ID 3 TimeSeries rollup 3
ID 4 TimeSeries rollup 4
….. …..
….. …..
Key 1M TimeSeries rollup 1M
Raw data in,
in random order,
one per Time Series.
50x
35% of the profile of
the entire application
SignalFx
Techniques inspired by HPC that have
improved our pipeline
Single threaded, event based architectures:
parallelize by running multiple copies of
single threaded code
Single threaded event based architectures
• Threads work on their own private
data (as much as possible)

• Communicate with other threads
using events/messages
SignalFx
local data
Network In thread Processor thread(s) Network out thread
Receive data
Process data
Write batched
data
Events
Events
Key Value
key 1 value 1
key 2 value 2
key 3 value 3
key 4 value 4
SignalFx
Network In thread Processor thread(s) Network out thread
Receive data
Process data Write batched
data
local data
Ring
Buffer
Ring
Buffer
Key Value
key 1 value 1
key 2 value 2
key 3 value 3
key 4 value 4
Single threaded event based architectures advantages
• It enables many other optimal choices like
• Compact array based data structures
• Buffer/object re-use

• Loosely coupled - easy to test

• Run multiple copies for parallelism
SignalFx
Ring
Buffer
Network In thread Worker thread(s) Network out thread
Receive data
Process data Write batched
data
local data
1
2
3
4
Ring
Buffer
Ring
Buffer
Key Value
key 1 value 1
key 2 value 2
key 3 value 3
key 4 value 4
SignalFx
Worker thread
Receive data using
Async IO
Process data
synchronously
Write data using
Async IO
Receive data using
Async IO
Process data
synchronously
Write data using
Async IO
Receive data using
Async IO
Process data
synchronously
Write data using
Async IO
Worker thread Worker thread
local data local data local data
Key Value
key 5 value 5
key 6 value 6
key 7 value 7
key 8 value 8
Key Value
key 1 value 1
key 2 value 2
key 3 value 3
key 4 value 4
Key Value
key 9 value 9
key 10 value 10
key 11 value 11
key 12 value 12
SignalFx
Ring
Buffer
Network thread Processor thread(s) Async IO thread
Receive data
Process data
Batched
IO calls
local data
1
2
3
4
Ring
Buffer
Ring
Buffer
5
Ring
Buffer
6
7
Key Value
key 1 value 1
key 2 value 2
key 3 value 3
key 4 value 4
Advice for threaded applications
• Threads should ideally reflect the actual
parallelism of the system.
• Avoid gratuitous over subscribing
• Exception: IO threads?

• DO NOT communicate unless you have to
SignalFx
Techniques inspired by HPC that have
improved our pipeline
Use compact, cache-conscious, array based
data structures with minimal indirection
SignalFx
L1 Data
L1
Instruction
L3
L1 Data
L1
Instruction
L2L2
Core 1 Core 2
Main memory
Basic principles
• Strive for smaller data structures
• Extra computation is ok
• E.g. Compressing network data

• Design data structures that facilitate
processing multiple entries—big
arrays!

• Layout should reflect access patterns
Hash maps
• Hash maps look ups are NOT free!

• A lookup in a well implemented hash
map is by definition a cache miss

• Popular implementations like
java.util.HashMap can cause multiple
cache misses
valuekey
key* value*key* value*
List
List
List
List
Typical hash map implemented as an array of
lists of key* | value*
valuekey
key* value*key* value*
List
List
List
List
1
Cache misses in a typical hash-map
implementation
valuekey
key* value*key* value*
List
List
List
List
1
2
Cache misses in a typical hash-map
implementation
valuekey
key* value*key* value*
List
List
List
List
1
2
3
Cache misses in a typical hash-map
implementation
valuekey
key* value*key* value*
List
List
List
List
1
2
3 4
Cache misses in a typical hash-map
implementation
valuekey
key* value*key* value*
List
List
List
List
1
2
3 4
Cache misses in a typical hash-map
implementation
valuekey
key* value*key* value*
Hash map implemented as an array of lists of
key* | value*
List
List
List
List
Array of co-located key/value
Key Value
Key 0 Value 0
Key 1 Value 1
Key 2 Value 2
Key 3 Value 3
Key 4 Value 4
Key 5 Value 5
Key 6 Value 6
Key 7 Value 7
Cache misses with no collision
Key Value
Key 0 Value 0
Key 1 Value 1
Key 2 Value 2
Key 3 Value 3
Key 4 Value 4
Key 5 Value 5
Key 6 Value 6
Key 7 Value 7
1
Cache misses with collisions
Key Value
Key 0 Value 0
Key 1 Value 1
Key 2 Value 2
Key 3 Value 3
Key 4 Value 4
Key 5 Value 5
Key 6 Value 6
Key 7 Value 7
1
2
Hash map of key to index to an array of structs
Value 0
Value 1
Value 2
Value 3
Value 4
Value 5
Value 6
Value 7
Value 8
Key Index
Key 0 1
Key 1 6
Key 2 4
Key 3 8
Cache misses with collision
Value 0
Value 1
Value 2
Value 3
Value 4
Value 5
Value 6
Value 7
Value 8
Key Index
Key 0 1
Key 1 6
Key 2 4
Key 3 8
1
Cache misses with collision
Value 0
Value 1
Value 2
Value 3
Value 4
Value 5
Value 6
Value 7
Value 8
Key Index
Key 0 1
Key 1 6
Key 2 4
Key 3 8
1 2
New library memory layout
TimeSeries rollup 0
TimeSeries rollup 1
TimeSeries rollup 2
TimeSeries rollup 3
TimeSeries rollup 4
TimeSeries rollup 5
TimeSeries rollup 6
TimeSeries rollup 7
TimeSeries rollup 8
ID Index
ID 0 1
ID 1 6
ID 2 4
ID 3 8
Raw data in Rollup out
Changing hash map implementations
• java.util.HashMap (uses separate chaining and boxes
primitives) to make a long -> int lookup
• Allocations galore

• net.openhft.koloboke primitive open hash map
• 45% improvement

For the JVM use libraries like https://github.com/
OpenHFT/Koloboke.
For C++ try https://github.com/preshing/
CompareIntegerMaps or similar.
Access patterns
Hot data
Cold data
object 0
object 1
object 2
object 3
Field 0 Field 1 Field 2 Field 3 Field 4
Field 0 Field 1 Field 2 Field 3 Field 4
Field 0 Field 1 Field 2 Field 3 Field 4
Field 0 Field 1 Field 2 Field 3 Field 4
Group fields accessed together
Hot fields
Cold fields
object 1
Field 0 Field 1 Field 2
Field 0 Field 1 Field 2
Field 0 Field 1 Field 2
Field 0 Field 1 Field 2
Field 3 Field 4
Field 3 Field 4
Field 3 Field 4
Field 3 Field 4
Results of separating hot and cold data
A hot loop run about once every 500 ms
• Old - Hot and cold data kept together
• 5 cache lines per time series
• Took anywhere between 62-70 ms

• New - Hot and cold data kept separate
• 3 cache lines of hot data per time series
• Took anywhere between 40-45 ms

• 35% improvement
SignalFx
Results (library)!
Old vs New
• Concurrent -> single threaded
• Locks gone
• Array based data structures
• Zero allocations
• Extensive batching and hardware prefetching



• Multiple hash maps -> a single hash map look up
Old vs New
Old vs New
76 K/sec
VS
2.1 M/sec
Old vs New
27x
Old vs New
35 %
SignalFx
Results (application)!
Amdahl’s law - 35%Overallspeedup
0
0.4
0.8
1.2
1.6
Library speed up
1 4 8 12 16 20 24 28 ∞
CPU
CPU
3.4x
35% of the profile but 3.4x improvement?
• Amdahl’s law
• Max 1.54x improvement if 35% => 0%

• Why 3.4x ?
• When you use less cache, you leave more for
others - thus speeding up other code too

• Lesson
• A profiler is a necessary tool, but not a substitute for
informed design
Heap growth
Closing remarks / rant
• “Write code first, profile later” = BAD

• Excessive encapsulation leads to
myopic decisions being made re: perf
• allocations
• “thread safe” code

• Beware of micro benchmarks
SignalFx
Thank You!
Rajiv Kurian
rajiv@signalfx.com
@rzidane360
WE’RE HIRING
jobs@signalfx.com
@SignalFx - signalfx.com
SignalFx
Bonus slides
Composition in C
Struct A Struct B
Struct B1
embedded
Struct B2
embedded
int
int
int
int
int
int
int
int
int
int
int
int
Composition in Java
Object A Object B
Object B1 Object B2
int
int
int
int
B1
B2
Actual layout?
int
int
int
int
Object B
Object B1 Object B2
B1
B2
Actual layout?
Object B
B1
B2
B (header)
B1*
B2*
Actual layout?
int
int
Object B
Object B1
B1
B2
B (header)
B1*
B2*
Actual layout?
int
int
Object B
Object B1
B1
B2
B (header)
B1*
B2*
B1 (header)
int
int
Actual layout?
int
int
Object B
Object B1
B1
B2
B (header)
B1*
B2*
B1 (header)
int
int
Actual layout?
int
int
int
int
Object B
Object B1 Object B2
B1
B2
B (header)
B1*
B2*
B1 (header)
int
int
Actual layout?
int
int
int
int
Object B
Object B1 Object B2
B1
B2
B (header)
B1*
B2*
B1 (header)
int
int
B2 (header)
int
int
Actual layout?
int
int
int
int
Object B
Object B1 Object B2
B1
B2
B (header)
B1*
B2*
B1 (header)
int
int
B2 (header)
int
int
Potential layout after GC
int
int
int
int
Object B
Object B1 Object B2
B1
B2
B (header)
B1*
B2*
Other data
B1 (header)
int
int
Other data
B2 (header
int
int
SignalFx
Techniques inspired by HPC that have
improved our pipeline
Separate the control and data planes
Frequent
Infrequent
A networking concept
Routing table
Packets in Packets out
Routing data
Key Value
What the control and data planes do
In networking terminology:
• Data plane - Defines the part that
decides what to do with packets
arriving on an inbound interface—
Frequent
• Control plane - Defines the part that is
concerned with drawing the network
map or routing table—Infrequent
The goal of control and data plane separation
DO NOT slow the frequent path because
of the infrequent path
Runtime configuration variables
Worker threadConfiguration variables
(volatile/atomic)
Setter thread
while (1) {
process_data_using_configuration_variables();
}
Flag 0
Flag 1
Flag 2
Flag 3
Flag 0
Flag 1
Flag 2
Flag 3
Runtime configuration variables
Worker threadConfiguration variables
(volatile/atomic)
Setter thread
Flag 0
Flag 1
Flag 2
Flag 3
while (1) {
cache_configuration_variables();
process_a_ton_of_stuff();
}
Cached configuration
variables
Volatile/atomic flag vs cached local flag
• All run time flags (used on every data point) are
volatile/atomic loads
• All run time flags are cached and refreshed on each
run loop
• About 8% improvement in datapoint/second. Others
might see more or less

Contenu connexe

Tendances

0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with ErlangMaxim Kharchenko
 
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...InfluxData
 
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink TensorflowFlink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink TensorflowFlink Forward
 
Finding OOMS in Legacy Systems with the Syslog Telegraf Plugin
Finding OOMS in Legacy Systems with the Syslog Telegraf PluginFinding OOMS in Legacy Systems with the Syslog Telegraf Plugin
Finding OOMS in Legacy Systems with the Syslog Telegraf PluginInfluxData
 
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...InfluxData
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward
 
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration storyJoan Viladrosa Riera
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshareAllen (Xiaozhong) Wang
 
0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with ErlangMaxim Kharchenko
 
Virtual training Intro to InfluxDB & Telegraf
Virtual training  Intro to InfluxDB & TelegrafVirtual training  Intro to InfluxDB & Telegraf
Virtual training Intro to InfluxDB & TelegrafInfluxData
 
From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyAllen (Xiaozhong) Wang
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthScyllaDB
 
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...Paul Brebner
 
Top Ten Kafka® Configs
Top Ten Kafka® ConfigsTop Ten Kafka® Configs
Top Ten Kafka® Configsconfluent
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimDatabricks
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings MeetupGwen (Chen) Shapira
 
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache KafkaKafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafkaconfluent
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013Jun Rao
 

Tendances (20)

0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with Erlang
 
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
 
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink TensorflowFlink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
 
Finding OOMS in Legacy Systems with the Syslog Telegraf Plugin
Finding OOMS in Legacy Systems with the Syslog Telegraf PluginFinding OOMS in Legacy Systems with the Syslog Telegraf Plugin
Finding OOMS in Legacy Systems with the Syslog Telegraf Plugin
 
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
 
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
 
0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with Erlang
 
Apache Storm In Retail Context
Apache Storm In Retail ContextApache Storm In Retail Context
Apache Storm In Retail Context
 
Virtual training Intro to InfluxDB & Telegraf
Virtual training  Intro to InfluxDB & TelegrafVirtual training  Intro to InfluxDB & Telegraf
Virtual training Intro to InfluxDB & Telegraf
 
From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka Journey
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster Health
 
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
 
Top Ten Kafka® Configs
Top Ten Kafka® ConfigsTop Ten Kafka® Configs
Top Ten Kafka® Configs
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
 
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache KafkaKafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
 
Raptor codes
Raptor codesRaptor codes
Raptor codes
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
 

En vedette

Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15SignalFx
 
AWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFxAWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFxSignalFx
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFxSignalFx
 
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics ProblemMicroservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics ProblemSignalFx
 
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...SignalFx
 
Go debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFxGo debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFxSignalFx
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache CassandraPatrick McFadin
 

En vedette (7)

Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
 
AWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFxAWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFx
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFx
 
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics ProblemMicroservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
 
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
 
Go debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFxGo debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFx
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 

Similaire à Scaling ingest pipelines with high performance computing principles - Rajiv Kurian

hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataRahul Jain
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Speedment, Inc.
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Malin Weiss
 
«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghub«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghubit-people
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
What’s Evolving in the Elastic Stack
What’s Evolving in the Elastic StackWhat’s Evolving in the Elastic Stack
What’s Evolving in the Elastic StackElasticsearch
 
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxData
 
Scaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesScaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesHaohui Mai
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresDataWorks Summit
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackDataWorks Summit/Hadoop Summit
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxData
 
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement VMware Tanzu
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Apex
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentSpeedment, Inc.
 
London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleEvan Chan
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 

Similaire à Scaling ingest pipelines with high performance computing principles - Rajiv Kurian (20)

hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghub«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghub
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
What’s Evolving in the Elastic Stack
What’s Evolving in the Elastic StackWhat’s Evolving in the Elastic Stack
What’s Evolving in the Elastic Stack
 
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
 
Scaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesScaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of Files
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value Stores
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
 
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ Speedment
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 

Dernier

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Dernier (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Scaling ingest pipelines with high performance computing principles - Rajiv Kurian

  • 2. SignalFx Scaling ingest pipelines with high performance computing principles Rajiv Kurian, Software Engineer rajiv@signalfx.com
  • 3. Agenda 1. Why we need to scale ingest 2. Basic properties and limitations of modern hardware 3. Optimization techniques inspired by HPC 4. Results! 5. Q&A (hopefully!)
  • 4. SignalFx Why we need to scale ingest
  • 5. • High resolution: • Up to 1 sec • Streaming analytics: • Charts/analytics update @1sec • Real time • Multidimensional metrics: • Dimensions : representing customer, server etc • Filter, aggregate : 99th-pct-latency-by-service,customer SignalFx is an advanced monitoring platform for modern applications
  • 6. Ingest pipeline ROLLUPS PERSIST REST/RATE CONTROL Raw time series data Processed data to analytics
  • 7. SignalFx ingest library Raw data in Rollup data out TimeSeries 0 rollup TimeSeries 1 rollup TimeSeries 2 rollup TimeSeries 3 rollup TimeSeries 4 rollup TimeSeries 5 rollup TimeSeries 6 rollup TimeSeries 7 rollup TimeSeries 8 rollup
  • 8. Issues identified (before applying HPC techniques) • Expensive - too many servers
 • Exhibits parallel slow down • More threads = worse performance
 • What did the profile say? • Death by a thousand cuts • The core library = 35% of profile
  • 9. SignalFx Basic properties and limitations of modern hardware
  • 11. Cache Lines • Data is transferred between memory and cache in blocks of fixed size, called cache lines. Usually 64 bytes • When the processor needs to read or write a location in main memory, it first checks for a corresponding entry in the cache. In the case of: • a cache hit, the processor immediately reads or writes the data in the cache line • a cache miss, the cache allocates a new entry and copies in data from main memory, then the request (read or write) is fulfilled from the contents of the cache • The memory subsystem makes two kinds of bets to help us: • Temporal locality • Spatial locality
  • 12. Reference latency numbers for comparison By Jeff Dean: http://research.google.com/people/jeff/ L1 Cache 0.5ns Branch mispredict 5 ns L2 Cache 7 ns 14x L1 Cache Mutex lock/unlock 25 ns Main memory 100 ns 20x L2 Cache, 200x L1 Cache Compress 1K bytes (Zippy) 3,000 ns Send 1K bytes over 1Gbps 10,000 ns 0.01 ms Read 4K randomly from SSD 150,000 ns 0.15 ms Read 1MB sequentially from memory 250,000 ns 0.25 ms Round trip within same DC 500,000 ns 0.5 ms Read 1MB sequentially from SSD 1,000,000 ns 1 ms 4x memory Disk seek 10,000,000 ns 10 ms 20x DC roundtrip Read 1MB sequentially from disk 20,000,000 ns 20 ms 80x memory, 20x SSD Send packet CA->Netherlands->CA 150,000,000 ns 150 ms
  • 16. Our optimization goal Convert a memory bandwidth bound application to a CPU bound application
  • 17. Things we kept in mind • Measure, measure, measure!
 • Don’t rely on micro benchmarks alone
  • 19. SignalFx library benchmark Rollup data out Key Value ID 0 TimeSeries rollup 0 ID 1 TimeSeries rollup 1 ID 2 TimeSeries rollup 2 ID 3 TimeSeries rollup 3 ID 4 TimeSeries rollup 4 ….. ….. ….. ….. Key 1M TimeSeries rollup 1M Raw data in, in random order, one per Time Series. 50x
  • 20. SignalFx library benchmark Rollup data out Key Value ID 0 TimeSeries rollup 0 ID 1 TimeSeries rollup 1 ID 2 TimeSeries rollup 2 ID 3 TimeSeries rollup 3 ID 4 TimeSeries rollup 4 ….. ….. ….. ….. Key 1M TimeSeries rollup 1M Raw data in, in random order, one per Time Series. 50x 35% of the profile of the entire application
  • 21. SignalFx Techniques inspired by HPC that have improved our pipeline Single threaded, event based architectures: parallelize by running multiple copies of single threaded code
  • 22. Single threaded event based architectures • Threads work on their own private data (as much as possible)
 • Communicate with other threads using events/messages
  • 23. SignalFx local data Network In thread Processor thread(s) Network out thread Receive data Process data Write batched data Events Events Key Value key 1 value 1 key 2 value 2 key 3 value 3 key 4 value 4
  • 24. SignalFx Network In thread Processor thread(s) Network out thread Receive data Process data Write batched data local data Ring Buffer Ring Buffer Key Value key 1 value 1 key 2 value 2 key 3 value 3 key 4 value 4
  • 25. Single threaded event based architectures advantages • It enables many other optimal choices like • Compact array based data structures • Buffer/object re-use
 • Loosely coupled - easy to test
 • Run multiple copies for parallelism
  • 26. SignalFx Ring Buffer Network In thread Worker thread(s) Network out thread Receive data Process data Write batched data local data 1 2 3 4 Ring Buffer Ring Buffer Key Value key 1 value 1 key 2 value 2 key 3 value 3 key 4 value 4
  • 27. SignalFx Worker thread Receive data using Async IO Process data synchronously Write data using Async IO Receive data using Async IO Process data synchronously Write data using Async IO Receive data using Async IO Process data synchronously Write data using Async IO Worker thread Worker thread local data local data local data Key Value key 5 value 5 key 6 value 6 key 7 value 7 key 8 value 8 Key Value key 1 value 1 key 2 value 2 key 3 value 3 key 4 value 4 Key Value key 9 value 9 key 10 value 10 key 11 value 11 key 12 value 12
  • 28. SignalFx Ring Buffer Network thread Processor thread(s) Async IO thread Receive data Process data Batched IO calls local data 1 2 3 4 Ring Buffer Ring Buffer 5 Ring Buffer 6 7 Key Value key 1 value 1 key 2 value 2 key 3 value 3 key 4 value 4
  • 29. Advice for threaded applications • Threads should ideally reflect the actual parallelism of the system. • Avoid gratuitous over subscribing • Exception: IO threads?
 • DO NOT communicate unless you have to
  • 30. SignalFx Techniques inspired by HPC that have improved our pipeline Use compact, cache-conscious, array based data structures with minimal indirection
  • 32. Basic principles • Strive for smaller data structures • Extra computation is ok • E.g. Compressing network data
 • Design data structures that facilitate processing multiple entries—big arrays!
 • Layout should reflect access patterns
  • 33. Hash maps • Hash maps look ups are NOT free!
 • A lookup in a well implemented hash map is by definition a cache miss
 • Popular implementations like java.util.HashMap can cause multiple cache misses
  • 34. valuekey key* value*key* value* List List List List Typical hash map implemented as an array of lists of key* | value*
  • 35. valuekey key* value*key* value* List List List List 1 Cache misses in a typical hash-map implementation
  • 36. valuekey key* value*key* value* List List List List 1 2 Cache misses in a typical hash-map implementation
  • 37. valuekey key* value*key* value* List List List List 1 2 3 Cache misses in a typical hash-map implementation
  • 38. valuekey key* value*key* value* List List List List 1 2 3 4 Cache misses in a typical hash-map implementation
  • 39. valuekey key* value*key* value* List List List List 1 2 3 4 Cache misses in a typical hash-map implementation
  • 40. valuekey key* value*key* value* Hash map implemented as an array of lists of key* | value* List List List List
  • 41. Array of co-located key/value Key Value Key 0 Value 0 Key 1 Value 1 Key 2 Value 2 Key 3 Value 3 Key 4 Value 4 Key 5 Value 5 Key 6 Value 6 Key 7 Value 7
  • 42. Cache misses with no collision Key Value Key 0 Value 0 Key 1 Value 1 Key 2 Value 2 Key 3 Value 3 Key 4 Value 4 Key 5 Value 5 Key 6 Value 6 Key 7 Value 7 1
  • 43. Cache misses with collisions Key Value Key 0 Value 0 Key 1 Value 1 Key 2 Value 2 Key 3 Value 3 Key 4 Value 4 Key 5 Value 5 Key 6 Value 6 Key 7 Value 7 1 2
  • 44. Hash map of key to index to an array of structs Value 0 Value 1 Value 2 Value 3 Value 4 Value 5 Value 6 Value 7 Value 8 Key Index Key 0 1 Key 1 6 Key 2 4 Key 3 8
  • 45. Cache misses with collision Value 0 Value 1 Value 2 Value 3 Value 4 Value 5 Value 6 Value 7 Value 8 Key Index Key 0 1 Key 1 6 Key 2 4 Key 3 8 1
  • 46. Cache misses with collision Value 0 Value 1 Value 2 Value 3 Value 4 Value 5 Value 6 Value 7 Value 8 Key Index Key 0 1 Key 1 6 Key 2 4 Key 3 8 1 2
  • 47. New library memory layout TimeSeries rollup 0 TimeSeries rollup 1 TimeSeries rollup 2 TimeSeries rollup 3 TimeSeries rollup 4 TimeSeries rollup 5 TimeSeries rollup 6 TimeSeries rollup 7 TimeSeries rollup 8 ID Index ID 0 1 ID 1 6 ID 2 4 ID 3 8 Raw data in Rollup out
  • 48. Changing hash map implementations • java.util.HashMap (uses separate chaining and boxes primitives) to make a long -> int lookup • Allocations galore
 • net.openhft.koloboke primitive open hash map • 45% improvement
 For the JVM use libraries like https://github.com/ OpenHFT/Koloboke. For C++ try https://github.com/preshing/ CompareIntegerMaps or similar.
  • 49. Access patterns Hot data Cold data object 0 object 1 object 2 object 3 Field 0 Field 1 Field 2 Field 3 Field 4 Field 0 Field 1 Field 2 Field 3 Field 4 Field 0 Field 1 Field 2 Field 3 Field 4 Field 0 Field 1 Field 2 Field 3 Field 4
  • 50. Group fields accessed together Hot fields Cold fields object 1 Field 0 Field 1 Field 2 Field 0 Field 1 Field 2 Field 0 Field 1 Field 2 Field 0 Field 1 Field 2 Field 3 Field 4 Field 3 Field 4 Field 3 Field 4 Field 3 Field 4
  • 51. Results of separating hot and cold data A hot loop run about once every 500 ms • Old - Hot and cold data kept together • 5 cache lines per time series • Took anywhere between 62-70 ms
 • New - Hot and cold data kept separate • 3 cache lines of hot data per time series • Took anywhere between 40-45 ms
 • 35% improvement
  • 53. Old vs New • Concurrent -> single threaded • Locks gone • Array based data structures • Zero allocations • Extensive batching and hardware prefetching
 
 • Multiple hash maps -> a single hash map look up
  • 55. Old vs New 76 K/sec VS 2.1 M/sec
  • 59. Amdahl’s law - 35%Overallspeedup 0 0.4 0.8 1.2 1.6 Library speed up 1 4 8 12 16 20 24 28 ∞
  • 60. CPU
  • 62. 35% of the profile but 3.4x improvement? • Amdahl’s law • Max 1.54x improvement if 35% => 0%
 • Why 3.4x ? • When you use less cache, you leave more for others - thus speeding up other code too
 • Lesson • A profiler is a necessary tool, but not a substitute for informed design
  • 64. Closing remarks / rant • “Write code first, profile later” = BAD
 • Excessive encapsulation leads to myopic decisions being made re: perf • allocations • “thread safe” code
 • Beware of micro benchmarks
  • 65. SignalFx Thank You! Rajiv Kurian rajiv@signalfx.com @rzidane360 WE’RE HIRING jobs@signalfx.com @SignalFx - signalfx.com
  • 67. Composition in C Struct A Struct B Struct B1 embedded Struct B2 embedded int int int int int int int int
  • 68. int int int int Composition in Java Object A Object B Object B1 Object B2 int int int int B1 B2
  • 71. Actual layout? int int Object B Object B1 B1 B2 B (header) B1* B2*
  • 72. Actual layout? int int Object B Object B1 B1 B2 B (header) B1* B2* B1 (header) int int
  • 73. Actual layout? int int Object B Object B1 B1 B2 B (header) B1* B2* B1 (header) int int
  • 74. Actual layout? int int int int Object B Object B1 Object B2 B1 B2 B (header) B1* B2* B1 (header) int int
  • 75. Actual layout? int int int int Object B Object B1 Object B2 B1 B2 B (header) B1* B2* B1 (header) int int B2 (header) int int
  • 76. Actual layout? int int int int Object B Object B1 Object B2 B1 B2 B (header) B1* B2* B1 (header) int int B2 (header) int int
  • 77. Potential layout after GC int int int int Object B Object B1 Object B2 B1 B2 B (header) B1* B2* Other data B1 (header) int int Other data B2 (header int int
  • 78. SignalFx Techniques inspired by HPC that have improved our pipeline Separate the control and data planes
  • 79. Frequent Infrequent A networking concept Routing table Packets in Packets out Routing data Key Value
  • 80. What the control and data planes do In networking terminology: • Data plane - Defines the part that decides what to do with packets arriving on an inbound interface— Frequent • Control plane - Defines the part that is concerned with drawing the network map or routing table—Infrequent
  • 81. The goal of control and data plane separation DO NOT slow the frequent path because of the infrequent path
  • 82. Runtime configuration variables Worker threadConfiguration variables (volatile/atomic) Setter thread while (1) { process_data_using_configuration_variables(); } Flag 0 Flag 1 Flag 2 Flag 3
  • 83. Flag 0 Flag 1 Flag 2 Flag 3 Runtime configuration variables Worker threadConfiguration variables (volatile/atomic) Setter thread Flag 0 Flag 1 Flag 2 Flag 3 while (1) { cache_configuration_variables(); process_a_ton_of_stuff(); } Cached configuration variables
  • 84. Volatile/atomic flag vs cached local flag • All run time flags (used on every data point) are volatile/atomic loads • All run time flags are cached and refreshed on each run loop • About 8% improvement in datapoint/second. Others might see more or less