SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
1 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Next	Generation	Execution	Engine
for	Apache	Storm
Roshan	Naik,	Hortonworks
Dataworks Summit
Sept	20th 2017,	San		Jose
2 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Present	:	Storm	1.x
à Has	matured	into	a	stable	and	reliable	system
à Widely	deployed	and	holding	up	well	in	production
à Scales	well	horizontally
à Lots	of	new	competition
– Differentiating	on	Features,	Performance,	Ease	of	Use	etc.
Storm	2.x
à High	performance	execution	engine
à All	Java	code	(transitioning	away	from	Clojure)
à Improved	Backpressure,	Metrics	subsystems
à Lots	more	..	
– Streams	API,	UI	improvements,	RAS	scheduler	improvements,	…
3 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Execution	Engine	- Planned	Enhancements	for
à Umbrella	Jira	:	STORM-2284
– https://issues.apache.org/jira/browse/STORM-2284
4 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Performance
5 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Use	Cases	- Latency	centric
à 100ms+	: Factory	automation
à 10ms	- 100ms	: Real	time	gaming,	scoring	shopping	carts	to	print	coupons
à 0-10	ms : Network	threat	detection
à Java	based	High	Frequency	Trading	systems
– fast:	under	100	micro-secs 90%	of	time,	no	GC	during	the	trading	hours
– medium:	under	1ms	95%	of	time,		and	rare	minor	GC
– slow:	under	10	ms 99	or	99.9%	of	time,	minor	GC	every	few	mins
– Cost	of	being	slow
• Better	to	turn	it	off	than	lose	money	by	leaving	it	running
6 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Performance	in	2.0
à How	do	we	know	if	a	streaming	system	is	“fast”?
– Faster	than	another	system	?
– What	about	Hardware	potential	?
• More	on	this	later
à Dimensions
– Throughput
– Latency
– Resource	utilization:	CPU/Network/Memory/Disk/Power
7 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Areas	critical	to	Performance
à Messaging	System
– Need	Bounded	Concurrent	Queues	that	operate	as	fast	as	hardware	allows
– Lock	based	queues	not	an	option
– Lock	free queues	or	preferably	Wait-free	queues
à Threading	&	Execution	Model
– Avoid	unnecessary	threads.		Less	synchronization.
– Dedicated	threads	for	spouts	and	bolts	instead	of	pooled	threads.
– CPU	Pinning.
– Reduce	inter-thread,	inter-process	and	inter-host	communication
à Memory	Model
– Lowering	GC	Pressure: Recycling	Objects	in	critical	path.
– Reducing	CPU	cache	faults: Control	Object	Layout	(contiguous	allocation),	avoid	false	sharing
8 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Messaging	Subsystem
(STORM-2307)
9 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Understanding	“Fast”
Component Throughput	Mill/sec
AKKA 90-100	threads 50
Flink per	core 1.5
Apex	3.0 container	local 4.3
v3.0
Gear	Pump 4	nodes 18
InfoSphere	Streams
v3.0
Huge	Gap!
Component Throughput	Mill/sec
Not	thread	safe ArrayDeQueue 1		thread	rd+wr 1063
Lock	based ArrayBlockingQueue 1		thd		rd+wr 30
1	Prod,	1	Cons 4
SleepingWaitStrategy Disruptor 1	P,	1C 25
(ProducerMode=	MULTI) 3.3.x
lazySet() FastQ 1	P,	1C 31
JC	Tools	MPSC 1P,	1c 74
2P, 59
3P 43
4P 40
6P 56
8P 65
10P 66
15P 68
20P 68
10 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Messaging	- Current	Architecture
Worker	Send	Thd
Send	Q
Network
Bolt/Spout	Executor
Recv Q
Bolt	
Executor	
Thread
(user	logic)
Send	Q
Send	
Thread
Worker	Recv Thd
Recv Q
Network
Worker	Process	- High	Level	View
11 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Bolt/Spout	Executor	- Detailed
ArrayList:		Current	Batch
CLQ	:	OVERFLOW
BATCHER
Disruptor	Q
Flusher	
Thread
Send	
Thread
SEND	QRECEIVE	Q
ArrayList:		Current	Batch
CLQ	:	OVERFLOW
BATCHER	(1	per	publisher)
Disruptor	Q
Bolt	
Executor	
Thread
(user	logic)
publish
Flusher	
Thread
ArrayList
ArrayList
DestID
msgs
msgs
msgs
msgs
DestID
msgs
msgs
msgs
msgs
Worker’s	
Outbound	Q
Local	Executor’s	
RECEIVE	Q
S
E
N
D
T
H
R
E
A
D
local
remote
12 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
New	Architecture
ArrayList
ArrayList
DestID
msgs
msgs
msgs
msgs
DestID
msgs
msgs
msgs
msgs
Worker’s	
Outbound	Q
Local	Executor’s	
RECEIVE	Q
S
E
N
D
T
H
R
E
A
D
ArrayList:		Current	Batch
CLQ	:	OVERFLOW
BATCHER
Disruptor	Q
Flusher	
Thread
Send	
Thread
SEND	QRECEIVE	Q
ArrayList:		Current	Batch
CLQ	:	OVERFLOW
BATCHER	(1	per	publisher)
Disruptor	Q
Bolt	
Executor	
Thread
(user	logic)
publish
Flusher	
Thread
local
remote
13 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Messaging	- New	Architecture
(STORM-2306)
RECEIVE	Q
ArrayList:		Current	Batch
BATCHER
JCTools Q
Bolt	
Executor	
Thread
(user	logic)
publish
Worker’s	
Outbound	Q
local
remote
Local	Executor’s	
RECEIVE	Q
14 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Preliminary	Numbers	
LATENCY
à 1	spout	-->	1	bolt		 with	1 ACKer (all	in	same	worker)
– v1.0.1	: 3.4	milliseconds
– v2.0 master:										 ~7		milliseconds
– v2.0 redesigned :	 under	100 micro seconds (116x	 improvement)
15 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Preliminary	Numbers	
THROUGHPUT
à 1	spout	-->	1	bolt	 [w/o ACKing]
– v1.0.1	: ?
– v2.0 master:										~4	million	/sec
– v2.0 redesigned :	 7	- 8	million	/sec (~2x	but	can	be	much	better)
à 1	spout	-->	1	bolt	 [with ACKing]
– v1.0	: 233	K	/sec
– v2.0 master:	 900	k/sec
– v2.0 redesigned :	 1.5	million	/sec	 (again,	can	be	much	better)
16 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Observations
à Latency: Dramatically	improved.
à Throughput: Discovered	multiple	bottlenecks	preventing	significantly	higher	
throughput.	
– Grouping: Bottlenecks	in	LocalShuffle &	FieldsGrouping if	addressed	along	with	some	others,	
throughput	can	reach	~7	million/sec.
– TumpleImpl : If	inefficiencies	here	are	addressed,	throughput can	reach	~15	mill/sec.
– ACK-ing : ACKer bolt	currently	maxing	out	at	~	2.5	million	ACKs	/	sec.	Limitation	with	
implementation	not	with	concept.	I	see	room	for	ACKer specific	fixes	that	can	also	
substantially improve	its	throughput.
17 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
CPU	Pinning
(STORM-2313)
18 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
CPU	cache	access
à Approximate	access	costs	
– L1	cache	:	1x
– L2	cache	:	2.5x
– Local	L3	cache	:	10-20x
– Remote	L3	cache:	25-75x
19 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
CPU	Affinity
à For	inter-thread	communication	
– cache	fault	distance	matters
– Faster	between	cores	on	same	socket
• 20%	latency	hit	when	threads	pinned	to	diff	sockets
à Pinning	threads	to	CPUs
– If	done	right,	minimizes	cache	fault	distance
– Threads	moving	around	needs	to	cache	refreshed
– Unrelated	threads	running	on	same	core	trash	each	others	cache
à Helps	perf on	NUMA	machines	
– Pinning	long	running	tasks	reduces	NUMA	effects
– NUMA	aware	allocator	introduced	in	Java	SE	6u2
20 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
CPU	Pinning	Strategy	
à Pin	executors	to	physical cores.
à Pin	each	executor	to	a	separate	physical core
– High	throughput	/	very	low		latency	topos:
– Not	economical	for	other	topos.
à Try	to	fit	subsequent	executor	threads	on	same	socket
à Logical	cores	– i.e.	Hyperthreading ?
– Avoid	hyperthreading – avoid	cache	trashing	each	other	on	same	core
– Could	provide	it	as	option	in	future	?
21 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Threading	&	Execution	Model	
(STORM-2307)
22 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
WORKER	PROCESS
• Start/Stop/Monitor	
Executors
• Manage	Metrics
• Topology	Reconfig
• Heartbeat
Executor	(Thd)
grouper
Task
(Bolt)Q
counters
Executor	(Thd)
System	Task
(Inter	host
Input)
Executor	(Thd)
Sys	Task
(Outbound	
Msgs)
Q
counters
New	Threading	&	Execution	Model
Executor	(Thd)
System	Task
(Intra	host	
Input)
Executor	(Thd)
(grouper)
(Bolt)
Task
(Spout/Bolt)Q
counters
23 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Memory	Management
24 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Memory	Management
Can	be	decomposed	into	2	key	area
– Object	Recycling - in	critical	path
• Avoids	dynamic	allocation	cost
• Minimizes	stop-the-world	GC	pauses
– Contiguous	allocation:	arrays,	data	members.	
• CPU	likes	it.	
• Pre-fetch	friendly.
• Fewer	cache	faults	per	object.
• Natural	in	C++,	very	painful	in	Java.
25 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Thank	You	!
Tomorrow:	
Data	Guarantees	And	Fault	Tolerance	In	Streaming	Systems
5:10	pm			Room:		C4.5
Questions	?
References
https://issues.apache.org/jira/browse/STORM-2284

Contenu connexe

Tendances

The Future of Apache Ambari
The Future of Apache AmbariThe Future of Apache Ambari
The Future of Apache Ambari
DataWorks Summit
 
SAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasySAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made Easy
DataWorks Summit
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
DataWorks Summit
 

Tendances (20)

An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
 
Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
 
Row/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache SparkRow/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache Spark
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudThe state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the Cloud
 
The Future of Apache Ambari
The Future of Apache AmbariThe Future of Apache Ambari
The Future of Apache Ambari
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
 
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSenseDouble Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSense
 
SAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasySAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made Easy
 
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and TroubleshootingApache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
 
Next Generation Execution Engine for Apache Storm
Next Generation Execution Engine for Apache StormNext Generation Execution Engine for Apache Storm
Next Generation Execution Engine for Apache Storm
 
Spark Security
Spark SecuritySpark Security
Spark Security
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
 
Creating the Internet of Your Things
Creating the Internet of Your ThingsCreating the Internet of Your Things
Creating the Internet of Your Things
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Apache Hive 2.0 SQL, Speed, Scale by Alan Gates
Apache Hive 2.0 SQL, Speed, Scale by Alan GatesApache Hive 2.0 SQL, Speed, Scale by Alan Gates
Apache Hive 2.0 SQL, Speed, Scale by Alan Gates
 
Running Zeppelin in Enterprise
Running Zeppelin in EnterpriseRunning Zeppelin in Enterprise
Running Zeppelin in Enterprise
 

En vedette

How Big Data and Deep Learning are Revolutionizing AML and Financial Crime De...
How Big Data and Deep Learning are Revolutionizing AML and Financial Crime De...How Big Data and Deep Learning are Revolutionizing AML and Financial Crime De...
How Big Data and Deep Learning are Revolutionizing AML and Financial Crime De...
DataWorks Summit
 
The Future of Data in Telecom and the Rise of Connected Communities
The Future of Data in Telecom and the Rise of Connected CommunitiesThe Future of Data in Telecom and the Rise of Connected Communities
The Future of Data in Telecom and the Rise of Connected Communities
DataWorks Summit
 
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiIntelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
DataWorks Summit
 

En vedette (15)

SparkR Best Practices for R Data Scientists
SparkR Best Practices for R Data ScientistsSparkR Best Practices for R Data Scientists
SparkR Best Practices for R Data Scientists
 
Delivering Data Science to the Business
Delivering Data Science to the BusinessDelivering Data Science to the Business
Delivering Data Science to the Business
 
How Big Data and Deep Learning are Revolutionizing AML and Financial Crime De...
How Big Data and Deep Learning are Revolutionizing AML and Financial Crime De...How Big Data and Deep Learning are Revolutionizing AML and Financial Crime De...
How Big Data and Deep Learning are Revolutionizing AML and Financial Crime De...
 
Beyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIBeyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AI
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
 
Data-In-Motion Unleashed
Data-In-Motion UnleashedData-In-Motion Unleashed
Data-In-Motion Unleashed
 
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...
 
Apache Hadoop Crash Course
Apache Hadoop Crash CourseApache Hadoop Crash Course
Apache Hadoop Crash Course
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache MetronMaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
 
Data Guarantees and Fault Tolerance in Streaming Systems
Data Guarantees and Fault Tolerance in Streaming SystemsData Guarantees and Fault Tolerance in Streaming Systems
Data Guarantees and Fault Tolerance in Streaming Systems
 
The Apache Way
The Apache WayThe Apache Way
The Apache Way
 
The Future of Data in Telecom and the Rise of Connected Communities
The Future of Data in Telecom and the Rise of Connected CommunitiesThe Future of Data in Telecom and the Rise of Connected Communities
The Future of Data in Telecom and the Rise of Connected Communities
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiIntelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
 

Similaire à Next Generation Execution for Apache Storm

Streamline - Stream Analytics for Everyone
Streamline - Stream Analytics for EveryoneStreamline - Stream Analytics for Everyone
Streamline - Stream Analytics for Everyone
DataWorks Summit/Hadoop Summit
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
DataWorks Summit
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & Future
DataWorks Summit
 
PaaS on Openstack
PaaS on OpenstackPaaS on Openstack
PaaS on Openstack
Open Stack
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
DataWorks Summit
 

Similaire à Next Generation Execution for Apache Storm (20)

Streaming analytics manager
Streaming analytics managerStreaming analytics manager
Streaming analytics manager
 
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Streamline - Stream Analytics for Everyone
Streamline - Stream Analytics for EveryoneStreamline - Stream Analytics for Everyone
Streamline - Stream Analytics for Everyone
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & Future
 
Schema Registry & Stream Analytics Manager
Schema Registry  & Stream Analytics ManagerSchema Registry  & Stream Analytics Manager
Schema Registry & Stream Analytics Manager
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
 
Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)
Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)
Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)
 
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming DataDruid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
 
SAM—streaming analytics made easy
SAM—streaming analytics made easySAM—streaming analytics made easy
SAM—streaming analytics made easy
 
What's new in Ambari
What's new in AmbariWhat's new in Ambari
What's new in Ambari
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
 
PaaS on Openstack
PaaS on OpenstackPaaS on Openstack
PaaS on Openstack
 
Unlocking insights in streaming data
Unlocking insights in streaming dataUnlocking insights in streaming data
Unlocking insights in streaming data
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 

Plus de DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

Next Generation Execution for Apache Storm