SlideShare une entreprise Scribd logo
1  sur  52
Télécharger pour lire hors ligne
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Streaming solutions for real time
problems
Aparna Gaonkar @AparnaMoi
Senior Principal Product Manager
Abhishek Gupta @abhi_tweeter
Senior Product Manager
Aug 10, 2017
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracle’s products remains at the sole discretion of Oracle.
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
The agenda..
Problem domain
Tech stack deep dive
Solution & Demo
Q&A
1
2
3
4
3
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
No time like the present..
How quickly are you able to respond to an event in your system?
– A fraud happening in your payment system
– A user viewing an expensive product
– A component being “stressed” due to load
4
…. and how to do this at scale ???
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
The (traditional) Batch solution
5
EVENTS
EVENTS
EVENTS
DWHAggregate
Batch
processing
Static view of insights
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
The (traditional) Queue based solution
6
QUEUE
EVENTS
EVENTS
EVENTS
DB
AppLISTENER
Polling etc.
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Problems
7
QUEUE
EVENTS
EVENTS
EVENTS
DB
AppLISTENER
Bottleneck – has a finite storage
capacity, because is designed not
for data storage but for queuing
data for smaller periods
Scalability issues
Bottleneck – can get
overwhelmed by too
many concurrent
requests
Polling etc.
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Hello Streaming: a (potential) solution
• Streaming – Continuously process
infinite/unbounded Data as it arrives,
rather than in a batch at a specific time
interval
• Data has both volume and velocity. Not
just Big Data, but Fast Data
• OLTP < Streaming < Batch
• Streaming solutions at the core deal with
these problems
– Storage for events
– Processing for events
– (Optionally), a system for maintaining state
8
http://www.capturearkansas.com/photos/550197
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Use Case: Data center monitoring application
9
• Collect: metrics from multiple
machines
• Crunch: statistics over a time window
• Monitor: using a dashboard
data: {"machine":"machine-1","metrics":["8","20","36","65","2","20","73","67"]}
data: {"machine":"machine-2","metrics":["1","54","42","61","40","35","26","78”]}
. . . .
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
A Streaming solution
10
Partitions
Partitions
Kafka
Receiver
Redis
Connector Lists
Sorted Set
Service App
UI
<polls>
SSE
EVENT
STORE
PROCESSOR STATE
STORE
VISUALIZATION
SIMULATED
PRODUCER
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Kafka: the Event Store
11
Partitions
Partitions
Kafka
Receiver
Redis
Connector Lists
Sorted Set
Service App
UI
<polls>
SSE
EVENT
STORE
PROCESSOR STATE
STORE
VISUALIZATION
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Kafka – publish-subscribe messaging….
“…. rethought as a distributed commit log”
• Core - Scala
• Clients – Java (included) and many more
– https://cwiki.apache.org/confluence/display/KAFKA/Clients
• Used by – Netflix, Twitter, Yelp, Uber etc. !
– https://kafka.apache.org/powered-by
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Topics
• Stores events/records
– Records are simple KV pairs (with timestamp)
– Keys and Values are treated as bytes by Kafka
• Producers write to & consumers read from it
– Multiple consumers possible
• Compacted topics
– Latest value (for a key) retained
– Keys cannot be null
•Divided into partitions
13
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Partitions
14
https://kafka.apache.org
On disk
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Replication (and partitioning) in action
15
Humble beginning – single node
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Replication (and partitioning) in action
16
Scale out…
https://simplydistributed.wordpress.com/2016/12/13/kafka-partitioning/ https://svn.apache.org/repos/asf/zookeeper/logo/zook
eeper.jpg
Zookeeper
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 17
Producers
https://kafka.apache.org
What goes where ??
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 18
Consumers
https://kafka.apache.org
Pub-sub Queue
Kafka
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 19
Scale: end-to-end
App
Producer data routed to appropriate
partition based on
strategy/configuration
Partition ‘automatically’ distributed
across the cluster
Pub-Sub & Queue combo in action
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 20
Managed Kafka cluster: Oracle Event Hub Cloud
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 21
Kafka Topic: Oracle Event Hub Cloud
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 22
Simulated event producer: Oracle Application Container Cloud
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Spark Streaming: processing engine
23
Partitions
Partitions
Kafka
Receiver
Redis
Connector Lists
Sorted Set
Service App
UI
<polls>
SSE
EVENT
STORE
PROCESSOR STATE
STORE
VISUALIZATION
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Hello Spark!
Unified platform for cluster
computing
– Order of magnitude faster than
Hadoop MapReduce
– Configurable Persistence and
Parallelism
– Multiple language support
One Driver, multiple Executors
– Can use Cluster Managers (YARN,
Mesos, Standalone) for Resource
Management
SPARK CORE
JAVA SCALA R PYTHON
RDD API DataFrame
StreamingSQL ML Lib GraphX
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
High Level Architecture of a Spark application
spark-submit <jar file>package
1 2 3
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
The evergreen WordCount
Good Evening to all Good weather
Good Evening To All Good Weather
RDD
(Good,
1)
(Evening,
1)
(To,
1)
(All,
1)
(Good,
1)
(Weather, 1)
(Good,
2)
(Evening,,
1)
(To,
1)
(All,
1)
(Weather, 1)
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Resilient Distributed DataSet (RDD)
Partition 1
Greetings to All
Random Text 1
Random Text 5
Representation of data in an RDD
Node 1 Node 2 Node 3
• Basic Abstraction
• Immutable
• Distributed collection
(with partitions)
• Fault Tolerant
• Lazy
– Transformations (map, flatMap)
– Actions (saveAsTextFile)
Partition 2
Welcome to
Oracle Code
Random Text 2
Random Text 6
Partition 3
Good evening
Random Text 3
Random Text 7
Partition 4
#oraclecode
Random Text 4
Random Text 8
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
RDD Operations
Partition 1
Greetings to all
Partition 2
Welcome to
Oracle Code
Partition 3
Good evening
Partition 1
Greetings
To
all
Partition 2
Welcome
To
Oracle
Code
Partition 3
Good
Evening
Narrow
Transformation
Narrow
Transformation
Narrow
Transformation
Wide
Transformation
Action
Partition 1
(Greetings,1)
(To,1)
(all,1)
Partition 2
(Welcome,1)
(To,1)
(Oracle,1)
(Code,1)
Partition 3
(Good,1)
(Evening,1)
Partition 1
(Greetings,1)
(To,2)
(all,1)
Partition 2
(Welcome,1)
(Oracle,1)
(Code,1)
(Evening,1)
Partition 1
(Greetings,1)
(To,2)
(all,1)
Partition 2
(Welcome,1)
(Oracle,1)
(Code,1)
(Evening,1)
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
RDD Operations
Partition 1
Greetings to all
Partition 2
Welcome to
Oracle Code
Partition 3
Good evening
Partition 1
Greetings
To
all
Partition 2
Welcome
To
Oracle
Code
Partition 3
Good
Evening
Partition 1
(Greetings,1)
(To,1)
(all,1)
Partition 2
(Welcome,1)
(To,1)
(Oracle,1)
(Code,1)
Partition 3
(Good,1)
(Evening,1)
Partition 1
(Greetings,1)
(To,2)
(all,1)
Partition 2
(Welcome,1)
(Oracle,1)
(Code,1)
(Evening,1)
Partition 1
(Greetings,1)
(To,2)
(all,1)
Partition 2
(Welcome,1)
(Oracle,1)
(Code,1)
(Evening,1)
Stage 1
Stage 2
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Lazy evaluation, DAG, Stages, Tasks
textFile
flatMap
map
reduceByKey
saveAsTextFile
A B C D
EA
B
C
D
E
Stage 1 Stage 2
Task 1
Task 2
Task 3
Task 1
Task 2
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Hello Spark Streaming!
• Continuous consumption and
processing of Streams
• Micro batching concept
• Abstraction is Dstream
(Discretized Streams)
• Checkpointing support
Kafka
TCP sockets
Flume
Etc.
Input Stream
Micro batches
Dstream of
RDDs
RDD @
T
RDD @
T + 1
RDD @
T + 2
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Streaming WordCount
Source: Apache Spark Streaming Documentation
DStream
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Common patterns in Streaming
• Windowed Calculations – show the count of the word “Code” in the last 15
minutes, refreshed every 10 seconds
– Window Length (15 mins)
– Slide Interval (10 secs)
• Cumulative Calculations - Show the total number of words received from
the time the job started.
– updateStateByKey
– mapWithState (new since 1.6)
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 34
Spark Cluster : Oracle Big Data Cloud – Compute Edition
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 35
Cluster console: Oracle Big Data Cloud – Compute Edition
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 36
Spark Job: Oracle Big Data Cloud – Compute Edition
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Redis: state store
37
Partitions
Partitions
Kafka
Receiver
Redis
Connector Lists
Sorted Set
Service App
UI
<polls>
SSE
EVENT
STORE
PROCESSOR STATE
STORE
VISUALIZATION
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
• RE(mote) DI(ctionary) S(erver)
• Versatile data structure server (written in
C)
• Focus on in-memory with (tunable)
persistence
•Not just any KV store
• Keys
– From a simple string to binary
– Max 512 MB (same for values)
– Can be expired
• Values – any of the following
– String, List, Hash
– Set, Sorted Set
– Geospatial, HyperLogLog
38
Hello
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Redis data structures
• Sorted Sets
– Each element has an associated score (basis for sort)
– Basic Ops: ZADD, ZINCRBY, ZREM, ZSCORE, ZCARD
– View: ZRANGEBYSCORE, ZREVRANGEBYSCORE
– Ranking: ZRANK, ZREVRANK
• Lists
– To be specific: a Linked List
– Operations at head (LPUSH) & tail (RPUSH) are O(1), search by
index is O(N)
– LRANGE, RPOP, LPOP to extract data & LTRIM to cap the size
– Blocking ops: BLPOP, BRPOP
39
https://redis.io/commands
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
etc……
• Good stuff – Redis Sentinel (HA), Master-Slave replication, Redis Cluster for
partitioning, Pub Sub, Transactions, Lua scripting
• Use cases: Messaging, Cache, Job Queue, Live leader board, counting stuff
(efficiently), analytics, location based (Geospatial) etc.
• Client libraries – Java, Scala, Go, Python, C++….
– https://redis.io/clients
• Who’s using it ?? Twitter, Github, Pinterest etc.
40
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Redis : Oracle Container Cloud
41
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Tech stack: powered by Oracle Cloud
42
Partitions
Partitions
Kafka
Receiver
Redis
Connector Lists
Sorted Set
Service App
UI
<polls
>
SSE
Developer Cloud Service
Event Hub CloudApplication
Container
Cloud
Event Source
Big Data – Compute Edition
Container Cloud
Container Cloud
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Demo
43
44
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Dashboard: powered by Kafka, Spark & Redis
45
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 46
Oracle Application Container Cloud - overview
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 47
Oracle Big Data Cloud – Compute Edition – Cluster overview
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 48
Oracle Big Data Cloud – Compute Edition – Cluster console
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 49
Oracle Big Data Cloud – Compute Edition – Jobs
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 50
Oracle Big Data Cloud – Compute Edition – Jobs
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 51
Oracle Container Cloud – the application Stack config
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 52
Oracle Container Cloud – dashboard app in action

Contenu connexe

Tendances

Video Transcoding on Hadoop
Video Transcoding on HadoopVideo Transcoding on Hadoop
Video Transcoding on HadoopDataWorks Summit
 
Performance tuning your Hadoop/Spark clusters to use cloud storage
Performance tuning your Hadoop/Spark clusters to use cloud storagePerformance tuning your Hadoop/Spark clusters to use cloud storage
Performance tuning your Hadoop/Spark clusters to use cloud storageDataWorks Summit
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningDataWorks Summit
 
Zero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightZero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightDataWorks Summit
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problemsAbhishek Gupta
 
YARN Ready: Apache Spark
YARN Ready: Apache Spark YARN Ready: Apache Spark
YARN Ready: Apache Spark Hortonworks
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceDataWorks Summit
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduDataWorks Summit
 
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analyticsPivotalOpenSourceHub
 
Schema Registry - Set Your Data Free
Schema Registry - Set Your Data FreeSchema Registry - Set Your Data Free
Schema Registry - Set Your Data FreeDataWorks Summit
 
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...buildacloud
 
SQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQSQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQpivotalny
 
HAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged DataHAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged DataDataWorks Summit
 
Data Security at Scale through Spark and Parquet Encryption
Data Security at Scale through Spark and Parquet EncryptionData Security at Scale through Spark and Parquet Encryption
Data Security at Scale through Spark and Parquet EncryptionDatabricks
 
Stream Processing Everywhere - What to use?
Stream Processing Everywhere - What to use?Stream Processing Everywhere - What to use?
Stream Processing Everywhere - What to use?MapR Technologies
 
Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!
Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!
Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!Mich Talebzadeh (Ph.D.)
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseDataWorks Summit
 

Tendances (20)

Video Transcoding on Hadoop
Video Transcoding on HadoopVideo Transcoding on Hadoop
Video Transcoding on Hadoop
 
Performance tuning your Hadoop/Spark clusters to use cloud storage
Performance tuning your Hadoop/Spark clusters to use cloud storagePerformance tuning your Hadoop/Spark clusters to use cloud storage
Performance tuning your Hadoop/Spark clusters to use cloud storage
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learning
 
Zero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightZero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsight
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problems
 
YARN Ready: Apache Spark
YARN Ready: Apache Spark YARN Ready: Apache Spark
YARN Ready: Apache Spark
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of Service
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
 
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
 
Schema Registry - Set Your Data Free
Schema Registry - Set Your Data FreeSchema Registry - Set Your Data Free
Schema Registry - Set Your Data Free
 
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
 
SQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQSQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQ
 
HAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged DataHAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged Data
 
Data Security at Scale through Spark and Parquet Encryption
Data Security at Scale through Spark and Parquet EncryptionData Security at Scale through Spark and Parquet Encryption
Data Security at Scale through Spark and Parquet Encryption
 
Stream Processing Everywhere - What to use?
Stream Processing Everywhere - What to use?Stream Processing Everywhere - What to use?
Stream Processing Everywhere - What to use?
 
Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!
Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!
Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 

Similaire à Streaming solutions for real time problems

AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RACAUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RACSandesh Rao
 
AIOUG-GroundBreakers-Jul 2019 - 19c RAC
AIOUG-GroundBreakers-Jul 2019 - 19c RACAIOUG-GroundBreakers-Jul 2019 - 19c RAC
AIOUG-GroundBreakers-Jul 2019 - 19c RACSandesh Rao
 
New availability features in oracle rac 12c release 2 anair ss
New availability features in oracle rac 12c release 2 anair   ssNew availability features in oracle rac 12c release 2 anair   ss
New availability features in oracle rac 12c release 2 anair ssAnil Nair
 
Coherence RoadMap 2018
Coherence RoadMap 2018Coherence RoadMap 2018
Coherence RoadMap 2018harvraja
 
Exadata x4 for_sap
Exadata x4 for_sapExadata x4 for_sap
Exadata x4 for_sapFran Navarro
 
20190615 hkos-mysql-troubleshootingandperformancev2
20190615 hkos-mysql-troubleshootingandperformancev220190615 hkos-mysql-troubleshootingandperformancev2
20190615 hkos-mysql-troubleshootingandperformancev2Ivan Ma
 
Intelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff PollockIntelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff PollockJeffrey T. Pollock
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataSolution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataInfiniteGraph
 
2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data IntegrationJeffrey T. Pollock
 
Octo and the DevSecOps Evolution at Oracle by Ian Van Hoven
Octo and the DevSecOps Evolution at Oracle by Ian Van HovenOcto and the DevSecOps Evolution at Oracle by Ian Van Hoven
Octo and the DevSecOps Evolution at Oracle by Ian Van HovenInfluxData
 
Times ten 18.1_overview_meetup
Times ten 18.1_overview_meetupTimes ten 18.1_overview_meetup
Times ten 18.1_overview_meetupByung Ho Lee
 
Understanding the impact of Linux scheduler on your application
Understanding the impact of Linux scheduler on your applicationUnderstanding the impact of Linux scheduler on your application
Understanding the impact of Linux scheduler on your applicationAtish Patra
 
Replicate data between environments
Replicate data between environmentsReplicate data between environments
Replicate data between environmentsDLT Solutions
 
Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster Fran Navarro
 
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
MySQL InnoDB Cluster - A complete High Availability solution for MySQLMySQL InnoDB Cluster - A complete High Availability solution for MySQL
MySQL InnoDB Cluster - A complete High Availability solution for MySQLOlivier DASINI
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewPaulo Fagundes
 
Oracle Code Online: Building a Serverless State Service for the Cloud
Oracle Code Online: Building a Serverless State Service for the CloudOracle Code Online: Building a Serverless State Service for the Cloud
Oracle Code Online: Building a Serverless State Service for the CloudEd Burns
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Community
 

Similaire à Streaming solutions for real time problems (20)

AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RACAUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
 
AIOUG-GroundBreakers-Jul 2019 - 19c RAC
AIOUG-GroundBreakers-Jul 2019 - 19c RACAIOUG-GroundBreakers-Jul 2019 - 19c RAC
AIOUG-GroundBreakers-Jul 2019 - 19c RAC
 
Novinky v Oracle Database 18c
Novinky v Oracle Database 18cNovinky v Oracle Database 18c
Novinky v Oracle Database 18c
 
New availability features in oracle rac 12c release 2 anair ss
New availability features in oracle rac 12c release 2 anair   ssNew availability features in oracle rac 12c release 2 anair   ss
New availability features in oracle rac 12c release 2 anair ss
 
Coherence RoadMap 2018
Coherence RoadMap 2018Coherence RoadMap 2018
Coherence RoadMap 2018
 
Exadata x4 for_sap
Exadata x4 for_sapExadata x4 for_sap
Exadata x4 for_sap
 
20190615 hkos-mysql-troubleshootingandperformancev2
20190615 hkos-mysql-troubleshootingandperformancev220190615 hkos-mysql-troubleshootingandperformancev2
20190615 hkos-mysql-troubleshootingandperformancev2
 
Intelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff PollockIntelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff Pollock
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataSolution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big Data
 
2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration
 
Octo and the DevSecOps Evolution at Oracle by Ian Van Hoven
Octo and the DevSecOps Evolution at Oracle by Ian Van HovenOcto and the DevSecOps Evolution at Oracle by Ian Van Hoven
Octo and the DevSecOps Evolution at Oracle by Ian Van Hoven
 
Times ten 18.1_overview_meetup
Times ten 18.1_overview_meetupTimes ten 18.1_overview_meetup
Times ten 18.1_overview_meetup
 
Understanding the impact of Linux scheduler on your application
Understanding the impact of Linux scheduler on your applicationUnderstanding the impact of Linux scheduler on your application
Understanding the impact of Linux scheduler on your application
 
Replicate data between environments
Replicate data between environmentsReplicate data between environments
Replicate data between environments
 
Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster
 
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
MySQL InnoDB Cluster - A complete High Availability solution for MySQLMySQL InnoDB Cluster - A complete High Availability solution for MySQL
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
 
Ceph's journey at SUSE
Ceph's journey at SUSECeph's journey at SUSE
Ceph's journey at SUSE
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overview
 
Oracle Code Online: Building a Serverless State Service for the Cloud
Oracle Code Online: Building a Serverless State Service for the CloudOracle Code Online: Building a Serverless State Service for the Cloud
Oracle Code Online: Building a Serverless State Service for the Cloud
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
 

Dernier

Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 

Dernier (20)

Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 

Streaming solutions for real time problems

  • 1. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Streaming solutions for real time problems Aparna Gaonkar @AparnaMoi Senior Principal Product Manager Abhishek Gupta @abhi_tweeter Senior Product Manager Aug 10, 2017
  • 2. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
  • 3. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | The agenda.. Problem domain Tech stack deep dive Solution & Demo Q&A 1 2 3 4 3
  • 4. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | No time like the present.. How quickly are you able to respond to an event in your system? – A fraud happening in your payment system – A user viewing an expensive product – A component being “stressed” due to load 4 …. and how to do this at scale ???
  • 5. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | The (traditional) Batch solution 5 EVENTS EVENTS EVENTS DWHAggregate Batch processing Static view of insights
  • 6. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | The (traditional) Queue based solution 6 QUEUE EVENTS EVENTS EVENTS DB AppLISTENER Polling etc.
  • 7. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Problems 7 QUEUE EVENTS EVENTS EVENTS DB AppLISTENER Bottleneck – has a finite storage capacity, because is designed not for data storage but for queuing data for smaller periods Scalability issues Bottleneck – can get overwhelmed by too many concurrent requests Polling etc.
  • 8. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Hello Streaming: a (potential) solution • Streaming – Continuously process infinite/unbounded Data as it arrives, rather than in a batch at a specific time interval • Data has both volume and velocity. Not just Big Data, but Fast Data • OLTP < Streaming < Batch • Streaming solutions at the core deal with these problems – Storage for events – Processing for events – (Optionally), a system for maintaining state 8 http://www.capturearkansas.com/photos/550197
  • 9. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Use Case: Data center monitoring application 9 • Collect: metrics from multiple machines • Crunch: statistics over a time window • Monitor: using a dashboard data: {"machine":"machine-1","metrics":["8","20","36","65","2","20","73","67"]} data: {"machine":"machine-2","metrics":["1","54","42","61","40","35","26","78”]} . . . .
  • 10. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | A Streaming solution 10 Partitions Partitions Kafka Receiver Redis Connector Lists Sorted Set Service App UI <polls> SSE EVENT STORE PROCESSOR STATE STORE VISUALIZATION SIMULATED PRODUCER
  • 11. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Kafka: the Event Store 11 Partitions Partitions Kafka Receiver Redis Connector Lists Sorted Set Service App UI <polls> SSE EVENT STORE PROCESSOR STATE STORE VISUALIZATION
  • 12. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Kafka – publish-subscribe messaging…. “…. rethought as a distributed commit log” • Core - Scala • Clients – Java (included) and many more – https://cwiki.apache.org/confluence/display/KAFKA/Clients • Used by – Netflix, Twitter, Yelp, Uber etc. ! – https://kafka.apache.org/powered-by
  • 13. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Topics • Stores events/records – Records are simple KV pairs (with timestamp) – Keys and Values are treated as bytes by Kafka • Producers write to & consumers read from it – Multiple consumers possible • Compacted topics – Latest value (for a key) retained – Keys cannot be null •Divided into partitions 13
  • 14. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Partitions 14 https://kafka.apache.org On disk
  • 15. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Replication (and partitioning) in action 15 Humble beginning – single node
  • 16. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Replication (and partitioning) in action 16 Scale out… https://simplydistributed.wordpress.com/2016/12/13/kafka-partitioning/ https://svn.apache.org/repos/asf/zookeeper/logo/zook eeper.jpg Zookeeper
  • 17. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 17 Producers https://kafka.apache.org What goes where ??
  • 18. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 18 Consumers https://kafka.apache.org Pub-sub Queue Kafka
  • 19. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 19 Scale: end-to-end App Producer data routed to appropriate partition based on strategy/configuration Partition ‘automatically’ distributed across the cluster Pub-Sub & Queue combo in action
  • 20. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 20 Managed Kafka cluster: Oracle Event Hub Cloud
  • 21. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 21 Kafka Topic: Oracle Event Hub Cloud
  • 22. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 22 Simulated event producer: Oracle Application Container Cloud
  • 23. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Spark Streaming: processing engine 23 Partitions Partitions Kafka Receiver Redis Connector Lists Sorted Set Service App UI <polls> SSE EVENT STORE PROCESSOR STATE STORE VISUALIZATION
  • 24. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Hello Spark! Unified platform for cluster computing – Order of magnitude faster than Hadoop MapReduce – Configurable Persistence and Parallelism – Multiple language support One Driver, multiple Executors – Can use Cluster Managers (YARN, Mesos, Standalone) for Resource Management SPARK CORE JAVA SCALA R PYTHON RDD API DataFrame StreamingSQL ML Lib GraphX
  • 25. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | High Level Architecture of a Spark application spark-submit <jar file>package 1 2 3
  • 26. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | The evergreen WordCount Good Evening to all Good weather Good Evening To All Good Weather RDD (Good, 1) (Evening, 1) (To, 1) (All, 1) (Good, 1) (Weather, 1) (Good, 2) (Evening,, 1) (To, 1) (All, 1) (Weather, 1)
  • 27. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Resilient Distributed DataSet (RDD) Partition 1 Greetings to All Random Text 1 Random Text 5 Representation of data in an RDD Node 1 Node 2 Node 3 • Basic Abstraction • Immutable • Distributed collection (with partitions) • Fault Tolerant • Lazy – Transformations (map, flatMap) – Actions (saveAsTextFile) Partition 2 Welcome to Oracle Code Random Text 2 Random Text 6 Partition 3 Good evening Random Text 3 Random Text 7 Partition 4 #oraclecode Random Text 4 Random Text 8
  • 28. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | RDD Operations Partition 1 Greetings to all Partition 2 Welcome to Oracle Code Partition 3 Good evening Partition 1 Greetings To all Partition 2 Welcome To Oracle Code Partition 3 Good Evening Narrow Transformation Narrow Transformation Narrow Transformation Wide Transformation Action Partition 1 (Greetings,1) (To,1) (all,1) Partition 2 (Welcome,1) (To,1) (Oracle,1) (Code,1) Partition 3 (Good,1) (Evening,1) Partition 1 (Greetings,1) (To,2) (all,1) Partition 2 (Welcome,1) (Oracle,1) (Code,1) (Evening,1) Partition 1 (Greetings,1) (To,2) (all,1) Partition 2 (Welcome,1) (Oracle,1) (Code,1) (Evening,1)
  • 29. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | RDD Operations Partition 1 Greetings to all Partition 2 Welcome to Oracle Code Partition 3 Good evening Partition 1 Greetings To all Partition 2 Welcome To Oracle Code Partition 3 Good Evening Partition 1 (Greetings,1) (To,1) (all,1) Partition 2 (Welcome,1) (To,1) (Oracle,1) (Code,1) Partition 3 (Good,1) (Evening,1) Partition 1 (Greetings,1) (To,2) (all,1) Partition 2 (Welcome,1) (Oracle,1) (Code,1) (Evening,1) Partition 1 (Greetings,1) (To,2) (all,1) Partition 2 (Welcome,1) (Oracle,1) (Code,1) (Evening,1) Stage 1 Stage 2
  • 30. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Lazy evaluation, DAG, Stages, Tasks textFile flatMap map reduceByKey saveAsTextFile A B C D EA B C D E Stage 1 Stage 2 Task 1 Task 2 Task 3 Task 1 Task 2
  • 31. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Hello Spark Streaming! • Continuous consumption and processing of Streams • Micro batching concept • Abstraction is Dstream (Discretized Streams) • Checkpointing support Kafka TCP sockets Flume Etc. Input Stream Micro batches Dstream of RDDs RDD @ T RDD @ T + 1 RDD @ T + 2
  • 32. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Streaming WordCount Source: Apache Spark Streaming Documentation DStream
  • 33. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Common patterns in Streaming • Windowed Calculations – show the count of the word “Code” in the last 15 minutes, refreshed every 10 seconds – Window Length (15 mins) – Slide Interval (10 secs) • Cumulative Calculations - Show the total number of words received from the time the job started. – updateStateByKey – mapWithState (new since 1.6)
  • 34. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 34 Spark Cluster : Oracle Big Data Cloud – Compute Edition
  • 35. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 35 Cluster console: Oracle Big Data Cloud – Compute Edition
  • 36. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 36 Spark Job: Oracle Big Data Cloud – Compute Edition
  • 37. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Redis: state store 37 Partitions Partitions Kafka Receiver Redis Connector Lists Sorted Set Service App UI <polls> SSE EVENT STORE PROCESSOR STATE STORE VISUALIZATION
  • 38. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | • RE(mote) DI(ctionary) S(erver) • Versatile data structure server (written in C) • Focus on in-memory with (tunable) persistence •Not just any KV store • Keys – From a simple string to binary – Max 512 MB (same for values) – Can be expired • Values – any of the following – String, List, Hash – Set, Sorted Set – Geospatial, HyperLogLog 38 Hello
  • 39. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Redis data structures • Sorted Sets – Each element has an associated score (basis for sort) – Basic Ops: ZADD, ZINCRBY, ZREM, ZSCORE, ZCARD – View: ZRANGEBYSCORE, ZREVRANGEBYSCORE – Ranking: ZRANK, ZREVRANK • Lists – To be specific: a Linked List – Operations at head (LPUSH) & tail (RPUSH) are O(1), search by index is O(N) – LRANGE, RPOP, LPOP to extract data & LTRIM to cap the size – Blocking ops: BLPOP, BRPOP 39 https://redis.io/commands
  • 40. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | etc…… • Good stuff – Redis Sentinel (HA), Master-Slave replication, Redis Cluster for partitioning, Pub Sub, Transactions, Lua scripting • Use cases: Messaging, Cache, Job Queue, Live leader board, counting stuff (efficiently), analytics, location based (Geospatial) etc. • Client libraries – Java, Scala, Go, Python, C++…. – https://redis.io/clients • Who’s using it ?? Twitter, Github, Pinterest etc. 40
  • 41. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Redis : Oracle Container Cloud 41
  • 42. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Tech stack: powered by Oracle Cloud 42 Partitions Partitions Kafka Receiver Redis Connector Lists Sorted Set Service App UI <polls > SSE Developer Cloud Service Event Hub CloudApplication Container Cloud Event Source Big Data – Compute Edition Container Cloud Container Cloud
  • 43. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Demo 43
  • 44. 44
  • 45. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Dashboard: powered by Kafka, Spark & Redis 45
  • 46. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 46 Oracle Application Container Cloud - overview
  • 47. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 47 Oracle Big Data Cloud – Compute Edition – Cluster overview
  • 48. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 48 Oracle Big Data Cloud – Compute Edition – Cluster console
  • 49. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 49 Oracle Big Data Cloud – Compute Edition – Jobs
  • 50. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 50 Oracle Big Data Cloud – Compute Edition – Jobs
  • 51. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 51 Oracle Container Cloud – the application Stack config
  • 52. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 52 Oracle Container Cloud – dashboard app in action