SlideShare une entreprise Scribd logo
1  sur  66
Télécharger pour lire hors ligne
Spotting the trends by looking
at the big picture
Streaming Cyber Security into Graph:
Accelerating Data into Datastax Graph
and Blazegraph
Silicon Valley
• Digital Experiences
• Artificial Intelligence
• Platforms & Systems
Washington DC
• Security
Dublin
• Artificial Intelligence
Sophia Antipolis
• Industry Innovation (FS & Resources)
Beijing
• Industrial Internet
Bangalore
• Software Engineering
Tel-Aviv
• Security
For more than 20 years, Accenture Labs has served as the tip of the spear for technology innovation at Accenture.
Over the last 5 years Accenture Labs has:
• Supported 300+ client engagements and hosted 1100+ client workshops
• Published 200+ thought leadership pieces, filed 110+ patent applications, and garnered 350+ Tier-1 media hits
Expanding Global Presence
2Copyright © 2016 Accenture All rights reserved.
Security Data Science is Hard
Once the security community moves beyond the mantras “encrypt
everything” and “secure the perimeter,” it can begin developing intelligent
prioritization and response plans to various kinds of breaches – with a strong
focus on integrity.
http://www.wired.com/2015/12/the-cia-secret-to-cybersecurity-that-no-one-seems-to-get/
Right now, financial services reports it takes an average of 98 days to detect an
Advance Threat but retailers say it can be about seven months.
3Copyright © 2016 Accenture All rights reserved.
Security Data Science is Hard
The challenge lies in efficiently scaling these technologies for practical
deployment, and making them reliable for large networks. This is where the
security community should focus its efforts.
http://www.wired.com/2015/12/the-cia-secret-to-cybersecurity-that-no-one-seems-to-get/
Right now, financial services reports it takes an average of 98 days to detect an
Advance Threat but retailers say it can be about seven months.
4Copyright © 2016 Accenture All rights reserved.
Research Hypotheses - Architecting the Next Generation Cyber Hunting
Cyber security is a big data problem, the volume and velocity of data from devices requires a
new approach that combines all data sources to allow for more in intelligent/advanced cyber
security hunting through analytics and exploration at scale across enterprise data.
Visualization will be a key part of cyber hunting because our human eyes and brains are
really good at detecting changes — what’s wrong or different — enabling us to follow the
threat.
Indication of compromise needs to evolve as attacks are becoming more sophisticated,
subtle, and hidden in the massive volume and velocity of data. Combining machine learning,
graph analysis, applied statistics, and deep learning is essential to reduce false positives,
detect threats faster, and empower cyber analyst to be more efficient.
Proprietary and Confidential Property of Accenture
5Copyright © 2016 Accenture All rights reserved.
EnableIncubateDiscover
Intellectual
asset
licensing
Joint Ventures
Products in-
sourced for scale
up
Intellectual assets
insourced for
development
Insourced
ideas &
technologies
Out to
Market
Scale
ASGARD
ASGARD Rethinking Cyber Security Analytics Hunting
Streaming
Storage
Analytics
Visualization
Interaction
Proprietary and Confidential Property of Accenture
6Copyright © 2016 Accenture All rights reserved.
Innovation Cycle
Architecture
Data
Visualization
Analytics
DATA SCIENCE ARCHITECTURE
Customize,
create, and iterate
Proprietary and Confidential Property of Accenture
7Copyright © 2016 Accenture All rights reserved.
8Copyright © 2016 Accenture All rights reserved.
STINGER
Project ASGARD – Advanced Security Graph Analytics for Real-time Defense
Building a data driven platform to advance cyber defense beyond any one traditional technology
Proprietary and Confidential Property of Accenture
Accenture Labs ASGARD V1 Platform
Ingest
Event
Processing
Storage
Notebooks
Query Layer
Data
Sources
Visualizations
SQL
Streaming
py
Proprietary and Confidential Property of Accenture
9Copyright © 2016 Accenture All rights reserved.
Big Data Cyber Defense is Hard… Really Hard
10Copyright © 2016 Accenture All rights reserved.
Cost
Efficiencies
Lack of Agile
Model
Development
Threats
disguised as
legitimate
Interconnected
Data Problem
Expanding
Attack
Surfaces
Out of Order
Events
Ongoing
Privacy
Concerns
Multi-Model
Approach
Big Data Cyber Defense is Hard… Really Hard
11Copyright © 2016 Accenture All rights reserved.
Cost
Efficiencies
Lack of Agile
Model
Development
Threats
disguised as
legitimate
Interconnected
Data Problem
Expanding
Attack
Surfaces
Out of Order
Events
Ongoing
Privacy
Concerns
Multi-Model
Approach
Big Data Cyber Defense is Hard… Really Hard
12Copyright © 2016 Accenture All rights reserved.
Cost
Efficiencies
Lack of Agile
Model
Development
Multi-Model
Approach
Threats
disguised as
legitimate
Interconnected
Data Problem
Expanding
Attack
Surfaces
Out of Order
Events
Ongoing
Privacy
Concerns
vs
Big Data Solution
• 10 node cluster - ~$60k in hardware
• Spark 1.6.0
• Query was done when data was available as a Pandas Dataframe
Production SIEM of Fortune 500 Enterprise
Data
• 450+ columns
• ~250 million events per day
SIEM
Benchmark
13Copyright © 2016 Accenture All rights reserved.
Cost
Efficiencies
Typical Scenario Time Period SIEM Big Data Speed Up
1 Show all network communication from one host
(IP) to multiple hosts (IPs)
1 Day 3h 20m 13s 1m 44s 114 Times Faster
1 Week Not Feasible* 4m 05s
2 Retrieve failed logon attempts in Active Directory 1 Day 18m 26s 1m 37s 10 Times Faster
1 Week 2h 13m 45s 3m 10s 41 Times Faster
3 Search for Malware (exe) in Symantec logs 1 Day 3h 24m 36s 1m 37s 125 Times Faster
1 Week Not Feasible* 3m 22s
4 View all proxy logs for a for specific domain 1 Day 4h 30m 13s 2m 54s 92 Times Faster
1 Week Not Feasible* 1m 09s**
Notes:
* Client team was unable to run the benchmarks without splitting the query by time units, and allocating more
resources to run it; they estimate it would take 20+ hours to complete
** Due to over 1.6 million results, the number of fields returned was reduced from 466 to 10 key fields resulting in 5x
speed-up over returning all fields; however, the other columns are still searchable and available within this time
Benchmark
14Copyright © 2016 Accenture All rights reserved.
15Copyright © 2016 Accenture All rights reserved.
Multi-Model Approach
Multi-Model
Approach
16Copyright © 2016 Accenture All rights reserved.
Multi-Model
Approach
Multi-Model Approach
No Silver Bullet!!!
17Copyright © 2016 Accenture All rights reserved.
Multi-Model
Approach
Multi-Model Approach
No Silver Bullet!!!
Cyber Security is a Connected Data Problem
User:
Bob
User:
Jane
IP:
10.0.0.1
IP:
10.1.0.1
IP:
10.0.0.2
Assigned_IPHostname:
Comp_1
Hostname:
Comp_2
Auth_Success
Communicates_With
Auth_Success
Associated_With
User:
John
Hostname:
Comp_3
Assigned_IP
Malware
Sig:
Package_1
Detected_On
User:
Fred
Auth_Failure
18Copyright © 2016 Accenture All rights reserved.
Interconnected
Data Problem
19Copyright © 2016 Accenture All rights reserved.
Why Graph Analysis
Graphs represent Cyber Security Data Well
Traversals Faster Than SQL Joins (Efficiency)
More effective at detecting certain types of threat than
other analytical methods for example:
• Fast-flux Botnet Detection
• Lateral Movement within Networks
• Low and slow port scans
• Attack Surface Management Risk
Infection
Signatures
IP Address Users
IP Address
20Copyright © 2016 Accenture All rights reserved.
Graph Analysis is Computationally Expensive
On extremely large graphs, Graph Analytics computations can be costly
• Cyber Security needs near real-time graph analytics measures
• Many Graph analytics computations scale nicely to GPUs
Micro-batching data to GPUs gives us regular updates of graph features
CPU Graph Analytics tools GPU Analytics Accelerators
cuSTINGER nvGraph
GraphX
Out of Order Data made prior graph analysis not as accurate
(Parquet Limitations) Out of Order
Events
21Copyright © 2016 Accenture All rights reserved.
Why GPU Accelerated Graph Analytics
Use CPU cluster resources for random ad-hoc analysis, streaming, etc…
Repetitive Tasks should be more optimized (Cyber Security is very Repetitive)
GPU scale better with a smaller footprint; more green
Future R&D Goals more Ensemble analytical methods on GPU
• Time series => graph analysis => graph feature time series
Ingest
Event
Processing
Storage
Notebooks
Query Layer
Data
Sources
Visualizations
SQL
Streaming
py
Accenture Labs ASGARD Platform v2
22Copyright © 2016 Accenture All rights reserved.
Ingest
Event
Processing
Storage
Notebooks
Query Layer
Data
Sources
Visualizations
SQL
Streaming
py
Accenture Labs ASGARD Platform v2
GPU Layer
23Copyright © 2016 Accenture All rights reserved.
DASL
24Copyright © 2016 Accenture All rights reserved.
DataStax Enterprise Graph
TitanDB
Graph
+ =
DSE Graph with Spark and Gremlin
• OLAP
• See what every user did
• OLTP
• See what a specific user(s) did
Graph
25Copyright © 2016 Accenture All rights reserved.
DSE Graph with Spark and Gremlin
• OLAP
• See what every user did
• OLTP
• See what a specific user(s) did
Graph
26Copyright © 2016 Accenture All rights reserved.
DSE Graph Deployment
Configuration
• 32GB of memory per node
• 4 cores per node
45 Nodes
27Copyright © 2016 Accenture All rights reserved.
Spark Streaming v. Flink Deployment
Configuration
• 4GB of memory per executor
• 1 core per executor
18 Executors
1 Master
+
28Copyright © 2016 Accenture All rights reserved.
Spark Streaming v. Flink Schema
USER
29Copyright © 2016 Accenture All rights reserved.
Up to 3 Reads
for Existing
Vertices
Spark Streaming v. Flink Schema
USER
30Copyright © 2016 Accenture All rights reserved.
g.V().hasLabel(“label”).has(“key”, “value").tryNext()
.orElseGet{g.addV(“label”).property(“key”, “value”).next()}
1 Read for
Existing Vertex
Up to 3 Reads
for Existing
Vertices
Spark Streaming v. Flink Schema
IP
USER MSG
31Copyright © 2016 Accenture All rights reserved.
g.V().hasLabel(“label”).has(“key”, “value").tryNext()
.orElseGet{g.addV(“label”).property(“key”, “value”).next()}
Up to 3 Writes
to Add New
Vertexes
2 Writes to Add
New Vertexes
6 New Edges
Written for
Each Event
Spark Streaming v. Flink Schema
32Copyright © 2016 Accenture All rights reserved.
IP
USER MSG
Spark Streaming => DSE Graph
kafkaDirectStream
map
data transformations
foreachPartition
Initialize DSE
Initialize Semaphore
Check Semaphore
Close DSE
foreach
async graph queries
Initialize Accumulators
33Copyright © 2016 Accenture All rights reserved.
Flink => DSE Graph
KafkaConsumer
map
data transformations
DataSink
open
Initialize DSE
Initialize Semaphore
Initialize
Accumulators
close
Check Semaphore
Close DSE
invoke
async graph queries
34Copyright © 2016 Accenture All rights reserved.
Post-QueryQueryPre-Query
Asynchronous Graph Queries
Structured
Data
Create Graph
Query
Acquire
Semaphore
Permit
Execute Query
Asynchronously
Success
Callback
Failure
Callback
Release
Semaphore
Permit
Increment
Success
Accumulator
Increment
Failure
Accumulator
35Copyright © 2016 Accenture All rights reserved.
Spark Streaming v. Flink Benchmark
36Copyright © 2016 Accenture All rights reserved.
1 25 50 100 200 300 1000
Spark v1.6.2 79.3 11.8 9.5 9.6 10.5 13.6 10.3
Flink v1.1.1 77.3 10.0 9.4 8.8 10.1 10.0 10.4
0
2
4
6
8
10
12
14
16
18
20
RunTime(min)
DSE Graph Insertion with Semaphores
~9.7
million
events
Spark Streaming v. Flink Benchmark
37Copyright © 2016 Accenture All rights reserved.
1 25 50 100 200 300 1000
Spark v1.6.2 79.3 11.8 9.5 9.6 10.5 13.6 10.3
Flink v1.1.1 77.3 10.0 9.4 8.8 10.1 10.0 10.4
0
2
4
6
8
10
12
14
16
18
20
RunTime(min)
DSE Graph Insertion with Semaphores FAILURESSLOW
~9.7
million
events
DSE Graph Query Execution
Asynchronous Synchronous
*
38Copyright © 2016 Accenture All rights reserved.
DSE Graph Troubleshooting
Use DataStax studio and profiling to build your insertions / lookups
• DataStax studio profiling is much nicer to work with than the gremlin shell
• Ensure you’re hitting indexes wherever possible
• Scanning thousands of vertices / edges kills performance
39Copyright © 2016 Accenture All rights reserved.
DSE Graph Troubleshooting
Monitor CPU and Disk I/O
• OpsCenter gives a great view on resource usage
• Our workload was CPU heavy, and at too high of a CPU load we saw reduced throughput
and/or failed queries
40Copyright © 2016 Accenture All rights reserved.
Streaming DSE Graph Limitations and Lessons Learned
DSE Graph does not yet support prepared statements or batches
Spark & Flink Accumulators are non-atomic within executors
Flink Kafka Consumer pushes offsets to Zookeeper
• To read from the beginning of the stream set "auto.offset.reset” to ”smallest”, and set
"group.id” to a random value
Spark Kafka Consumer doesn’t push offsets to Zookeeper
• Treat offset data as time series and store data in Cassandra
41Copyright © 2016 Accenture All rights reserved.
DSE Graph with Spark and Gremlin
• OLAP
• See what every user did
• OLTP
• See what a specific user(s) did
Graph
42Copyright © 2016 Accenture All rights reserved.
43Copyright © 2016 Accenture All rights reserved.
Graph Analysis is Computationally Expensive
On extremely large graphs, Graph Analytics computations can be costly
• Cyber Security needs near real-time graph analytics measures
• Many Graph analytics computations scale nicely to GPUs
Micro-batching data to GPUs gives us regular updates of graph features
CPU Graph Analytics tools GPU Analytics Accelerators
cuSTINGER nvGraph
GraphX
44Copyright © 2016 Accenture All rights reserved.
Graph Analysis is Computationally Expensive
On extremely large graphs, Graph Analytics computations can be costly
• Cyber Security needs near real-time graph analytics measures
• Many Graph analytics computations scale nicely to GPUs
Micro-batching data to GPUs gives us regular updates of graph features
CPU Graph Analytics tools GPU Analytics Accelerators
cuSTINGER nvGraph
GraphX
45Copyright © 2016 Accenture All rights reserved.
Blazegraph
+ =
46Copyright © 2016 Accenture All rights reserved.
Blazegraph DASL
+ =
DASL
Out of Order Data - The Problem
Date Events
Prior to 8/6/16 9,411
8/6/16 1
8/7/16 56
8/8/16 4,484
8/9/16 83,465
8/10/16 498,093
8/11/16 163,179,386
8/12/16 35,569,614
Total Events 199,344,510
47Copyright © 2016 Accenture All rights reserved.
Out of Order Data - The Problem
Date Events
Prior to 8/6/16 9,411
8/6/16 1
8/7/16 56
8/8/16 4,484
8/9/16 83,465
8/10/16 498,093
8/11/16 163,179,386
8/12/16 35,569,614
Total Events 199,344,510
Received On
48Copyright © 2016 Accenture All rights reserved.
Out of Order Data - The Problem
Date Events
Prior to 8/6/16 9,411
8/6/16 1
8/7/16 56
8/8/16 4,484
8/9/16 83,465
8/10/16 498,093
8/11/16 163,179,386
8/12/16 35,569,614
Total Events 199,344,510
Received On
49Copyright © 2016 Accenture All rights reserved.
Build a Daily
Parquet
One Day of Data
Spread Across
Several Days
Need to Scan
Week of Data
for One Day
Out of Order Data – The Solution
Apache Kudu
• Columnar storage manager developed by Cloudera for the Hadoop ecosystem
• Supports
• Fast inserts/updates
• Efficient columnar scans
Benefits
• Allows us to insert data in real time
• Allows us to scan a true day of data more efficiently than Parquet
Development Time
• Started with Kudu 0.10 now at 1.0.0
• 2 weeks to get up and running
50Copyright © 2016 Accenture All rights reserved.
Where Does Kudu Fit?
ScanSpeed
Random Access Speed
51Copyright © 2016 Accenture All rights reserved.
Where Does Kudu Fit?
ScanSpeed
Random Access Speed
52Copyright © 2016 Accenture All rights reserved.
Kudu - Deployment and Data Size
18 Tablet Servers
1 Master
+
Configuration
• 16GB of memory
• 8GB of block caching*
• Co-located with Spark/HDFS
Data
• 7 days of events
• 1.11 Billion rows
• 70 columns
• ~155GB raw
53Copyright © 2016 Accenture All rights reserved.
Kudu - Schema
Encoding
• BIT_SHUFFLE encoding on all INTEGER, DOUBLE, FLOAT columns
• DICT_ENCODING encoding on all STRING columns*
Compression
• DEFAULT_COMPRESSION set to NO_COMPRESSION for testing
• Successfully done experiments with SNAPPY_COMPRESSION reducing storage costs
Partitioning
• Hash bucketing on 5 PRIMARY KEYS
• YEAR, MONTH, DAY combination to reduce tablets scanned
• HOUR further distribute a day of data and allow for search
• EVENT_ID unique identifier to distribute data evenly
• 144 tablets per day; plan to test more partitioning in the future
54Copyright © 2016 Accenture All rights reserved.
Kudu - Upsert Performance with 1440 Tablets
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
UpsertsPerSecond
Upsert/second
Expon. (Upsert/second)
Sampled every
10 seconds
with Impala
1400 Spark Tasks
160 Executors
~90 mins
55Copyright © 2016 Accenture All rights reserved.
Query Scenarios Size
1 Filter on year, month, day for one
source IP to three destination IPs
( s = IP AND ( d = IP OR
d = IP OR d = IP ))
330
2 Filter on year, month, day for failed
logins of a particular user
( et = FL AND user = Bob )
180
3 Filter on year, month, day for specific
firewall server events
( dv = FW AND dcs = Server )
150
4 Filter on year, month, day for all
traffic to a specific domain
( hst = Host )
3k
Kudu v. Parquet Performance in Spark
61.8
55.3 55.2 55.0
7.9
5.0 2.9 2.4
0
10
20
30
40
50
60
70
Scenario 1 Scenario 2 Scenario 3 Scenario 4
Avg.QueryTime(s)
Parquet Kudu
56Copyright © 2016 Accenture All rights reserved.
Kudu - Impact of Operators within Spark
Supported Predicate Pushdown
• =, <=, >=, BETWEEN
• Scans tablets within Kudu = FAST results
Unsupported Predicate Pushdown
• NULL, NOT NULL, <>, OR, LIKE, IN
• Optimizer tries to use any supported predicates to limit table scans
• Then pushes all data to Spark Task for comparison
• Data movement slows down query response time
AS OF v1.0.0
57Copyright © 2016 Accenture All rights reserved.
Source: https://github.com/cloudera/kudu/blob/master/docs/developing.adoc
Query Scenarios Size
1 Filter on year, month, day for subnet
source IP to three destination IPs
( s LIKE IP% AND ( d = IP OR
d = IP OR d = IP ))
18k
2 Filter on year, month, day for failed
logins of admin users
( et = FL AND user LIKE Ad% )
500
3 Filter on year, month, day for any
firewall server events
( dv = FW AND
dcs LIKE %Server% )
260k
4 Filter on year, month, day for all
traffic to domain & subdomains
( hst LIKE %Host )
6k
Testing Limitations of Kudu in Spark
61.8 55.3 55.2 55.0
7.9 5.0 2.9 2.4
224.1
132.0
174.8
194.5
0
50
100
150
200
250
Scenario 1 Scenario 2 Scenario 3 Scenario 4
Avg.QueryTime(s)
Parquet Kudu = Kudu LIKE
58Copyright © 2016 Accenture All rights reserved.
Testing Limitations of Kudu in Spark
61.8 55.3 55.2 55.0
7.9 5.0 2.9 2.4
224.1
132.0
174.8
194.5
0
50
100
150
200
250
Scenario 1 Scenario 2 Scenario 3 Scenario 4
Avg.QueryTime(s)
Parquet Kudu = Kudu LIKE
2.4x - 3.6x longer than Parquet for Kudu with LIKE
59Copyright © 2016 Accenture All rights reserved.
Query Scenarios Size
1 Filter on year, month, day for subnet
source IP to three destination IPs
( s LIKE IP% AND ( d = IP OR
d = IP OR d = IP ))
18k
2 Filter on year, month, day for failed
logins of admin users
( et = FL AND user LIKE Ad% )
500
3 Filter on year, month, day for any
firewall server events
( dv = FW AND
dcs LIKE %Server% )
260k
4 Filter on year, month, day for all
traffic to domain & subdomains
( hst LIKE %Host )
6k
Stream Data Pipeline
Copyright © 2016 Accenture All rights reserved. 60
Ingest
Stream
Processing
Streaming
Temporary
Storage
GPU Graph
DASL
Historical
Storage
Kudu Brings Big Data to GPU Analytics
61Copyright © 2016 Accenture All rights reserved.
cuSTINGER
&
Kudu provides
Spark Data Frames
for Blazegraph DASL
Kudu has low-level
C++ APIs
Support micro-
batching use case
for updates to GPUs
Ingest
Event
Processing
Storage
Notebooks
Query Layer
Data
Sources
Visualizations
SQL
Streaming
py
Accenture Labs ASGARD Platform v2
GPU Layer
cuSTINGER
62Copyright © 2016 Accenture All rights reserved.
More Lessons Learned
Initial thought Alluxio serves as caching layer
• Spark support was lacking
• Little performance improvement vs. parquet on HDFS
• Kudu better choice for our needs
KUDU-1651
• All NULL values in dictionary encoded blocks for Kudu 1.0.0 throws error
• Fixed in master branch!
Cannot change partitioning
• Planned to be supported in future releases
63Copyright © 2016 Accenture All rights reserved.
64Copyright © 2016 Accenture All rights reserved.
Future Research Work
Tackling
• Streaming Analytics (out-of-order)
• Complex Event Processing
• Managing Graph Insertions and Deletions in real time
GPU analytics
• Unified memory and page faulting on the new NVIDIA Pascal
GPUs solves a previous problem with maximum graph size
• NVIDIA DGX-1
Deep Learning
Thanks!
Cloudera
• Dan Burkert
• Todd Lipcon
Datastax
• Rob Murphy
• Jonathan Shook
65Copyright © 2016 Accenture All rights reserved.
Questions? @datametrician
Josh Patterson
Principal Data Scientist
Keith Kraus
Associate Principal Engineer
Mike Wendt
Principal Engineer
@mike_wendt
@keithjkraus

Contenu connexe

Tendances

Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationMatthew W. Bowers
 
Azure DDoS Protection Standard
Azure DDoS Protection StandardAzure DDoS Protection Standard
Azure DDoS Protection Standardarnaudlh
 
Managed Services Presentation
Managed Services PresentationManaged Services Presentation
Managed Services PresentationScott Gombar
 
Secure Your Cloud Environment with Azure Active Directory (AD)
Secure Your Cloud Environment with Azure Active Directory (AD)Secure Your Cloud Environment with Azure Active Directory (AD)
Secure Your Cloud Environment with Azure Active Directory (AD)WinWire Technologies Inc
 
introduction to Azure Sentinel
introduction to Azure Sentinelintroduction to Azure Sentinel
introduction to Azure SentinelRobert Crane
 
Microsoft Cloud Application Security Overview
Microsoft Cloud Application Security Overview Microsoft Cloud Application Security Overview
Microsoft Cloud Application Security Overview Syed Sabhi Haider
 
Azure security and Compliance
Azure security and ComplianceAzure security and Compliance
Azure security and ComplianceKarina Matos
 
SEIM-Microsoft Sentinel.pptx
SEIM-Microsoft Sentinel.pptxSEIM-Microsoft Sentinel.pptx
SEIM-Microsoft Sentinel.pptxAmrMousa51
 
Understanding Azure Networking Services
Understanding Azure Networking ServicesUnderstanding Azure Networking Services
Understanding Azure Networking ServicesInCycleSoftware
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)James Serra
 
Cloudamize Platform Training for Azure.pptx
Cloudamize Platform Training for Azure.pptxCloudamize Platform Training for Azure.pptx
Cloudamize Platform Training for Azure.pptxSasikumarPalanivel3
 
Azure Security Center- Zero to Hero
Azure Security Center-  Zero to HeroAzure Security Center-  Zero to Hero
Azure Security Center- Zero to HeroKasun Rajapakse
 
Implementing your landing zone - FND210 - AWS re:Inforce 2019
Implementing your landing zone - FND210 - AWS re:Inforce 2019 Implementing your landing zone - FND210 - AWS re:Inforce 2019
Implementing your landing zone - FND210 - AWS re:Inforce 2019 Amazon Web Services
 
Understanding Azure Disaster Recovery
Understanding Azure Disaster RecoveryUnderstanding Azure Disaster Recovery
Understanding Azure Disaster RecoveryNew Horizons Ireland
 
Power of the cloud - Introduction to azure security
Power of the cloud - Introduction to azure securityPower of the cloud - Introduction to azure security
Power of the cloud - Introduction to azure securityBruno Capuano
 

Tendances (20)

Azure Security Overview
Azure Security OverviewAzure Security Overview
Azure Security Overview
 
Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar Presentation
 
Azure DDoS Protection Standard
Azure DDoS Protection StandardAzure DDoS Protection Standard
Azure DDoS Protection Standard
 
Getting Started on AWS
Getting Started on AWS Getting Started on AWS
Getting Started on AWS
 
Managed Services Presentation
Managed Services PresentationManaged Services Presentation
Managed Services Presentation
 
Secure Your Cloud Environment with Azure Active Directory (AD)
Secure Your Cloud Environment with Azure Active Directory (AD)Secure Your Cloud Environment with Azure Active Directory (AD)
Secure Your Cloud Environment with Azure Active Directory (AD)
 
introduction to Azure Sentinel
introduction to Azure Sentinelintroduction to Azure Sentinel
introduction to Azure Sentinel
 
Microsoft Cloud Application Security Overview
Microsoft Cloud Application Security Overview Microsoft Cloud Application Security Overview
Microsoft Cloud Application Security Overview
 
Azure security and Compliance
Azure security and ComplianceAzure security and Compliance
Azure security and Compliance
 
Azure DevOps
Azure DevOpsAzure DevOps
Azure DevOps
 
SEIM-Microsoft Sentinel.pptx
SEIM-Microsoft Sentinel.pptxSEIM-Microsoft Sentinel.pptx
SEIM-Microsoft Sentinel.pptx
 
Understanding Azure Networking Services
Understanding Azure Networking ServicesUnderstanding Azure Networking Services
Understanding Azure Networking Services
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
Cloudamize Platform Training for Azure.pptx
Cloudamize Platform Training for Azure.pptxCloudamize Platform Training for Azure.pptx
Cloudamize Platform Training for Azure.pptx
 
Azure Security Center- Zero to Hero
Azure Security Center-  Zero to HeroAzure Security Center-  Zero to Hero
Azure Security Center- Zero to Hero
 
GCP Cloud Storage Security
GCP Cloud Storage SecurityGCP Cloud Storage Security
GCP Cloud Storage Security
 
Implementing your landing zone - FND210 - AWS re:Inforce 2019
Implementing your landing zone - FND210 - AWS re:Inforce 2019 Implementing your landing zone - FND210 - AWS re:Inforce 2019
Implementing your landing zone - FND210 - AWS re:Inforce 2019
 
Setting Up a Landing Zone
Setting Up a Landing ZoneSetting Up a Landing Zone
Setting Up a Landing Zone
 
Understanding Azure Disaster Recovery
Understanding Azure Disaster RecoveryUnderstanding Azure Disaster Recovery
Understanding Azure Disaster Recovery
 
Power of the cloud - Introduction to azure security
Power of the cloud - Introduction to azure securityPower of the cloud - Introduction to azure security
Power of the cloud - Introduction to azure security
 

Similaire à Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph and Blazegraph

Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?Guido Schmutz
 
FullDay Faeder on Friday
FullDay Faeder on Friday FullDay Faeder on Friday
FullDay Faeder on Friday Adam Faeder
 
FullDay on Fridays Feb. 3, 2017
FullDay on Fridays Feb. 3, 2017FullDay on Fridays Feb. 3, 2017
FullDay on Fridays Feb. 3, 2017Adam Faeder
 
ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016Keith Kraus
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
 
CONFidence2015: Real World Threat Hunting - Martin Nystrom
CONFidence2015: Real World Threat Hunting - Martin NystromCONFidence2015: Real World Threat Hunting - Martin Nystrom
CONFidence2015: Real World Threat Hunting - Martin NystromPROIDEA
 
Proteja sus datos en cualquier servicio Cloud y Web de forma unificada
Proteja sus datos en cualquier servicio Cloud y Web de forma unificadaProteja sus datos en cualquier servicio Cloud y Web de forma unificada
Proteja sus datos en cualquier servicio Cloud y Web de forma unificadaCristian Garcia G.
 
PLNOG19 - Gaweł Mikołajczyk & Michał Garcarz - SOC, studium ciężkich przypadków
PLNOG19 - Gaweł Mikołajczyk & Michał Garcarz - SOC, studium ciężkich przypadkówPLNOG19 - Gaweł Mikołajczyk & Michał Garcarz - SOC, studium ciężkich przypadków
PLNOG19 - Gaweł Mikołajczyk & Michał Garcarz - SOC, studium ciężkich przypadkówPROIDEA
 
CL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCisco
 
The evolving threat in the face of increased connectivity
The evolving threat in the face of increased connectivityThe evolving threat in the face of increased connectivity
The evolving threat in the face of increased connectivityAPNIC
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"MDS ap
 
Solnet dev secops meetup
Solnet dev secops meetupSolnet dev secops meetup
Solnet dev secops meetuppbink
 
[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...
[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...
[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...PROIDEA
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudInside Analysis
 
Preparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissancePreparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissanceCloudera, Inc.
 
2019 Performance Monitoring and Management Trends and Insights
2019 Performance Monitoring and Management Trends and Insights2019 Performance Monitoring and Management Trends and Insights
2019 Performance Monitoring and Management Trends and InsightsOpsRamp
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?Aerospike, Inc.
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...Timothy Spann
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life RevolutionCapgemini
 
D3SF17- Improving Our China Clients Performance
D3SF17- Improving Our China Clients PerformanceD3SF17- Improving Our China Clients Performance
D3SF17- Improving Our China Clients PerformanceImperva Incapsula
 

Similaire à Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph and Blazegraph (20)

Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?
 
FullDay Faeder on Friday
FullDay Faeder on Friday FullDay Faeder on Friday
FullDay Faeder on Friday
 
FullDay on Fridays Feb. 3, 2017
FullDay on Fridays Feb. 3, 2017FullDay on Fridays Feb. 3, 2017
FullDay on Fridays Feb. 3, 2017
 
ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
CONFidence2015: Real World Threat Hunting - Martin Nystrom
CONFidence2015: Real World Threat Hunting - Martin NystromCONFidence2015: Real World Threat Hunting - Martin Nystrom
CONFidence2015: Real World Threat Hunting - Martin Nystrom
 
Proteja sus datos en cualquier servicio Cloud y Web de forma unificada
Proteja sus datos en cualquier servicio Cloud y Web de forma unificadaProteja sus datos en cualquier servicio Cloud y Web de forma unificada
Proteja sus datos en cualquier servicio Cloud y Web de forma unificada
 
PLNOG19 - Gaweł Mikołajczyk & Michał Garcarz - SOC, studium ciężkich przypadków
PLNOG19 - Gaweł Mikołajczyk & Michał Garcarz - SOC, studium ciężkich przypadkówPLNOG19 - Gaweł Mikołajczyk & Michał Garcarz - SOC, studium ciężkich przypadków
PLNOG19 - Gaweł Mikołajczyk & Michał Garcarz - SOC, studium ciężkich przypadków
 
CL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and Planning
 
The evolving threat in the face of increased connectivity
The evolving threat in the face of increased connectivityThe evolving threat in the face of increased connectivity
The evolving threat in the face of increased connectivity
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
 
Solnet dev secops meetup
Solnet dev secops meetupSolnet dev secops meetup
Solnet dev secops meetup
 
[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...
[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...
[CONFidence 2016] Gaweł Mikołajczyk - Making sense out of the Security Operat...
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the Cloud
 
Preparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissancePreparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity Renaissance
 
2019 Performance Monitoring and Management Trends and Insights
2019 Performance Monitoring and Management Trends and Insights2019 Performance Monitoring and Management Trends and Insights
2019 Performance Monitoring and Management Trends and Insights
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
 
D3SF17- Improving Our China Clients Performance
D3SF17- Improving Our China Clients PerformanceD3SF17- Improving Our China Clients Performance
D3SF17- Improving Our China Clients Performance
 

Dernier

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 

Dernier (20)

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 

Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph and Blazegraph

  • 1. Spotting the trends by looking at the big picture Streaming Cyber Security into Graph: Accelerating Data into Datastax Graph and Blazegraph
  • 2. Silicon Valley • Digital Experiences • Artificial Intelligence • Platforms & Systems Washington DC • Security Dublin • Artificial Intelligence Sophia Antipolis • Industry Innovation (FS & Resources) Beijing • Industrial Internet Bangalore • Software Engineering Tel-Aviv • Security For more than 20 years, Accenture Labs has served as the tip of the spear for technology innovation at Accenture. Over the last 5 years Accenture Labs has: • Supported 300+ client engagements and hosted 1100+ client workshops • Published 200+ thought leadership pieces, filed 110+ patent applications, and garnered 350+ Tier-1 media hits Expanding Global Presence 2Copyright © 2016 Accenture All rights reserved.
  • 3. Security Data Science is Hard Once the security community moves beyond the mantras “encrypt everything” and “secure the perimeter,” it can begin developing intelligent prioritization and response plans to various kinds of breaches – with a strong focus on integrity. http://www.wired.com/2015/12/the-cia-secret-to-cybersecurity-that-no-one-seems-to-get/ Right now, financial services reports it takes an average of 98 days to detect an Advance Threat but retailers say it can be about seven months. 3Copyright © 2016 Accenture All rights reserved.
  • 4. Security Data Science is Hard The challenge lies in efficiently scaling these technologies for practical deployment, and making them reliable for large networks. This is where the security community should focus its efforts. http://www.wired.com/2015/12/the-cia-secret-to-cybersecurity-that-no-one-seems-to-get/ Right now, financial services reports it takes an average of 98 days to detect an Advance Threat but retailers say it can be about seven months. 4Copyright © 2016 Accenture All rights reserved.
  • 5. Research Hypotheses - Architecting the Next Generation Cyber Hunting Cyber security is a big data problem, the volume and velocity of data from devices requires a new approach that combines all data sources to allow for more in intelligent/advanced cyber security hunting through analytics and exploration at scale across enterprise data. Visualization will be a key part of cyber hunting because our human eyes and brains are really good at detecting changes — what’s wrong or different — enabling us to follow the threat. Indication of compromise needs to evolve as attacks are becoming more sophisticated, subtle, and hidden in the massive volume and velocity of data. Combining machine learning, graph analysis, applied statistics, and deep learning is essential to reduce false positives, detect threats faster, and empower cyber analyst to be more efficient. Proprietary and Confidential Property of Accenture 5Copyright © 2016 Accenture All rights reserved.
  • 6. EnableIncubateDiscover Intellectual asset licensing Joint Ventures Products in- sourced for scale up Intellectual assets insourced for development Insourced ideas & technologies Out to Market Scale ASGARD ASGARD Rethinking Cyber Security Analytics Hunting Streaming Storage Analytics Visualization Interaction Proprietary and Confidential Property of Accenture 6Copyright © 2016 Accenture All rights reserved.
  • 7. Innovation Cycle Architecture Data Visualization Analytics DATA SCIENCE ARCHITECTURE Customize, create, and iterate Proprietary and Confidential Property of Accenture 7Copyright © 2016 Accenture All rights reserved.
  • 8. 8Copyright © 2016 Accenture All rights reserved. STINGER Project ASGARD – Advanced Security Graph Analytics for Real-time Defense Building a data driven platform to advance cyber defense beyond any one traditional technology Proprietary and Confidential Property of Accenture
  • 9. Accenture Labs ASGARD V1 Platform Ingest Event Processing Storage Notebooks Query Layer Data Sources Visualizations SQL Streaming py Proprietary and Confidential Property of Accenture 9Copyright © 2016 Accenture All rights reserved.
  • 10. Big Data Cyber Defense is Hard… Really Hard 10Copyright © 2016 Accenture All rights reserved. Cost Efficiencies Lack of Agile Model Development Threats disguised as legitimate Interconnected Data Problem Expanding Attack Surfaces Out of Order Events Ongoing Privacy Concerns Multi-Model Approach
  • 11. Big Data Cyber Defense is Hard… Really Hard 11Copyright © 2016 Accenture All rights reserved. Cost Efficiencies Lack of Agile Model Development Threats disguised as legitimate Interconnected Data Problem Expanding Attack Surfaces Out of Order Events Ongoing Privacy Concerns Multi-Model Approach
  • 12. Big Data Cyber Defense is Hard… Really Hard 12Copyright © 2016 Accenture All rights reserved. Cost Efficiencies Lack of Agile Model Development Multi-Model Approach Threats disguised as legitimate Interconnected Data Problem Expanding Attack Surfaces Out of Order Events Ongoing Privacy Concerns
  • 13. vs Big Data Solution • 10 node cluster - ~$60k in hardware • Spark 1.6.0 • Query was done when data was available as a Pandas Dataframe Production SIEM of Fortune 500 Enterprise Data • 450+ columns • ~250 million events per day SIEM Benchmark 13Copyright © 2016 Accenture All rights reserved. Cost Efficiencies
  • 14. Typical Scenario Time Period SIEM Big Data Speed Up 1 Show all network communication from one host (IP) to multiple hosts (IPs) 1 Day 3h 20m 13s 1m 44s 114 Times Faster 1 Week Not Feasible* 4m 05s 2 Retrieve failed logon attempts in Active Directory 1 Day 18m 26s 1m 37s 10 Times Faster 1 Week 2h 13m 45s 3m 10s 41 Times Faster 3 Search for Malware (exe) in Symantec logs 1 Day 3h 24m 36s 1m 37s 125 Times Faster 1 Week Not Feasible* 3m 22s 4 View all proxy logs for a for specific domain 1 Day 4h 30m 13s 2m 54s 92 Times Faster 1 Week Not Feasible* 1m 09s** Notes: * Client team was unable to run the benchmarks without splitting the query by time units, and allocating more resources to run it; they estimate it would take 20+ hours to complete ** Due to over 1.6 million results, the number of fields returned was reduced from 466 to 10 key fields resulting in 5x speed-up over returning all fields; however, the other columns are still searchable and available within this time Benchmark 14Copyright © 2016 Accenture All rights reserved.
  • 15. 15Copyright © 2016 Accenture All rights reserved. Multi-Model Approach Multi-Model Approach
  • 16. 16Copyright © 2016 Accenture All rights reserved. Multi-Model Approach Multi-Model Approach No Silver Bullet!!!
  • 17. 17Copyright © 2016 Accenture All rights reserved. Multi-Model Approach Multi-Model Approach No Silver Bullet!!!
  • 18. Cyber Security is a Connected Data Problem User: Bob User: Jane IP: 10.0.0.1 IP: 10.1.0.1 IP: 10.0.0.2 Assigned_IPHostname: Comp_1 Hostname: Comp_2 Auth_Success Communicates_With Auth_Success Associated_With User: John Hostname: Comp_3 Assigned_IP Malware Sig: Package_1 Detected_On User: Fred Auth_Failure 18Copyright © 2016 Accenture All rights reserved. Interconnected Data Problem
  • 19. 19Copyright © 2016 Accenture All rights reserved. Why Graph Analysis Graphs represent Cyber Security Data Well Traversals Faster Than SQL Joins (Efficiency) More effective at detecting certain types of threat than other analytical methods for example: • Fast-flux Botnet Detection • Lateral Movement within Networks • Low and slow port scans • Attack Surface Management Risk Infection Signatures IP Address Users IP Address
  • 20. 20Copyright © 2016 Accenture All rights reserved. Graph Analysis is Computationally Expensive On extremely large graphs, Graph Analytics computations can be costly • Cyber Security needs near real-time graph analytics measures • Many Graph analytics computations scale nicely to GPUs Micro-batching data to GPUs gives us regular updates of graph features CPU Graph Analytics tools GPU Analytics Accelerators cuSTINGER nvGraph GraphX Out of Order Data made prior graph analysis not as accurate (Parquet Limitations) Out of Order Events
  • 21. 21Copyright © 2016 Accenture All rights reserved. Why GPU Accelerated Graph Analytics Use CPU cluster resources for random ad-hoc analysis, streaming, etc… Repetitive Tasks should be more optimized (Cyber Security is very Repetitive) GPU scale better with a smaller footprint; more green Future R&D Goals more Ensemble analytical methods on GPU • Time series => graph analysis => graph feature time series
  • 23. Ingest Event Processing Storage Notebooks Query Layer Data Sources Visualizations SQL Streaming py Accenture Labs ASGARD Platform v2 GPU Layer 23Copyright © 2016 Accenture All rights reserved. DASL
  • 24. 24Copyright © 2016 Accenture All rights reserved. DataStax Enterprise Graph TitanDB Graph + =
  • 25. DSE Graph with Spark and Gremlin • OLAP • See what every user did • OLTP • See what a specific user(s) did Graph 25Copyright © 2016 Accenture All rights reserved.
  • 26. DSE Graph with Spark and Gremlin • OLAP • See what every user did • OLTP • See what a specific user(s) did Graph 26Copyright © 2016 Accenture All rights reserved.
  • 27. DSE Graph Deployment Configuration • 32GB of memory per node • 4 cores per node 45 Nodes 27Copyright © 2016 Accenture All rights reserved.
  • 28. Spark Streaming v. Flink Deployment Configuration • 4GB of memory per executor • 1 core per executor 18 Executors 1 Master + 28Copyright © 2016 Accenture All rights reserved.
  • 29. Spark Streaming v. Flink Schema USER 29Copyright © 2016 Accenture All rights reserved. Up to 3 Reads for Existing Vertices
  • 30. Spark Streaming v. Flink Schema USER 30Copyright © 2016 Accenture All rights reserved. g.V().hasLabel(“label”).has(“key”, “value").tryNext() .orElseGet{g.addV(“label”).property(“key”, “value”).next()} 1 Read for Existing Vertex Up to 3 Reads for Existing Vertices
  • 31. Spark Streaming v. Flink Schema IP USER MSG 31Copyright © 2016 Accenture All rights reserved. g.V().hasLabel(“label”).has(“key”, “value").tryNext() .orElseGet{g.addV(“label”).property(“key”, “value”).next()} Up to 3 Writes to Add New Vertexes 2 Writes to Add New Vertexes
  • 32. 6 New Edges Written for Each Event Spark Streaming v. Flink Schema 32Copyright © 2016 Accenture All rights reserved. IP USER MSG
  • 33. Spark Streaming => DSE Graph kafkaDirectStream map data transformations foreachPartition Initialize DSE Initialize Semaphore Check Semaphore Close DSE foreach async graph queries Initialize Accumulators 33Copyright © 2016 Accenture All rights reserved.
  • 34. Flink => DSE Graph KafkaConsumer map data transformations DataSink open Initialize DSE Initialize Semaphore Initialize Accumulators close Check Semaphore Close DSE invoke async graph queries 34Copyright © 2016 Accenture All rights reserved.
  • 35. Post-QueryQueryPre-Query Asynchronous Graph Queries Structured Data Create Graph Query Acquire Semaphore Permit Execute Query Asynchronously Success Callback Failure Callback Release Semaphore Permit Increment Success Accumulator Increment Failure Accumulator 35Copyright © 2016 Accenture All rights reserved.
  • 36. Spark Streaming v. Flink Benchmark 36Copyright © 2016 Accenture All rights reserved. 1 25 50 100 200 300 1000 Spark v1.6.2 79.3 11.8 9.5 9.6 10.5 13.6 10.3 Flink v1.1.1 77.3 10.0 9.4 8.8 10.1 10.0 10.4 0 2 4 6 8 10 12 14 16 18 20 RunTime(min) DSE Graph Insertion with Semaphores ~9.7 million events
  • 37. Spark Streaming v. Flink Benchmark 37Copyright © 2016 Accenture All rights reserved. 1 25 50 100 200 300 1000 Spark v1.6.2 79.3 11.8 9.5 9.6 10.5 13.6 10.3 Flink v1.1.1 77.3 10.0 9.4 8.8 10.1 10.0 10.4 0 2 4 6 8 10 12 14 16 18 20 RunTime(min) DSE Graph Insertion with Semaphores FAILURESSLOW ~9.7 million events
  • 38. DSE Graph Query Execution Asynchronous Synchronous * 38Copyright © 2016 Accenture All rights reserved.
  • 39. DSE Graph Troubleshooting Use DataStax studio and profiling to build your insertions / lookups • DataStax studio profiling is much nicer to work with than the gremlin shell • Ensure you’re hitting indexes wherever possible • Scanning thousands of vertices / edges kills performance 39Copyright © 2016 Accenture All rights reserved.
  • 40. DSE Graph Troubleshooting Monitor CPU and Disk I/O • OpsCenter gives a great view on resource usage • Our workload was CPU heavy, and at too high of a CPU load we saw reduced throughput and/or failed queries 40Copyright © 2016 Accenture All rights reserved.
  • 41. Streaming DSE Graph Limitations and Lessons Learned DSE Graph does not yet support prepared statements or batches Spark & Flink Accumulators are non-atomic within executors Flink Kafka Consumer pushes offsets to Zookeeper • To read from the beginning of the stream set "auto.offset.reset” to ”smallest”, and set "group.id” to a random value Spark Kafka Consumer doesn’t push offsets to Zookeeper • Treat offset data as time series and store data in Cassandra 41Copyright © 2016 Accenture All rights reserved.
  • 42. DSE Graph with Spark and Gremlin • OLAP • See what every user did • OLTP • See what a specific user(s) did Graph 42Copyright © 2016 Accenture All rights reserved.
  • 43. 43Copyright © 2016 Accenture All rights reserved. Graph Analysis is Computationally Expensive On extremely large graphs, Graph Analytics computations can be costly • Cyber Security needs near real-time graph analytics measures • Many Graph analytics computations scale nicely to GPUs Micro-batching data to GPUs gives us regular updates of graph features CPU Graph Analytics tools GPU Analytics Accelerators cuSTINGER nvGraph GraphX
  • 44. 44Copyright © 2016 Accenture All rights reserved. Graph Analysis is Computationally Expensive On extremely large graphs, Graph Analytics computations can be costly • Cyber Security needs near real-time graph analytics measures • Many Graph analytics computations scale nicely to GPUs Micro-batching data to GPUs gives us regular updates of graph features CPU Graph Analytics tools GPU Analytics Accelerators cuSTINGER nvGraph GraphX
  • 45. 45Copyright © 2016 Accenture All rights reserved. Blazegraph + =
  • 46. 46Copyright © 2016 Accenture All rights reserved. Blazegraph DASL + = DASL
  • 47. Out of Order Data - The Problem Date Events Prior to 8/6/16 9,411 8/6/16 1 8/7/16 56 8/8/16 4,484 8/9/16 83,465 8/10/16 498,093 8/11/16 163,179,386 8/12/16 35,569,614 Total Events 199,344,510 47Copyright © 2016 Accenture All rights reserved.
  • 48. Out of Order Data - The Problem Date Events Prior to 8/6/16 9,411 8/6/16 1 8/7/16 56 8/8/16 4,484 8/9/16 83,465 8/10/16 498,093 8/11/16 163,179,386 8/12/16 35,569,614 Total Events 199,344,510 Received On 48Copyright © 2016 Accenture All rights reserved.
  • 49. Out of Order Data - The Problem Date Events Prior to 8/6/16 9,411 8/6/16 1 8/7/16 56 8/8/16 4,484 8/9/16 83,465 8/10/16 498,093 8/11/16 163,179,386 8/12/16 35,569,614 Total Events 199,344,510 Received On 49Copyright © 2016 Accenture All rights reserved. Build a Daily Parquet One Day of Data Spread Across Several Days Need to Scan Week of Data for One Day
  • 50. Out of Order Data – The Solution Apache Kudu • Columnar storage manager developed by Cloudera for the Hadoop ecosystem • Supports • Fast inserts/updates • Efficient columnar scans Benefits • Allows us to insert data in real time • Allows us to scan a true day of data more efficiently than Parquet Development Time • Started with Kudu 0.10 now at 1.0.0 • 2 weeks to get up and running 50Copyright © 2016 Accenture All rights reserved.
  • 51. Where Does Kudu Fit? ScanSpeed Random Access Speed 51Copyright © 2016 Accenture All rights reserved.
  • 52. Where Does Kudu Fit? ScanSpeed Random Access Speed 52Copyright © 2016 Accenture All rights reserved.
  • 53. Kudu - Deployment and Data Size 18 Tablet Servers 1 Master + Configuration • 16GB of memory • 8GB of block caching* • Co-located with Spark/HDFS Data • 7 days of events • 1.11 Billion rows • 70 columns • ~155GB raw 53Copyright © 2016 Accenture All rights reserved.
  • 54. Kudu - Schema Encoding • BIT_SHUFFLE encoding on all INTEGER, DOUBLE, FLOAT columns • DICT_ENCODING encoding on all STRING columns* Compression • DEFAULT_COMPRESSION set to NO_COMPRESSION for testing • Successfully done experiments with SNAPPY_COMPRESSION reducing storage costs Partitioning • Hash bucketing on 5 PRIMARY KEYS • YEAR, MONTH, DAY combination to reduce tablets scanned • HOUR further distribute a day of data and allow for search • EVENT_ID unique identifier to distribute data evenly • 144 tablets per day; plan to test more partitioning in the future 54Copyright © 2016 Accenture All rights reserved.
  • 55. Kudu - Upsert Performance with 1440 Tablets 0 100000 200000 300000 400000 500000 600000 700000 800000 900000 UpsertsPerSecond Upsert/second Expon. (Upsert/second) Sampled every 10 seconds with Impala 1400 Spark Tasks 160 Executors ~90 mins 55Copyright © 2016 Accenture All rights reserved.
  • 56. Query Scenarios Size 1 Filter on year, month, day for one source IP to three destination IPs ( s = IP AND ( d = IP OR d = IP OR d = IP )) 330 2 Filter on year, month, day for failed logins of a particular user ( et = FL AND user = Bob ) 180 3 Filter on year, month, day for specific firewall server events ( dv = FW AND dcs = Server ) 150 4 Filter on year, month, day for all traffic to a specific domain ( hst = Host ) 3k Kudu v. Parquet Performance in Spark 61.8 55.3 55.2 55.0 7.9 5.0 2.9 2.4 0 10 20 30 40 50 60 70 Scenario 1 Scenario 2 Scenario 3 Scenario 4 Avg.QueryTime(s) Parquet Kudu 56Copyright © 2016 Accenture All rights reserved.
  • 57. Kudu - Impact of Operators within Spark Supported Predicate Pushdown • =, <=, >=, BETWEEN • Scans tablets within Kudu = FAST results Unsupported Predicate Pushdown • NULL, NOT NULL, <>, OR, LIKE, IN • Optimizer tries to use any supported predicates to limit table scans • Then pushes all data to Spark Task for comparison • Data movement slows down query response time AS OF v1.0.0 57Copyright © 2016 Accenture All rights reserved. Source: https://github.com/cloudera/kudu/blob/master/docs/developing.adoc
  • 58. Query Scenarios Size 1 Filter on year, month, day for subnet source IP to three destination IPs ( s LIKE IP% AND ( d = IP OR d = IP OR d = IP )) 18k 2 Filter on year, month, day for failed logins of admin users ( et = FL AND user LIKE Ad% ) 500 3 Filter on year, month, day for any firewall server events ( dv = FW AND dcs LIKE %Server% ) 260k 4 Filter on year, month, day for all traffic to domain & subdomains ( hst LIKE %Host ) 6k Testing Limitations of Kudu in Spark 61.8 55.3 55.2 55.0 7.9 5.0 2.9 2.4 224.1 132.0 174.8 194.5 0 50 100 150 200 250 Scenario 1 Scenario 2 Scenario 3 Scenario 4 Avg.QueryTime(s) Parquet Kudu = Kudu LIKE 58Copyright © 2016 Accenture All rights reserved.
  • 59. Testing Limitations of Kudu in Spark 61.8 55.3 55.2 55.0 7.9 5.0 2.9 2.4 224.1 132.0 174.8 194.5 0 50 100 150 200 250 Scenario 1 Scenario 2 Scenario 3 Scenario 4 Avg.QueryTime(s) Parquet Kudu = Kudu LIKE 2.4x - 3.6x longer than Parquet for Kudu with LIKE 59Copyright © 2016 Accenture All rights reserved. Query Scenarios Size 1 Filter on year, month, day for subnet source IP to three destination IPs ( s LIKE IP% AND ( d = IP OR d = IP OR d = IP )) 18k 2 Filter on year, month, day for failed logins of admin users ( et = FL AND user LIKE Ad% ) 500 3 Filter on year, month, day for any firewall server events ( dv = FW AND dcs LIKE %Server% ) 260k 4 Filter on year, month, day for all traffic to domain & subdomains ( hst LIKE %Host ) 6k
  • 60. Stream Data Pipeline Copyright © 2016 Accenture All rights reserved. 60 Ingest Stream Processing Streaming Temporary Storage GPU Graph DASL Historical Storage
  • 61. Kudu Brings Big Data to GPU Analytics 61Copyright © 2016 Accenture All rights reserved. cuSTINGER & Kudu provides Spark Data Frames for Blazegraph DASL Kudu has low-level C++ APIs Support micro- batching use case for updates to GPUs
  • 62. Ingest Event Processing Storage Notebooks Query Layer Data Sources Visualizations SQL Streaming py Accenture Labs ASGARD Platform v2 GPU Layer cuSTINGER 62Copyright © 2016 Accenture All rights reserved.
  • 63. More Lessons Learned Initial thought Alluxio serves as caching layer • Spark support was lacking • Little performance improvement vs. parquet on HDFS • Kudu better choice for our needs KUDU-1651 • All NULL values in dictionary encoded blocks for Kudu 1.0.0 throws error • Fixed in master branch! Cannot change partitioning • Planned to be supported in future releases 63Copyright © 2016 Accenture All rights reserved.
  • 64. 64Copyright © 2016 Accenture All rights reserved. Future Research Work Tackling • Streaming Analytics (out-of-order) • Complex Event Processing • Managing Graph Insertions and Deletions in real time GPU analytics • Unified memory and page faulting on the new NVIDIA Pascal GPUs solves a previous problem with maximum graph size • NVIDIA DGX-1 Deep Learning
  • 65. Thanks! Cloudera • Dan Burkert • Todd Lipcon Datastax • Rob Murphy • Jonathan Shook 65Copyright © 2016 Accenture All rights reserved.
  • 66. Questions? @datametrician Josh Patterson Principal Data Scientist Keith Kraus Associate Principal Engineer Mike Wendt Principal Engineer @mike_wendt @keithjkraus