SlideShare une entreprise Scribd logo
1  sur  38
© 2014 MapR Technologies 1© 2014 MapR Technologies
© 2014 MapR Technologies 2
Objective
• Advanced Persistent Threat (APT)
• Big Data + Threat Intelligence
• Hadoop + Spark Solution
• Example Detection Algorithm Development Scenarios (most of
them are still open problems)
Topics covered in this talk
© 2014 MapR Technologies 3© 2014 MapR Technologies
Advanced Persistent Threat
© 2014 MapR Technologies 4
APT
• Advanced Persistent Threat (APT) is one of the biggest headaches in
IT departments
– Target Compromise
– Countless DDoS attacks (Thousands a day according to Arbor Networks)
– These are only known cases, that could just be a tip of the iceberg.
• Why APT is so prevalent?
– No more hobby for smart hackers
– Huge money is involved, even behind organized crimes
– Political tool (Recent conflict between Ukraine and Russia sparked malware
warfare between them)
– Cyber warfare (Stuxnet)
Overview
© 2014 MapR Technologies 5
APT
• Hard to Detect
– More software layer stacks without thorough vulnerability test popping every day
• Storm, spark, yarn, grail, play, spring, flask, …
– Mobile area is even worse
• Particularly android
• Some estimates 30% or more devices are already compromised, worldwide
• Anti-Virus is useful only up to a certain point
– It takes months to years to define malware signature
– Zero day attack is still unpreventable
– It became almost a Placebo
• Firewall is not much useful anymore
– A device can be infected when the user brings it outside the Firewall premise
• Botnet itself is becoming more complex with many hierarchies
– Minimal binary delivery
– Surreptitious C&C connection with complex hierarchy or even headless peer to peer bots (Gameover
Zeus Botnet)
Status
© 2014 MapR Technologies 6
APT
• Snort / Suricata
– Rule based system
– Community support, pre/post-compromise detection
– Constant update is needed, cannot detect Zero day attack
– Sourcefire provides paid service
• Sandbox Technology
– Firewall + In premise detection
– Fireeye
• Poly-morphing technology
– ShapeSecurity
• Log data mining based methods
– Splunk / Sumo Logic, Solutionary
Defense, state of the art
© 2014 MapR Technologies 7
APT
• Many world wide security labs have malware labs and generate
threat reports
• The analysis takes from 2 weeks to months
• Involves
– Decoding binary execution and decrypting load / config parameters
– Complete time line analysis, from infection to exploit
– What devices and ips and domain names are involved
• Sometimes, analyze IRC data, or even social network data
– Botnet connection and verify the command and control
• Can we automate this with Big Data?
Threat Report
© 2014 MapR Technologies 8
APT
Example Annual Threat Report (from Fireeye, 2013, Europe)
Top Two
Industries in
Threat Finding
were
Healthcare
and Finance
© 2014 MapR Technologies 9
APT
• Configuration (Decrypted)
• ID: F16 08-07-2013
Group:
DNS/Port: Direct: toornt.servegame.com:443,
Proxy DNS/Port:
Proxy Hijack: No
ActiveX Startup Key:
HKLM Startup Entry:
File Name:
Install Path: C:Documents and SettingsAdminLocal SettingsTempmorse.exe
Keylog Path: C:Documents and SettingsAdminLocal SettingsTempmorse
Inject: No
Process Mutex: gdfgdfgdg
Key Logger Mutex:
ActiveX Startup: No
HKLM Startup: No
Copy To: No
Melt: No
Persistence: No
Keylogger: No
Password: !@#GooD#@!
Example Threat Report (from Fireeye)
C&C Servers
toornt.servegame.com
updateo.servegame.com
egypttv.sytes.net
skype.servemp3.com
natco2.no-ip.net
Why does it need Password?
© 2014 MapR Technologies 10
APT
• CHAIN OF EVENTS
• ASSOCIATED DOMAINS
• 192.81.171.13 - www.toonzone.net - Compromised website
• 190.123.47.198 - ilinsting.com - Redirect
• 64.202.116.124 - bgbyhn.in.ua - Fiesta EK
• INFECTION CHAIN OF EVENTS
• 06:40:07 UTC - www.toonzone.net - GET /forums/adult-swim-toonami-forum/
• 06:40:08 UTC - ilinsting.com - GET /szjhmucw.js?3ad1359a5153d640
• 06:40:09 UTC - bgbyhn.in.ua - GET /hdjng94/?2
• 06:40:11 UTC - bgbyhn.in.ua - GET /hdjng94/?25b6d1b1cb76ec625b500e0d560a50040703520d5053520a0706510355090109
• 06:40:12 UTC - bgbyhn.in.ua - GET /hdjng94/?2d8a97d01a056fdd41084e5a0b0c56050752085a0d55540b07570b54080f0708;5110411
• 06:40:14 UTC - bgbyhn.in.ua - GET /hdjng94/?02bb88c62d7306c8534209590a035103050452590c5a530d0501515709000057;5
• 06:40:15 UTC - bgbyhn.in.ua - GET /hdjng94/?02bb88c62d7306c8534209590a035103050452590c5a530d0501515709000057;5;1
• 06:40:42 UTC - bgbyhn.in.ua - GET /hdjng94/?2ad5cdef3fc4ef9851110f0e515f57530757540e5706555d07525700525c065e;6
• 06:40:43 UTC - bgbyhn.in.ua - GET /hdjng94/?2ad5cdef3fc4ef9851110f0e515f57530757540e5706555d07525700525c065e;6;1
• 06:40:43 UTC - bgbyhn.in.ua - GET /hdjng94/?5998786b9c7a1ffe544b580305030457000f0903035a0659000a0a0d0600555a
• 06:40:49 UTC - bgbyhn.in.ua - GET /hdjng94/?59576b00f4cfd03e5641500c04590205000f050c0200000b000a0602075a5308;1;2
• 06:40:49 UTC - bgbyhn.in.ua - GET /hdjng94/?59576b00f4cfd03e5641500c04590205000f050c0200000b000a0602075a5308;1;2;1
Another Example, Fiesta EK, from malware-traffic-analysis.net
© 2014 MapR Technologies 11© 2014 MapR Technologies
Big Data + Threat Intelligence
© 2014 MapR Technologies 12
Big Data + Threat Intelligence
• Tom Brady + Gisele Bundchen
– An Ideal Marriage
• With All the advances in Computing and Data Resources, why can’t we
automate Malware detection
• Big Data is an ideal platform for malware study
– Simple packet capture can easily make PETA bytes data from small offices
– Huge storage + Fast processing is essential for malware study
• Various aspects of Big Data fit well with Malware
– Streaming analysis (Storm, Spark Streaming)
– Volumetric data analysis (Spark)
– Graph analysis
• View network devices as nodes, discover command and control role
• Each url can be a node and the basis of graph analysis
– Visualization for intuitive analysis
Pros
© 2014 MapR Technologies 13
Big Data + Threat Intelligence
• Anomaly detection
– Typical log analysis
– Router / Switch has built in alarm setting
• Simple Level based detection
– Is this going to be useful?
• How much can you tell
• Machine Learning
– Not much useful
• Not easy to get labeled data
• Even with labeled data it is very hard to develop a feature set
– If the feature set is known, hackers will revise their codes
• Zero day attack does not come with a label
– Modeling needs complete understanding of criminal minds
Cons (e.g., Gwyneth Paltrow and Chris Martin)
© 2014 MapR Technologies 14
Big Data + Threat Intelligence
An Example Architecture
Storm Spout
Packet
Stream
Or
Binary
Downloads
Storm Bolt
Packet Analysis
Alert and store
packet data
Store to HDFS
Spark Analysis
Storm Bolt
Meta Data
Extraction
Packet stream
truly reveals
Malware
expression
compared to Log
Connect the Dots with Strong
In Memory Processing
© 2014 MapR Technologies 15
Big Data + Threat Intelligence
• Reduce False Positives
– Mantra in Malware detection business
• Big data is a great resource for reducing false positives (Type 1
error)
– As soon as an update on an algorithm is made, test against the Big
Data test cases
– The test can even be applied to old cases, greatly reducing false
positives
• Typically, we had to sample test data by weighting old data lower
False Positives
© 2014 MapR Technologies 16
Big Data + Threat Intelligence
• Wireshark (tshark) is the goto software for packet analysis
– Huge memory hogging software
• Need to put packet data onto HDFS
• Packetpig has been developed from Hortonworks
– A lot more has to be done to be closer anywhere near to the strength of
Wireshark
• Need to design efficient meta data collection and storage
mechanisms
– Use snort or custom c platform library to extract essential flow data
• Flow is a 5-tuple src/dest/ip/port/protocol
• Flow is the de facto unit of network malware expression analysis
Packet to HDFS
© 2014 MapR Technologies 17
Big Data + Threat Intelligence
• Big Data provides opportunity to map out all the ip addresses
used on a particular network
• Through graph analysis, find rogue IP addresses
• Use geographical information with IP to find abnormal
connection behavior
• DNS provides many insights on Malware connection
– Static IP cannot be used for malware control purpose
– Fast Flux
– Awkward names
IP based analysis
© 2014 MapR Technologies 18
Big Data + Threat Intelligence
• Flow is an essential malware analysis unit
• Flow identifies
– Who’s connecting to whom
• Frequency, duration, communication bandwidth
• App can be identified from flow
– Port, actual content
– Palo Alto Networks
• Normal flow vs Abnormal flow
– With enough data, we could potentially identify normal flow
• Use first 16 bytes?
– Cluster analysis, detect anomaly
Flow to detect malware expression
© 2014 MapR Technologies 19© 2014 MapR Technologies
Spark on Hadoop
© 2014 MapR Technologies 20
Apache Spark
• spark.apache.org
• github.com/apache/spark
• user@spark.apache.org
• Originally developed in 2009 in UC
Berkeley’s AMP Lab
• Fully open sourced in 2010 – now
at Apache Software Foundation
© 2014 MapR Technologies 21
Easy: Example – IP Count
• Spark
public static class WordCountMapClass extends MapReduceBase
implements Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value,
OutputCollector<Text, IntWritable> output,
Reporter reporter) throws IOException {
String line = value.toString();
StringTokenizer itr = new StringTokenizer(line);
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
output.collect(word, one);
}
}
}
public static class WorkdCountReduce extends MapReduceBase
implements Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, IntWritable> output,
Reporter reporter) throws IOException {
int sum = 0;
while (values.hasNext()) {
sum += values.next().get();
}
output.collect(key, new IntWritable(sum));
}
}
• Hadoop MapReduce
val spark = new SparkContext(master, appName, [sparkHome], [jars])
val file = spark.textFile("hdfs://...")
val counts = file.flatMap(line => line.split(“,”)(0))
.map(ip=> (ip, 1))
.reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://...")
© 2014 MapR Technologies 22
Fast: Using RAM, Operator Graphs
• In-memory Caching
• Data Partitions read from
RAM instead of disk
• Operator Graphs
• Scheduling Optimizations
• Fault Tolerance
= cached partition
= RDD
join
filter
groupBy
Stage 3
Stage 1
Stage 2
A: B:
C: D: E:
F:
map
© 2014 MapR Technologies 23
SPARK RDD
• Resilient Distributed Datasets (RDD) is the key (potentially) in
memory data structure
• RDD is distributed over Hadoop Nodes, typically resides on
memory
• Transform RDD, then get data from RDD, Lazy Evaluation
– 2 sets of interfaces are provided, one for transform, the other for taking
actions (e.g., count, save etc)
• Most of the interface is quite similar to Lisp operations and SQL
operations
• Use Persist (Cache) to have the RDD on memory
© 2014 MapR Technologies 24
RDD
© 2014 MapR Technologies 25
Working With RDDs
RDD
RDD
RDD
RDD
Transformations
Action Value
linesWithSpark = textFile.filter(lambda line: "Spark” in line)
linesWithSpark.count()
74
linesWithSpark.first()
# Apache Spark
textFile = sc.textFile(”SomeFile.txt”)
© 2014 MapR Technologies 26
Spark, Hadoop Malware Analysis
Why useful
Packet
Stream
Construct Group of
Suspected Flows In RDD
E.g., suspected DNS tunnels,
IRC communications
Analyze with SPARK on RDD, IN
MEMORY
Connect the Dots, Flows,
SysLogs and Events
Huge advantage over Wireshark!
Store in HDFS for easy
access and use HBase for
database support
Real Time Event
Processing
Fast
Classification or
Anomaly
Detection
© 2014 MapR Technologies 27
SPARK and Hadoop
• Connecting dots needs Huge Storage and Fast Access
– Potential need to go back in time to find correlating events
• DDoS attack found Today + 10 Days ago spotty IRC chat + 20 days ago NXDomain
events by the suspected infected machine
– Sometimes it takes months to know a domain (the machine contacted) is suspicious (e.g.,
scored in VirusTotal)
– Then see if these patterns match with known malware expressions
– Approximate matching technology here is quite important
» HMM and Correlation Modeling
– HDFS + Hbase would be a good solution
• Store relevant temporal data
• Retrieve fast according to the criteria
• SPARK + Hadoop provides fast development cycle
– From prototype to evaluation
Why Hadoop
© 2014 MapR Technologies 28© 2014 MapR Technologies
Example Detection Algorithm Development Scenarios
© 2014 MapR Technologies 29
Introduction to Botnet (Terminology)
Bot Master
Bots
Code Server
IRC Server
Victim
IRC Channel
Attack
IRC Channel
C&C Traffic
Updates
Old Days BotNet
operation,
Just for Reference
Companies are
interested in
finding these in
there premises
© 2014 MapR Technologies 30
(Malware Expression) Detection Phases
• Pre Infection Detection
– Intrusion Detection System
• Active Infection Detection
– Recruit and Reconnaissance in the internal network
• Post Infection Detection
– Exploit and Monetize
© 2014 MapR Technologies 31
Pre Infection Detection
• Detect suspicious URLs
– When a device tries to contact or download suspicious URLs, block it
• How it works
– If suspicious or unknown contents are detected, send it to backend big
data deep analysis engine
– Update suspicious IP/Domain Name/URLs
– Update hash of the binary
– Regularly remove old hash/suspicious URLs
CAMP
© 2014 MapR Technologies 32
On going infection detection
• How it works
– Detect suspicious internal behavior
– Develop normal behavioral model for target customer site
– Detect abnormal authentication behavior, e.g., Kerberos, LDAP etc
– Detect suspicious data move
– Detect suspicious port usage
– Detect tunnels
• It is highly important to leverage Big Data to develop sustainable
normal behavioral model and constant update. Network data/model is
constantly changing.
• Consult with Security experts to define the measure points
In-network infection propagation
© 2014 MapR Technologies 33
Post Infection Detection
• HTTP / DNS is most frequently abused protocols
– Firewalls allow these ports get through
– If needed, play man in the middle for SSL data inspection
• Ill formed Http Header detection
– Abnormal location
– Abnormal referrer
– Abnormal User Agent
– Abnormal Size
• Abnormal Http Post Detection (e.g., entropy analysis)
• Ill formed XML / HTML
• SQL Injection
– SELECT * FROM users WHERE name = '' OR '1'='1';
• LDAP Code Injection
Protocol Abnormality
Collect Malware
Expression
Samples
Develop Feature
Set with Hadoop
and SME
Deploy and
Continually
update the model
© 2014 MapR Technologies 34
Post Infection Detection
• Click Fraud
• Like Fraud
• DDoS
• SPAM
Volumetric Abnormality
© 2014 MapR Technologies 35
Post Infection Detection
• Cadence
• Weird domain name resolution
• Fast Fluxing domain names
• Abnormal IRC traffic behavior
• Abnormal twitter behavior
• Abnormal facebook behavior
Command and Control Contact
© 2014 MapR Technologies 36
DGA
ClickSecurity.com
What Features Would U Use?
© 2014 MapR Technologies 37© 2014 MapR Technologies
Conclusion
© 2014 MapR Technologies 38
Conclusion
• Threat Intelligence and Big Data are very HOT
• Big Data is the ideal analysis platform for Malware expression analysis
– Caution, Remember the Cons
– Useful for efficiently connecting the dots
• Big Data enables
– Persistent model building and updating
– Reducing false positives through exhaustive data check compared to spot check
• Hadoop / SPARK supports ideal platform for Malware expression analysis
– SPARK provides strong inmemory processing power for complex malware data analysis
with simpler scripting level coding
• scala
– MapR provides fastest data access on Hadoop nodes
• M7
• MapR is the better hadoop
• Don’t under estimate NFS and Volume convenience
• Questions are welcome, send to syoon@maprtech.com,
mvasquez@maprtech.com nestrada@maprtech.com

Contenu connexe

Tendances

Using Canary Honeypots for Network Security Monitoring
Using Canary Honeypots for Network Security MonitoringUsing Canary Honeypots for Network Security Monitoring
Using Canary Honeypots for Network Security Monitoringchrissanders88
 
Threat Hunting with Data Science
Threat Hunting with Data ScienceThreat Hunting with Data Science
Threat Hunting with Data ScienceAustin Taylor
 
Leveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker ActivityLeveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker ActivitySqrrl
 
PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup Suman Karumuri
 
Splunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilsonSplunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilsonBecky Burwell
 
Apache metron meetup presentation at capital one
Apache metron meetup presentation at capital oneApache metron meetup presentation at capital one
Apache metron meetup presentation at capital onegvetticaden
 
Big Data for Security - DNS Analytics
Big Data for Security - DNS AnalyticsBig Data for Security - DNS Analytics
Big Data for Security - DNS AnalyticsMarco Casassa Mont
 
Apache metron - An Introduction
Apache metron - An IntroductionApache metron - An Introduction
Apache metron - An IntroductionBaban Gaigole
 
Listening at the Cocktail Party with Deep Neural Networks and TensorFlow
Listening at the Cocktail Party with Deep Neural Networks and TensorFlowListening at the Cocktail Party with Deep Neural Networks and TensorFlow
Listening at the Cocktail Party with Deep Neural Networks and TensorFlowDatabricks
 
A Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsA Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsBigPanda
 
Applied machine learning defeating modern malicious documents
Applied machine learning defeating modern malicious documentsApplied machine learning defeating modern malicious documents
Applied machine learning defeating modern malicious documentsPriyanka Aash
 
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOsSPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOsRod Soto
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDataWorks Summit
 
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...Carolyn Duby
 
What's Next for Google's BigTable
What's Next for Google's BigTableWhat's Next for Google's BigTable
What's Next for Google's BigTableSqrrl
 
Threat Hunting for Command and Control Activity
Threat Hunting for Command and Control ActivityThreat Hunting for Command and Control Activity
Threat Hunting for Command and Control ActivitySqrrl
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC OsloDavid Pilato
 
Managing your black friday logs Voxxed Luxembourg
Managing your black friday logs Voxxed LuxembourgManaging your black friday logs Voxxed Luxembourg
Managing your black friday logs Voxxed LuxembourgDavid Pilato
 

Tendances (20)

Using Canary Honeypots for Network Security Monitoring
Using Canary Honeypots for Network Security MonitoringUsing Canary Honeypots for Network Security Monitoring
Using Canary Honeypots for Network Security Monitoring
 
Threat Hunting with Data Science
Threat Hunting with Data ScienceThreat Hunting with Data Science
Threat Hunting with Data Science
 
Leveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker ActivityLeveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker Activity
 
PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup
 
Splunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilsonSplunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilson
 
Apache metron meetup presentation at capital one
Apache metron meetup presentation at capital oneApache metron meetup presentation at capital one
Apache metron meetup presentation at capital one
 
Big Data for Security - DNS Analytics
Big Data for Security - DNS AnalyticsBig Data for Security - DNS Analytics
Big Data for Security - DNS Analytics
 
Apache metron - An Introduction
Apache metron - An IntroductionApache metron - An Introduction
Apache metron - An Introduction
 
Listening at the Cocktail Party with Deep Neural Networks and TensorFlow
Listening at the Cocktail Party with Deep Neural Networks and TensorFlowListening at the Cocktail Party with Deep Neural Networks and TensorFlow
Listening at the Cocktail Party with Deep Neural Networks and TensorFlow
 
A Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsA Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOps
 
Applied machine learning defeating modern malicious documents
Applied machine learning defeating modern malicious documentsApplied machine learning defeating modern malicious documents
Applied machine learning defeating modern malicious documents
 
Big Data for Security
Big Data for SecurityBig Data for Security
Big Data for Security
 
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOsSPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...
 
Solving Cyber at Scale
Solving Cyber at ScaleSolving Cyber at Scale
Solving Cyber at Scale
 
What's Next for Google's BigTable
What's Next for Google's BigTableWhat's Next for Google's BigTable
What's Next for Google's BigTable
 
Threat Hunting for Command and Control Activity
Threat Hunting for Command and Control ActivityThreat Hunting for Command and Control Activity
Threat Hunting for Command and Control Activity
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC Oslo
 
Managing your black friday logs Voxxed Luxembourg
Managing your black friday logs Voxxed LuxembourgManaging your black friday logs Voxxed Luxembourg
Managing your black friday logs Voxxed Luxembourg
 

En vedette

Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...SANG WON PARK
 
Monitor all the cloud things - security monitoring for everyone
Monitor all the cloud things - security monitoring for everyoneMonitor all the cloud things - security monitoring for everyone
Monitor all the cloud things - security monitoring for everyoneDuncan Godfrey
 
Stormshield Visibility Center
Stormshield Visibility CenterStormshield Visibility Center
Stormshield Visibility CenterNRC
 
What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2Gareth Chapman
 
Docker in Production, Look No Hands! by Scott Coulton
Docker in Production, Look No Hands! by Scott CoultonDocker in Production, Look No Hands! by Scott Coulton
Docker in Production, Look No Hands! by Scott CoultonDocker, Inc.
 
150430 regiosessie corv_almelo
150430 regiosessie corv_almelo150430 regiosessie corv_almelo
150430 regiosessie corv_almeloKING
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsMartin Gutenbrunner
 
Performance Benchmarking of Clouds Evaluating OpenStack
Performance Benchmarking of Clouds                Evaluating OpenStackPerformance Benchmarking of Clouds                Evaluating OpenStack
Performance Benchmarking of Clouds Evaluating OpenStackPradeep Kumar
 
Exponentiële groei v2
Exponentiële groei v2Exponentiële groei v2
Exponentiële groei v2guest6b41899
 
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTA BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTPintu Kabiraj
 
E-commerce Berlin Expo - Divante - Anna Lankauf
E-commerce Berlin Expo - Divante - Anna LankaufE-commerce Berlin Expo - Divante - Anna Lankauf
E-commerce Berlin Expo - Divante - Anna LankaufE-Commerce Berlin EXPO
 
(ARC401) Cloud First: New Architecture for New Infrastructure
(ARC401) Cloud First: New Architecture for New Infrastructure(ARC401) Cloud First: New Architecture for New Infrastructure
(ARC401) Cloud First: New Architecture for New InfrastructureAmazon Web Services
 
Elks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetupElks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetupAnoop Vijayan
 
Docker experience @inbotapp
Docker experience @inbotappDocker experience @inbotapp
Docker experience @inbotappJilles van Gurp
 
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlow
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlowCloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlow
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlowCohesive Networks
 
Evolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEOEvolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEODimitri Brunel
 
Cloud adoption patterns April 11 2016
Cloud adoption patterns April 11 2016Cloud adoption patterns April 11 2016
Cloud adoption patterns April 11 2016Kyle Brown
 
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...VMware Tanzu
 

En vedette (20)

Diabetes mellitus
Diabetes mellitusDiabetes mellitus
Diabetes mellitus
 
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
 
Monitor all the cloud things - security monitoring for everyone
Monitor all the cloud things - security monitoring for everyoneMonitor all the cloud things - security monitoring for everyone
Monitor all the cloud things - security monitoring for everyone
 
Stormshield Visibility Center
Stormshield Visibility CenterStormshield Visibility Center
Stormshield Visibility Center
 
What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2
 
Veselík 1
Veselík 1Veselík 1
Veselík 1
 
Docker in Production, Look No Hands! by Scott Coulton
Docker in Production, Look No Hands! by Scott CoultonDocker in Production, Look No Hands! by Scott Coulton
Docker in Production, Look No Hands! by Scott Coulton
 
150430 regiosessie corv_almelo
150430 regiosessie corv_almelo150430 regiosessie corv_almelo
150430 regiosessie corv_almelo
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environments
 
Performance Benchmarking of Clouds Evaluating OpenStack
Performance Benchmarking of Clouds                Evaluating OpenStackPerformance Benchmarking of Clouds                Evaluating OpenStack
Performance Benchmarking of Clouds Evaluating OpenStack
 
Exponentiële groei v2
Exponentiële groei v2Exponentiële groei v2
Exponentiële groei v2
 
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTA BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
 
E-commerce Berlin Expo - Divante - Anna Lankauf
E-commerce Berlin Expo - Divante - Anna LankaufE-commerce Berlin Expo - Divante - Anna Lankauf
E-commerce Berlin Expo - Divante - Anna Lankauf
 
(ARC401) Cloud First: New Architecture for New Infrastructure
(ARC401) Cloud First: New Architecture for New Infrastructure(ARC401) Cloud First: New Architecture for New Infrastructure
(ARC401) Cloud First: New Architecture for New Infrastructure
 
Elks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetupElks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetup
 
Docker experience @inbotapp
Docker experience @inbotappDocker experience @inbotapp
Docker experience @inbotapp
 
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlow
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlowCloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlow
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlow
 
Evolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEOEvolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEO
 
Cloud adoption patterns April 11 2016
Cloud adoption patterns April 11 2016Cloud adoption patterns April 11 2016
Cloud adoption patterns April 11 2016
 
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...
 

Similaire à Hadoop / Spark on Malware Expression

Luiz eduardo. introduction to mobile snitch
Luiz eduardo. introduction to mobile snitchLuiz eduardo. introduction to mobile snitch
Luiz eduardo. introduction to mobile snitchYury Chemerkin
 
Crowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoopCrowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadooplucenerevolution
 
Combating Advanced Persistent Threats with Flow-based Security Monitoring
Combating Advanced Persistent Threats with Flow-based Security MonitoringCombating Advanced Persistent Threats with Flow-based Security Monitoring
Combating Advanced Persistent Threats with Flow-based Security MonitoringLancope, Inc.
 
Data torrent meetup-productioneng
Data torrent meetup-productionengData torrent meetup-productioneng
Data torrent meetup-productionengChris Westin
 
Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2Niel Dunnage
 
Apache Metron Meetup May 4, 2016 - Big data cybersecurity
Apache Metron Meetup May 4, 2016 - Big data cybersecurityApache Metron Meetup May 4, 2016 - Big data cybersecurity
Apache Metron Meetup May 4, 2016 - Big data cybersecurityHortonworks
 
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced ThreatsGood Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced ThreatsZivaro Inc
 
Architecting R into Storm Application Development Process
Architecting R into Storm Application Development ProcessArchitecting R into Storm Application Development Process
Architecting R into Storm Application Development ProcessDataWorks Summit
 
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseR + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseAllen Day, PhD
 
Ariu - Ph.D. Defense Slides
Ariu - Ph.D. Defense SlidesAriu - Ph.D. Defense Slides
Ariu - Ph.D. Defense SlidesPluribus One
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
Extracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseExtracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseAshwini Almad
 
Extracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseExtracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseEndgameInc
 
Security Breakout Session
Security Breakout Session Security Breakout Session
Security Breakout Session Splunk
 
Complete notes security
Complete notes securityComplete notes security
Complete notes securityKitkat Emoo
 
Distributed tracing
Distributed tracingDistributed tracing
Distributed tracingnishantmodak
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Codemotion
 
Distributed Sensor Data Contextualization for Threat Intelligence Analysis
Distributed Sensor Data Contextualization for Threat Intelligence AnalysisDistributed Sensor Data Contextualization for Threat Intelligence Analysis
Distributed Sensor Data Contextualization for Threat Intelligence AnalysisJason Trost
 
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network SecurityMMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network SecurityAPNIC
 
MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR Technologies
 

Similaire à Hadoop / Spark on Malware Expression (20)

Luiz eduardo. introduction to mobile snitch
Luiz eduardo. introduction to mobile snitchLuiz eduardo. introduction to mobile snitch
Luiz eduardo. introduction to mobile snitch
 
Crowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoopCrowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoop
 
Combating Advanced Persistent Threats with Flow-based Security Monitoring
Combating Advanced Persistent Threats with Flow-based Security MonitoringCombating Advanced Persistent Threats with Flow-based Security Monitoring
Combating Advanced Persistent Threats with Flow-based Security Monitoring
 
Data torrent meetup-productioneng
Data torrent meetup-productionengData torrent meetup-productioneng
Data torrent meetup-productioneng
 
Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2
 
Apache Metron Meetup May 4, 2016 - Big data cybersecurity
Apache Metron Meetup May 4, 2016 - Big data cybersecurityApache Metron Meetup May 4, 2016 - Big data cybersecurity
Apache Metron Meetup May 4, 2016 - Big data cybersecurity
 
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced ThreatsGood Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
 
Architecting R into Storm Application Development Process
Architecting R into Storm Application Development ProcessArchitecting R into Storm Application Development Process
Architecting R into Storm Application Development Process
 
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseR + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
 
Ariu - Ph.D. Defense Slides
Ariu - Ph.D. Defense SlidesAriu - Ph.D. Defense Slides
Ariu - Ph.D. Defense Slides
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Extracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseExtracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet Noise
 
Extracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseExtracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet Noise
 
Security Breakout Session
Security Breakout Session Security Breakout Session
Security Breakout Session
 
Complete notes security
Complete notes securityComplete notes security
Complete notes security
 
Distributed tracing
Distributed tracingDistributed tracing
Distributed tracing
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
 
Distributed Sensor Data Contextualization for Threat Intelligence Analysis
Distributed Sensor Data Contextualization for Threat Intelligence AnalysisDistributed Sensor Data Contextualization for Threat Intelligence Analysis
Distributed Sensor Data Contextualization for Threat Intelligence Analysis
 
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network SecurityMMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security
 
MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012
 

Plus de MapR Technologies

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscapeMapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureMapR Technologies
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 

Plus de MapR Technologies (20)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 

Dernier

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Dernier (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

Hadoop / Spark on Malware Expression

  • 1. © 2014 MapR Technologies 1© 2014 MapR Technologies
  • 2. © 2014 MapR Technologies 2 Objective • Advanced Persistent Threat (APT) • Big Data + Threat Intelligence • Hadoop + Spark Solution • Example Detection Algorithm Development Scenarios (most of them are still open problems) Topics covered in this talk
  • 3. © 2014 MapR Technologies 3© 2014 MapR Technologies Advanced Persistent Threat
  • 4. © 2014 MapR Technologies 4 APT • Advanced Persistent Threat (APT) is one of the biggest headaches in IT departments – Target Compromise – Countless DDoS attacks (Thousands a day according to Arbor Networks) – These are only known cases, that could just be a tip of the iceberg. • Why APT is so prevalent? – No more hobby for smart hackers – Huge money is involved, even behind organized crimes – Political tool (Recent conflict between Ukraine and Russia sparked malware warfare between them) – Cyber warfare (Stuxnet) Overview
  • 5. © 2014 MapR Technologies 5 APT • Hard to Detect – More software layer stacks without thorough vulnerability test popping every day • Storm, spark, yarn, grail, play, spring, flask, … – Mobile area is even worse • Particularly android • Some estimates 30% or more devices are already compromised, worldwide • Anti-Virus is useful only up to a certain point – It takes months to years to define malware signature – Zero day attack is still unpreventable – It became almost a Placebo • Firewall is not much useful anymore – A device can be infected when the user brings it outside the Firewall premise • Botnet itself is becoming more complex with many hierarchies – Minimal binary delivery – Surreptitious C&C connection with complex hierarchy or even headless peer to peer bots (Gameover Zeus Botnet) Status
  • 6. © 2014 MapR Technologies 6 APT • Snort / Suricata – Rule based system – Community support, pre/post-compromise detection – Constant update is needed, cannot detect Zero day attack – Sourcefire provides paid service • Sandbox Technology – Firewall + In premise detection – Fireeye • Poly-morphing technology – ShapeSecurity • Log data mining based methods – Splunk / Sumo Logic, Solutionary Defense, state of the art
  • 7. © 2014 MapR Technologies 7 APT • Many world wide security labs have malware labs and generate threat reports • The analysis takes from 2 weeks to months • Involves – Decoding binary execution and decrypting load / config parameters – Complete time line analysis, from infection to exploit – What devices and ips and domain names are involved • Sometimes, analyze IRC data, or even social network data – Botnet connection and verify the command and control • Can we automate this with Big Data? Threat Report
  • 8. © 2014 MapR Technologies 8 APT Example Annual Threat Report (from Fireeye, 2013, Europe) Top Two Industries in Threat Finding were Healthcare and Finance
  • 9. © 2014 MapR Technologies 9 APT • Configuration (Decrypted) • ID: F16 08-07-2013 Group: DNS/Port: Direct: toornt.servegame.com:443, Proxy DNS/Port: Proxy Hijack: No ActiveX Startup Key: HKLM Startup Entry: File Name: Install Path: C:Documents and SettingsAdminLocal SettingsTempmorse.exe Keylog Path: C:Documents and SettingsAdminLocal SettingsTempmorse Inject: No Process Mutex: gdfgdfgdg Key Logger Mutex: ActiveX Startup: No HKLM Startup: No Copy To: No Melt: No Persistence: No Keylogger: No Password: !@#GooD#@! Example Threat Report (from Fireeye) C&C Servers toornt.servegame.com updateo.servegame.com egypttv.sytes.net skype.servemp3.com natco2.no-ip.net Why does it need Password?
  • 10. © 2014 MapR Technologies 10 APT • CHAIN OF EVENTS • ASSOCIATED DOMAINS • 192.81.171.13 - www.toonzone.net - Compromised website • 190.123.47.198 - ilinsting.com - Redirect • 64.202.116.124 - bgbyhn.in.ua - Fiesta EK • INFECTION CHAIN OF EVENTS • 06:40:07 UTC - www.toonzone.net - GET /forums/adult-swim-toonami-forum/ • 06:40:08 UTC - ilinsting.com - GET /szjhmucw.js?3ad1359a5153d640 • 06:40:09 UTC - bgbyhn.in.ua - GET /hdjng94/?2 • 06:40:11 UTC - bgbyhn.in.ua - GET /hdjng94/?25b6d1b1cb76ec625b500e0d560a50040703520d5053520a0706510355090109 • 06:40:12 UTC - bgbyhn.in.ua - GET /hdjng94/?2d8a97d01a056fdd41084e5a0b0c56050752085a0d55540b07570b54080f0708;5110411 • 06:40:14 UTC - bgbyhn.in.ua - GET /hdjng94/?02bb88c62d7306c8534209590a035103050452590c5a530d0501515709000057;5 • 06:40:15 UTC - bgbyhn.in.ua - GET /hdjng94/?02bb88c62d7306c8534209590a035103050452590c5a530d0501515709000057;5;1 • 06:40:42 UTC - bgbyhn.in.ua - GET /hdjng94/?2ad5cdef3fc4ef9851110f0e515f57530757540e5706555d07525700525c065e;6 • 06:40:43 UTC - bgbyhn.in.ua - GET /hdjng94/?2ad5cdef3fc4ef9851110f0e515f57530757540e5706555d07525700525c065e;6;1 • 06:40:43 UTC - bgbyhn.in.ua - GET /hdjng94/?5998786b9c7a1ffe544b580305030457000f0903035a0659000a0a0d0600555a • 06:40:49 UTC - bgbyhn.in.ua - GET /hdjng94/?59576b00f4cfd03e5641500c04590205000f050c0200000b000a0602075a5308;1;2 • 06:40:49 UTC - bgbyhn.in.ua - GET /hdjng94/?59576b00f4cfd03e5641500c04590205000f050c0200000b000a0602075a5308;1;2;1 Another Example, Fiesta EK, from malware-traffic-analysis.net
  • 11. © 2014 MapR Technologies 11© 2014 MapR Technologies Big Data + Threat Intelligence
  • 12. © 2014 MapR Technologies 12 Big Data + Threat Intelligence • Tom Brady + Gisele Bundchen – An Ideal Marriage • With All the advances in Computing and Data Resources, why can’t we automate Malware detection • Big Data is an ideal platform for malware study – Simple packet capture can easily make PETA bytes data from small offices – Huge storage + Fast processing is essential for malware study • Various aspects of Big Data fit well with Malware – Streaming analysis (Storm, Spark Streaming) – Volumetric data analysis (Spark) – Graph analysis • View network devices as nodes, discover command and control role • Each url can be a node and the basis of graph analysis – Visualization for intuitive analysis Pros
  • 13. © 2014 MapR Technologies 13 Big Data + Threat Intelligence • Anomaly detection – Typical log analysis – Router / Switch has built in alarm setting • Simple Level based detection – Is this going to be useful? • How much can you tell • Machine Learning – Not much useful • Not easy to get labeled data • Even with labeled data it is very hard to develop a feature set – If the feature set is known, hackers will revise their codes • Zero day attack does not come with a label – Modeling needs complete understanding of criminal minds Cons (e.g., Gwyneth Paltrow and Chris Martin)
  • 14. © 2014 MapR Technologies 14 Big Data + Threat Intelligence An Example Architecture Storm Spout Packet Stream Or Binary Downloads Storm Bolt Packet Analysis Alert and store packet data Store to HDFS Spark Analysis Storm Bolt Meta Data Extraction Packet stream truly reveals Malware expression compared to Log Connect the Dots with Strong In Memory Processing
  • 15. © 2014 MapR Technologies 15 Big Data + Threat Intelligence • Reduce False Positives – Mantra in Malware detection business • Big data is a great resource for reducing false positives (Type 1 error) – As soon as an update on an algorithm is made, test against the Big Data test cases – The test can even be applied to old cases, greatly reducing false positives • Typically, we had to sample test data by weighting old data lower False Positives
  • 16. © 2014 MapR Technologies 16 Big Data + Threat Intelligence • Wireshark (tshark) is the goto software for packet analysis – Huge memory hogging software • Need to put packet data onto HDFS • Packetpig has been developed from Hortonworks – A lot more has to be done to be closer anywhere near to the strength of Wireshark • Need to design efficient meta data collection and storage mechanisms – Use snort or custom c platform library to extract essential flow data • Flow is a 5-tuple src/dest/ip/port/protocol • Flow is the de facto unit of network malware expression analysis Packet to HDFS
  • 17. © 2014 MapR Technologies 17 Big Data + Threat Intelligence • Big Data provides opportunity to map out all the ip addresses used on a particular network • Through graph analysis, find rogue IP addresses • Use geographical information with IP to find abnormal connection behavior • DNS provides many insights on Malware connection – Static IP cannot be used for malware control purpose – Fast Flux – Awkward names IP based analysis
  • 18. © 2014 MapR Technologies 18 Big Data + Threat Intelligence • Flow is an essential malware analysis unit • Flow identifies – Who’s connecting to whom • Frequency, duration, communication bandwidth • App can be identified from flow – Port, actual content – Palo Alto Networks • Normal flow vs Abnormal flow – With enough data, we could potentially identify normal flow • Use first 16 bytes? – Cluster analysis, detect anomaly Flow to detect malware expression
  • 19. © 2014 MapR Technologies 19© 2014 MapR Technologies Spark on Hadoop
  • 20. © 2014 MapR Technologies 20 Apache Spark • spark.apache.org • github.com/apache/spark • user@spark.apache.org • Originally developed in 2009 in UC Berkeley’s AMP Lab • Fully open sourced in 2010 – now at Apache Software Foundation
  • 21. © 2014 MapR Technologies 21 Easy: Example – IP Count • Spark public static class WordCountMapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer itr = new StringTokenizer(line); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); output.collect(word, one); } } } public static class WorkdCountReduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } } • Hadoop MapReduce val spark = new SparkContext(master, appName, [sparkHome], [jars]) val file = spark.textFile("hdfs://...") val counts = file.flatMap(line => line.split(“,”)(0)) .map(ip=> (ip, 1)) .reduceByKey(_ + _) counts.saveAsTextFile("hdfs://...")
  • 22. © 2014 MapR Technologies 22 Fast: Using RAM, Operator Graphs • In-memory Caching • Data Partitions read from RAM instead of disk • Operator Graphs • Scheduling Optimizations • Fault Tolerance = cached partition = RDD join filter groupBy Stage 3 Stage 1 Stage 2 A: B: C: D: E: F: map
  • 23. © 2014 MapR Technologies 23 SPARK RDD • Resilient Distributed Datasets (RDD) is the key (potentially) in memory data structure • RDD is distributed over Hadoop Nodes, typically resides on memory • Transform RDD, then get data from RDD, Lazy Evaluation – 2 sets of interfaces are provided, one for transform, the other for taking actions (e.g., count, save etc) • Most of the interface is quite similar to Lisp operations and SQL operations • Use Persist (Cache) to have the RDD on memory
  • 24. © 2014 MapR Technologies 24 RDD
  • 25. © 2014 MapR Technologies 25 Working With RDDs RDD RDD RDD RDD Transformations Action Value linesWithSpark = textFile.filter(lambda line: "Spark” in line) linesWithSpark.count() 74 linesWithSpark.first() # Apache Spark textFile = sc.textFile(”SomeFile.txt”)
  • 26. © 2014 MapR Technologies 26 Spark, Hadoop Malware Analysis Why useful Packet Stream Construct Group of Suspected Flows In RDD E.g., suspected DNS tunnels, IRC communications Analyze with SPARK on RDD, IN MEMORY Connect the Dots, Flows, SysLogs and Events Huge advantage over Wireshark! Store in HDFS for easy access and use HBase for database support Real Time Event Processing Fast Classification or Anomaly Detection
  • 27. © 2014 MapR Technologies 27 SPARK and Hadoop • Connecting dots needs Huge Storage and Fast Access – Potential need to go back in time to find correlating events • DDoS attack found Today + 10 Days ago spotty IRC chat + 20 days ago NXDomain events by the suspected infected machine – Sometimes it takes months to know a domain (the machine contacted) is suspicious (e.g., scored in VirusTotal) – Then see if these patterns match with known malware expressions – Approximate matching technology here is quite important » HMM and Correlation Modeling – HDFS + Hbase would be a good solution • Store relevant temporal data • Retrieve fast according to the criteria • SPARK + Hadoop provides fast development cycle – From prototype to evaluation Why Hadoop
  • 28. © 2014 MapR Technologies 28© 2014 MapR Technologies Example Detection Algorithm Development Scenarios
  • 29. © 2014 MapR Technologies 29 Introduction to Botnet (Terminology) Bot Master Bots Code Server IRC Server Victim IRC Channel Attack IRC Channel C&C Traffic Updates Old Days BotNet operation, Just for Reference Companies are interested in finding these in there premises
  • 30. © 2014 MapR Technologies 30 (Malware Expression) Detection Phases • Pre Infection Detection – Intrusion Detection System • Active Infection Detection – Recruit and Reconnaissance in the internal network • Post Infection Detection – Exploit and Monetize
  • 31. © 2014 MapR Technologies 31 Pre Infection Detection • Detect suspicious URLs – When a device tries to contact or download suspicious URLs, block it • How it works – If suspicious or unknown contents are detected, send it to backend big data deep analysis engine – Update suspicious IP/Domain Name/URLs – Update hash of the binary – Regularly remove old hash/suspicious URLs CAMP
  • 32. © 2014 MapR Technologies 32 On going infection detection • How it works – Detect suspicious internal behavior – Develop normal behavioral model for target customer site – Detect abnormal authentication behavior, e.g., Kerberos, LDAP etc – Detect suspicious data move – Detect suspicious port usage – Detect tunnels • It is highly important to leverage Big Data to develop sustainable normal behavioral model and constant update. Network data/model is constantly changing. • Consult with Security experts to define the measure points In-network infection propagation
  • 33. © 2014 MapR Technologies 33 Post Infection Detection • HTTP / DNS is most frequently abused protocols – Firewalls allow these ports get through – If needed, play man in the middle for SSL data inspection • Ill formed Http Header detection – Abnormal location – Abnormal referrer – Abnormal User Agent – Abnormal Size • Abnormal Http Post Detection (e.g., entropy analysis) • Ill formed XML / HTML • SQL Injection – SELECT * FROM users WHERE name = '' OR '1'='1'; • LDAP Code Injection Protocol Abnormality Collect Malware Expression Samples Develop Feature Set with Hadoop and SME Deploy and Continually update the model
  • 34. © 2014 MapR Technologies 34 Post Infection Detection • Click Fraud • Like Fraud • DDoS • SPAM Volumetric Abnormality
  • 35. © 2014 MapR Technologies 35 Post Infection Detection • Cadence • Weird domain name resolution • Fast Fluxing domain names • Abnormal IRC traffic behavior • Abnormal twitter behavior • Abnormal facebook behavior Command and Control Contact
  • 36. © 2014 MapR Technologies 36 DGA ClickSecurity.com What Features Would U Use?
  • 37. © 2014 MapR Technologies 37© 2014 MapR Technologies Conclusion
  • 38. © 2014 MapR Technologies 38 Conclusion • Threat Intelligence and Big Data are very HOT • Big Data is the ideal analysis platform for Malware expression analysis – Caution, Remember the Cons – Useful for efficiently connecting the dots • Big Data enables – Persistent model building and updating – Reducing false positives through exhaustive data check compared to spot check • Hadoop / SPARK supports ideal platform for Malware expression analysis – SPARK provides strong inmemory processing power for complex malware data analysis with simpler scripting level coding • scala – MapR provides fastest data access on Hadoop nodes • M7 • MapR is the better hadoop • Don’t under estimate NFS and Volume convenience • Questions are welcome, send to syoon@maprtech.com, mvasquez@maprtech.com nestrada@maprtech.com