SlideShare a Scribd company logo
1 of 28
CHARACTERIZING SIGNATURE
SETS FOR TESTING DPI
SYSTEMS
The 3rd IEEE International Workshop on
Management of Emerging Networks and
Services (IEEE MENS 2011)
Rafael Antonello, Stenio Fernandes, Djamel Sadok, Judith
Kelner
Federal University of Pernambuco - UFPE
Recife, Brazil
Outline
 Introduction
 Motivation
 Contribution
 Signature Set Analyzer Framework
 Experimental Results
 Concluding Remarks
Introduction
Introduction
 Deep Packet Inspection (DPI) Systems
 key component for accurate network
management
 Look inside the packet payload trying to find
application signatures
 Recognizable patterns (similar to an anti-virus
system)
 High computational requirements are
mainly due
 high number of regular expressions (RE) in
the signature sets in modern DPI
RegEx to FA
 analyze the DFA created for recognizing the regular
expression (regex) ^x01[x08x09][x03x04]
 Size and complexity of signatures sets can lead to
space state explosion of the FA
 It degrades performance
Introduction
 Challenges:
 Growing link speed
 40-100 Gbps and beyond
 Ever increasing number of Internet applications
 Research effort on optimizing DPI systems
 new packet capture methods
 Building efficient automata for representing REs
 Efficient classifiers
Motivation
 Performance analysis for DPI engines has been
done without a common ground
 That’s where the problem arises
 Selected signature bases present
 Different sizes. Example:
 1.8Gbps over a 268 signatures set [17]
 1.6Gbps over a 2 signatures set [7]
 Variable complexity
 For RE, dot stars (.*) and count constraints (c{n}
constructions) can generate very complex DFAs
Contribution…
 A framework for
 Characterizing the signature sets commonly
used to evaluate DPI systems
 An in-depth analysis of signature sets
 from well-known applications, protocols, and
intrusion detection systems
 A classification mechanism for signature
sets
 according to their size, number of sub-
patterns, and complexity
SSA Framework - SSAF
Sig-Set Analyzer
SSAF Overview
 Firstly
 Select representative signature sets
 Extract REs
 And then apply normalization
web-cgi.rules.pcres1 Wfrom=[^x3b&n]{100}
web-cgi.rules.pcres2 pwd=(!|%21)CRYPT(!|%21)[A-Z0-9]{512}
web-cgi.rules.pcres3 evtdumpx3f.*?x2525[^x20]*?x20HTTP
web-cgi.rules.pcres4 ShellExample.cgi?[^nr&]*x2a
web-cgi.rules.pcres5 update=[^rnx26]+
web-cgi.rules.pcres6 awstats.pl?[^rn]*configdir=x7C
 SSA generates:
 Number of signatures
 Signature size (avg): Average size of signatures
 Signature max size: Maximum signature size;
 Signature min size: Minimum signature size;
 DotStars .* - (count): Number of dot stars (.*) constructions;
 DotStars (avg): Average of dot stars per signature;
 Char Ranges (count): Number of character ranges ([a-d])
 Char Ranges (avg): Average number of character ranges per
signature;
 SSA:
 Count constraints c{n} or c{m.n} (count)
 Average number of count constraints per signature;
 Count constraints on ranges [a-d]{n}or{m,n} (count): Number of count
constraints on character ranges.
 Count constraints on ranges (avg): Average number of count constraints
on character ranges;
 OR operators | (count): Number of OR operators in a signature set;
 OR operators (avg): Average number of OR operators per signature;
 Number of sps (count): Number of sub-patterns present in a signature set;
 Number of sps (avg): Average number of sub-patterns per signature;
 Sp min length: Sub-patterns’ minimum length;
 Sp max length: Sub-patterns’ maximum length;
 Sp avg. length: Sub-patterns’ average length.
Logistic Function
 Normalization
 Size
 Sub-patterns
 Complexity
 x: signature set size, # of sub-patterns, complexity
metric
 y: [0-1]
Complexity
 x is the sum of three variables:
 the average number of count constraints on
ranges,
 the average number of count constraints, and
 the average number of dot star constructions
per signature
Metric Levels
Base Size Small Medium Large
Avg. Number of Sub-
Patterns
Low
Medium High
Complexity Low Moderate High
Signature Sets’ Characterization:
Based on the output of the logistic function (for normalization purposes)
Experimental Results
Signature Bases
 L7-Filter
 Bro
 Snort-Web
 Snort-ActiveX
 Snort-Spyware
L7-Filter
Metric Values
Number of signatures 123
Signature size (avg) 61.756096
Signature max size 438
Signature min size 6
DotStars .* - (count) 35
DotStars (avg) 0.284553
Char Ranges [a-d] (count) 265
Char Ranges (avg) 2.154472
Count constraints c{n} or c{m.n} (count) 0
Count constraints on ranges (count) 0
OR operators | (count) 150
OR operators (avg) 1.219512
Number of sps (count) 470
Number of sps (avg) 3.821138
Sp min length 1
Sp max length 46
Sp avg. length 5.859574
Bro
Metric Values
Number of signatures 268
Signature size (avg) 30.772388
Signature max size 211
Signature min size 1
DotStars (count) 8
DotStars (avg) 0.029851
Char Ranges (count) 0
Count constraints (count) 10
Count constraints (avg) 0.037313
Count constraints on ranges (count) 4
Count constraints on ranges (avg) 0.014925
OR operators (count) 6
OR operators (avg) 0.022388
Number of sps (count) 382
Number of sps (avg) 1.425373
Sp min length 1
Sp max length 46
Sp avg. length 4.028796
Snort-Web
Metric Values
Number of signatures 336
Signature size (avg) 57.327381
Signature max size 486
Signature min size 3
DotStars (count) 56
DotStars (avg) 0.166667
Char Ranges (count) 103
Char Ranges (avg) 0.306548
Count constraints (count) 233
Count constraints (avg) 0.693452
Count constraints on ranges (count) 18
Count constraints on ranges (avg) 0.053571
OR operators (count) 402
OR operators (avg) 1.196429
Number of sps (count) 1668
Number of sps (avg) 4.964286
Sp min length 1
Sp max length 64
Sp length (avg) 4.573741
Snort-ActiveX
Metric Values
Number of signatures 2385
Signature size (avg) 321.137115
Signature max size 867
Signature min size 34
DotStars (count) 1599
DotStars (avg) 0.67044
Char Ranges (count) 2
Char Ranges (avg) 0.000839
Count constraints (count) 0
Count constraints on ranges (count) 0
OR operators (count) 10654
OR operators (avg) 4.467086
Number of sps (count) 54981
Number of sps (avg) 23.05283
Sp min length 1
Sp max length 83
Sp avg. length 6.119805
Snort-Spyware
Metric Values
Number of signatures 431
Signature size (avg) 48.308586
Signature max size 324
Signature min size 12
DotStars (count) 37
DotStars (avg) 0.085847
Char Ranges (count) 18
Char Ranges (avg) 0.041763
Count constraints (count) 25
Count constraints (avg) 0.058005
Count constraints on ranges (count) 1
Count constraints on ranges (avg) 0.00232
OR operators (count) 72
OR operators (avg) 0.167053
Number of sps (count) 1315
Number of sps (avg) 3.051044
Sp min length 1
Sp max length 175
Sp length (avg) 9.01673
Signature Sets’ Main
Characteristics
Sig-Set Base Size Sub-Pattern number Overall Complexity
L-7 Filter Small (0.31) Medium (0.37) Moderated (0.38)
Bro Medium (0.35) Low (0.30) Low (0.22)
Snort-Web Medium (0.37) Medium (0.41) High (0.84)
Snort-ActiveX Large (0.9) High (0.9) High (0.71)
Snort-Spyware Medium (0.4) Medium (0.35) Low (0.27)
Concluding Remarks
Concluding Remarks
 Different signature sets to compare different DPI
techniques might lead to inaccurate results
 We developed a mechanism for characterizing
signature sets
 according to their size
 number of sub-patterns
 overall complexity
 By knowing the characteristics of the signature
sets (size and complexity)
 It puts DFA-based DPI engines under different stress
conditions
 It allows comparable performance analysis
CHARACTERIZING
SIGNATURE SETS FOR
TESTING DPI SYSTEMS

More Related Content

Similar to Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems

Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Serban Tanasa
 
2007 Tidc India Profiling
2007 Tidc India Profiling2007 Tidc India Profiling
2007 Tidc India Profilingdanrinkes
 
Intelligent Monitoring
Intelligent MonitoringIntelligent Monitoring
Intelligent MonitoringIntelie
 
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...Priyanka Aash
 
Intrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set ReductionIntrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set ReductionIOSR Journals
 
Fingerprinting Chemical Structures
Fingerprinting Chemical StructuresFingerprinting Chemical Structures
Fingerprinting Chemical StructuresRajarshi Guha
 
Performance is a feature! - London .NET User Group
Performance is a feature! - London .NET User GroupPerformance is a feature! - London .NET User Group
Performance is a feature! - London .NET User GroupMatt Warren
 
Protocol Type Based Intrusion Detection Using RBF Neural Network
Protocol Type Based Intrusion Detection Using RBF Neural NetworkProtocol Type Based Intrusion Detection Using RBF Neural Network
Protocol Type Based Intrusion Detection Using RBF Neural NetworkWaqas Tariq
 
Performance and how to measure it - ProgSCon London 2016
Performance and how to measure it - ProgSCon London 2016Performance and how to measure it - ProgSCon London 2016
Performance and how to measure it - ProgSCon London 2016Matt Warren
 
Extracting data from text documents using the regex
Extracting data from text documents using the regexExtracting data from text documents using the regex
Extracting data from text documents using the regexSteve Mylroie
 
Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595Marco Yandun
 
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...Klessydra t - designing vector coprocessors for multi-threaded edge-computing...
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...RISC-V International
 
Mastering AIOps with Deep Learning
Mastering AIOps with Deep LearningMastering AIOps with Deep Learning
Mastering AIOps with Deep LearningJorge Cardoso
 
A New Framework for Detection
A New Framework for DetectionA New Framework for Detection
A New Framework for DetectionSourcefire VRT
 
[2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger [2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger Eli Kaminuma
 
Performance is a Feature! at DDD 11
Performance is a Feature! at DDD 11Performance is a Feature! at DDD 11
Performance is a Feature! at DDD 11Matt Warren
 

Similar to Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems (20)

Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
 
2007 Tidc India Profiling
2007 Tidc India Profiling2007 Tidc India Profiling
2007 Tidc India Profiling
 
TiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architectureTiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architecture
 
Intelligent Monitoring
Intelligent MonitoringIntelligent Monitoring
Intelligent Monitoring
 
CodeChecker summary 21062021
CodeChecker summary 21062021CodeChecker summary 21062021
CodeChecker summary 21062021
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
 
Intrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set ReductionIntrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set Reduction
 
Fingerprinting Chemical Structures
Fingerprinting Chemical StructuresFingerprinting Chemical Structures
Fingerprinting Chemical Structures
 
Performance is a feature! - London .NET User Group
Performance is a feature! - London .NET User GroupPerformance is a feature! - London .NET User Group
Performance is a feature! - London .NET User Group
 
Protocol Type Based Intrusion Detection Using RBF Neural Network
Protocol Type Based Intrusion Detection Using RBF Neural NetworkProtocol Type Based Intrusion Detection Using RBF Neural Network
Protocol Type Based Intrusion Detection Using RBF Neural Network
 
Performance and how to measure it - ProgSCon London 2016
Performance and how to measure it - ProgSCon London 2016Performance and how to measure it - ProgSCon London 2016
Performance and how to measure it - ProgSCon London 2016
 
dfl
dfldfl
dfl
 
Extracting data from text documents using the regex
Extracting data from text documents using the regexExtracting data from text documents using the regex
Extracting data from text documents using the regex
 
Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595Hidalgo jairo, yandun marco 595
Hidalgo jairo, yandun marco 595
 
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...Klessydra t - designing vector coprocessors for multi-threaded edge-computing...
Klessydra t - designing vector coprocessors for multi-threaded edge-computing...
 
Mastering AIOps with Deep Learning
Mastering AIOps with Deep LearningMastering AIOps with Deep Learning
Mastering AIOps with Deep Learning
 
A New Framework for Detection
A New Framework for DetectionA New Framework for Detection
A New Framework for Detection
 
[2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger [2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger
 
Performance is a Feature! at DDD 11
Performance is a Feature! at DDD 11Performance is a Feature! at DDD 11
Performance is a Feature! at DDD 11
 

More from Stenio Fernandes

Data analytics in computer networking
Data analytics in computer networkingData analytics in computer networking
Data analytics in computer networkingStenio Fernandes
 
SDN Dependability: Assessment, Techniques, and Tools - SDN Research Group - I...
SDN Dependability: Assessment, Techniques, and Tools - SDN Research Group - I...SDN Dependability: Assessment, Techniques, and Tools - SDN Research Group - I...
SDN Dependability: Assessment, Techniques, and Tools - SDN Research Group - I...Stenio Fernandes
 
A brief history of streaming video in the Internet
A brief history of streaming video in the InternetA brief history of streaming video in the Internet
A brief history of streaming video in the InternetStenio Fernandes
 
Research Challenges and Opportunities in the Era of the Internet of Everythin...
Research Challenges and Opportunities in the Era of the Internet of Everythin...Research Challenges and Opportunities in the Era of the Internet of Everythin...
Research Challenges and Opportunities in the Era of the Internet of Everythin...Stenio Fernandes
 
Orientações para a pós graduação - reunião semestral - orientandos - 2014.1
Orientações para a pós graduação - reunião semestral - orientandos - 2014.1Orientações para a pós graduação - reunião semestral - orientandos - 2014.1
Orientações para a pós graduação - reunião semestral - orientandos - 2014.1Stenio Fernandes
 
IEEE ICC 2012 - Dependability Assessment of Virtualized Networks
 IEEE ICC 2012 - Dependability Assessment of Virtualized Networks IEEE ICC 2012 - Dependability Assessment of Virtualized Networks
IEEE ICC 2012 - Dependability Assessment of Virtualized NetworksStenio Fernandes
 
Big Data Analytics and Advanced Computer Networking Scenarios
Big Data Analytics and Advanced Computer Networking ScenariosBig Data Analytics and Advanced Computer Networking Scenarios
Big Data Analytics and Advanced Computer Networking ScenariosStenio Fernandes
 

More from Stenio Fernandes (8)

Data analytics in computer networking
Data analytics in computer networkingData analytics in computer networking
Data analytics in computer networking
 
SDN Dependability: Assessment, Techniques, and Tools - SDN Research Group - I...
SDN Dependability: Assessment, Techniques, and Tools - SDN Research Group - I...SDN Dependability: Assessment, Techniques, and Tools - SDN Research Group - I...
SDN Dependability: Assessment, Techniques, and Tools - SDN Research Group - I...
 
A brief history of streaming video in the Internet
A brief history of streaming video in the InternetA brief history of streaming video in the Internet
A brief history of streaming video in the Internet
 
Research Challenges and Opportunities in the Era of the Internet of Everythin...
Research Challenges and Opportunities in the Era of the Internet of Everythin...Research Challenges and Opportunities in the Era of the Internet of Everythin...
Research Challenges and Opportunities in the Era of the Internet of Everythin...
 
Orientações para a pós graduação - reunião semestral - orientandos - 2014.1
Orientações para a pós graduação - reunião semestral - orientandos - 2014.1Orientações para a pós graduação - reunião semestral - orientandos - 2014.1
Orientações para a pós graduação - reunião semestral - orientandos - 2014.1
 
IEEE ICC 2012 - Dependability Assessment of Virtualized Networks
 IEEE ICC 2012 - Dependability Assessment of Virtualized Networks IEEE ICC 2012 - Dependability Assessment of Virtualized Networks
IEEE ICC 2012 - Dependability Assessment of Virtualized Networks
 
Big Data Analytics and Advanced Computer Networking Scenarios
Big Data Analytics and Advanced Computer Networking ScenariosBig Data Analytics and Advanced Computer Networking Scenarios
Big Data Analytics and Advanced Computer Networking Scenarios
 
A referee's plea reviewed
A referee's plea reviewedA referee's plea reviewed
A referee's plea reviewed
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems

  • 1. CHARACTERIZING SIGNATURE SETS FOR TESTING DPI SYSTEMS The 3rd IEEE International Workshop on Management of Emerging Networks and Services (IEEE MENS 2011) Rafael Antonello, Stenio Fernandes, Djamel Sadok, Judith Kelner Federal University of Pernambuco - UFPE Recife, Brazil
  • 2. Outline  Introduction  Motivation  Contribution  Signature Set Analyzer Framework  Experimental Results  Concluding Remarks
  • 4. Introduction  Deep Packet Inspection (DPI) Systems  key component for accurate network management  Look inside the packet payload trying to find application signatures  Recognizable patterns (similar to an anti-virus system)  High computational requirements are mainly due  high number of regular expressions (RE) in the signature sets in modern DPI
  • 5. RegEx to FA  analyze the DFA created for recognizing the regular expression (regex) ^x01[x08x09][x03x04]  Size and complexity of signatures sets can lead to space state explosion of the FA  It degrades performance
  • 6. Introduction  Challenges:  Growing link speed  40-100 Gbps and beyond  Ever increasing number of Internet applications  Research effort on optimizing DPI systems  new packet capture methods  Building efficient automata for representing REs  Efficient classifiers
  • 7. Motivation  Performance analysis for DPI engines has been done without a common ground  That’s where the problem arises  Selected signature bases present  Different sizes. Example:  1.8Gbps over a 268 signatures set [17]  1.6Gbps over a 2 signatures set [7]  Variable complexity  For RE, dot stars (.*) and count constraints (c{n} constructions) can generate very complex DFAs
  • 8. Contribution…  A framework for  Characterizing the signature sets commonly used to evaluate DPI systems  An in-depth analysis of signature sets  from well-known applications, protocols, and intrusion detection systems  A classification mechanism for signature sets  according to their size, number of sub- patterns, and complexity
  • 9. SSA Framework - SSAF Sig-Set Analyzer
  • 11.  Firstly  Select representative signature sets  Extract REs  And then apply normalization web-cgi.rules.pcres1 Wfrom=[^x3b&n]{100} web-cgi.rules.pcres2 pwd=(!|%21)CRYPT(!|%21)[A-Z0-9]{512} web-cgi.rules.pcres3 evtdumpx3f.*?x2525[^x20]*?x20HTTP web-cgi.rules.pcres4 ShellExample.cgi?[^nr&]*x2a web-cgi.rules.pcres5 update=[^rnx26]+ web-cgi.rules.pcres6 awstats.pl?[^rn]*configdir=x7C
  • 12.
  • 13.  SSA generates:  Number of signatures  Signature size (avg): Average size of signatures  Signature max size: Maximum signature size;  Signature min size: Minimum signature size;  DotStars .* - (count): Number of dot stars (.*) constructions;  DotStars (avg): Average of dot stars per signature;  Char Ranges (count): Number of character ranges ([a-d])  Char Ranges (avg): Average number of character ranges per signature;
  • 14.  SSA:  Count constraints c{n} or c{m.n} (count)  Average number of count constraints per signature;  Count constraints on ranges [a-d]{n}or{m,n} (count): Number of count constraints on character ranges.  Count constraints on ranges (avg): Average number of count constraints on character ranges;  OR operators | (count): Number of OR operators in a signature set;  OR operators (avg): Average number of OR operators per signature;  Number of sps (count): Number of sub-patterns present in a signature set;  Number of sps (avg): Average number of sub-patterns per signature;  Sp min length: Sub-patterns’ minimum length;  Sp max length: Sub-patterns’ maximum length;  Sp avg. length: Sub-patterns’ average length.
  • 15. Logistic Function  Normalization  Size  Sub-patterns  Complexity  x: signature set size, # of sub-patterns, complexity metric  y: [0-1]
  • 16. Complexity  x is the sum of three variables:  the average number of count constraints on ranges,  the average number of count constraints, and  the average number of dot star constructions per signature
  • 17. Metric Levels Base Size Small Medium Large Avg. Number of Sub- Patterns Low Medium High Complexity Low Moderate High Signature Sets’ Characterization: Based on the output of the logistic function (for normalization purposes)
  • 19. Signature Bases  L7-Filter  Bro  Snort-Web  Snort-ActiveX  Snort-Spyware
  • 20. L7-Filter Metric Values Number of signatures 123 Signature size (avg) 61.756096 Signature max size 438 Signature min size 6 DotStars .* - (count) 35 DotStars (avg) 0.284553 Char Ranges [a-d] (count) 265 Char Ranges (avg) 2.154472 Count constraints c{n} or c{m.n} (count) 0 Count constraints on ranges (count) 0 OR operators | (count) 150 OR operators (avg) 1.219512 Number of sps (count) 470 Number of sps (avg) 3.821138 Sp min length 1 Sp max length 46 Sp avg. length 5.859574
  • 21. Bro Metric Values Number of signatures 268 Signature size (avg) 30.772388 Signature max size 211 Signature min size 1 DotStars (count) 8 DotStars (avg) 0.029851 Char Ranges (count) 0 Count constraints (count) 10 Count constraints (avg) 0.037313 Count constraints on ranges (count) 4 Count constraints on ranges (avg) 0.014925 OR operators (count) 6 OR operators (avg) 0.022388 Number of sps (count) 382 Number of sps (avg) 1.425373 Sp min length 1 Sp max length 46 Sp avg. length 4.028796
  • 22. Snort-Web Metric Values Number of signatures 336 Signature size (avg) 57.327381 Signature max size 486 Signature min size 3 DotStars (count) 56 DotStars (avg) 0.166667 Char Ranges (count) 103 Char Ranges (avg) 0.306548 Count constraints (count) 233 Count constraints (avg) 0.693452 Count constraints on ranges (count) 18 Count constraints on ranges (avg) 0.053571 OR operators (count) 402 OR operators (avg) 1.196429 Number of sps (count) 1668 Number of sps (avg) 4.964286 Sp min length 1 Sp max length 64 Sp length (avg) 4.573741
  • 23. Snort-ActiveX Metric Values Number of signatures 2385 Signature size (avg) 321.137115 Signature max size 867 Signature min size 34 DotStars (count) 1599 DotStars (avg) 0.67044 Char Ranges (count) 2 Char Ranges (avg) 0.000839 Count constraints (count) 0 Count constraints on ranges (count) 0 OR operators (count) 10654 OR operators (avg) 4.467086 Number of sps (count) 54981 Number of sps (avg) 23.05283 Sp min length 1 Sp max length 83 Sp avg. length 6.119805
  • 24. Snort-Spyware Metric Values Number of signatures 431 Signature size (avg) 48.308586 Signature max size 324 Signature min size 12 DotStars (count) 37 DotStars (avg) 0.085847 Char Ranges (count) 18 Char Ranges (avg) 0.041763 Count constraints (count) 25 Count constraints (avg) 0.058005 Count constraints on ranges (count) 1 Count constraints on ranges (avg) 0.00232 OR operators (count) 72 OR operators (avg) 0.167053 Number of sps (count) 1315 Number of sps (avg) 3.051044 Sp min length 1 Sp max length 175 Sp length (avg) 9.01673
  • 25. Signature Sets’ Main Characteristics Sig-Set Base Size Sub-Pattern number Overall Complexity L-7 Filter Small (0.31) Medium (0.37) Moderated (0.38) Bro Medium (0.35) Low (0.30) Low (0.22) Snort-Web Medium (0.37) Medium (0.41) High (0.84) Snort-ActiveX Large (0.9) High (0.9) High (0.71) Snort-Spyware Medium (0.4) Medium (0.35) Low (0.27)
  • 27. Concluding Remarks  Different signature sets to compare different DPI techniques might lead to inaccurate results  We developed a mechanism for characterizing signature sets  according to their size  number of sub-patterns  overall complexity  By knowing the characteristics of the signature sets (size and complexity)  It puts DFA-based DPI engines under different stress conditions  It allows comparable performance analysis

Editor's Notes

  1. Wepropose to evaluatecomplexity as thesumofthethreemostimportantmetrics:
  2. performance analysis of DPI systems will be more carefully designed.