SlideShare a Scribd company logo
1 of 37
Networking measurements and
                monitoring
      1st assigment: Oral Presentation

                Classification
      Patrick Herbeuval                    Valentin Thirion
      University of Liège                University of Liège
1st Master in Computer Science    1st Master in Computer Science
p.herbeuval@student.ulg.ac.be    valentin.thirion@student.ulg.ac.be


                       Teacher: B. DONNET
                     benoit.donnet@ulg.ac.be
Plan
I.     Introduction

Four papers
     II.    Early Application Identification
     III.   Multilevel classifier: BLINC
     IV.    Statistical: The ADSL Case
     V.     Application specific: Skype

VI. Comparative

VII. Conclusion
I - Introduction
Internet is more and more used today

We want to keep the network comfortable enough

The quality of service asked by consumers increases as
fast as applications consumes more bandwidth

ISPs, companies and universities want to ban P2P

Port based classifiers were good years ago, quite
inefficient now
Why classify?
Classification is today a key issue for today’s network
administrators and companies for the following reasons:

• Improve the network infrastructure

• Ban undesired traffic

• Protect the network against potential attacks

• Global knowledge of trends
How classify?
Deep Packet Inspection (DPI): verry precise technique but
lots of drawbacks:
   Huge computation power needed
   Unneficient if packets are crypted
   Continuous need of database updates

Statistical analysis

Social
II - Early Application
             Identification
Goal: determine the app with the first few packets

Advantage: knowing the kind of traffic in the
beginning, ability to block, redirect it

DPI consumes too much ressources and flows need to be
ended to be analysed

Statistical: usage of the mean sizes, durations, … these
are values that are not available for the first few packets
Clustering the flows
Techniques used: K-Means, Gaussian Mixture
Model, special

Values used:
  Size of the first few packets
  Duration of the first few packets (negociation phase)
Data set
4 packet traces
  3 from a University network
  1 from an enterprise network

Keep only TCP packets and trash the ones that flow
began before the trace capture

Features analysed: need for an efficient metric
  Size and direction of the first 4 packets
     We can observe that the range of theses values is very similar
     across traces, see graph next slide
Size &
Direction
Classification, 2 phases
Training phase: offline at management sites.
  Apply clustering techniques to samples of TCP connections
  for all target applications
  Creation of a spatial representation based on the sizes of the
  first P packets (vector of P dimensions or HMM)
  Then find applications that have the same behaviour
  Best results: 40 clusters and the 4 first packets
  Creation of two sets:
     One with the description of each cluster
     One with applications present in each cluster
Classification, 2 phases
Classification phase: online at management hosts
  Extract the 5-tuple and analysis of the size of packets in all
  directions
  With this size, use the assigment module (associates a
  connection to a cluster)
  With the clusters, the labelling module selects the application
  associated with the connection
Evaluation & Conclusion
   Evaluation
      Assigment accuracy: above 95% for all heuristics
      Labbeling accuracy: between 85% and 98%

   The size of first few packet is a good metric

   Quality of clustering is richer with HMM but comparable
   with Euclidean

   GMM Clustering with TCP ports classifies over 98% of
   know applications

   Limitation: need the first 4 packets in the correct order

Heuristic: (Wikipedia) Where the exhaustive search is impractical (NP-
complete for instance), heuristic methods are used to speed up the
process of finding a satisfactory solution.
III – The BLINC Classifier
Stands for BLINd Classification

Avoid reading the whole content of the packet
   Privacy, performance, cyphered packets

3 levels of classification
   Social level
   Functional level
   Application level
The Social level
Finding host communities
  Client-server, P2P, …

Analyse these communities
  Perfect match : likely malicious
  Partial overlap : P2P sources, websites, gaming, …
  Partial overlap within the same subnet : farms
The Social level (2)
The functional level
Find if a host offers a service, uses it or both

Mostly depending on the port range used by this host

Works better when a host is connected to many servers

Typical schemes:
   HTTP server: 1-2 ports
   P2P: many ports (to 1 per host)
   Mail server: depending on services available
The application level
Using the connections 4-tuple (+ maybe other
characteristics)

Create a model for every application type

Models are represented by little graphs called
« graphlets »
BLINC : Results
Uses 2 metrics to evaluate the classifier
  Completeness (% classified traffic)
  Accuracy (% correctly classified traffic)

Some parameters can be used to tune the classifier
  Changing a threshold can improve the results for one of the
  metrics, but significantly degrade the other one
Global results




GN : Genome campus (~1000 users), UN : university network (~20.000 users)
Tuning




Td : minimal # of destination IPs needed to classify the flow as P2P
Results (2)
Good detection rate without reading any byte of the payload
   Non payload flows classified as well.
   Cyphering is not a problem
   Low resource consumption

Good detection of unknown flows
Difficult to distinguish applications of the same type (e.g.a ll VoIP
protocols grouped as the same one)
Doesn’t work if the header are encrypted
Hard to identify multiple sources behind NATs
Results from the edge of the network, the classifier may work
differently at the backbone of the network
BLINC : conclusion
BLINC has a good detection rate without costing a lot of
processing and without being intrusive

It can detect attacks and unknown protocols

It can be improved in some situations
IV – The ADSL Case
Test statistical classifier on different sites, after having
been trained on some others.

Dataset:
   4 packet traces collected at 3 different ADSL POPs from
   Orange
   2 traces at the same time, different locations
   2 traces at the same location, 17 days between
   Reference used: ODP tool (provided by Orange)
Classification methodology
3 algorithms used to classify the traces
   Naïve Bayes Kernel Estimation
   Bayesian Network
   C4.5 Decision Tree

Traces analysed on the two features
   SET_A: Packet Level Information
   SET_B: Flow Level Statistics

3 filters:
   S/S: flows with 3-way-Handshake
   S/S+4D: same as S/S + at least 4 data packets
   S/S+F/R: same as S/S + FIN or RST flag at the end
Classification, 2 cases
Static case: classification on each site independently
  Ideal number of packets: 4
  Accuracy: about 90%
  Great classification of WEB and EDONKEY flows

Cross-site case:
  SET_A: EDONKEY result immune, spatial similarity seems
  more important than temporal similarity.
     Classifier very sensitive to the context in which it is trained
     MAIL is often taken for FTP due to the packet sizes similarities
     Usage of Port number increases the quality of results
Classification, 2 cases
           (continued)
  SET_B: some degradations
     Focus on a single feature: Port number
        Results are the opposite from the static case
        Prediction of traffic using non-legacy ports is non efficient
        Due to the heavy-hitters (typically P2P)

Global results: C4.5 algorithm is the best in term of overall
accuracy for almost all cases (static + cross-site)
  Degradation : C4.5 is comparable with other algorithms
  (≤17%)

Data overfitting problem
Unknown class + Conclusion
Looking for the unknown marked flows
  3 way handshake
  Apply classifiers and get confidence level, this value is then
  compared to the one returned by C4.5
  Useful to detect malicious traffic and P2P
  Should be integrated into existing DPI tool

Conclusion:
  Statistical tools are very useful to identify unknown traffic
  Good performances if used in the same site as training
  Can detect applications among protocols
  Really suffers from data overfitting (same behaviour from different
  apps)
  Great thing about this analysis: used commercial traffic, so very
  differentiated
V – Skype case
We want to detect Skype traffic

It’s already possible to detect VoIP traffic with other
classifiers, but how to distinguish it ?

Skype is a closed and cyphered protocol, which has to be
analysed before starting the classification
Skype model
Using a controlled environment, detection of Skype traffic
characteristics

2 kinds of connections : E2E and E2O
  E2E : End 2 End, Skype to Skype
  E2O : End 2 Out, Skype to telephone network

Skype works on TCP and UDP

Skype can carry text, voice, video and files
  Everything multiplexed in 1 packet
  In this case, only voice traffic is treated
Skype SoM
TCP packets are entirely cyphered, they cannot be
analysed

UDP has a small uncyphered overhead, called Start of
Message (SoM)

E2E : id and message type (signaling or data)

E20 : unique connection identifier

Skype also always uses the same port number in UDP
(12340)
Classifiers
Chi-Square Classifier (CSC)
  Based on the randomness of bits in packets
  Doesn’t works on TCP since cyphered packets seems to be
  completely random.

Naive Bayes Classifier (NBC)
  Real-time voice protocol classifier
  Based on message size (depending of the audio codec) and
  on average inter-packet gap
  Used on a short window of samples to cope with variability in
  packet size

Payload based classifier
  Used in the controlled environment to check if CSC and NBC
  work well
Experiments
NBC detects all kinds of VoIP traffic

CSC detects all kinds of Skype traffic
  Using both of them should detect Skype voice traffic
Results
                        N
                        N      OK
                               OK      FP
                                        FP    FP%
                                              FP%      FN
                                                        FN   FN%
                                                             FN%                                N
                                                                                                N        OK
                                                                                                         OK         FP
                                                                                                                    FP       FP%
                                                                                                                             FP%        FN
                                                                                                                                        FN    FN%
                                                                                                                                              FN%
            E2E
             E2E       1014
                        1014                                                        E2E
                                                                                    E2E            65
                                                                                                    65
  PBC
   PBC                         ——       ——     ——       ——     —
                                                               —          PBC
                                                                          PBC                            ——          ——        —
                                                                                                                               —        ——      —
                                                                                                                                                —
            E2O
             E2O        163
                         163                                                        E2O
                                                                                    E2O          125
                                                                                                  125
            E2E
             E2E       1236
                        1236   726
                                726    510
                                        510   0.68
                                               0.68   288
                                                       288   28.40
                                                              28.40                 E2E
                                                                                    E2E       27437
                                                                                               27437       50
                                                                                                           50     27387
                                                                                                                  27387      73.73
                                                                                                                             73.73      15
                                                                                                                                        15    23.08
                                                                                                                                              23.08
  NBC
  NBC                                                                     NBC
                                                                          NBC
            E2O
             E2O        441
                         441   153
                                153    288
                                        288   0.38
                                               0.38    10
                                                        10    6.13
                                                               6.13                 E2O
                                                                                    E2O          295
                                                                                                  295    124
                                                                                                          124       171
                                                                                                                     171      0.46
                                                                                                                              0.46       1
                                                                                                                                         1     0.80
                                                                                                                                               0.80
            E2E
             E2E       2781
                        2781   984
                                984   1797
                                       1797   2.40
                                               2.40    30
                                                        30    2.96
                                                               2.96                 E2E
                                                                                    E2E          191
                                                                                                  191      57
                                                                                                           57       134
                                                                                                                     134      0.36
                                                                                                                              0.36       8
                                                                                                                                         8    12.31
                                                                                                                                              12.31
  CSC
  CSC                                                                     CSC
                                                                          CSC
            E2O
             E2O        161
                         161   157
                                157      44   0.01
                                               0.01     66    3.68
                                                               3.68                 E2O
                                                                                    E2O          190
                                                                                                  190    123
                                                                                                          123         67
                                                                                                                      67      0.18
                                                                                                                              0.18       2
                                                                                                                                         2      1.6
                                                                                                                                                 1.6
 NBC ∧
 NBC ∧      E2E
             E2E        716
                         716   710
                                710      66   0.01
                                               0.01   304
                                                       304   29.98
                                                              29.98      NBC ∧
                                                                         NBC ∧      E2E
                                                                                    E2E            51
                                                                                                    51     49
                                                                                                           49          2
                                                                                                                       2      0.01
                                                                                                                              0.01      16
                                                                                                                                        16    24.62
                                                                                                                                              24.62
   CSC
   CSC      E2O
             E2O        147
                         147   147
                                147      00   0.00
                                               0.00    16
                                                        16    9.82
                                                               9.82        CSC
                                                                           CSC      E2O
                                                                                    E2O          163
                                                                                                  163    122
                                                                                                          122         41
                                                                                                                      41      0.11
                                                                                                                              0.11       3
                                                                                                                                         3     2.40
                                                                                                                                               2.40
           ≥ 100
            ≥ 100     76025
                       76025                                                       ≥ 100
                                                                                   ≥ 100      37212
                                                                                               37212
  TOT
   TOT                          —
                                —        —
                                         —      —
                                                —      —
                                                       —        —
                                                                —         TOT
                                                                          TOT                             —
                                                                                                          —          —
                                                                                                                     —         —
                                                                                                                               —        —
                                                                                                                                        —        —
                                                                                                                                                 —
                     487729
                      487729                                                                258634
                                                                                             258634


       Table 3: Results for UDP flows, C AMPUS dataset.
        Table 3: Results for UDP flows, C AMPUS dataset.                          Table 4: Results for UDP flows, ISP dataset.
                                                                                 Table 4: Results for UDP flows, ISP dataset.
                                                                                                                C AMPUS
                                                                                                                C AMPUS              ISP
                                                                                                                                      ISP
PBC as oracle, so that flows that pass the PBC classification form                                E2E
                                                                                                E2E                 20910
                                                                                                                    20910                60
                                                                                                                                         60
 PBC as oracle, so that flows that pass the PBC classification form                 NBC
                                                                                  NBC           E2O
                                                                                                E2O                  2034
                                                                                                                      2034              646
                                                                                                                                        646
aa reliable dataset. We refer to this set as the benchmark dataset.
    reliable dataset. We refer to this set as the benchmark dataset.                            E2E
                                                                                                E2E
               Very low false positive rate
In particular, this dataset is built by Skype voice flows considering
 In particular, this dataset is built by Skype voice flows considering
the E2O case. In the E2E case, voice, video, data and chat flows
                                                                                  CSC
                                                                                  CSC           E2O
                                                                                                E2O
                                                                                                                  403996
                                                                                                                   403996            46876
                                                                                                                                      46876
 the E2O case. In the E2E case, voice, video, data and chat flows              NBC ∧ CSC
                                                                                                E2E
                                                                                                E2E                   621
                                                                                                                      621            12
                                                                                                                                      12
are present, since it is impossible to distinguish among them from            NBC ∧ CSC         E2O                   313              0
 are present, since it is impossible to distinguish among them from                             E2O                   313              0
packet inspection. Our tests are the NBC, the CSC and the joint                                ≥ 100             1646424         108831
               Bigger false negative rate
 packet inspection. Our tests are the NBC, the CSC and the joint
NBC-CSC classifiers. Notice that the NBC test is expected to fail
 NBC-CSC classifiers. Notice that the NBC test is expected to fail
                                                                                  TOT
                                                                                  TOT
                                                                                               ≥ 100              1646424
                                                                                                                23856424
                                                                                                                23856424
                                                                                                                                 108831
                                                                                                                                1614553
                                                                                                                                1614553
when aavideo/data/chat benchmark E2E flow is tested.
 when video/data/chat benchmark E2E flow is tested.
    From aapreliminary set of experiments on the testbed traces, con-
     From preliminary set of experiments on the testbed traces, con-
taining more that 50 Skype voice calls, we tuned the PBC and CSC                 Table 5: Results for TCP flows, both datasets.
                                                                                 Table 5: Results for TCP flows, both datasets.
 taining more that 50 Skype voice calls, we tuned the PBC and CSC
classifier thresholds to B m i inn = − 5 and χ 22(T hr) = 150, respec-
 classifier thresholds to B m = − 5 and χ (T hr) = 150, respec-
tively. Using such choices, further discussed in Sec. 5.2, all flows
 tively. Using such choices, further discussed in Sec. 5.2, all flows    noticing that the NBC (correctly) identifies 27437 voice flows, most
were correctly identified as E2E or E2O, and neither FP nor FN           noticing that the NBC (correctly) identifies 27437 voice flows, most
 were correctly identified as E2E or E2O, and neither FP nor FN          of which correspond to actual ISP’s VoIP flows carried over RTP.
                                                                        of which correspond to actual ISP’s VoIP flows carried over RTP.
were identified. Using the same threshold setting, we then apply
 were identified. Using the same threshold setting, we then apply        Only combining the CSC allows to detect the true Skype voice
the classification to real traffic traces: the results are summarized     Only combining the CSC allows to detect the true Skype voice
 the classification to real traffic traces: the results are summarized    flows. These results confirm that the NBC-FP may be due to non-
                                                                        flows. These results confirm that the NBC-FP may be due to non-
Skype : Conclusion
Skype is hard to classify due to its cyphering
protocol, which makes its analysis hard to do

But with this classifier, we have good results on UDP
  False positive is almost zero, good if the ISP wants to
  prioritarize its traffic
  False negative is bigger but not really a problem while the
  ISP doesn’t want to block Skype
VI - Comparative
All these classifiers have good results, but each of them has its
strengths and weaknesses

ADSL needs specific training, but best detection rate

BLINC and Early are less precise but more flexible
   They are also faster and good to detect attacks

BLINC detects unknown protocols but cannot discern apps

Early needs the 4 first packets in order, ADSL the 3-way handshake

Skype is more specific, cannot be compared immediately
   Good false positive rate but higher false negative rate
VII – Conclusion
We have now solutions that can replace DPI’s

Each classifier is good in its domain
  Important network: early app detection (detect attacks soon)
  ADSL and commercial: statistical (user trends, adapt
  infrastructure)
  University or academy: BLINC (statistics, trends)
  Everywhere we want to improve it: Skype classifier

Remarks:
  Traces and classifiers are quite old (4 to 6 years)
  What about mobile usage ? Multimedia over 3/4G networks ?
References:
K. Karagiannis, K. Papagiannaki, M. Faloutsos. BLINC: Multilevel Traffic
Classification in the Dark. In Proc. ACM SIGCOMM. August 2005.
L. Bernaille, R. Teixeira, K. Salamatian. Early Application Identification. In Proc.
ACM CoNEXT. December 2006.
M. Pietrzyk, J.-L. Costeux, G. Urvoy-Keller, T. En-Jajjary. Challenging Statistical
Classification for Operational Usage: the ADSL Case. In Proc. ACM/USENIX
Internet Measurement Conference (IMC). Novem- ber 2009.
D.Bonfiglio,M.Mellia,M.Meo,D.Rossi,P.Tofanelli.RevealingSkype Traffic: When
Randomness Plays with You. In Proc. ACM SIGCOMM. August 2007.

            Thanks for your attention

                      Any questions ?

More Related Content

What's hot

Wireshark - Basics
Wireshark - BasicsWireshark - Basics
Wireshark - BasicsYoram Orzach
 
DISTIBUTED OPERATING SYSTEM
DISTIBUTED  OPERATING SYSTEM DISTIBUTED  OPERATING SYSTEM
DISTIBUTED OPERATING SYSTEM AjithaG9
 
FEC & File Multicast
FEC & File MulticastFEC & File Multicast
FEC & File MulticastYoss Cohen
 
Ch 09 -- ARP & IP Analysis
Ch 09 -- ARP & IP AnalysisCh 09 -- ARP & IP Analysis
Ch 09 -- ARP & IP AnalysisYoram Orzach
 
Network analysis Using Wireshark Lesson 11: TCP and UDP Analysis
Network analysis Using Wireshark Lesson 11: TCP and UDP AnalysisNetwork analysis Using Wireshark Lesson 11: TCP and UDP Analysis
Network analysis Using Wireshark Lesson 11: TCP and UDP AnalysisYoram Orzach
 
Performance analysis of transport layer basedhybrid covert channel detection ...
Performance analysis of transport layer basedhybrid covert channel detection ...Performance analysis of transport layer basedhybrid covert channel detection ...
Performance analysis of transport layer basedhybrid covert channel detection ...IJNSA Journal
 
Degrees of Freedom for Interference Networks aided by Relays Bounds and achie...
Degrees of Freedom for Interference Networks aided by Relays Bounds and achie...Degrees of Freedom for Interference Networks aided by Relays Bounds and achie...
Degrees of Freedom for Interference Networks aided by Relays Bounds and achie...amin azari
 
Proposal for System Analysis and Desing
Proposal for System Analysis and DesingProposal for System Analysis and Desing
Proposal for System Analysis and DesingMd Khaza Main Uddin
 
A Survey on DPI Techniques for Regular Expression Detection in Network Intrus...
A Survey on DPI Techniques for Regular Expression Detection in Network Intrus...A Survey on DPI Techniques for Regular Expression Detection in Network Intrus...
A Survey on DPI Techniques for Regular Expression Detection in Network Intrus...ijsrd.com
 
Online stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficOnline stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficeSAT Journals
 
Online stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficOnline stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficeSAT Publishing House
 
Energy Saving DSR and Probabilistic Rebroadcast Mechanism are used to Increas...
Energy Saving DSR and Probabilistic Rebroadcast Mechanism are used to Increas...Energy Saving DSR and Probabilistic Rebroadcast Mechanism are used to Increas...
Energy Saving DSR and Probabilistic Rebroadcast Mechanism are used to Increas...IJTET Journal
 
New tcp-ip model (2)
New tcp-ip model (2)New tcp-ip model (2)
New tcp-ip model (2)Nitesh Singh
 
PeerShark - Detecting Peer-to-Peer Botnets by Tracking Conversations
PeerShark - Detecting Peer-to-Peer Botnets by Tracking ConversationsPeerShark - Detecting Peer-to-Peer Botnets by Tracking Conversations
PeerShark - Detecting Peer-to-Peer Botnets by Tracking ConversationsPratik Narang
 
Course descriptions cit-iae_20130517
Course descriptions cit-iae_20130517Course descriptions cit-iae_20130517
Course descriptions cit-iae_20130517Md Hasnain
 

What's hot (18)

Wireshark - Basics
Wireshark - BasicsWireshark - Basics
Wireshark - Basics
 
DISTIBUTED OPERATING SYSTEM
DISTIBUTED  OPERATING SYSTEM DISTIBUTED  OPERATING SYSTEM
DISTIBUTED OPERATING SYSTEM
 
ewsn09
ewsn09ewsn09
ewsn09
 
FEC & File Multicast
FEC & File MulticastFEC & File Multicast
FEC & File Multicast
 
Mini Project- Implementation & Evaluation Of Wireless La Ns
Mini Project- Implementation & Evaluation Of Wireless La NsMini Project- Implementation & Evaluation Of Wireless La Ns
Mini Project- Implementation & Evaluation Of Wireless La Ns
 
The campus NMS tool NAV
The campus NMS tool NAVThe campus NMS tool NAV
The campus NMS tool NAV
 
Ch 09 -- ARP & IP Analysis
Ch 09 -- ARP & IP AnalysisCh 09 -- ARP & IP Analysis
Ch 09 -- ARP & IP Analysis
 
Network analysis Using Wireshark Lesson 11: TCP and UDP Analysis
Network analysis Using Wireshark Lesson 11: TCP and UDP AnalysisNetwork analysis Using Wireshark Lesson 11: TCP and UDP Analysis
Network analysis Using Wireshark Lesson 11: TCP and UDP Analysis
 
Performance analysis of transport layer basedhybrid covert channel detection ...
Performance analysis of transport layer basedhybrid covert channel detection ...Performance analysis of transport layer basedhybrid covert channel detection ...
Performance analysis of transport layer basedhybrid covert channel detection ...
 
Degrees of Freedom for Interference Networks aided by Relays Bounds and achie...
Degrees of Freedom for Interference Networks aided by Relays Bounds and achie...Degrees of Freedom for Interference Networks aided by Relays Bounds and achie...
Degrees of Freedom for Interference Networks aided by Relays Bounds and achie...
 
Proposal for System Analysis and Desing
Proposal for System Analysis and DesingProposal for System Analysis and Desing
Proposal for System Analysis and Desing
 
A Survey on DPI Techniques for Regular Expression Detection in Network Intrus...
A Survey on DPI Techniques for Regular Expression Detection in Network Intrus...A Survey on DPI Techniques for Regular Expression Detection in Network Intrus...
A Survey on DPI Techniques for Regular Expression Detection in Network Intrus...
 
Online stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficOnline stream mining approach for clustering network traffic
Online stream mining approach for clustering network traffic
 
Online stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficOnline stream mining approach for clustering network traffic
Online stream mining approach for clustering network traffic
 
Energy Saving DSR and Probabilistic Rebroadcast Mechanism are used to Increas...
Energy Saving DSR and Probabilistic Rebroadcast Mechanism are used to Increas...Energy Saving DSR and Probabilistic Rebroadcast Mechanism are used to Increas...
Energy Saving DSR and Probabilistic Rebroadcast Mechanism are used to Increas...
 
New tcp-ip model (2)
New tcp-ip model (2)New tcp-ip model (2)
New tcp-ip model (2)
 
PeerShark - Detecting Peer-to-Peer Botnets by Tracking Conversations
PeerShark - Detecting Peer-to-Peer Botnets by Tracking ConversationsPeerShark - Detecting Peer-to-Peer Botnets by Tracking Conversations
PeerShark - Detecting Peer-to-Peer Botnets by Tracking Conversations
 
Course descriptions cit-iae_20130517
Course descriptions cit-iae_20130517Course descriptions cit-iae_20130517
Course descriptions cit-iae_20130517
 

Viewers also liked

Customer experience w eCommerce
Customer experience w eCommerceCustomer experience w eCommerce
Customer experience w eCommerceMarcin Cichoń
 
eCommerce - jak nie zamordować swojej sieci dystrybucji?
eCommerce - jak nie zamordować swojej sieci dystrybucji?eCommerce - jak nie zamordować swojej sieci dystrybucji?
eCommerce - jak nie zamordować swojej sieci dystrybucji?Marcin Cichoń
 
Wroclove Design - Efektywna czy efektowna strona internetowa?
Wroclove Design - Efektywna czy efektowna strona internetowa?Wroclove Design - Efektywna czy efektowna strona internetowa?
Wroclove Design - Efektywna czy efektowna strona internetowa?Marcin Cichoń
 
ROICamp - O projektowaniu słów kilka, czyli jak optymalizować serwisy aby wzm...
ROICamp - O projektowaniu słów kilka, czyli jak optymalizować serwisy aby wzm...ROICamp - O projektowaniu słów kilka, czyli jak optymalizować serwisy aby wzm...
ROICamp - O projektowaniu słów kilka, czyli jak optymalizować serwisy aby wzm...Marcin Cichoń
 
Jak sprzedawać w Internecie w branży fashion
Jak sprzedawać w Internecie w branży fashionJak sprzedawać w Internecie w branży fashion
Jak sprzedawać w Internecie w branży fashionMarcin Cichoń
 
Pozyskiwanie potencjalnego klienta w eCommerce
Pozyskiwanie potencjalnego klienta w eCommercePozyskiwanie potencjalnego klienta w eCommerce
Pozyskiwanie potencjalnego klienta w eCommerceMarcin Cichoń
 
Psychologia sprzedaży abonamentowej w Internecie / eCommerce
Psychologia sprzedaży abonamentowej w Internecie / eCommercePsychologia sprzedaży abonamentowej w Internecie / eCommerce
Psychologia sprzedaży abonamentowej w Internecie / eCommerceMarcin Cichoń
 
4Developers - Wdrożenie e-commerce w branży fashion
4Developers - Wdrożenie e-commerce w branży fashion4Developers - Wdrożenie e-commerce w branży fashion
4Developers - Wdrożenie e-commerce w branży fashionMarcin Cichoń
 
Dylemat PM: Czy dziewięć kobiet urodzi dziecko w miesiąc? (InfoMeet)
Dylemat PM: Czy dziewięć kobiet urodzi dziecko w miesiąc? (InfoMeet)Dylemat PM: Czy dziewięć kobiet urodzi dziecko w miesiąc? (InfoMeet)
Dylemat PM: Czy dziewięć kobiet urodzi dziecko w miesiąc? (InfoMeet)Marcin Cichoń
 
Presentation, Geeks Anonymes 13 nov 2013
Presentation, Geeks Anonymes 13 nov 2013Presentation, Geeks Anonymes 13 nov 2013
Presentation, Geeks Anonymes 13 nov 2013Valentin Thirion
 
Sprzedaż w kanale mobile
Sprzedaż w kanale mobileSprzedaż w kanale mobile
Sprzedaż w kanale mobileMarcin Cichoń
 
Design Thinking vs Lean UX Startup
Design Thinking vs Lean UX StartupDesign Thinking vs Lean UX Startup
Design Thinking vs Lean UX StartupMarcin Cichoń
 

Viewers also liked (16)

Customer experience w eCommerce
Customer experience w eCommerceCustomer experience w eCommerce
Customer experience w eCommerce
 
eCommerce - jak nie zamordować swojej sieci dystrybucji?
eCommerce - jak nie zamordować swojej sieci dystrybucji?eCommerce - jak nie zamordować swojej sieci dystrybucji?
eCommerce - jak nie zamordować swojej sieci dystrybucji?
 
Wroclove Design - Efektywna czy efektowna strona internetowa?
Wroclove Design - Efektywna czy efektowna strona internetowa?Wroclove Design - Efektywna czy efektowna strona internetowa?
Wroclove Design - Efektywna czy efektowna strona internetowa?
 
ROICamp - O projektowaniu słów kilka, czyli jak optymalizować serwisy aby wzm...
ROICamp - O projektowaniu słów kilka, czyli jak optymalizować serwisy aby wzm...ROICamp - O projektowaniu słów kilka, czyli jak optymalizować serwisy aby wzm...
ROICamp - O projektowaniu słów kilka, czyli jak optymalizować serwisy aby wzm...
 
Detective Agency in Mumbai
Detective Agency in MumbaiDetective Agency in Mumbai
Detective Agency in Mumbai
 
Personal Detective Agency
Personal Detective AgencyPersonal Detective Agency
Personal Detective Agency
 
Detective Agency in India
Detective Agency in IndiaDetective Agency in India
Detective Agency in India
 
Jak sprzedawać w Internecie w branży fashion
Jak sprzedawać w Internecie w branży fashionJak sprzedawać w Internecie w branży fashion
Jak sprzedawać w Internecie w branży fashion
 
Pozyskiwanie potencjalnego klienta w eCommerce
Pozyskiwanie potencjalnego klienta w eCommercePozyskiwanie potencjalnego klienta w eCommerce
Pozyskiwanie potencjalnego klienta w eCommerce
 
Corporate Detective Services Mumbai
Corporate Detective Services MumbaiCorporate Detective Services Mumbai
Corporate Detective Services Mumbai
 
Psychologia sprzedaży abonamentowej w Internecie / eCommerce
Psychologia sprzedaży abonamentowej w Internecie / eCommercePsychologia sprzedaży abonamentowej w Internecie / eCommerce
Psychologia sprzedaży abonamentowej w Internecie / eCommerce
 
4Developers - Wdrożenie e-commerce w branży fashion
4Developers - Wdrożenie e-commerce w branży fashion4Developers - Wdrożenie e-commerce w branży fashion
4Developers - Wdrożenie e-commerce w branży fashion
 
Dylemat PM: Czy dziewięć kobiet urodzi dziecko w miesiąc? (InfoMeet)
Dylemat PM: Czy dziewięć kobiet urodzi dziecko w miesiąc? (InfoMeet)Dylemat PM: Czy dziewięć kobiet urodzi dziecko w miesiąc? (InfoMeet)
Dylemat PM: Czy dziewięć kobiet urodzi dziecko w miesiąc? (InfoMeet)
 
Presentation, Geeks Anonymes 13 nov 2013
Presentation, Geeks Anonymes 13 nov 2013Presentation, Geeks Anonymes 13 nov 2013
Presentation, Geeks Anonymes 13 nov 2013
 
Sprzedaż w kanale mobile
Sprzedaż w kanale mobileSprzedaż w kanale mobile
Sprzedaż w kanale mobile
 
Design Thinking vs Lean UX Startup
Design Thinking vs Lean UX StartupDesign Thinking vs Lean UX Startup
Design Thinking vs Lean UX Startup
 

Similar to Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Computer Networks Lecture Notes 01
Computer Networks Lecture Notes 01Computer Networks Lecture Notes 01
Computer Networks Lecture Notes 01Sreedhar Chowdam
 
raim-2015-paper31
raim-2015-paper31raim-2015-paper31
raim-2015-paper31John Wu
 
OSI model (7 LAYER )
OSI model (7 LAYER )OSI model (7 LAYER )
OSI model (7 LAYER )AAKASH S
 
Anomaly detection final
Anomaly detection finalAnomaly detection final
Anomaly detection finalAkshay Bansal
 
DCN 5th ed. slides ch01-Introduction.pdf
DCN 5th ed. slides ch01-Introduction.pdfDCN 5th ed. slides ch01-Introduction.pdf
DCN 5th ed. slides ch01-Introduction.pdfBilal Munir Mughal
 
Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1Adrian Preda
 
EC8551 COMMUNICATION NETWORKS
EC8551 COMMUNICATION NETWORKSEC8551 COMMUNICATION NETWORKS
EC8551 COMMUNICATION NETWORKSGOWTHAMMS6
 
1.- Networking Models, Devices.pdf
1.- Networking Models, Devices.pdf1.- Networking Models, Devices.pdf
1.- Networking Models, Devices.pdfOpositorGonzalez1
 
NetWork Design Question1a.) In writing a letter to a Friend, what .pdf
NetWork Design Question1a.) In writing a letter to a Friend, what .pdfNetWork Design Question1a.) In writing a letter to a Friend, what .pdf
NetWork Design Question1a.) In writing a letter to a Friend, what .pdfjeetumordhani
 
DCN-chapter1.pdf
DCN-chapter1.pdfDCN-chapter1.pdf
DCN-chapter1.pdfMakuBandar
 
computer network and chapter 7 OSI layers.pptx
computer network and chapter 7 OSI layers.pptxcomputer network and chapter 7 OSI layers.pptx
computer network and chapter 7 OSI layers.pptxgadisaAdamu
 
Automated Traffic Classification And Application Identification Using Machine...
Automated Traffic Classification And Application Identification Using Machine...Automated Traffic Classification And Application Identification Using Machine...
Automated Traffic Classification And Application Identification Using Machine...Jennifer Daniel
 

Similar to Network Measurement and Monitori - Assigment 1, Group3, "Classification" (20)

Class Note 02
Class Note 02Class Note 02
Class Note 02
 
Computer Networks Lecture Notes 01
Computer Networks Lecture Notes 01Computer Networks Lecture Notes 01
Computer Networks Lecture Notes 01
 
Csc341 – Lecture 1 network management
Csc341 – Lecture 1 network managementCsc341 – Lecture 1 network management
Csc341 – Lecture 1 network management
 
raim-2015-paper31
raim-2015-paper31raim-2015-paper31
raim-2015-paper31
 
OSI model (7 LAYER )
OSI model (7 LAYER )OSI model (7 LAYER )
OSI model (7 LAYER )
 
Anomaly detection final
Anomaly detection finalAnomaly detection final
Anomaly detection final
 
DCN 5th ed. slides ch01-Introduction.pdf
DCN 5th ed. slides ch01-Introduction.pdfDCN 5th ed. slides ch01-Introduction.pdf
DCN 5th ed. slides ch01-Introduction.pdf
 
Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1
 
CN Unit-1 PPT.pptx
CN Unit-1 PPT.pptxCN Unit-1 PPT.pptx
CN Unit-1 PPT.pptx
 
OSI model.pptx
OSI model.pptxOSI model.pptx
OSI model.pptx
 
EC8551 COMMUNICATION NETWORKS
EC8551 COMMUNICATION NETWORKSEC8551 COMMUNICATION NETWORKS
EC8551 COMMUNICATION NETWORKS
 
NOS Unit.pdf
NOS Unit.pdfNOS Unit.pdf
NOS Unit.pdf
 
Viloria osi layer4-7
Viloria osi layer4-7Viloria osi layer4-7
Viloria osi layer4-7
 
1.- Networking Models, Devices.pdf
1.- Networking Models, Devices.pdf1.- Networking Models, Devices.pdf
1.- Networking Models, Devices.pdf
 
OSI &TCP/IP Model
OSI &TCP/IP ModelOSI &TCP/IP Model
OSI &TCP/IP Model
 
NetWork Design Question1a.) In writing a letter to a Friend, what .pdf
NetWork Design Question1a.) In writing a letter to a Friend, what .pdfNetWork Design Question1a.) In writing a letter to a Friend, what .pdf
NetWork Design Question1a.) In writing a letter to a Friend, what .pdf
 
DCN-chapter1.pdf
DCN-chapter1.pdfDCN-chapter1.pdf
DCN-chapter1.pdf
 
presentation_SB_v01
presentation_SB_v01presentation_SB_v01
presentation_SB_v01
 
computer network and chapter 7 OSI layers.pptx
computer network and chapter 7 OSI layers.pptxcomputer network and chapter 7 OSI layers.pptx
computer network and chapter 7 OSI layers.pptx
 
Automated Traffic Classification And Application Identification Using Machine...
Automated Traffic Classification And Application Identification Using Machine...Automated Traffic Classification And Application Identification Using Machine...
Automated Traffic Classification And Application Identification Using Machine...
 

Recently uploaded

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Network Measurement and Monitori - Assigment 1, Group3, "Classification"

  • 1. Networking measurements and monitoring 1st assigment: Oral Presentation Classification Patrick Herbeuval Valentin Thirion University of Liège University of Liège 1st Master in Computer Science 1st Master in Computer Science p.herbeuval@student.ulg.ac.be valentin.thirion@student.ulg.ac.be Teacher: B. DONNET benoit.donnet@ulg.ac.be
  • 2. Plan I. Introduction Four papers II. Early Application Identification III. Multilevel classifier: BLINC IV. Statistical: The ADSL Case V. Application specific: Skype VI. Comparative VII. Conclusion
  • 3. I - Introduction Internet is more and more used today We want to keep the network comfortable enough The quality of service asked by consumers increases as fast as applications consumes more bandwidth ISPs, companies and universities want to ban P2P Port based classifiers were good years ago, quite inefficient now
  • 4. Why classify? Classification is today a key issue for today’s network administrators and companies for the following reasons: • Improve the network infrastructure • Ban undesired traffic • Protect the network against potential attacks • Global knowledge of trends
  • 5. How classify? Deep Packet Inspection (DPI): verry precise technique but lots of drawbacks: Huge computation power needed Unneficient if packets are crypted Continuous need of database updates Statistical analysis Social
  • 6. II - Early Application Identification Goal: determine the app with the first few packets Advantage: knowing the kind of traffic in the beginning, ability to block, redirect it DPI consumes too much ressources and flows need to be ended to be analysed Statistical: usage of the mean sizes, durations, … these are values that are not available for the first few packets
  • 7. Clustering the flows Techniques used: K-Means, Gaussian Mixture Model, special Values used: Size of the first few packets Duration of the first few packets (negociation phase)
  • 8. Data set 4 packet traces 3 from a University network 1 from an enterprise network Keep only TCP packets and trash the ones that flow began before the trace capture Features analysed: need for an efficient metric Size and direction of the first 4 packets We can observe that the range of theses values is very similar across traces, see graph next slide
  • 10. Classification, 2 phases Training phase: offline at management sites. Apply clustering techniques to samples of TCP connections for all target applications Creation of a spatial representation based on the sizes of the first P packets (vector of P dimensions or HMM) Then find applications that have the same behaviour Best results: 40 clusters and the 4 first packets Creation of two sets: One with the description of each cluster One with applications present in each cluster
  • 11. Classification, 2 phases Classification phase: online at management hosts Extract the 5-tuple and analysis of the size of packets in all directions With this size, use the assigment module (associates a connection to a cluster) With the clusters, the labelling module selects the application associated with the connection
  • 12. Evaluation & Conclusion Evaluation Assigment accuracy: above 95% for all heuristics Labbeling accuracy: between 85% and 98% The size of first few packet is a good metric Quality of clustering is richer with HMM but comparable with Euclidean GMM Clustering with TCP ports classifies over 98% of know applications Limitation: need the first 4 packets in the correct order Heuristic: (Wikipedia) Where the exhaustive search is impractical (NP- complete for instance), heuristic methods are used to speed up the process of finding a satisfactory solution.
  • 13. III – The BLINC Classifier Stands for BLINd Classification Avoid reading the whole content of the packet Privacy, performance, cyphered packets 3 levels of classification Social level Functional level Application level
  • 14. The Social level Finding host communities Client-server, P2P, … Analyse these communities Perfect match : likely malicious Partial overlap : P2P sources, websites, gaming, … Partial overlap within the same subnet : farms
  • 16. The functional level Find if a host offers a service, uses it or both Mostly depending on the port range used by this host Works better when a host is connected to many servers Typical schemes: HTTP server: 1-2 ports P2P: many ports (to 1 per host) Mail server: depending on services available
  • 17. The application level Using the connections 4-tuple (+ maybe other characteristics) Create a model for every application type Models are represented by little graphs called « graphlets »
  • 18. BLINC : Results Uses 2 metrics to evaluate the classifier Completeness (% classified traffic) Accuracy (% correctly classified traffic) Some parameters can be used to tune the classifier Changing a threshold can improve the results for one of the metrics, but significantly degrade the other one
  • 19. Global results GN : Genome campus (~1000 users), UN : university network (~20.000 users)
  • 20. Tuning Td : minimal # of destination IPs needed to classify the flow as P2P
  • 21. Results (2) Good detection rate without reading any byte of the payload Non payload flows classified as well. Cyphering is not a problem Low resource consumption Good detection of unknown flows Difficult to distinguish applications of the same type (e.g.a ll VoIP protocols grouped as the same one) Doesn’t work if the header are encrypted Hard to identify multiple sources behind NATs Results from the edge of the network, the classifier may work differently at the backbone of the network
  • 22. BLINC : conclusion BLINC has a good detection rate without costing a lot of processing and without being intrusive It can detect attacks and unknown protocols It can be improved in some situations
  • 23. IV – The ADSL Case Test statistical classifier on different sites, after having been trained on some others. Dataset: 4 packet traces collected at 3 different ADSL POPs from Orange 2 traces at the same time, different locations 2 traces at the same location, 17 days between Reference used: ODP tool (provided by Orange)
  • 24. Classification methodology 3 algorithms used to classify the traces Naïve Bayes Kernel Estimation Bayesian Network C4.5 Decision Tree Traces analysed on the two features SET_A: Packet Level Information SET_B: Flow Level Statistics 3 filters: S/S: flows with 3-way-Handshake S/S+4D: same as S/S + at least 4 data packets S/S+F/R: same as S/S + FIN or RST flag at the end
  • 25. Classification, 2 cases Static case: classification on each site independently Ideal number of packets: 4 Accuracy: about 90% Great classification of WEB and EDONKEY flows Cross-site case: SET_A: EDONKEY result immune, spatial similarity seems more important than temporal similarity. Classifier very sensitive to the context in which it is trained MAIL is often taken for FTP due to the packet sizes similarities Usage of Port number increases the quality of results
  • 26. Classification, 2 cases (continued) SET_B: some degradations Focus on a single feature: Port number Results are the opposite from the static case Prediction of traffic using non-legacy ports is non efficient Due to the heavy-hitters (typically P2P) Global results: C4.5 algorithm is the best in term of overall accuracy for almost all cases (static + cross-site) Degradation : C4.5 is comparable with other algorithms (≤17%) Data overfitting problem
  • 27. Unknown class + Conclusion Looking for the unknown marked flows 3 way handshake Apply classifiers and get confidence level, this value is then compared to the one returned by C4.5 Useful to detect malicious traffic and P2P Should be integrated into existing DPI tool Conclusion: Statistical tools are very useful to identify unknown traffic Good performances if used in the same site as training Can detect applications among protocols Really suffers from data overfitting (same behaviour from different apps) Great thing about this analysis: used commercial traffic, so very differentiated
  • 28. V – Skype case We want to detect Skype traffic It’s already possible to detect VoIP traffic with other classifiers, but how to distinguish it ? Skype is a closed and cyphered protocol, which has to be analysed before starting the classification
  • 29. Skype model Using a controlled environment, detection of Skype traffic characteristics 2 kinds of connections : E2E and E2O E2E : End 2 End, Skype to Skype E2O : End 2 Out, Skype to telephone network Skype works on TCP and UDP Skype can carry text, voice, video and files Everything multiplexed in 1 packet In this case, only voice traffic is treated
  • 30. Skype SoM TCP packets are entirely cyphered, they cannot be analysed UDP has a small uncyphered overhead, called Start of Message (SoM) E2E : id and message type (signaling or data) E20 : unique connection identifier Skype also always uses the same port number in UDP (12340)
  • 31. Classifiers Chi-Square Classifier (CSC) Based on the randomness of bits in packets Doesn’t works on TCP since cyphered packets seems to be completely random. Naive Bayes Classifier (NBC) Real-time voice protocol classifier Based on message size (depending of the audio codec) and on average inter-packet gap Used on a short window of samples to cope with variability in packet size Payload based classifier Used in the controlled environment to check if CSC and NBC work well
  • 32. Experiments NBC detects all kinds of VoIP traffic CSC detects all kinds of Skype traffic Using both of them should detect Skype voice traffic
  • 33. Results N N OK OK FP FP FP% FP% FN FN FN% FN% N N OK OK FP FP FP% FP% FN FN FN% FN% E2E E2E 1014 1014 E2E E2E 65 65 PBC PBC —— —— —— —— — — PBC PBC —— —— — — —— — — E2O E2O 163 163 E2O E2O 125 125 E2E E2E 1236 1236 726 726 510 510 0.68 0.68 288 288 28.40 28.40 E2E E2E 27437 27437 50 50 27387 27387 73.73 73.73 15 15 23.08 23.08 NBC NBC NBC NBC E2O E2O 441 441 153 153 288 288 0.38 0.38 10 10 6.13 6.13 E2O E2O 295 295 124 124 171 171 0.46 0.46 1 1 0.80 0.80 E2E E2E 2781 2781 984 984 1797 1797 2.40 2.40 30 30 2.96 2.96 E2E E2E 191 191 57 57 134 134 0.36 0.36 8 8 12.31 12.31 CSC CSC CSC CSC E2O E2O 161 161 157 157 44 0.01 0.01 66 3.68 3.68 E2O E2O 190 190 123 123 67 67 0.18 0.18 2 2 1.6 1.6 NBC ∧ NBC ∧ E2E E2E 716 716 710 710 66 0.01 0.01 304 304 29.98 29.98 NBC ∧ NBC ∧ E2E E2E 51 51 49 49 2 2 0.01 0.01 16 16 24.62 24.62 CSC CSC E2O E2O 147 147 147 147 00 0.00 0.00 16 16 9.82 9.82 CSC CSC E2O E2O 163 163 122 122 41 41 0.11 0.11 3 3 2.40 2.40 ≥ 100 ≥ 100 76025 76025 ≥ 100 ≥ 100 37212 37212 TOT TOT — — — — — — — — — — TOT TOT — — — — — — — — — — 487729 487729 258634 258634 Table 3: Results for UDP flows, C AMPUS dataset. Table 3: Results for UDP flows, C AMPUS dataset. Table 4: Results for UDP flows, ISP dataset. Table 4: Results for UDP flows, ISP dataset. C AMPUS C AMPUS ISP ISP PBC as oracle, so that flows that pass the PBC classification form E2E E2E 20910 20910 60 60 PBC as oracle, so that flows that pass the PBC classification form NBC NBC E2O E2O 2034 2034 646 646 aa reliable dataset. We refer to this set as the benchmark dataset. reliable dataset. We refer to this set as the benchmark dataset. E2E E2E Very low false positive rate In particular, this dataset is built by Skype voice flows considering In particular, this dataset is built by Skype voice flows considering the E2O case. In the E2E case, voice, video, data and chat flows CSC CSC E2O E2O 403996 403996 46876 46876 the E2O case. In the E2E case, voice, video, data and chat flows NBC ∧ CSC E2E E2E 621 621 12 12 are present, since it is impossible to distinguish among them from NBC ∧ CSC E2O 313 0 are present, since it is impossible to distinguish among them from E2O 313 0 packet inspection. Our tests are the NBC, the CSC and the joint ≥ 100 1646424 108831 Bigger false negative rate packet inspection. Our tests are the NBC, the CSC and the joint NBC-CSC classifiers. Notice that the NBC test is expected to fail NBC-CSC classifiers. Notice that the NBC test is expected to fail TOT TOT ≥ 100 1646424 23856424 23856424 108831 1614553 1614553 when aavideo/data/chat benchmark E2E flow is tested. when video/data/chat benchmark E2E flow is tested. From aapreliminary set of experiments on the testbed traces, con- From preliminary set of experiments on the testbed traces, con- taining more that 50 Skype voice calls, we tuned the PBC and CSC Table 5: Results for TCP flows, both datasets. Table 5: Results for TCP flows, both datasets. taining more that 50 Skype voice calls, we tuned the PBC and CSC classifier thresholds to B m i inn = − 5 and χ 22(T hr) = 150, respec- classifier thresholds to B m = − 5 and χ (T hr) = 150, respec- tively. Using such choices, further discussed in Sec. 5.2, all flows tively. Using such choices, further discussed in Sec. 5.2, all flows noticing that the NBC (correctly) identifies 27437 voice flows, most were correctly identified as E2E or E2O, and neither FP nor FN noticing that the NBC (correctly) identifies 27437 voice flows, most were correctly identified as E2E or E2O, and neither FP nor FN of which correspond to actual ISP’s VoIP flows carried over RTP. of which correspond to actual ISP’s VoIP flows carried over RTP. were identified. Using the same threshold setting, we then apply were identified. Using the same threshold setting, we then apply Only combining the CSC allows to detect the true Skype voice the classification to real traffic traces: the results are summarized Only combining the CSC allows to detect the true Skype voice the classification to real traffic traces: the results are summarized flows. These results confirm that the NBC-FP may be due to non- flows. These results confirm that the NBC-FP may be due to non-
  • 34. Skype : Conclusion Skype is hard to classify due to its cyphering protocol, which makes its analysis hard to do But with this classifier, we have good results on UDP False positive is almost zero, good if the ISP wants to prioritarize its traffic False negative is bigger but not really a problem while the ISP doesn’t want to block Skype
  • 35. VI - Comparative All these classifiers have good results, but each of them has its strengths and weaknesses ADSL needs specific training, but best detection rate BLINC and Early are less precise but more flexible They are also faster and good to detect attacks BLINC detects unknown protocols but cannot discern apps Early needs the 4 first packets in order, ADSL the 3-way handshake Skype is more specific, cannot be compared immediately Good false positive rate but higher false negative rate
  • 36. VII – Conclusion We have now solutions that can replace DPI’s Each classifier is good in its domain Important network: early app detection (detect attacks soon) ADSL and commercial: statistical (user trends, adapt infrastructure) University or academy: BLINC (statistics, trends) Everywhere we want to improve it: Skype classifier Remarks: Traces and classifiers are quite old (4 to 6 years) What about mobile usage ? Multimedia over 3/4G networks ?
  • 37. References: K. Karagiannis, K. Papagiannaki, M. Faloutsos. BLINC: Multilevel Traffic Classification in the Dark. In Proc. ACM SIGCOMM. August 2005. L. Bernaille, R. Teixeira, K. Salamatian. Early Application Identification. In Proc. ACM CoNEXT. December 2006. M. Pietrzyk, J.-L. Costeux, G. Urvoy-Keller, T. En-Jajjary. Challenging Statistical Classification for Operational Usage: the ADSL Case. In Proc. ACM/USENIX Internet Measurement Conference (IMC). Novem- ber 2009. D.Bonfiglio,M.Mellia,M.Meo,D.Rossi,P.Tofanelli.RevealingSkype Traffic: When Randomness Plays with You. In Proc. ACM SIGCOMM. August 2007. Thanks for your attention Any questions ?