SlideShare une entreprise Scribd logo
1  sur  16
EXPLAIN-IT: Towards Explainable
AI for Unsupervised Network
Traffic Analysisº
Andrea Morichetta★, Pedro Casas*, Marco Mellia★
Politecnico di Torino★, Austrian Institute of Technology*
3rd ACM CoNEXT Workshop on Big DAta, Machine Learning and Artificial
Intelligence for Data Communication Networks
The Gap
• Scenario: Rising popularity of ML applications for solving
specific problems in network traffic analysis.
• Ground truth is systematically missing – difficult to obtain
(structural complexity and big data volumes)
• Labeled datasets are frequently simplistic representation of
real-world phenomena, often also outdated.
2
Unsupervised learning to fill the gap
• Unsupervised techniques allow to have a better understanding of the
data, exploring its shape and patterns.
• However, it is difficult to analyze their results
• Typical solutions:
• manual inspection  problem when there are too many or too complex data
• unsupervised quality metrics  the why is missing
• supervised quality metrics  not good if ground truth inherently wrong or
biased
3
Knowledge extraction from the clusters
Goal: have an interpretable representation of the features relevance in
the clusters
• For understanding the clusters content
• For better explanation of the data aggregation
4
Knowledge extraction – a supervised
approach
A possible solution: White box classifiers (white box techniques: e.g.,
linear regression and decision trees)
+Gives us also the opportunity to evaluate the cluster
attribution/assignment (via classification)
+Clear and algorithmically grounded
+Gives an “interpretation” available for the analysis
- It limits the set of applicable techniques
5
How to make this approach more general and extend the
set of algorithms?
Explainable AI - extend the supervised
approach
• EXPLAINABLE AI makes it easier to understand why certain decisions
or predictions have been made.
• Achieved by:
• Restricting the complexity of the machine learning model (intrinsic)
• Or by applying methods that analyze the model after training (post
hoc),
• e.g., LIME (Local Interpretable Model-agnostic Explanations)1 can
explain the predictions of any classifier or regressor, by
approximating it locally with an interpretable model.
6
1Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. "" Why should i trust you?" Explaining the predictions of any classifier." Proceedings of the 22nd ACM SIGKDD
international conference on knowledge discovery and data mining. 2016.
Approach
7
Data
Exploration Space
Knowledge
Summary Space
Unsupervised techniques
e.g., Clustering
Step 2
Advanced Knowledge Extraction
Splitting model
SVM
Identification of
XAI features with
LIME
Explainable AI
Use case
• 10654 YouTube video sessions, coming from different sources, smartphone
(HTML player and YouTube app) and desktop (HTML player)
• Set of ~500 features:
• at the full video session level (e.g., session downlink throughout)
• as well as at different time resolutions with time slots of ∆t = [1, 5, 10] seconds.
• We focus on the average video quality (AVGQ) metric. We consider video
resolution as follows:
• 0: Low Definition (LD), with AVGQ < 480
• 1: Standard Definition (SD), with 480 ≤ AVGQ < 720
• 2: High Definition (HD), with AVGQ ≥ 720
8
Clustering phase
• Goal: We want to obtain 3 clusters in output:
a. Low Definition, LD
b. Standard definition, SD
c. High Definition, HD
• Algorithms used:
• Agglomerative (1) clustering with Ward Links (Ward minimizes the variance
of the clusters being merged)
• Agglomerative (2) clustering with Single Links (Single single uses the
minimum of the distances between all observations of the two sets)
• K-Means
• BIRCH - Balanced Iterative Reducing and Clustering using Hierarchies
9
Clustering Results – quality metrics
10
Adjusted M
utual Info Score
Adjusted Rand Score
CompletenessScore
FowlkesM
allowsScore
Homogeneity Score
SilhouetteScore
V
M
easureScore
0 0
0 1
0 2
0 3
0 4
0 5
0 6
Algorithm
Agglomerative(1)
Agglomerative(2)
K-Means
Birch
Clustering results – label distribution
11
Label distribution after agglomerative Ward clustering
Clustering results – feature Inspection
12
Example of feature inspection inthe results of agglomerative Ward clustering
Cluster 0 Cluster 1 Cluster 2
Interpret with model – using Support Vector
Machines
• Hyperplane-based classifiers
• The SVM selects the maximum margin separating hyperplane
• Use of kernel function to map points on a high-dimensional space
• However, it is a black-box classifier
• Thus, Explainable AI can aid us
13
Interpret with model – using SVM
14
Agglomerative (1)
Results of SVM applied to Agglomerative with Ward
Results with LIME – an example
Feature Feature Importance
uplink_bytes_second_slot_1s > 10468.5 0.10
dist_packet_length_downlink_p25 > 1379 0.09
dist_slotted_uplink_bytes_p97_1s > 18445.9 0.08
uplink_packets_first_slot_5s > 861.3 0.07
420628.7 < dist_slotted_bytes_p97_1s <= 902383.7 0.07
dist_slotted_downlink_bytes_p97_5s > 2711876.9 0.06
dist_slotted_downlink_bytes_h_1s > 0.7 0.05
335.4 < dist_slotted_uplink_packets_p99_1s <= 502.0 0.04
dist_slotted_uplink_bytes_p90_1s > 7627.6 0.04
dist_slotted_bytes_mean_5s > 845017.6 0.04
15
Instance classified as belonging to cluster 2
Conclusion and future work
• Interesting approach for improving the interpretation of clustering
results by relying on XAI principles
• Is explainable AI an advantage in the YouTube case, where features
are complex?
• Is LIME always good? Look at alternatives, e.g., SHAP
• Is it possible to avoid the classification step?
• Extend it to other scenarios
• Expand the research on different clustering algorithms
• Use different classification techniques
16

Contenu connexe

Similaire à ExplainableAI.pptx

Distilling dark knowledge from neural networks
Distilling dark knowledge from neural networksDistilling dark knowledge from neural networks
Distilling dark knowledge from neural networksAlexander Korbonits
 
Introduction to image processing and pattern recognition
Introduction to image processing and pattern recognitionIntroduction to image processing and pattern recognition
Introduction to image processing and pattern recognitionSaibee Alam
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?Ivo Andreev
 
Activity Monitoring Using Wearable Sensors and Smart Phone
Activity Monitoring Using Wearable Sensors and Smart PhoneActivity Monitoring Using Wearable Sensors and Smart Phone
Activity Monitoring Using Wearable Sensors and Smart PhoneDrAhmedZoha
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxIvo Andreev
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesTuri, Inc.
 
AI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousAI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousRaffael Marty
 
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...ssuser9357dd
 
Deep learning with keras
Deep learning with kerasDeep learning with keras
Deep learning with kerasMOHITKUMAR1379
 
Mastering AIOps with Deep Learning
Mastering AIOps with Deep LearningMastering AIOps with Deep Learning
Mastering AIOps with Deep LearningJorge Cardoso
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptxArthur240715
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonAditya Bhattacharya
 
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Experfy
 
background.pptx
background.pptxbackground.pptx
background.pptxKabileshCm
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]SubhradeepMaji
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learningSushant Shrivastava
 
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringBridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringRafael Ferreira da Silva
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyNUPUR YADAV
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTuri, Inc.
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Dalei Li
 

Similaire à ExplainableAI.pptx (20)

Distilling dark knowledge from neural networks
Distilling dark knowledge from neural networksDistilling dark knowledge from neural networks
Distilling dark knowledge from neural networks
 
Introduction to image processing and pattern recognition
Introduction to image processing and pattern recognitionIntroduction to image processing and pattern recognition
Introduction to image processing and pattern recognition
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?
 
Activity Monitoring Using Wearable Sensors and Smart Phone
Activity Monitoring Using Wearable Sensors and Smart PhoneActivity Monitoring Using Wearable Sensors and Smart Phone
Activity Monitoring Using Wearable Sensors and Smart Phone
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
AI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousAI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are Dangerous
 
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Mo...
 
Deep learning with keras
Deep learning with kerasDeep learning with keras
Deep learning with keras
 
Mastering AIOps with Deep Learning
Mastering AIOps with Deep LearningMastering AIOps with Deep Learning
Mastering AIOps with Deep Learning
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
 
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
 
background.pptx
background.pptxbackground.pptx
background.pptx
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringBridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...
 

Dernier

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01KreezheaRecto
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Christo Ananth
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...tanu pandey
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 

Dernier (20)

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 

ExplainableAI.pptx

  • 1. EXPLAIN-IT: Towards Explainable AI for Unsupervised Network Traffic Analysisº Andrea Morichetta★, Pedro Casas*, Marco Mellia★ Politecnico di Torino★, Austrian Institute of Technology* 3rd ACM CoNEXT Workshop on Big DAta, Machine Learning and Artificial Intelligence for Data Communication Networks
  • 2. The Gap • Scenario: Rising popularity of ML applications for solving specific problems in network traffic analysis. • Ground truth is systematically missing – difficult to obtain (structural complexity and big data volumes) • Labeled datasets are frequently simplistic representation of real-world phenomena, often also outdated. 2
  • 3. Unsupervised learning to fill the gap • Unsupervised techniques allow to have a better understanding of the data, exploring its shape and patterns. • However, it is difficult to analyze their results • Typical solutions: • manual inspection  problem when there are too many or too complex data • unsupervised quality metrics  the why is missing • supervised quality metrics  not good if ground truth inherently wrong or biased 3
  • 4. Knowledge extraction from the clusters Goal: have an interpretable representation of the features relevance in the clusters • For understanding the clusters content • For better explanation of the data aggregation 4
  • 5. Knowledge extraction – a supervised approach A possible solution: White box classifiers (white box techniques: e.g., linear regression and decision trees) +Gives us also the opportunity to evaluate the cluster attribution/assignment (via classification) +Clear and algorithmically grounded +Gives an “interpretation” available for the analysis - It limits the set of applicable techniques 5 How to make this approach more general and extend the set of algorithms?
  • 6. Explainable AI - extend the supervised approach • EXPLAINABLE AI makes it easier to understand why certain decisions or predictions have been made. • Achieved by: • Restricting the complexity of the machine learning model (intrinsic) • Or by applying methods that analyze the model after training (post hoc), • e.g., LIME (Local Interpretable Model-agnostic Explanations)1 can explain the predictions of any classifier or regressor, by approximating it locally with an interpretable model. 6 1Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. "" Why should i trust you?" Explaining the predictions of any classifier." Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016.
  • 7. Approach 7 Data Exploration Space Knowledge Summary Space Unsupervised techniques e.g., Clustering Step 2 Advanced Knowledge Extraction Splitting model SVM Identification of XAI features with LIME Explainable AI
  • 8. Use case • 10654 YouTube video sessions, coming from different sources, smartphone (HTML player and YouTube app) and desktop (HTML player) • Set of ~500 features: • at the full video session level (e.g., session downlink throughout) • as well as at different time resolutions with time slots of ∆t = [1, 5, 10] seconds. • We focus on the average video quality (AVGQ) metric. We consider video resolution as follows: • 0: Low Definition (LD), with AVGQ < 480 • 1: Standard Definition (SD), with 480 ≤ AVGQ < 720 • 2: High Definition (HD), with AVGQ ≥ 720 8
  • 9. Clustering phase • Goal: We want to obtain 3 clusters in output: a. Low Definition, LD b. Standard definition, SD c. High Definition, HD • Algorithms used: • Agglomerative (1) clustering with Ward Links (Ward minimizes the variance of the clusters being merged) • Agglomerative (2) clustering with Single Links (Single single uses the minimum of the distances between all observations of the two sets) • K-Means • BIRCH - Balanced Iterative Reducing and Clustering using Hierarchies 9
  • 10. Clustering Results – quality metrics 10 Adjusted M utual Info Score Adjusted Rand Score CompletenessScore FowlkesM allowsScore Homogeneity Score SilhouetteScore V M easureScore 0 0 0 1 0 2 0 3 0 4 0 5 0 6 Algorithm Agglomerative(1) Agglomerative(2) K-Means Birch
  • 11. Clustering results – label distribution 11 Label distribution after agglomerative Ward clustering
  • 12. Clustering results – feature Inspection 12 Example of feature inspection inthe results of agglomerative Ward clustering Cluster 0 Cluster 1 Cluster 2
  • 13. Interpret with model – using Support Vector Machines • Hyperplane-based classifiers • The SVM selects the maximum margin separating hyperplane • Use of kernel function to map points on a high-dimensional space • However, it is a black-box classifier • Thus, Explainable AI can aid us 13
  • 14. Interpret with model – using SVM 14 Agglomerative (1) Results of SVM applied to Agglomerative with Ward
  • 15. Results with LIME – an example Feature Feature Importance uplink_bytes_second_slot_1s > 10468.5 0.10 dist_packet_length_downlink_p25 > 1379 0.09 dist_slotted_uplink_bytes_p97_1s > 18445.9 0.08 uplink_packets_first_slot_5s > 861.3 0.07 420628.7 < dist_slotted_bytes_p97_1s <= 902383.7 0.07 dist_slotted_downlink_bytes_p97_5s > 2711876.9 0.06 dist_slotted_downlink_bytes_h_1s > 0.7 0.05 335.4 < dist_slotted_uplink_packets_p99_1s <= 502.0 0.04 dist_slotted_uplink_bytes_p90_1s > 7627.6 0.04 dist_slotted_bytes_mean_5s > 845017.6 0.04 15 Instance classified as belonging to cluster 2
  • 16. Conclusion and future work • Interesting approach for improving the interpretation of clustering results by relying on XAI principles • Is explainable AI an advantage in the YouTube case, where features are complex? • Is LIME always good? Look at alternatives, e.g., SHAP • Is it possible to avoid the classification step? • Extend it to other scenarios • Expand the research on different clustering algorithms • Use different classification techniques 16

Notes de l'éditeur

  1. Why our model predicted a specific label? E.g., if traffic is malicious or not? LIME intuition is to look closer in the area of the predicted decision, and get easier boundaries LIME is only based on inputs and outputs of the model Random generating data points, by perturbation, in the neighborhood of our target data point What we get, is a new dataset in the neighborhood of our target, that we can interpret with a white box model Assign weights to the points closer to the target in order to get these rights when predicting with a local linear model
  2. packet-level video traffic measurements only information extracted from the network traffic for each of the captured packet are packet time and packet size. From these two values, we then derive a full set of 477 different features Overall/full session traffic, downlink traffic and uplink traffic Sampled empirical distributions of overall session traffic, downlink traffic and uplink traffic extracted from the analyzed network video traffic packets into relevant Video Quality Metrics. Six VQMs: initial delay, frequency of stallings, number of stalling events, number of quality switches, average video quality (video vertical resolution, e.g., 480p, 720p, 1080p, etc.) and average video bitrate.