SlideShare a Scribd company logo
1 of 19
Here is the
anomalow-down!
Sevvandi Kandanaarachchi
RMIT University
Joint work with Rob Hyndman
1
Why anomalies?
• They tell a different story
• Fraudulent credit card transactions amongst billions of
legitimate transactions
• Computer network intrusions
• Astronomical anomalies – solar flares
• Weather anomalies – tsunamis
• Stock market anomalies – heralding a crash?
2
Anomaly detection – why?
• Take fraud and network intrusions for example
• Training a model on certain fraud/intrusions/cyber attacks is
not optimal, because there are new types of fraud/attacks,
always!
• You want to be alerted when weird things happen.
• Anomaly detection is used in these applications.
3
Is everything rosy?
4
Some
Current
Challenges
High dimensionality of data
• Finding anomalies in high dimensional data is hard
• Anomalies and normal points look similar
High false positives
• Do not want an “alarm factory” – confidence in the
system goes down
Parameters need to be defined by the user
• But expert knowledge is needed
5
Overview
lookout – an
anomaly
detection
method
Low false positives
User does not need to specify parameters
lookout – on CRAN
dobin – a
dimension
reduction
method for
anomaly
detection
Addresses the high dimensionality challenge
dobin – on CRAN
6
dobin –
dimension
reduction for
outlier detection
Sevvandi Kandanaarachchi, Rob Hyndman
JCGS, (2021) 30:1, 204-219
7
What is it?
Original anomalies are still
anomalies in the reduced
dimensional space
It is a preprocessing technique
Not an anomaly detection method
8
What does
it do?
Find a set of new axes (basis
vectors), which preserves
anomalies
First basis vector in the direction of
most anomalousness (largest knn
distances), second basis vector in
the direction of second largest knn
distances
9
Example
• Uniform distribution in 20
dimensions,
• one point at (0.9, 0.9, 0.9, . . .)
• This is the outlier
• In R
• > dobin(X)
10
Sevvandi Kandanaarachchi, Rob Hyndman
Preprint - https://bit.ly/lookoutliers
lookout – leave one
out kde for outlier
detection
11
lookout
Outlier detection method
• Because of Extreme Value Theory
(EVT)
• EVT is used to model 100-year floods
• Use a Generalized Pareto Distribution
Low false positives
Not an “alarm factory”
12
lookout
User does not need to specify
parameters
• Use Kernel Density Estimates –
need a bandwidth parameter
• But general bandwidth is not
appropriate for anomaly detection
• Select bandwidth using topological
data analysis
• bw(TDA) → KDE → EVT → outliers
Anomaly persistence
• Which anomalies are consistently
identified, with changing
bandwidth?
• Visual representation of anomaly
persistence
13
Example 1
2D normal distribution, with outliers at the far end.
The outlying indices are 501 - 505
The persistence diagram. The outliers get identified
for a large range of bandwidth values.
14
Example 2
2D bimodal distribution, with outliers in the trough.
The outliers have indices 1001 - 1005
The persistence diagram. Again, the outliers
get identified for a large range of bandwidth values.
15
Example 3
Points in 3 normally distributed clusters, with anomalies
away from them. Anomalies have indices 701 - 703.
The persistence diagram. Anomalies get
identified for a broad range of bandwidth
values.
16
Example 4
Points in an annulus with anomalies in the middle.
Anomalies have indices 1001 - 1010
The persistence diagram.
17
Summary
• dobin - a dimension reduction method for anomaly detection
• lookout - a EVT based method to find anomalies
• Both paper/preprint available
• https://doi.org/10.1080/10618600.2020.1807353
• https://bit.ly/lookoutliers
• Both packages on CRAN
18
Thank you!
19

More Related Content

Similar to Here is the anomalow-down!

presentation.pptx
presentation.pptxpresentation.pptx
presentation.pptxshamaaslam3
 
Big Data for Big Power: How smart is the grid if the infrastructure is stupid?
Big Data for Big Power:  How smart is the grid if the infrastructure is stupid?Big Data for Big Power:  How smart is the grid if the infrastructure is stupid?
Big Data for Big Power: How smart is the grid if the infrastructure is stupid?OReillyStrata
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networksCSIRO
 
4th Year Project Presentation Slides
4th Year Project Presentation Slides4th Year Project Presentation Slides
4th Year Project Presentation SlidesItrat Rahman
 
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01MapR Technologies
 
Strata 2014 Anomaly Detection
Strata 2014 Anomaly DetectionStrata 2014 Anomaly Detection
Strata 2014 Anomaly DetectionTed Dunning
 
Final observability starts_with_data
Final observability starts_with_dataFinal observability starts_with_data
Final observability starts_with_dataDave McAllister
 
Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperGarvit Burad
 
Reproducible Emulation of Analog Behavioral Models
Reproducible Emulation of Analog Behavioral ModelsReproducible Emulation of Analog Behavioral Models
Reproducible Emulation of Analog Behavioral Modelsfnothaft
 
Anomalies and events keep us on our toes
Anomalies and events keep us on our toesAnomalies and events keep us on our toes
Anomalies and events keep us on our toesCSIRO
 
Anomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningAnomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningKuppusamy P
 
Estimating default risk in fund structures
Estimating default risk in fund structuresEstimating default risk in fund structures
Estimating default risk in fund structuresIFMR
 
Portal Imaging used to clear setup uncertainty
Portal Imaging used to clear setup uncertaintyPortal Imaging used to clear setup uncertainty
Portal Imaging used to clear setup uncertaintyMajoVJJose
 
Practical solutions in ultra low power design for artificial retina
Practical solutions in ultra low power design for artificial retinaPractical solutions in ultra low power design for artificial retina
Practical solutions in ultra low power design for artificial retinachiportal
 
Digital radiography testing
Digital radiography testingDigital radiography testing
Digital radiography testingmehrdad kehtari
 
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...Edge AI and Vision Alliance
 
Wqtc2013 invest ofperformanceprobswitheds-20130910
Wqtc2013 invest ofperformanceprobswitheds-20130910Wqtc2013 invest ofperformanceprobswitheds-20130910
Wqtc2013 invest ofperformanceprobswitheds-20130910John B. Cook, PE, CEO
 

Similar to Here is the anomalow-down! (20)

FINAL B.V.C 8051.pptx
FINAL B.V.C 8051.pptxFINAL B.V.C 8051.pptx
FINAL B.V.C 8051.pptx
 
presentation.pptx
presentation.pptxpresentation.pptx
presentation.pptx
 
Big Data for Big Power: How smart is the grid if the infrastructure is stupid?
Big Data for Big Power:  How smart is the grid if the infrastructure is stupid?Big Data for Big Power:  How smart is the grid if the infrastructure is stupid?
Big Data for Big Power: How smart is the grid if the infrastructure is stupid?
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networks
 
4th Year Project Presentation Slides
4th Year Project Presentation Slides4th Year Project Presentation Slides
4th Year Project Presentation Slides
 
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
 
Strata 2014 Anomaly Detection
Strata 2014 Anomaly DetectionStrata 2014 Anomaly Detection
Strata 2014 Anomaly Detection
 
Final observability starts_with_data
Final observability starts_with_dataFinal observability starts_with_data
Final observability starts_with_data
 
Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research Paper
 
Reproducible Emulation of Analog Behavioral Models
Reproducible Emulation of Analog Behavioral ModelsReproducible Emulation of Analog Behavioral Models
Reproducible Emulation of Analog Behavioral Models
 
Anomalies and events keep us on our toes
Anomalies and events keep us on our toesAnomalies and events keep us on our toes
Anomalies and events keep us on our toes
 
Anomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningAnomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine Learning
 
cable fault.pptx
cable fault.pptxcable fault.pptx
cable fault.pptx
 
Estimating default risk in fund structures
Estimating default risk in fund structuresEstimating default risk in fund structures
Estimating default risk in fund structures
 
Portal Imaging used to clear setup uncertainty
Portal Imaging used to clear setup uncertaintyPortal Imaging used to clear setup uncertainty
Portal Imaging used to clear setup uncertainty
 
Practical solutions in ultra low power design for artificial retina
Practical solutions in ultra low power design for artificial retinaPractical solutions in ultra low power design for artificial retina
Practical solutions in ultra low power design for artificial retina
 
238 iit conf 238
238 iit conf  238238 iit conf  238
238 iit conf 238
 
Digital radiography testing
Digital radiography testingDigital radiography testing
Digital radiography testing
 
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...
“Next-generation Computer Vision Methods for Automated Navigation of Unmanned...
 
Wqtc2013 invest ofperformanceprobswitheds-20130910
Wqtc2013 invest ofperformanceprobswitheds-20130910Wqtc2013 invest ofperformanceprobswitheds-20130910
Wqtc2013 invest ofperformanceprobswitheds-20130910
 

More from CSIRO

The painful removal of tiling artefacts in hypersprectral data
The painful removal of tiling artefacts in hypersprectral dataThe painful removal of tiling artefacts in hypersprectral data
The painful removal of tiling artefacts in hypersprectral dataCSIRO
 
Explainable insights on algorithm performance
Explainable insights on algorithm performanceExplainable insights on algorithm performance
Explainable insights on algorithm performanceCSIRO
 
The painful removal of tiling artefacts in ToF-SIMS data
The painful removal of tiling artefacts in ToF-SIMS dataThe painful removal of tiling artefacts in ToF-SIMS data
The painful removal of tiling artefacts in ToF-SIMS dataCSIRO
 
Sophisticated tools for spatio-temporal data exploration
Sophisticated tools for spatio-temporal data explorationSophisticated tools for spatio-temporal data exploration
Sophisticated tools for spatio-temporal data explorationCSIRO
 
Explainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in educationExplainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in educationCSIRO
 
A time series of networks. Is everything OK? Are there anomalies?
A time series of networks. Is everything OK? Are there anomalies?A time series of networks. Is everything OK? Are there anomalies?
A time series of networks. Is everything OK? Are there anomalies?CSIRO
 
Explainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxExplainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxCSIRO
 
Anomalous Networks
Anomalous NetworksAnomalous Networks
Anomalous NetworksCSIRO
 
Four, fast geostatistical methods - a comparison
Four, fast geostatistical methods - a comparisonFour, fast geostatistical methods - a comparison
Four, fast geostatistical methods - a comparisonCSIRO
 
Comparison of geostatistical methods for spatial data
Comparison of geostatistical methods for spatial dataComparison of geostatistical methods for spatial data
Comparison of geostatistical methods for spatial dataCSIRO
 
Algorithm evaluation using Item Response Theory
Algorithm evaluation using Item Response TheoryAlgorithm evaluation using Item Response Theory
Algorithm evaluation using Item Response TheoryCSIRO
 
Getting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesGetting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesCSIRO
 
Evaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response TheoryEvaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response TheoryCSIRO
 
Anomalies! You can't escape them.
Anomalies! You can't escape them.Anomalies! You can't escape them.
Anomalies! You can't escape them.CSIRO
 
Algorithm evaluation using item response theory
Algorithm evaluation using item response theoryAlgorithm evaluation using item response theory
Algorithm evaluation using item response theoryCSIRO
 

More from CSIRO (15)

The painful removal of tiling artefacts in hypersprectral data
The painful removal of tiling artefacts in hypersprectral dataThe painful removal of tiling artefacts in hypersprectral data
The painful removal of tiling artefacts in hypersprectral data
 
Explainable insights on algorithm performance
Explainable insights on algorithm performanceExplainable insights on algorithm performance
Explainable insights on algorithm performance
 
The painful removal of tiling artefacts in ToF-SIMS data
The painful removal of tiling artefacts in ToF-SIMS dataThe painful removal of tiling artefacts in ToF-SIMS data
The painful removal of tiling artefacts in ToF-SIMS data
 
Sophisticated tools for spatio-temporal data exploration
Sophisticated tools for spatio-temporal data explorationSophisticated tools for spatio-temporal data exploration
Sophisticated tools for spatio-temporal data exploration
 
Explainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in educationExplainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in education
 
A time series of networks. Is everything OK? Are there anomalies?
A time series of networks. Is everything OK? Are there anomalies?A time series of networks. Is everything OK? Are there anomalies?
A time series of networks. Is everything OK? Are there anomalies?
 
Explainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxExplainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptx
 
Anomalous Networks
Anomalous NetworksAnomalous Networks
Anomalous Networks
 
Four, fast geostatistical methods - a comparison
Four, fast geostatistical methods - a comparisonFour, fast geostatistical methods - a comparison
Four, fast geostatistical methods - a comparison
 
Comparison of geostatistical methods for spatial data
Comparison of geostatistical methods for spatial dataComparison of geostatistical methods for spatial data
Comparison of geostatistical methods for spatial data
 
Algorithm evaluation using Item Response Theory
Algorithm evaluation using Item Response TheoryAlgorithm evaluation using Item Response Theory
Algorithm evaluation using Item Response Theory
 
Getting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesGetting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensembles
 
Evaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response TheoryEvaluating algorithms using Item Response Theory
Evaluating algorithms using Item Response Theory
 
Anomalies! You can't escape them.
Anomalies! You can't escape them.Anomalies! You can't escape them.
Anomalies! You can't escape them.
 
Algorithm evaluation using item response theory
Algorithm evaluation using item response theoryAlgorithm evaluation using item response theory
Algorithm evaluation using item response theory
 

Recently uploaded

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 

Recently uploaded (20)

Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 

Here is the anomalow-down!

  • 1. Here is the anomalow-down! Sevvandi Kandanaarachchi RMIT University Joint work with Rob Hyndman 1
  • 2. Why anomalies? • They tell a different story • Fraudulent credit card transactions amongst billions of legitimate transactions • Computer network intrusions • Astronomical anomalies – solar flares • Weather anomalies – tsunamis • Stock market anomalies – heralding a crash? 2
  • 3. Anomaly detection – why? • Take fraud and network intrusions for example • Training a model on certain fraud/intrusions/cyber attacks is not optimal, because there are new types of fraud/attacks, always! • You want to be alerted when weird things happen. • Anomaly detection is used in these applications. 3
  • 5. Some Current Challenges High dimensionality of data • Finding anomalies in high dimensional data is hard • Anomalies and normal points look similar High false positives • Do not want an “alarm factory” – confidence in the system goes down Parameters need to be defined by the user • But expert knowledge is needed 5
  • 6. Overview lookout – an anomaly detection method Low false positives User does not need to specify parameters lookout – on CRAN dobin – a dimension reduction method for anomaly detection Addresses the high dimensionality challenge dobin – on CRAN 6
  • 7. dobin – dimension reduction for outlier detection Sevvandi Kandanaarachchi, Rob Hyndman JCGS, (2021) 30:1, 204-219 7
  • 8. What is it? Original anomalies are still anomalies in the reduced dimensional space It is a preprocessing technique Not an anomaly detection method 8
  • 9. What does it do? Find a set of new axes (basis vectors), which preserves anomalies First basis vector in the direction of most anomalousness (largest knn distances), second basis vector in the direction of second largest knn distances 9
  • 10. Example • Uniform distribution in 20 dimensions, • one point at (0.9, 0.9, 0.9, . . .) • This is the outlier • In R • > dobin(X) 10
  • 11. Sevvandi Kandanaarachchi, Rob Hyndman Preprint - https://bit.ly/lookoutliers lookout – leave one out kde for outlier detection 11
  • 12. lookout Outlier detection method • Because of Extreme Value Theory (EVT) • EVT is used to model 100-year floods • Use a Generalized Pareto Distribution Low false positives Not an “alarm factory” 12
  • 13. lookout User does not need to specify parameters • Use Kernel Density Estimates – need a bandwidth parameter • But general bandwidth is not appropriate for anomaly detection • Select bandwidth using topological data analysis • bw(TDA) → KDE → EVT → outliers Anomaly persistence • Which anomalies are consistently identified, with changing bandwidth? • Visual representation of anomaly persistence 13
  • 14. Example 1 2D normal distribution, with outliers at the far end. The outlying indices are 501 - 505 The persistence diagram. The outliers get identified for a large range of bandwidth values. 14
  • 15. Example 2 2D bimodal distribution, with outliers in the trough. The outliers have indices 1001 - 1005 The persistence diagram. Again, the outliers get identified for a large range of bandwidth values. 15
  • 16. Example 3 Points in 3 normally distributed clusters, with anomalies away from them. Anomalies have indices 701 - 703. The persistence diagram. Anomalies get identified for a broad range of bandwidth values. 16
  • 17. Example 4 Points in an annulus with anomalies in the middle. Anomalies have indices 1001 - 1010 The persistence diagram. 17
  • 18. Summary • dobin - a dimension reduction method for anomaly detection • lookout - a EVT based method to find anomalies • Both paper/preprint available • https://doi.org/10.1080/10618600.2020.1807353 • https://bit.ly/lookoutliers • Both packages on CRAN 18