SlideShare une entreprise Scribd logo
1  sur  32
IT'S ABOUT TIME !! 
Presented By- 
P.SHANMUKHA SREENIVAS 
M.MGT 1
AN OVERVIEW ON TIME SERIES DATA MINING 
OUTLINE 
2 
1. Introduction 
2. Similarity Search in Time Series Data 
3. Feature-based Dimensionality Reduction 
4. Discretization 
5. Other Time Series Data Mining Tasks 
6. Conclusions
3 
Introduction 
6145.45 
6128.75 
6142.7 
6201.2 
6151.9 
6050.95 
5917.75 
5855.95 
5984 
5993.9 
5934.8 
5920.05 
5950 
5950.7 
5963.8 
6141.15 
.. 
.. 
6471.4 
6511.7 
6563.25 
6558.45 
6492.7 
6546.75 
A time series is a collection of observations 
made sequentially in time. 
CNX IT returns 
Examples: Financial time series, scientific time series
TIME SERIES SIMILARITY SEARCH 
4 
Some examples: 
- Identifying companies with similar patterns of growth. 
- Determining products with similar selling patterns 
- Discovering stocks with similar movement in stock prices. 
- Finding out whether a musical score is similar to one of a set 
of copyrighted scores.
Major Time Series Data Mining Tasks 
• Indexing 
• Clustering 
• Classification 
• Prediction 
• Anomaly Detection 
Indexing and clustering make explicit use of a distance measure 
The others make implicit use of a distance measure
TIME SERIES SIMILARITY SEARCH 
DISTANCE MEASURES 
 Euclidean distance 
 Dynamic Time Warping 
 Other distance measures 
o Threshold query based similarity search (TQuEST) 
o Minkowski Distance 
6
7 
Euclidean Distance Metric 
Given two time series 
Q = q1…qn 
and 
C = c1…cn 
their Euclidean distance is 
defined as: 
n 
2 , 
      
i i D Q C q c 
 
i 
1 
C 
Q 
D(Q,C)
What’s wrong with Euclidean Distance? 
Similar sequences but they are shifted and have different scales 
Normalize the time series before measuring 
the distance between them. 푥푖 
What if a sequence is stretched or compressed along the time axis? 
(Goldin and Kanellakis, 1995) 
′ = 
푥푖 − μ 
σ
9 
Dynamic Time Warping (Berndt et al.) 
Dynamic Time Warping is a technique that finds the optimal 
alignment between two time series if one time series may be 
“warped” non-linearly by stretching or shrinking it along its time 
axis. 
This warping between two time series can be used or to determine 
the similarity between the two time series. 
Fixed Time Axis 
Sequences are aligned “one to one”. 
“Warped” Time Axis 
Nonlinear alignments are possible.
DYNAMIC TIME WARPING 
[BERNDT, CLIFFORD, 1994] 
 Allows acceleration-deceleration of signals along the time 
dimension 
 Basic idea 
X = (x1; x2; :::xN); N є N Y = (y1; y2; :::yM); M є N 
*Data sequences should be sampled at equidistant points in time 
 Algorithm starts by building the distance matrix C є R (N*M) 
representing all pairwise distances between X and Y 
This distance matrix is also called as the local cost matrix 
c(i,j) = ||xi - yj|| i є [1 : N]; j є [1 : M] 
 Once the local cost matrix is built, the algorithm finds the 
alignment path which runs through the low-cost areas – ‘valleys’ 
on the augmented cost matrix
C 
Q 
C Q 
HOW IS DTW 
CALCULATED? 
(i,j) = d(qi,cj) + min{ (i-1,j-1) , (i-1,j ) , (i,j-1) } 
Warping path w
CONSTRAINTS 
 Boundary condition 
Shanmukha Sreenivas P , DoMS 
The starting and ending points of the warping path must be the first and the 
last points of aligned sequences i.e C1 =(1,1) Ck=(M,N) 
 Monotonicity condition 
n1< n2 < ::: < nK and m1< m2< :::< mK. 
This condition preserves the time-ordering of points. 
 Step size condition 
This criteria limits the warping path from long jumps (shifts in time) while 
aligning sequences. 
i.e we’ll be looking at only these values w(i-1,j-1) , w(i-1,j ) , w(i,j-1) 
12
Shanmukha Sreenivas P , DoMS 
CONSTRAINT VISUALIZATION 
a)Admissible path satisfying constraints 
b)Violation of boundary condition 
c)Violation of monotonicity 
d)Violation of step size 
13
STEP SIZE CONDITION 
A global constraint constrains the indices of the warping path wk = (i,j)k such that 
j-r  i  j+r 
Where r is a term defining allowed range of warping for a given point in a 
sequence. 
r = 
Sakoe-Chiba Band Itakura Parallelogram
DYNAMIC TIME WARPING 
15 
Advantages:
EXAMPLE 
s1 s2 s3 s4 s5 s6 s7 s8 s9 
q1 3.76 8.07 1.64 1.08 2.86 0.00 0.06 1.88 1.25 
q2 2.02 5.38 0.58 2.43 4.88 0.31 0.59 3.57 2.69 
q3 6.35 11.70 3.46 0.21 1.23 0.29 0.11 0.62 0.29 
q4 16.8 25.10 11.90 1.28 0.23 4.54 3.69 0.64 1.10 
q5 3.20 7.24 1.28 1.42 3.39 0.04 0.16 2.31 1.61 
q6 3.39 7.51 1.39 1.30 3.20 0.02 0.12 2.16 1.49 
q7 4.75 9.49 2.31 0.64 2.10 0.04 0.00 1.28 0.77 
q8 0.96 3.53 0.10 4.00 7.02 1.00 1.46 5.43 4.33 
q9 0.02 1.08 0.27 8.07 12.18 3.39 4.20 10.05 8.53 
Matrix of the pair-wise distances for element si with qj
EXAMPLE 
s1 s2 s3 s4 s5 s6 s7 s8 s9 
q1 3.76 11.83 13.47 14.55 17.41 17.41 17.47 19.35 20.60 
q2 5.78 9.14 9.72 12.15 17.03 17.34 17.93 21.04 22.04 
q3 12.13 17.48 12.60 9.93 11.16 11.45 11.56 12.18 12.47 
q4 29.02 37.23 24.50 11.21 10.16 14.70 15.14 12.20 13.28 
q5 32.22 36.26 25.78 12.63 13.55 10.20 10.36 12.67 13.81 
q6 35.61 39.73 27.17 13.93 15.83 10.22 10.32 12.48 13.97 
q7 40.36 45.10 29.48 14.57 16.03 10.26 10.22 11.50 12.27 
q8 41.32 43.89 29.58 18.57 21.59 11.26 11.68 15.65 15.83 
q9 41.34 42.40 29.85 26.64 30.75 14.65 15.46 21.73 24.18 
Window size = 2 
Matrix computed with Dynamic Programming based on the: 
dist(i,j) = dist(s1, q1) + min {dist(i-1,j-1), dist(i, j-1), dist(i-1,j))
FORMULATION 
 Let D(i, j) refer to the dynamic time warping 
distance between the subsequences 
x1, x2, …, xi 
y1, y2, …, yj 
D(i, j) = | xi – yj | + min{ D(i – 1, j), D(i – 1, j – 1), D(i, j – 1) }
SOLUTION BY DYNAMIC PROGRAMMING 
 Basic implementation = O(n2) where n is the length of 
the sequences 
 will have to solve the problem for each (i, j) 
pair 
 If warping window is specified, then O(nw) 
 Only solve for the (i, j) pairs where | i – j | <= 
w
FEATURE-BASED DIMENSIONALITY 
REDUCTION 
20 
• Time series databases are often extremely large. 
Searching directly on these data will be very 
complex and inefficient. 
• To overcome this problem, we should use some of 
transformation methods to reduce the magnitude of 
time series. 
• These transformation methods are called 
dimensionality reduction techniques.
21 
Dimensionality Reduction 
C 
An Example of a 
Technique I 
0 20 40 60 80 100 120 140 
Raw 
Data 
0.4995 
0.5264 
0.5523 
0.5761 
0.5973 
0.6153 
0.6301 
0.6420 
0.6515 
0.6596 
0.6672 
0.6751 
0.6843 
0.6954 
0.7086 
0.7240 
0.7412 
0.7595 
0.7780 
0.7956 
0.8115 
0.8247 
0.8345 
0.8407 
0.8431 
0.8423 
0.8387 
… 
The graphic shows a 
time series with 128 
points. 
The raw data used to 
produce the graphic is 
also reproduced as a 
column of numbers (just 
the first 30 or so points are 
shown). 
n = 128
22 
Dimensionality Reduction 
C 
An Example of a 
Technique II 
0 20 40 60 80 100 120 140 
. . . . . . . . . . . . . . 
Fourier 
Coefficients 
1.5698 
1.0485 
0.7160 
0.8406 
0.3709 
0.4670 
0.2667 
0.1928 
0.1635 
0.1602 
0.0992 
0.1282 
0.1438 
0.1416 
0.1400 
0.1412 
0.1530 
0.0795 
0.1013 
0.1150 
0.1801 
0.1082 
0.0812 
0.0347 
0.0052 
0.0017 
0.0002 
... 
Raw 
Data 
0.4995 
0.5264 
0.5523 
0.5761 
0.5973 
0.6153 
0.6301 
0.6420 
0.6515 
0.6596 
0.6672 
0.6751 
0.6843 
0.6954 
0.7086 
0.7240 
0.7412 
0.7595 
0.7780 
0.7956 
0.8115 
0.8247 
0.8345 
0.8407 
0.8431 
0.8423 
0.8387 
… 
Truncated 
Fourier 
Coefficients 
1.5698 
1.0485 
0.7160 
0.8406 
0.3709 
0.4670 
0.2667 
0.1928 
n = 128 
N = 8 
Cratio = 1/16
Shanmukha Sreenivas P , DoMS 
23 
excellent approximation, with 
only 2 frequencies!
24 
Fourier Analysis of Time Series using R 
No. observations(n) = 11 
Max freq = (n-1)/2 =5w 
No. of cosines = {(n-1)/2}+1=6
25 
Fourier Analysis of Time Series using R 
No. observations(n) = 11 
Max freq = (n-1)/2 =5w 
No. of sines = {(n-1)/2}=5
0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 
26 
DFT DWT SVD APCA PAA PLA
DISCRETIZATION 
27 
• Discretization of a time series is tranforming it into a 
symbolic string. 
• The main benefit of this discretization is that there is an 
enormous wealth of existing algorithms and data structures 
that allow the efficient manipulations of symbolic 
representations. 
• Lin and Keogh et al. (2003) proposed a method called 
Symbolic Aggregate Approximation (SAX), which allows 
the descretization of original time series into symbolic 
strings.
SYMBOLIC AGGREGATE 
APPROXIMATION (SAX) [LIN ET AL. 2003] 
28 
baabccbc 
The first symbolic representation 
of time series, that allows 
discretization of time series into 
symbolic strings
HOW DO WE OBTAIN SAX 
29 
C 
C 
0 20 40 60 80 100 120 
0 
- 
b 
20 40 60 80 100 120 
b 
b 
a 
c 
c 
c 
a 
baabccbc 
First convert the time 
series to PAA 
representation, then 
convert the PAA to 
symbols
TWO PARAMETER CHOICES 
30 
0 20 40 60 80 100 120 
0 
- 
b 
20 40 60 80 100 120 
b 
b 
a 
c 
c 
c 
a 
C 
C 
1 2 3 4 5 6 7 
1 
8 
The word size, in this 
case 8 
The alphabet size (cardinality), in this case 3 
3 
2 
1
 Structural representations help in 
understanding time series through 
 Data analysis + Visualization 
 SAX is claimed to be a landmark representation 
of time series 
 Symbolic and therefore allows use of discrete data 
structures and their corresponding algorithms for 
analysis 
 Also helps with visualization 
31
THANK YOU 
 
www.cs.ucr.edu/~eamonn/TSDMA/index.html 
32 
Datasets and code used in 
this presentation can be 
found at..

Contenu connexe

Tendances

Hierarchical clustering.pptx
Hierarchical clustering.pptxHierarchical clustering.pptx
Hierarchical clustering.pptxNTUConcepts1
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data miningKamal Acharya
 
K means Clustering
K means ClusteringK means Clustering
K means ClusteringEdureka!
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clusteringKrish_ver2
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysisDataminingTools Inc
 
4.3 multimedia datamining
4.3 multimedia datamining4.3 multimedia datamining
4.3 multimedia dataminingKrish_ver2
 
K-Folds Cross Validation Method
K-Folds Cross Validation MethodK-Folds Cross Validation Method
K-Folds Cross Validation MethodSHUBHAM GUPTA
 
Data preparation
Data preparationData preparation
Data preparationTony Nguyen
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classificationKrish_ver2
 
Multilayer & Back propagation algorithm
Multilayer & Back propagation algorithmMultilayer & Back propagation algorithm
Multilayer & Back propagation algorithmswapnac12
 
Cluster Analysis Introduction
Cluster Analysis IntroductionCluster Analysis Introduction
Cluster Analysis IntroductionPrasiddhaSarma
 
Clustering - Machine Learning Techniques
Clustering - Machine Learning TechniquesClustering - Machine Learning Techniques
Clustering - Machine Learning TechniquesKush Kulshrestha
 
Multiclass classification of imbalanced data
Multiclass classification of imbalanced dataMulticlass classification of imbalanced data
Multiclass classification of imbalanced dataSaurabhWani6
 
K-means clustering algorithm
K-means clustering algorithmK-means clustering algorithm
K-means clustering algorithmVinit Dantkale
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series dataKrish_ver2
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notesBAIRAVI T
 
5.3 mining sequential patterns
5.3 mining sequential patterns5.3 mining sequential patterns
5.3 mining sequential patternsKrish_ver2
 

Tendances (20)

Hierarchical clustering.pptx
Hierarchical clustering.pptxHierarchical clustering.pptx
Hierarchical clustering.pptx
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 
K means Clustering
K means ClusteringK means Clustering
K means Clustering
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clustering
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 
Clusters techniques
Clusters techniquesClusters techniques
Clusters techniques
 
4.3 multimedia datamining
4.3 multimedia datamining4.3 multimedia datamining
4.3 multimedia datamining
 
K-Folds Cross Validation Method
K-Folds Cross Validation MethodK-Folds Cross Validation Method
K-Folds Cross Validation Method
 
Data preparation
Data preparationData preparation
Data preparation
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 
Multilayer & Back propagation algorithm
Multilayer & Back propagation algorithmMultilayer & Back propagation algorithm
Multilayer & Back propagation algorithm
 
Cluster Analysis Introduction
Cluster Analysis IntroductionCluster Analysis Introduction
Cluster Analysis Introduction
 
Clustering - Machine Learning Techniques
Clustering - Machine Learning TechniquesClustering - Machine Learning Techniques
Clustering - Machine Learning Techniques
 
Multiclass classification of imbalanced data
Multiclass classification of imbalanced dataMulticlass classification of imbalanced data
Multiclass classification of imbalanced data
 
K-means clustering algorithm
K-means clustering algorithmK-means clustering algorithm
K-means clustering algorithm
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notes
 
5.3 mining sequential patterns
5.3 mining sequential patterns5.3 mining sequential patterns
5.3 mining sequential patterns
 

En vedette

Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDatamining Tools
 
Time series-mining-slides
Time series-mining-slidesTime series-mining-slides
Time series-mining-slidesYanchang Zhao
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streamsKrish_ver2
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slidesmahavir_a
 
Time Series Analysis and Mining with R
Time Series Analysis and Mining with RTime Series Analysis and Mining with R
Time Series Analysis and Mining with RYanchang Zhao
 
08. Mining Type Of Complex Data
08. Mining Type Of Complex Data08. Mining Type Of Complex Data
08. Mining Type Of Complex DataAchmad Solichin
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersAlbert Bifet
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data miningMITS Gwalior
 
Data Mining and Intrusion Detection
Data Mining and Intrusion Detection Data Mining and Intrusion Detection
Data Mining and Intrusion Detection amiable_indian
 
Moa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsMoa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsAlbert Bifet
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningWan Aezwani Wab
 
Web mining (structure mining)
Web mining (structure mining)Web mining (structure mining)
Web mining (structure mining)Amir Fahmideh
 

En vedette (20)

Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Time series-mining-slides
Time series-mining-slidesTime series-mining-slides
Time series-mining-slides
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slides
 
Time Series Analysis and Mining with R
Time Series Analysis and Mining with RTime Series Analysis and Mining with R
Time Series Analysis and Mining with R
 
08. Mining Type Of Complex Data
08. Mining Type Of Complex Data08. Mining Type Of Complex Data
08. Mining Type Of Complex Data
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream Classifiers
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
 
Web mining
Web miningWeb mining
Web mining
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Data Mining and Intrusion Detection
Data Mining and Intrusion Detection Data Mining and Intrusion Detection
Data Mining and Intrusion Detection
 
Spatial databases
Spatial databasesSpatial databases
Spatial databases
 
Moa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsMoa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data Streams
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule Mining
 
Web mining (structure mining)
Web mining (structure mining)Web mining (structure mining)
Web mining (structure mining)
 
Time series slideshare
Time series slideshareTime series slideshare
Time series slideshare
 
Anomaly Detection
Anomaly DetectionAnomaly Detection
Anomaly Detection
 
WEB MINING.
WEB MINING.WEB MINING.
WEB MINING.
 

Similaire à Time series data mining techniques

Accelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUAccelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUDavide Nardone
 
Compressed learning for time series classification
Compressed learning for time series classificationCompressed learning for time series classification
Compressed learning for time series classification學翰 施
 
Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive ModellingRajiv Advani
 
Una introducción a la minería de series temporales
Una introducción a la minería de series temporalesUna introducción a la minería de series temporales
Una introducción a la minería de series temporalesFacultad de Informática UCM
 
Anomaly Detection in Sequences of Short Text Using Iterative Language Models
Anomaly Detection in Sequences of Short Text Using Iterative Language ModelsAnomaly Detection in Sequences of Short Text Using Iterative Language Models
Anomaly Detection in Sequences of Short Text Using Iterative Language ModelsCynthia Freeman
 
Efficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input DataEfficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input Dataymelka
 
DSD-INT - SWAN Advanced Course - 02 - Setting up a SWAN computation
DSD-INT - SWAN Advanced Course - 02 - Setting up a SWAN computationDSD-INT - SWAN Advanced Course - 02 - Setting up a SWAN computation
DSD-INT - SWAN Advanced Course - 02 - Setting up a SWAN computationDeltares
 
Fuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networksFuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networksmourya chandra
 
Mining of time series data base using fuzzy neural information systems
Mining of time series data base using fuzzy neural information systemsMining of time series data base using fuzzy neural information systems
Mining of time series data base using fuzzy neural information systemsDr.MAYA NAYAK
 
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfHailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfcookie1969
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for Graphspione30
 
R data mining-Time Series Analysis with R
R data mining-Time Series Analysis with RR data mining-Time Series Analysis with R
R data mining-Time Series Analysis with RDr. Volkan OBAN
 
SPECTRAL-BASED FATIGUE ASSESSMENT OF FSO
SPECTRAL-BASED FATIGUE ASSESSMENT OF FSOSPECTRAL-BASED FATIGUE ASSESSMENT OF FSO
SPECTRAL-BASED FATIGUE ASSESSMENT OF FSOSUMARDIONO .
 
Feature Scaling with R.pdf
Feature Scaling with R.pdfFeature Scaling with R.pdf
Feature Scaling with R.pdfShakiruBankole2
 

Similaire à Time series data mining techniques (20)

Accelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPUAccelerating Dynamic Time Warping Subsequence Search with GPU
Accelerating Dynamic Time Warping Subsequence Search with GPU
 
Compressed learning for time series classification
Compressed learning for time series classificationCompressed learning for time series classification
Compressed learning for time series classification
 
Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive Modelling
 
Una introducción a la minería de series temporales
Una introducción a la minería de series temporalesUna introducción a la minería de series temporales
Una introducción a la minería de series temporales
 
Anomaly Detection in Sequences of Short Text Using Iterative Language Models
Anomaly Detection in Sequences of Short Text Using Iterative Language ModelsAnomaly Detection in Sequences of Short Text Using Iterative Language Models
Anomaly Detection in Sequences of Short Text Using Iterative Language Models
 
Efficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input DataEfficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input Data
 
DSD-INT - SWAN Advanced Course - 02 - Setting up a SWAN computation
DSD-INT - SWAN Advanced Course - 02 - Setting up a SWAN computationDSD-INT - SWAN Advanced Course - 02 - Setting up a SWAN computation
DSD-INT - SWAN Advanced Course - 02 - Setting up a SWAN computation
 
Digital control book
Digital control bookDigital control book
Digital control book
 
Fuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networksFuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networks
 
D143136
D143136D143136
D143136
 
Ewdts 2018
Ewdts 2018Ewdts 2018
Ewdts 2018
 
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
 
Mining of time series data base using fuzzy neural information systems
Mining of time series data base using fuzzy neural information systemsMining of time series data base using fuzzy neural information systems
Mining of time series data base using fuzzy neural information systems
 
Kk2518251830
Kk2518251830Kk2518251830
Kk2518251830
 
Kk2518251830
Kk2518251830Kk2518251830
Kk2518251830
 
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfHailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for Graphs
 
R data mining-Time Series Analysis with R
R data mining-Time Series Analysis with RR data mining-Time Series Analysis with R
R data mining-Time Series Analysis with R
 
SPECTRAL-BASED FATIGUE ASSESSMENT OF FSO
SPECTRAL-BASED FATIGUE ASSESSMENT OF FSOSPECTRAL-BASED FATIGUE ASSESSMENT OF FSO
SPECTRAL-BASED FATIGUE ASSESSMENT OF FSO
 
Feature Scaling with R.pdf
Feature Scaling with R.pdfFeature Scaling with R.pdf
Feature Scaling with R.pdf
 

Plus de Shanmukha S. Potti

Technology analysis using patent citation network of a seminal patent final
Technology analysis using patent citation network of a seminal patent finalTechnology analysis using patent citation network of a seminal patent final
Technology analysis using patent citation network of a seminal patent finalShanmukha S. Potti
 
Technology Analysis Using Patent Citation Network of a Seminal Patent
Technology Analysis Using Patent Citation Network of a Seminal PatentTechnology Analysis Using Patent Citation Network of a Seminal Patent
Technology Analysis Using Patent Citation Network of a Seminal PatentShanmukha S. Potti
 
A brief introduction to 'R' statistical package
A brief introduction to 'R' statistical packageA brief introduction to 'R' statistical package
A brief introduction to 'R' statistical packageShanmukha S. Potti
 
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...Shanmukha S. Potti
 
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLER
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLERBIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLER
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLERShanmukha S. Potti
 
Proactive planning for catastrophic events in supply chains
Proactive planning for catastrophic events in supply chainsProactive planning for catastrophic events in supply chains
Proactive planning for catastrophic events in supply chainsShanmukha S. Potti
 
HR Analytics: New approaches, higher returns on human capital investment
HR Analytics: New approaches, higher returns on human capital investmentHR Analytics: New approaches, higher returns on human capital investment
HR Analytics: New approaches, higher returns on human capital investmentShanmukha S. Potti
 
Commercialization Options for a set of Wireless Patents
Commercialization Options for a set of Wireless PatentsCommercialization Options for a set of Wireless Patents
Commercialization Options for a set of Wireless PatentsShanmukha S. Potti
 
How NOT to make a presentation!!
How NOT to make a presentation!!How NOT to make a presentation!!
How NOT to make a presentation!!Shanmukha S. Potti
 

Plus de Shanmukha S. Potti (9)

Technology analysis using patent citation network of a seminal patent final
Technology analysis using patent citation network of a seminal patent finalTechnology analysis using patent citation network of a seminal patent final
Technology analysis using patent citation network of a seminal patent final
 
Technology Analysis Using Patent Citation Network of a Seminal Patent
Technology Analysis Using Patent Citation Network of a Seminal PatentTechnology Analysis Using Patent Citation Network of a Seminal Patent
Technology Analysis Using Patent Citation Network of a Seminal Patent
 
A brief introduction to 'R' statistical package
A brief introduction to 'R' statistical packageA brief introduction to 'R' statistical package
A brief introduction to 'R' statistical package
 
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...
 
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLER
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLERBIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLER
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLER
 
Proactive planning for catastrophic events in supply chains
Proactive planning for catastrophic events in supply chainsProactive planning for catastrophic events in supply chains
Proactive planning for catastrophic events in supply chains
 
HR Analytics: New approaches, higher returns on human capital investment
HR Analytics: New approaches, higher returns on human capital investmentHR Analytics: New approaches, higher returns on human capital investment
HR Analytics: New approaches, higher returns on human capital investment
 
Commercialization Options for a set of Wireless Patents
Commercialization Options for a set of Wireless PatentsCommercialization Options for a set of Wireless Patents
Commercialization Options for a set of Wireless Patents
 
How NOT to make a presentation!!
How NOT to make a presentation!!How NOT to make a presentation!!
How NOT to make a presentation!!
 

Dernier

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 

Dernier (20)

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 

Time series data mining techniques

  • 1. IT'S ABOUT TIME !! Presented By- P.SHANMUKHA SREENIVAS M.MGT 1
  • 2. AN OVERVIEW ON TIME SERIES DATA MINING OUTLINE 2 1. Introduction 2. Similarity Search in Time Series Data 3. Feature-based Dimensionality Reduction 4. Discretization 5. Other Time Series Data Mining Tasks 6. Conclusions
  • 3. 3 Introduction 6145.45 6128.75 6142.7 6201.2 6151.9 6050.95 5917.75 5855.95 5984 5993.9 5934.8 5920.05 5950 5950.7 5963.8 6141.15 .. .. 6471.4 6511.7 6563.25 6558.45 6492.7 6546.75 A time series is a collection of observations made sequentially in time. CNX IT returns Examples: Financial time series, scientific time series
  • 4. TIME SERIES SIMILARITY SEARCH 4 Some examples: - Identifying companies with similar patterns of growth. - Determining products with similar selling patterns - Discovering stocks with similar movement in stock prices. - Finding out whether a musical score is similar to one of a set of copyrighted scores.
  • 5. Major Time Series Data Mining Tasks • Indexing • Clustering • Classification • Prediction • Anomaly Detection Indexing and clustering make explicit use of a distance measure The others make implicit use of a distance measure
  • 6. TIME SERIES SIMILARITY SEARCH DISTANCE MEASURES  Euclidean distance  Dynamic Time Warping  Other distance measures o Threshold query based similarity search (TQuEST) o Minkowski Distance 6
  • 7. 7 Euclidean Distance Metric Given two time series Q = q1…qn and C = c1…cn their Euclidean distance is defined as: n 2 ,       i i D Q C q c  i 1 C Q D(Q,C)
  • 8. What’s wrong with Euclidean Distance? Similar sequences but they are shifted and have different scales Normalize the time series before measuring the distance between them. 푥푖 What if a sequence is stretched or compressed along the time axis? (Goldin and Kanellakis, 1995) ′ = 푥푖 − μ σ
  • 9. 9 Dynamic Time Warping (Berndt et al.) Dynamic Time Warping is a technique that finds the optimal alignment between two time series if one time series may be “warped” non-linearly by stretching or shrinking it along its time axis. This warping between two time series can be used or to determine the similarity between the two time series. Fixed Time Axis Sequences are aligned “one to one”. “Warped” Time Axis Nonlinear alignments are possible.
  • 10. DYNAMIC TIME WARPING [BERNDT, CLIFFORD, 1994]  Allows acceleration-deceleration of signals along the time dimension  Basic idea X = (x1; x2; :::xN); N є N Y = (y1; y2; :::yM); M є N *Data sequences should be sampled at equidistant points in time  Algorithm starts by building the distance matrix C є R (N*M) representing all pairwise distances between X and Y This distance matrix is also called as the local cost matrix c(i,j) = ||xi - yj|| i є [1 : N]; j є [1 : M]  Once the local cost matrix is built, the algorithm finds the alignment path which runs through the low-cost areas – ‘valleys’ on the augmented cost matrix
  • 11. C Q C Q HOW IS DTW CALCULATED? (i,j) = d(qi,cj) + min{ (i-1,j-1) , (i-1,j ) , (i,j-1) } Warping path w
  • 12. CONSTRAINTS  Boundary condition Shanmukha Sreenivas P , DoMS The starting and ending points of the warping path must be the first and the last points of aligned sequences i.e C1 =(1,1) Ck=(M,N)  Monotonicity condition n1< n2 < ::: < nK and m1< m2< :::< mK. This condition preserves the time-ordering of points.  Step size condition This criteria limits the warping path from long jumps (shifts in time) while aligning sequences. i.e we’ll be looking at only these values w(i-1,j-1) , w(i-1,j ) , w(i,j-1) 12
  • 13. Shanmukha Sreenivas P , DoMS CONSTRAINT VISUALIZATION a)Admissible path satisfying constraints b)Violation of boundary condition c)Violation of monotonicity d)Violation of step size 13
  • 14. STEP SIZE CONDITION A global constraint constrains the indices of the warping path wk = (i,j)k such that j-r  i  j+r Where r is a term defining allowed range of warping for a given point in a sequence. r = Sakoe-Chiba Band Itakura Parallelogram
  • 15. DYNAMIC TIME WARPING 15 Advantages:
  • 16. EXAMPLE s1 s2 s3 s4 s5 s6 s7 s8 s9 q1 3.76 8.07 1.64 1.08 2.86 0.00 0.06 1.88 1.25 q2 2.02 5.38 0.58 2.43 4.88 0.31 0.59 3.57 2.69 q3 6.35 11.70 3.46 0.21 1.23 0.29 0.11 0.62 0.29 q4 16.8 25.10 11.90 1.28 0.23 4.54 3.69 0.64 1.10 q5 3.20 7.24 1.28 1.42 3.39 0.04 0.16 2.31 1.61 q6 3.39 7.51 1.39 1.30 3.20 0.02 0.12 2.16 1.49 q7 4.75 9.49 2.31 0.64 2.10 0.04 0.00 1.28 0.77 q8 0.96 3.53 0.10 4.00 7.02 1.00 1.46 5.43 4.33 q9 0.02 1.08 0.27 8.07 12.18 3.39 4.20 10.05 8.53 Matrix of the pair-wise distances for element si with qj
  • 17. EXAMPLE s1 s2 s3 s4 s5 s6 s7 s8 s9 q1 3.76 11.83 13.47 14.55 17.41 17.41 17.47 19.35 20.60 q2 5.78 9.14 9.72 12.15 17.03 17.34 17.93 21.04 22.04 q3 12.13 17.48 12.60 9.93 11.16 11.45 11.56 12.18 12.47 q4 29.02 37.23 24.50 11.21 10.16 14.70 15.14 12.20 13.28 q5 32.22 36.26 25.78 12.63 13.55 10.20 10.36 12.67 13.81 q6 35.61 39.73 27.17 13.93 15.83 10.22 10.32 12.48 13.97 q7 40.36 45.10 29.48 14.57 16.03 10.26 10.22 11.50 12.27 q8 41.32 43.89 29.58 18.57 21.59 11.26 11.68 15.65 15.83 q9 41.34 42.40 29.85 26.64 30.75 14.65 15.46 21.73 24.18 Window size = 2 Matrix computed with Dynamic Programming based on the: dist(i,j) = dist(s1, q1) + min {dist(i-1,j-1), dist(i, j-1), dist(i-1,j))
  • 18. FORMULATION  Let D(i, j) refer to the dynamic time warping distance between the subsequences x1, x2, …, xi y1, y2, …, yj D(i, j) = | xi – yj | + min{ D(i – 1, j), D(i – 1, j – 1), D(i, j – 1) }
  • 19. SOLUTION BY DYNAMIC PROGRAMMING  Basic implementation = O(n2) where n is the length of the sequences  will have to solve the problem for each (i, j) pair  If warping window is specified, then O(nw)  Only solve for the (i, j) pairs where | i – j | <= w
  • 20. FEATURE-BASED DIMENSIONALITY REDUCTION 20 • Time series databases are often extremely large. Searching directly on these data will be very complex and inefficient. • To overcome this problem, we should use some of transformation methods to reduce the magnitude of time series. • These transformation methods are called dimensionality reduction techniques.
  • 21. 21 Dimensionality Reduction C An Example of a Technique I 0 20 40 60 80 100 120 140 Raw Data 0.4995 0.5264 0.5523 0.5761 0.5973 0.6153 0.6301 0.6420 0.6515 0.6596 0.6672 0.6751 0.6843 0.6954 0.7086 0.7240 0.7412 0.7595 0.7780 0.7956 0.8115 0.8247 0.8345 0.8407 0.8431 0.8423 0.8387 … The graphic shows a time series with 128 points. The raw data used to produce the graphic is also reproduced as a column of numbers (just the first 30 or so points are shown). n = 128
  • 22. 22 Dimensionality Reduction C An Example of a Technique II 0 20 40 60 80 100 120 140 . . . . . . . . . . . . . . Fourier Coefficients 1.5698 1.0485 0.7160 0.8406 0.3709 0.4670 0.2667 0.1928 0.1635 0.1602 0.0992 0.1282 0.1438 0.1416 0.1400 0.1412 0.1530 0.0795 0.1013 0.1150 0.1801 0.1082 0.0812 0.0347 0.0052 0.0017 0.0002 ... Raw Data 0.4995 0.5264 0.5523 0.5761 0.5973 0.6153 0.6301 0.6420 0.6515 0.6596 0.6672 0.6751 0.6843 0.6954 0.7086 0.7240 0.7412 0.7595 0.7780 0.7956 0.8115 0.8247 0.8345 0.8407 0.8431 0.8423 0.8387 … Truncated Fourier Coefficients 1.5698 1.0485 0.7160 0.8406 0.3709 0.4670 0.2667 0.1928 n = 128 N = 8 Cratio = 1/16
  • 23. Shanmukha Sreenivas P , DoMS 23 excellent approximation, with only 2 frequencies!
  • 24. 24 Fourier Analysis of Time Series using R No. observations(n) = 11 Max freq = (n-1)/2 =5w No. of cosines = {(n-1)/2}+1=6
  • 25. 25 Fourier Analysis of Time Series using R No. observations(n) = 11 Max freq = (n-1)/2 =5w No. of sines = {(n-1)/2}=5
  • 26. 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 26 DFT DWT SVD APCA PAA PLA
  • 27. DISCRETIZATION 27 • Discretization of a time series is tranforming it into a symbolic string. • The main benefit of this discretization is that there is an enormous wealth of existing algorithms and data structures that allow the efficient manipulations of symbolic representations. • Lin and Keogh et al. (2003) proposed a method called Symbolic Aggregate Approximation (SAX), which allows the descretization of original time series into symbolic strings.
  • 28. SYMBOLIC AGGREGATE APPROXIMATION (SAX) [LIN ET AL. 2003] 28 baabccbc The first symbolic representation of time series, that allows discretization of time series into symbolic strings
  • 29. HOW DO WE OBTAIN SAX 29 C C 0 20 40 60 80 100 120 0 - b 20 40 60 80 100 120 b b a c c c a baabccbc First convert the time series to PAA representation, then convert the PAA to symbols
  • 30. TWO PARAMETER CHOICES 30 0 20 40 60 80 100 120 0 - b 20 40 60 80 100 120 b b a c c c a C C 1 2 3 4 5 6 7 1 8 The word size, in this case 8 The alphabet size (cardinality), in this case 3 3 2 1
  • 31.  Structural representations help in understanding time series through  Data analysis + Visualization  SAX is claimed to be a landmark representation of time series  Symbolic and therefore allows use of discrete data structures and their corresponding algorithms for analysis  Also helps with visualization 31
  • 32. THANK YOU  www.cs.ucr.edu/~eamonn/TSDMA/index.html 32 Datasets and code used in this presentation can be found at..