K. Yoshida, M. Inui, T. Yairi, K. Machida, M. Shioya, and Y. Masukawa, "Identification of Causal Variables for Building Energy Fault Detection by Semi-supervised LDA and Decision Boundary Analysis", in Proc. ICDM Workshops, 2008, pp.164-173.
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Identification of Causal Variables for Building Energy Fault Detection by Semisupervised LDA
1. 2nd Workshop on Domain Driven Data Mining, Session I : S2208
Dec. 15, 2008
Palazzo dei Congressi, Pisa, Italy
Identification of Causal Variables
for Building Energy Fault Detection
by Semi-supervised LDA
&
Decision Boundary Analysis
Keigo Yoshida, Minoru Inui, Takehisa Yairi, Kazuo Machida
(Dept. of Aeronautics & Astronautics, the Univ. of Tokyo)
Masaki Shioya, and Yoshio Masukawa
(Kajima Corp.)
1
2. 2
Main Point of the Presentation
We propose …
A Supportive Method for Anomaly Cause Identification
by
Combining Traditional Data Analysis
and Domain Knowledge
Applied to Real Building Energy Management System (BEMS)
Root cause of energy wastes was found successfully
3. 3
Outline
Introduction
Theories
Experiments for Real Data
Conclusions
4. 4
Introduction: What is BEMS ?
Building Energy Management Systems
Collect/Monitor Sensor Data in BLDG
(temperature, heat consumption etc…)
Energy-efficient Control
Discover Energy Faults (wastes)
I/F
BEMS
5. 5
Introduction: Problem of BEMS
Hard to identify root causes of Energy Faults (EF)
Complex Relation between Equipments
Data Deluge from Numerous Sensors
(approx. 2000 sensors for 20-story)
Current EF Detection:
Heuristics Based on Expert’s Empirical Knowledge,
usually fuzzy “IF-THEN” rules.
“Heuristic Diagnostics is Incomplete”
Fuzziness False Negative Error
Detection-Only Cannot Improve Systems
6. 6
Early Fault Diagnosis Methods
Performance
Knowledge-Based
Data-Driven
Modeling-Based
• Feature Extraction
• FTA/FMEA
• Neural Networks…
• Bayesian
Filtering Expert System
• FDA…
Fuzzy Logic Unsupervised
Supervised Learning Learning /
Data Mining
Experts Source Data
Easy Interpretation Hard
Expensive Modeling Cost Low
Poor Versatility High
Knowledge Acquisition Bottleneck Neglecting Useful Knowledge
7. 7
Proposed Method
Performance
Knowledge-Based
Data-Driven
Modeling-Based
Expert System Proposal
Domain Knowledge
Fuzzy Logic Unsupervised
+
Supervised Learning AnalysisLearning /
Data
Data Mining
- Characteristics -
Experts Source Data
Interpretation:
Easy exploit domain knowledge
Interpretation Hard
Expensive Modeling Cost Low
Cost:
Poor not so high, empirical knowledge
Versatility only
High
Versatility: easy to apply to various domains & problems
Performance: better than heuristics
8. 8
Conceptual Diagram
Learning Boundary
Experts Detection Rule
e.g.
Feedback
* Assumption *
Variable Identification
Contribution to EF
Incomplete heuristics surely Data Distribution
represent abnormal Acquire Reliable Labels
phenomena with Given Rule
DBA
Semi-supervised LDA
Variable #
9. 9
Outline
Introduction
Theories
Semi-Supervised Linear Discriminant Analysis
Decision Boundary Analysis
Experiments for Real Data
Conclusions
10. 10
Semi-supervised LDA
Learning Boundary
Data Distribution
Acquire Reliable Labels
with Given Rule
11. 11
Manifold Regularization [M. Belkin et al. 05]
Labeled data only
Regularized Least Square
Squared loss Penalty Term
for labeled data (usually squared
function norm)
12. 12
Manifold Regularization [M. Belkin et al. 05]
Labeled data only
Regularized Least Square
Squared loss Penalty Term
for labeled data (usually squared
function norm)
Laplacian RLS:
Squared loss Penalty Term Additional term
for intrinsic geometry
Use labeled & unlabeled data
Assumption:
Geometrically close
⇒ similar label : graph Laplacian
13. 13
Semi-Supervised Linear Discriminant Analysis (SS-LDA)
LDA seeks projection for small within-cov. & large between-cov.
Between-class
Within-class
Regularized Discriminant Analysis:
[Friedman 89]
Regularizer
Semi-Supervised Discriminant Analysis (SS-LDA):
14. 14
Decision Boundary Analysis
Learning Boundary
Data Distribution
Acquire Reliable Labels
with Given Rule
Semi-supervised LDA
15. 15
Decision Boundary Analysis
Feature Extraction method proposed by Lee & Landgrabe
C. Lee & D. A. Landgrabe. Feature Extraction Based on Decision Boundary, IEEE
Trans. Pattern Anal. Mach. Intell. 15(4): 388-400, 1993
Class 2 Learned Class 1 Top view Cross-section view
Boundary
Normal vec.
: disciminantly informative : discriminantly redundant
Extract informative features from
normal vectors on the boundary
16. 16
Decision Boundary Feature Matrix
Linear:
Nonlinear:
Define responsibility of each variables for discrimination
17. 17
Outline
Introduction
Theories
Experiments
Application to Energy Fault Analysis
Conclusions
18. 18
Energy Fault Diagnosis Problem
EF: Inverter overloaded
Detection Rule
6h M.A. of Inverter output = 100 EF
… but I don’t know the cause
cold
Inverter
hot
coil
Air Handling Unit
humidity
19. 19
Energy Fault Diagnosis Problem
EF: Inverter overloaded
Detection Rule
6h M.A. of Inverter output = 100 EF
… but I don’t know the cause
DATA
cold
& Inverter
hot RULE
coil
Find out root cause of inverter overload
Air Handling Unit
humidity
20. 20
Energy Fault Diagnosis - Settings
Air-conditioning time-series sensor data for 1 unit
instances: 744
Labeled sample: 10 for each (3% of all)
(based on probability proportional to distance from boundary)
Hyper-parameters: NN = 5,
13 attributes, all continuous
1. Supply Air (SA) Temp. 8. Humidifier Valve Opening
2. Room Tempe. 9. Return Air Temperature
3. Supply Air Temp. Setting 10. Pressure Diff. between In-Outside
4. Room Humidity 11. Moving Ave. of Pressure Difference
5. Inverter Output 12. Outside Air Temperature
6. Cooing Water Valve Opening 13. Outside Humidity
7. Hot Water Valve Opening
22. 22
Results (100 times ave.)
Contribution Score [%] 0 20 40 60 80 100
SA Te m p .
Ro o m Te m p .
SA Se t t in g
Ro o m Hu m id it y
Inverte
I ve rt e r
n
r
C o o lin g Wa t e r
Ho t Wa t e r
H u m id if ie r
Re t u r n Air Te m p .
Pr e s s u r e Dif f .
MA. Pr e s s u r e LDA
O u t s id e Te m p .
O u t s id e Hu m id it y
<LDA>
Inverter (96%) Trivial
23. 23
Results (100 times ave.)
Contribution Score [%] 0 20 40 60 80 100
SA Temp.
SA Te m p .
Ro o m Te m p .
SA Se t t in g
Ro o m Hu m id it y
I ve rt e r
n
CCooling t e r
o o lin g Wa
watert Wa t e r
Ho
H u m id if ie r
Re t u r n Air Te m p .
Pr e s s u r e Dif f .
MA. Pr e s s u r e LDA
O u t s id e Te m p . SSLDA
O u t s id e Hu m id it y
<LDA> <SSLDA>
Inverter (96%) Cool water (75%)
SA temp. (12%)
24. 24
Results (100 times ave.)
Contribution Score [%] 0 20 40 60 80 100
SA Te m p .
Ro o m Te m p . Not Distinctive !
SA Se t t in g
Ro o m Hu m id it y
I ve rt e r
n
C o o lin g Wa t e r
Ho t Wa t e r
H u m id if ie r
Re t u r n Air Te m p .
Pr e s s u r e Dif f . LDA
MA. Pr e s s u r e SSLDA
O u t s id e Te m p . K DA
O u t s id e Hu m id it y
<LDA> <SSLDA> <KDA>
Inverter (96%) Cool water (75%) Cool water (19%)
SA temp. (12%) MA. Pressure (15%)
Inverter (15%)
…
25. 25
Results (100 times ave.)
Contribution Score [%] 0 20 40 60 80 100
[1] SA Te SA. mp
Temp.
Ro o m Te m p .
[2] SA Se t tSA in g
Setting
Ro o m Hu m id it y
Inverterr
I ve rt e
n
[3] C o o lin g Wa t e r
Cooling
Ho t water
Wa t e r
H u m id if ie r
Re t u r n Air Te m p .
Pr e s s u r e Dif f . LDA
MA. Pr e s s u r e SSLDA
O u t s id e Te m p . K DA
O u t s id e Hu m id it y SSK DA
<LDA> <SSLDA> <KDA> <SSKDA>
Inverter (96%) Cool water (75%) Cool water (19%) Inverter (33%)
SA temp. (12%) MA. Pressure (15%) SA temp (19%)
Inverter (15%) Cool Water (17%)
SA setting (13%)
…
26. 26
Energy Fault Diagnosis: Examine Row Data
Cooling water valve Opening [3]
valve opens completely, but this is result of EF, not cause
27. 27
Energy Fault Diagnosis: Examine Row Data
Cooling water valve Opening
valve opens completely, but this is result of EF, not cause
SSLDA/SSKDAdeviation… temp. [1] & setting [2] responsible
To reduce this show SA
• Operate inverter at peak power
deviation of SA temp.
• Open cooling water valve
29. 29
Outline
Introduction
Theories
Experiments for Real Data
Conclusions
30. 30
Conclusions
Introduce identification method of causal variables
by combining semi-supervised LDA & DBA
Labels are acquired from imperfect domain-specific rule
SS-LDA/SS-KDA: reflect domain knowledge & avoid over-fitting
DBA: extract informative features from normal direction of boundary
Apply to energy fault cause diagnosis
Succeeded in extracting some responsible features
beginning with fuzzy heuristics based on domain knowledge
31. 31
Room for improvements
Consider temporal continuity
Time-series is not i.i.d.
Find True Cause from Correlating Variables
34. 34
Extension to Multiple Energy Faults
In real systems, various faults take place
Fault cause varies among phenomena
Need to separate phenomena and diagnose respectively
<Our Approach>
1. Extract points detected by existing heuristics
2. Reduce dimensionality and visualize data in low-dim. space
3. Clustering data and give them labels
4. Identify variables discriminating that cluster from normal data
35. 35
Experimental Condition & Results
Air-conditioning sensor data, 13 attributes, same heuristics
748 instances, operating time only (hourly data for 2 months)
137 points are detected by heuristics
Reduce dimensionality by isomap [J.B. Tenenbaum 00] (kNN = 5)
Contribution score is given by SS-KDA (kNN = 5, )
<2D representation>
2 major cluster,
4 anomalies
36. 36
Contribution score for red points
Experimental Condition &Room air Temp.
Results
superficial
Air-conditioning sensor data, 13 attributes, same heuristics
748 instances, operating time only (hourly data for 2 months)
137 points are detected by heuristics
Reduce dimensionality by isomap [J.B. Tenenbaum 00] (kNN = 5)
Contribution score is given by SS-KDA (kNN = 5, )
<2D representation>
Deviation of Room Air Temp.
2 major cluster, around detected points
4 anomalies
Detected, this is EF
38. 38
Data Distribution
Linearly Separable
for Cooling Water Valve [3]
Cooling Water Valve [%]
39. 39
Probabilistic Labeling
Rule Points distant from boundary are
reliable as class labels
Keep robustness against outliers
outlier
Points are stochastically given labels
Unreliable based on reliability
: Distance from boundary of point
40. 40
Estimate DBFM
Linear Case:
Nonlinear Case
Difficult to acquire points on boundary & calculate gradient vector
Disciminant function is linear in feature space
Input space Feature space
Kernelized SSLDA
(SS-KDA)
41. 41
DBFM for Nonlinear Distribution (1)
Feature space
1. Generate points on boundary in feature space
2. Gradient vector at corresponding point
for Gaussian kernel
Input space
But to find pre-image is generally difficult…
By kernel trick, pre-image problem is avoidable
42. 42
DBFM for Nonlinear Distribution (2)
Finally we have gradient vectors on boundary for each point
3. Construct estimated DBFM
Define responsibility of each variables for discrimination
Max. eigenvalue
43. 43
Verification by Benchmark Data – wine discrimination -
UCI Machine Learning Repository: Wine Dataset
Consider 2-class problem (Original data contain 3)
Number of Instances: wine A: 59, wine B: 71
13 attributes, all continuous
1. Alcohol Ad hoc Rule: Color intensity > 4 wine A
2. Malic otherwise wine B
3. Ash
4. Alkalinity of Ash Histogram
5. Magnesium
6. Phenols
7. Flavonoids Frequency
8. Nonflavonoid phenols
9. Proanthocyanins
10. Color intensity
11. Hue
12. OD280/OD315 of diluted wines
13. Proline
Color intensity
44. 44
Result on Benchmark Data
Acquire only 3 labels for each class based on probability
proportional to distance from boundary (color intensity = 4)
Hyper-parameters: Nearest neighbors = 3,
100 times average
Most 3 responsible attributes
<LDA>
1. Flavonoids (7): 18.0%
2. Color intensity (10): 13.2%
3. Phenols (6): 11.6 %
[42.8%]
<SS-LDA>
1. Proline (13): 26.5%
2. Color intensity (10): 22.1%
3. Alcohol (1): 14.2%
[62.8%]
45. 45
Comparison of SSLDA with LDA
Plot data in space spanned by most 3 responsible features
LDA SSLDA
Apparently SSLDA gives effective features for discrimination