Identification of Causal Variables for Building Energy Fault Detection by Semisupervised LDA

2nd Workshop on Domain Driven Data Mining, Session I ： S2208
Dec. 15, 2008
Palazzo dei Congressi, Pisa, Italy

Identification of Causal Variables
for Building Energy Fault Detection
by Semi-supervised LDA
&
Decision Boundary Analysis

Keigo Yoshida, Minoru Inui, Takehisa Yairi, Kazuo Machida
(Dept. of Aeronautics & Astronautics, the Univ. of Tokyo)

Masaki Shioya, and Yoshio Masukawa
(Kajima Corp.)
1

2

Main Point of the Presentation
We propose …

A Supportive Method for Anomaly Cause Identification
by
Combining Traditional Data Analysis
and Domain Knowledge

Applied to Real Building Energy Management System (BEMS)
Root cause of energy wastes was found successfully

3

Outline

 Introduction
 Theories
 Experiments for Real Data
 Conclusions

4

Introduction: What is BEMS ?
 Building Energy Management Systems
 Collect/Monitor Sensor Data in BLDG
(temperature, heat consumption etc…)
 Energy-efficient Control
 Discover Energy Faults (wastes)

I/F

BEMS

5

Introduction: Problem of BEMS
 Hard to identify root causes of Energy Faults (EF)
 Complex Relation between Equipments
 Data Deluge from Numerous Sensors
(approx. 2000 sensors for 20-story)

 Current EF Detection:
Heuristics Based on Expert’s Empirical Knowledge,
usually fuzzy “IF-THEN” rules.
“Heuristic Diagnostics is Incomplete”
 Fuzziness False Negative Error
 Detection-Only Cannot Improve Systems

6

Early Fault Diagnosis Methods
Performance
Knowledge-Based
Data-Driven
Modeling-Based
• Feature Extraction
• FTA/FMEA
• Neural Networks…
• Bayesian
Filtering Expert System
• FDA…
Fuzzy Logic Unsupervised
Supervised Learning Learning /
Data Mining

Experts Source Data
Easy Interpretation Hard
Expensive Modeling Cost Low
Poor Versatility High

Knowledge Acquisition Bottleneck Neglecting Useful Knowledge

7

Proposed Method
Performance
Knowledge-Based
Data-Driven
Modeling-Based

Expert System Proposal
Domain Knowledge
Fuzzy Logic Unsupervised
+
Supervised Learning AnalysisLearning /
Data
Data Mining

- Characteristics -
Experts Source Data
Interpretation:
Easy exploit domain knowledge
Interpretation Hard
Expensive Modeling Cost Low
Cost:
Poor not so high, empirical knowledge
Versatility only
High
Versatility: easy to apply to various domains & problems
Performance: better than heuristics

8

Conceptual Diagram
Learning Boundary
Experts Detection Rule

e.g.

Feedback
* Assumption *
Variable Identification
Contribution to EF
Incomplete heuristics surely Data Distribution
represent abnormal Acquire Reliable Labels
phenomena with Given Rule
DBA
Semi-supervised LDA
Variable #

9

Outline

 Introduction
 Theories
 Semi-Supervised Linear Discriminant Analysis
 Decision Boundary Analysis
 Conclusions

10

Semi-supervised LDA
Learning Boundary

Data Distribution

Acquire Reliable Labels
with Given Rule

11

Manifold Regularization [M. Belkin et al. 05]
Labeled data only
 Regularized Least Square

Squared loss Penalty Term
for labeled data (usually squared
function norm)

12

Manifold Regularization [M. Belkin et al. 05]
Labeled data only
 Regularized Least Square

Squared loss Penalty Term
for labeled data (usually squared
function norm)
 Laplacian RLS:

Squared loss Penalty Term Additional term
for intrinsic geometry
Use labeled & unlabeled data
Assumption:
Geometrically close
⇒ similar label : graph Laplacian

13

Semi-Supervised Linear Discriminant Analysis (SS-LDA)

 LDA seeks projection for small within-cov. & large between-cov.

Between-class

Within-class

 Regularized Discriminant Analysis:
[Friedman 89]

Regularizer
 Semi-Supervised Discriminant Analysis (SS-LDA):

14

Learning Boundary

Data Distribution

Acquire Reliable Labels
with Given Rule

Semi-supervised LDA

15

 Feature Extraction method proposed by Lee & Landgrabe
C. Lee & D. A. Landgrabe. Feature Extraction Based on Decision Boundary, IEEE
Trans. Pattern Anal. Mach. Intell. 15(4): 388-400, 1993

Class 2 Learned Class 1 Top view Cross-section view
Boundary
Normal vec.

: disciminantly informative : discriminantly redundant

 Extract informative features from
normal vectors on the boundary

16

Decision Boundary Feature Matrix

Linear:

Nonlinear:

 Define responsibility of each variables for discrimination

17

Outline

 Introduction
 Theories
 Experiments
 Application to Energy Fault Analysis
 Conclusions

18

Energy Fault Diagnosis Problem
EF: Inverter overloaded
Detection Rule
6h M.A. of Inverter output = 100 EF
… but I don’t know the cause

cold
Inverter
hot

coil
Air Handling Unit
humidity

19

Energy Fault Diagnosis Problem
EF: Inverter overloaded
Detection Rule
6h M.A. of Inverter output = 100 EF
… but I don’t know the cause

DATA
cold
& Inverter
hot RULE
coil

Find out root cause of inverter overload
Air Handling Unit
humidity

20

Energy Fault Diagnosis - Settings

 Air-conditioning time-series sensor data for 1 unit
 instances: 744
 Labeled sample: 10 for each (3% of all)
(based on probability proportional to distance from boundary)
 Hyper-parameters: NN = 5,
 13 attributes, all continuous

1. Supply Air (SA) Temp. 8. Humidifier Valve Opening
2. Room Tempe. 9. Return Air Temperature
3. Supply Air Temp. Setting 10. Pressure Diff. between In-Outside
4. Room Humidity 11. Moving Ave. of Pressure Difference
5. Inverter Output 12. Outside Air Temperature
6. Cooing Water Valve Opening 13. Outside Humidity
7. Hot Water Valve Opening

Experimental Results

21

22

Results (100 times ave.)
Contribution Score [%] 0 20 40 60 80 100
SA Te m p .
Ro o m Te m p .
SA Se t t in g
Ro o m Hu m id it y
Inverte
I ve rt e r
n
r
C o o lin g Wa t e r
Ho t Wa t e r
H u m id if ie r
Re t u r n Air Te m p .
Pr e s s u r e Dif f .
MA. Pr e s s u r e LDA
O u t s id e Te m p .
O u t s id e Hu m id it y

<LDA>
Inverter (96%) Trivial

23

SA Temp.
SA Te m p .
Ro o m Te m p .
SA Se t t in g
Ro o m Hu m id it y
I ve rt e r
n
CCooling t e r
o o lin g Wa
watert Wa t e r
Ho
H u m id if ie r
Pr e s s u r e Dif f .
MA. Pr e s s u r e LDA
O u t s id e Te m p . SSLDA

<LDA> <SSLDA>
Inverter (96%) Cool water (75%)
SA temp. (12%)

24

SA Te m p .
Ro o m Te m p . Not Distinctive !
SA Se t t in g
Ro o m Hu m id it y
I ve rt e r
n
C o o lin g Wa t e r
Ho t Wa t e r
H u m id if ie r
Pr e s s u r e Dif f . LDA
MA. Pr e s s u r e SSLDA
O u t s id e Te m p . K DA

<LDA> <SSLDA> <KDA>
Inverter (96%) Cool water (75%) Cool water (19%)
SA temp. (12%) MA. Pressure (15%)
Inverter (15%)
…

25

[1] SA Te SA. mp
Temp.
Ro o m Te m p .
[2] SA Se t tSA in g
Setting
Ro o m Hu m id it y
Inverterr
I ve rt e
n
[3] C o o lin g Wa t e r
Cooling
Ho t water
Wa t e r
H u m id if ie r
Pr e s s u r e Dif f . LDA
MA. Pr e s s u r e SSLDA
O u t s id e Te m p . K DA
O u t s id e Hu m id it y SSK DA

<LDA> <SSLDA> <KDA> <SSKDA>
Inverter (96%) Cool water (75%) Cool water (19%) Inverter (33%)
SA temp. (12%) MA. Pressure (15%) SA temp (19%)
Inverter (15%) Cool Water (17%)
SA setting (13%)
…

26

Energy Fault Diagnosis: Examine Row Data
 Cooling water valve Opening [3]

valve opens completely, but this is result of EF, not cause

27

Energy Fault Diagnosis: Examine Row Data
 Cooling water valve Opening

valve opens completely, but this is result of EF, not cause
 SSLDA/SSKDAdeviation… temp. [1] & setting [2] responsible
To reduce this show SA
• Operate inverter at peak power
deviation of SA temp.
• Open cooling water valve

28

Evaluation

Root Cause LDA SSLDA KDA SSKDA

SA Temp.

SA Setting

29

Outline

 Introduction
 Theories
 Conclusions

30

Conclusions
 Introduce identification method of causal variables
by combining semi-supervised LDA & DBA
 Labels are acquired from imperfect domain-specific rule
 SS-LDA/SS-KDA: reflect domain knowledge & avoid over-fitting
 DBA: extract informative features from normal direction of boundary

 Apply to energy fault cause diagnosis
 Succeeded in extracting some responsible features
beginning with fuzzy heuristics based on domain knowledge

31

Room for improvements

 Consider temporal continuity
 Time-series is not i.i.d.

 Find True Cause from Correlating Variables

32

Thank you for your kind attention

34

Extension to Multiple Energy Faults

 In real systems, various faults take place
 Fault cause varies among phenomena
 Need to separate phenomena and diagnose respectively

<Our Approach>
1. Extract points detected by existing heuristics
2. Reduce dimensionality and visualize data in low-dim. space
3. Clustering data and give them labels
4. Identify variables discriminating that cluster from normal data

35

Experimental Condition & Results
 Air-conditioning sensor data, 13 attributes, same heuristics
 748 instances, operating time only (hourly data for 2 months)
 137 points are detected by heuristics
 Reduce dimensionality by isomap [J.B. Tenenbaum 00] (kNN = 5)
 Contribution score is given by SS-KDA (kNN = 5, )
<2D representation>

2 major cluster,
4 anomalies

36
Contribution score for red points
Experimental Condition &Room air Temp.
Results
superficial
 Air-conditioning sensor data, 13 attributes, same heuristics
 748 instances, operating time only (hourly data for 2 months)
 137 points are detected by heuristics
 Reduce dimensionality by isomap [J.B. Tenenbaum 00] (kNN = 5)
 Contribution score is given by SS-KDA (kNN = 5, )
<2D representation>

Deviation of Room Air Temp.
2 major cluster, around detected points
4 anomalies
Detected, this is EF

37

Data Distribution
Properly Controlled

System Deviation

38

Data Distribution
Linearly Separable
for Cooling Water Valve [3]
Cooling Water Valve [%]

39

Probabilistic Labeling
Rule  Points distant from boundary are
reliable as class labels
 Keep robustness against outliers

outlier
Points are stochastically given labels
Unreliable based on reliability

: Distance from boundary of point

40

Estimate DBFM
 Linear Case:
 Nonlinear Case
Difficult to acquire points on boundary & calculate gradient vector
Disciminant function is linear in feature space
Input space Feature space

Kernelized SSLDA
(SS-KDA)

41

DBFM for Nonlinear Distribution (1)
Feature space
1. Generate points on boundary in feature space

2. Gradient vector at corresponding point

for Gaussian kernel
Input space

But to find pre-image is generally difficult…

By kernel trick, pre-image problem is avoidable

42

DBFM for Nonlinear Distribution (2)
Finally we have gradient vectors on boundary for each point

3. Construct estimated DBFM

 Define responsibility of each variables for discrimination

Max. eigenvalue

43

Verification by Benchmark Data – wine discrimination -

UCI Machine Learning Repository: Wine Dataset
 Consider 2-class problem (Original data contain 3)
 Number of Instances: 　 wine A: 59, wine B: 71
 13 attributes, all continuous
1. Alcohol Ad hoc Rule: Color intensity > 4 wine A
2. Malic otherwise wine B
3. Ash
4. Alkalinity of Ash Histogram
5. Magnesium
6. Phenols
7. Flavonoids Frequency
8. Nonflavonoid phenols
9. Proanthocyanins
10. Color intensity
11. Hue
12. OD280/OD315 of diluted wines
13. Proline
Color intensity

44

Result on Benchmark Data
 Acquire only 3 labels for each class based on probability
proportional to distance from boundary (color intensity = 4)
 Hyper-parameters: Nearest neighbors = 3,
100 times average

Most 3 responsible attributes
<LDA>
1. Flavonoids (7): 18.0%
2. Color intensity (10): 13.2%
3. Phenols (6): 11.6 %
[42.8%]
<SS-LDA>
1. Proline (13): 26.5%
2. Color intensity (10): 22.1%
3. Alcohol (1): 14.2%
[62.8%]

45

Comparison of SSLDA with LDA
Plot data in space spanned by most 3 responsible features
LDA SSLDA

Apparently SSLDA gives effective features for discrimination

Identification of Causal Variables for Building Energy Fault Detection by Semisupervised LDA

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (6)

Similaire à Identification of Causal Variables for Building Energy Fault Detection by Semisupervised LDA

Similaire à Identification of Causal Variables for Building Energy Fault Detection by Semisupervised LDA (20)

Dernier

Dernier (20)

Identification of Causal Variables for Building Energy Fault Detection by Semisupervised LDA