Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive Maintenance to Image Understanding
1. Data Analysis in Industrial Applications:
From Predictive Maintenance
to Image Understanding
2016 Taipei Tech Workshop, Technikum Wien, 22.11.2016
DI Matthias Wastian (dwh GmbH), DI Dominik Brunmeir (dwh OG)
matthias.wastian@dwh.at, dominik.brunmeir@dwh.at
2. Presentation Outline
• Who We Are
• Some Definitions
– Machine Learning
– Data Mining
– Deep Learning
• Natural Language Processing
– Telecommunication Patent Classification
– Speech Analysis of Austrian Politicians
• Predictive Maintenance
– Server Outage Prediction
• Image Understanding
– Object Detection Using HOG Features
– Deep Inspection
– Automatic Optical Inspection of Humidity Sensors
3. dwh GmbH
• Founded 2004, GmbH since 2010
• 16 employees
• 17 master theses
• 6 finished dissertations
• 6 current dissertations
• >90 publications
• Bosses:
– Niki Popper
– Michael Landsiedl
5. Definitions
Machine Learning
• is a field of study that gives computers the ability to learn without being
explicitly programmed (Arthur Samuel, 1959).
• The field of machine learning is concerned with the question of how to
construct computer programs that automatically improve with experience.
• A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E (Tom Mitchell,
1997).
Data Mining
• is the analysis of (often large) observational data sets to find unsuspected
relationships and to summarize the data in novel ways that are both
understandable and useful to the data owner (David Hand, 2001).
6. Definitions
Deep Learning
• is learning using one of a set of algorithms that
attempt to model high-level abstractions in data
by using model architectures composed of
multiple non-linear transformations.
• One of the promises of deep learning is replacing
handcrafted features with efficient algorithms for
unsupervised or semi-supervised feature learning
and hierarchical feature extraction.
7. Telecommunication Patent
Classification
Example EP1696821B1
• Method and device for automatically detecting mating of animals
• Abstract: The inventive device (110, 210, 310, 510) for
automatically detecting the mating of animals is wearable by an
animal (100) and comprises means (105, 505) for fixing to an
animal, means (140) for detecting an attempt of mating a female
animal (120) by said animal, means (145, 180, 345, 580) for
identifying an electronic label which is introduced in the body of
said female animal and actuated by said detection means and/or
by the female animal identification means by processing the image
of at least one part of the female animal triggered by said detection
means. In the preferred embodiment, means for identifying said
other animal comprises means for communicating with the
electronic label (130) carried by a female animal conspecific with
the animal triggered by said detection means. In one of the
embodiments, communications means is embodied in such a way
that it reads the electronic label identifier of each female animal
which said animal attempts to mate and storing means (160)
memorises each displayed identifier. In the other embodiment,
communication means is provided with a device for storing
representative information on the attempted mating in the random
access memory of the electronic label carried by the conspecific
female animal.
8. Telecommunication Patent
Classification
• Several thousand classified patents were used
to derive a classification model for millions of
3GPP patents from Korea, Japan, China, the US
and Europe.
• The data used included the publication
number, the abstract, the abstract of the DWPI
and the claims of the patent.
10. Natural Language Processing
Word Counts
• Measuring similarity: scalar product
• Problem: document length, solution: normalize
Tf-idf
• Common words (stop words: a, the, in...)
vs rare words (names, technical terms,...)
• Important words: common locally, rare globally
• Term frequency times 𝑙𝑜𝑔
#𝑑𝑜𝑐𝑠
1+#𝑑𝑜𝑐𝑠 𝑢𝑠𝑖𝑛𝑔 𝑤𝑜𝑟𝑑
11. Speech Analysis of Austrian
Politicians
• How rude are Austrian politicians?
• Have they become ruder over time?
Data acquisition via
web scraping
Human labelling of
selected sentences
Word2vec or similar
models
12. Predictive Maintenance
Server Outage Prediction
• NOBODY likes server outages.
• Is there an exact
definition of the term outage?
• Is it measurable?
• Downtime minutes per user
13. Server Outage Prediction
• Definition (Event): An event shall be defined as an
occurrence happening at a determinable time and place
with a certain duration. It may be a part of a chain of
occurrences as an effect of a preceding occurrence and as
the cause of a succeeding occurrence. It is possible that
more than one event occurs at the same time and/or place.
• Definition (Abnormal Event): An abnormal event shall be
defined as an outlier in a chain of events, an event that
deviates so much from the other events as to arouse
suspicion that it was caused by something that does not
follow the usual behavior of the considered system and
that it could change the entire system behavior.
16. Server Outage Prediction
Data:
• up to 1439 features per server, sampling rate 1-15min
• historic data sets, IBM Lotus Domino Server.Load
Preprocessing:
• Reduction of data using expert knowledge
• Differentiating accumulative features
• Checking for wrong or missing data
• Normalizing the data (maxmin-mapping)
17. Server Outage Prediction
“I have seen the
future and it is very
much like the
present, only
longer.“
Kehlog Albran, The Profit
21. Outlier Detection
• Threshold
• Angle-based outlier detection
• One-class support vector machine
• Why do we use these methods?
– 1 + 𝑥 -classification problem
– Unsupervised
– Range of dimensions
22. Outlier Detection
• Threshold
• Angle-based outlier detection
• One-class support vector machine
• Why do we use these methods?
– 1 + 𝑥 -classification problem
– Unsupervised
– Range of dimensions
23. Server Outage Prediction
• The outlier detection delivers a score that can be used to calculate a
fuzzy value of outageness.
• Thus a partition based on the relevance of outages is possible – traffic
light system
• A combination of outageness
scores delivered by various
anomaly detectors is possible.
• By saving all these scores in a
database, a classification of
outages is possible (e.g. with
an ANN or some clustering
method).
0 50 100 150 200 250 300
0e+002e+054e+056e+058e+051e+06
ABOF-Bewertung der Zeitpunkte
Rot-Lastbeginn, Grün-Lastende
Zeit
F-ABOFaktor
24. Image Understanding
Applications
• Industrial image analysis
Quality assurance
Labview, Halcon, Cognex Vision Pro
• Medical image analysis
Mostly researchers with medical background
Visualisation support, detection of carcinoms etc.
ITK
• Image analysis and AI
Facebook, Google, Baidu, Microsoft – but still enrooted in universities
Face detection, mimic detection, scene description
OpenCV, dlib, Theano, Keras
26. HOG: Algorithm Details
• Dalal, Triggs (2005)
• Focus on intensity gradients/edge directories
• Local contrast normalization (invariant to light conditions)
• Orientation detection of a single pixel, overlapping blocks, histogram of
orientated gradients
• SVM classifier
• Open source availybility (dlib)
• Relatively few training pictures necessary
• Not a lot of parameter tuning
• Few wrong detections
28. Deep Inspection
• Automatic optical inspection of sensors
• Sensor generations look similar, but not exactly alike
• Deep Convolutional Networks for better
generalization and no extra parameter tuning
• HOG
• Software used: Keras (Python)
29. Deep Inspection
• Input: pixel grey values
• Solution processed by Gershick et al. (2014):
227*227
• Alternating convolution and max pooling,
spatial overlapping
• Sparse connections (non-linear filter)
• Shared weights to gain translation invariance
and an improved generalization ability
30. Deep Inspection
• Input: pixel grey values
• Solution processed by Gershick et al. (2014): 227*227
• Alternating convolution and max pooling, spatial
overlapping
• Hierarchical abstractions
• Sparse connections (non-linear filter)
• Shared weights to gain translation invariance and an
improved generalization ability
• MLP classifier
• Dataset augmentation: sliding window, flipping,
distortion,...
33. Automatic error detection
for humidity sensors
Multiple Challenges
High quality requirements
Changing specifications
Different kind of errors
High data volume
34. Image Aquisition
8“ Silicium wafers
90.000 Sensors per wafer
0.7µm/pixel resolution
Target scan speed of 30 minutes per wafer