Gaining insight from data is not as straightforward as we often wish it would be – as diverse as the questions we’re asking are the quality and the quantity of the data we may have at hand. Any attempt to turn data into knowledge thus strongly depends on it dealing with big or not-so-big data, high- or low-dimensional data, exact or fuzzy data, exact or fuzzy questions, and the goal being accurate prediction or understanding. This presentation emphasizes the need for a multi-paradigm data science to tackle all the challenges we are facing today and may be facing in the future. Luckily, solutions are starting to emerge...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge Discovery | Kai Gansel | Additive
1. Multi-Paradigm Data Science
On the many dimensions of Knowledge Discovery
Data Natives, Berlin, November 17th, 2017
Dr. Kai Gansel
ADDITIVE GmbH
kai.gansel@additive-net.de
2. Dimensions of Knowledge Discovery I and II: Data
���������� � �������� ������� �� � ��
���� ������ ��
high-dimensional data
low-dimensional data
bigdata
not-so-bigdata
Statistics & Modeling
Data Mining
ClusterPC
ML & NN
2��� DN2017_Kai_ADDITIVE.nb
3. Dimensions of Knowledge Discovery III and IV: Approach
���������� �� � ��
���� ������
fuzzy question
exact question
exactdata
fuzzydata
Statistics
Data Mining
ML & NN
DN2017_Kai_ADDITIVE.nb ���3
6. Correlation of SNPs with schizophrenic phenotypes
(Lencz et al., 2013)
6��� DN2017_Kai_ADDITIVE.nb
7. Special Topic: Higher order correlations
Definition
(Schneider & Grün, 2003)
An observed correlation between items or events is called genuine if it cannot be explained by correlations of lower order, i.e. by a random superposition of any of its constituent parts.
Meaning
Genuine higher order patterns are based on non-random, interacting processes and reflect the correlational structure of these processes. The appearances of such patterns may provide insights into
their hidden causes.
DN2017_Kai_ADDITIVE.nb ���7
8. General task
W = region defining one data point
τ = class / feature / quality
Application areas: visited websites, market basket analysis...
...you name it!
The problem
Combinatorial explosion of the number of candidate patterns and tests with increasing number of dimensions:
n = 20; 2^n - n - 1
1 048 555
8��� DN2017_Kai_ADDITIVE.nb
9. Reducing the complexity of data: DimensionReduce
Advantages of dimensionality reduction:
◼ It reduces the time and storage space required.
◼ Removal of multi-collinearity improves the performance of any machine learning model.
◼ It becomes easier to visualize the data when reduced to very low dimensions such as 2D or 3D.
Here are some multi-dimensional example data:
data = Import[NotebookDirectory[] <> "Example.dat"];
Rearrange example data to represent individual measurements.
Structure of the data:
ListPlot[Tally[First /@ data], PlotRange → All, Filling → Axis, AxesLabel → {"Sort ID", "Number of measurements"}]
20 40 60 80 100
Sort ID
10
20
30
40
50
Number of measurements
DN2017_Kai_ADDITIVE.nb ���9
12. Clustering and classifying data: ClusterClassify
ClusterClassify automatically determines the number of clusters and classifies the data accordingly:
Manipulate[With[{CC = ClusterClassify[data3D, Method → method][data3D]}, ListPointPlot3D[Map[Last, GatherBy[Transpose[{CC, data3D}], First], {2}],
ImageSize → 500, PlotLegends → SwatchLegend[Union[CC], LegendLabel → "Cluster ID", LegendFunction → Panel, LegendMarkers → "SphereBubble"]]],
{method, {"GaussianMixture", "DBSCAN", "MeanShift", "Agglomerate", "NeighborhoodContraction"}}, SaveDefinitions → True]
������ ��������������� ������ ��������� ����������� �����������������������
Cluster ID
1
2
3
12��� DN2017_Kai_ADDITIVE.nb
13. Classifying data: Classify
Rock-paper-scissors
Click Reset.
Hold up a fist in front of the camera. Click Rock. Change your hand to paper as you click the paper button, same for scissors. Capture 10-12 images of each. Click stop when you are done. Click Train and
wait. Click Watch and hold up some rock paper scissors gestures and it should recognize what you are doing.
Data = 0
�����
Capture: Rock Paper Scissors Watch Stop
�����
DN2017_Kai_ADDITIVE.nb ���13
14. Find the optimal parameters of a classifier
Load a dataset and split it into a training set and a test set.
data = RandomSample[ExampleData[{"MachineLearning", "Titanic"}, "Data"]];
training = data[[ ;; 1000]];
test = data[[1001 ;;]];
Define a function computing the performance of a classifier as a function of its (hyper)parameters.
loss[{c_, gamma_, b_, d_}] :=
-ClassifierMeasurements[Classify[training, Method → {"SupportVectorMachine", "KernelType" → "Polynomial", "SoftMarginParameter" → Exp[c],
"GammaScalingParameter" → Exp[gamma], "BiasParameter" → Exp[b], "PolynomialDegree" → d}], test, "LogLikelihoodRate"];
Define the possible value of the parameters.
region = ImplicitRegion[And[-3. ≤ c ≤ 3., -3. ≤ gamma ≤ 3., -1. ≤ b ≤ 2., 1 ≤ d ≤ 3, d ∈ Integers], {c, gamma, b, d}]
Search for a good set of parameters.
bmo = BayesianMinimization[loss, region]
bmo["MinimumConfiguration"]
Train a classifier with these parameters.
Classify[training, Method → {"SupportVectorMachine", "KernelType" → "Polynomial", "SoftMarginParameter" → Exp[2.979837222482109`],
"GammaScalingParameter" → Exp[-2.1506497693543025`], "BiasParameter" → Exp[-0.9038364134482837`], "PolynomialDegree" → 2}]
ClassifierMeasurements[%, test, "Accuracy"]
14��� DN2017_Kai_ADDITIVE.nb
15. Neural Networks: Digit classification
Use the MNIST database of handwritten digits to train a convolutional network to predict the digit given an image.
First obtain the training and validation data.
resource = ResourceObject["MNIST"];
trainingData = ResourceData[resource, "TrainingData"];
testData = ResourceData[resource, "TestData"];
RandomSample[trainingData, 5]
Define a convolutional neural network that takes in 28×28 grayscale images as input.
lenet = NetChain[{ConvolutionLayer[20, 5], Ramp, PoolingLayer[2, 2], ConvolutionLayer[50, 5], Ramp, PoolingLayer[2, 2], FlattenLayer[], 500, Ramp, 10, SoftmaxLayer[]},
"Output" → NetDecoder[{"Class", Range[0, 9]}], "Input" → NetEncoder[{"Image", {28, 28}, "Grayscale"}]]
NetChain
�����
�����
�-������ (����� �����)
� ���������������� �-������ (����� ������)
� ���� �-������ (����� ������)
� ������������ �-������ (����� ������)
� ���������������� �-������ (����� ����)
� ���� �-������ (����� ����)
� ������������ �-������ (����� ����)
� ������������ ������ (����� ���)
� ����������� ������ (����� ���)
� ���� ������ (����� ���)
�� ����������� ������ (����� ��)
�� ������������ ������ (����� ��)
������ �����
(�������������)
Train the network for one training round.
lenet = NetTrain[lenet, trainingData, ValidationSet → testData, MaxTrainingRounds → 1];
Evaluate the trained network directly on images randomly sampled from the validation set.
imgs = Keys@RandomSample[testData, 5];
Thread[imgs → lenet[imgs]]
→ 4, → 0, → 6, → 7, → 2
DN2017_Kai_ADDITIVE.nb ���15
17. Neural Networks: Unsupervised learning with autoencoders
Train an autoencoder network to reconstruct images of handwritten digits a�er projecting them to a lower-dimensional “code” vector space. Use these code vectors to perform clustering and visualiza-
tion.
First obtain the training data, then select images corresponding to digits 0 through 4.
resource = ResourceObject["MNIST"];
trainingData = ResourceData[resource, "TrainingData"];
trainingSubset = Select[trainingData, Last[#] ≤ 4 &];
testData = ResourceData[resource, "TestData"];
testSubset = Select[testData, Last[#] ≤ 4 &];
RandomSample[trainingSubset, 8]
→ 1, → 3, → 0, → 0, → 4, → 2, → 1, → 4
Obtain the “mean image” to subtract from the training data.
trainingImages = Keys[trainingSubset];
meanImage = Image[Mean@Map[ImageData, trainingImages]]
Create a network to train that produces both the reconstruction and the reconstruction error.
DN2017_Kai_ADDITIVE.nb ���17
19. encoder = Take[trained4, {NetPort["Input"], 4}]
NetGraph 1 2 3 4Input Output
784 50 501 ⨯ 28 ⨯ 28 784
FlattenLayer Ramp
LinearLayer
Compute codes for all of the test images.
testImages = Keys[testSubset];
features = encoder[testImages];
Project the code vectors to three dimensions and visualize them along with the original classes (not seen by the network). The digit classes tend to cluster together.
coords = DimensionReduce[features, 3];
classes = Values[testSubset];
Table[Extract[coords, Position[classes, i]], {i, 0, 4}]
ListPointPlot3D[Table[Extract[coords, Position[classes, i]], {i, 0, 4}], PlotLegends → PointLegend[96, Range[0, 4]],
BoxRatios → 1, Axes → None, Boxed → True, PlotStyle → Map[ColorData[96], Range[1, 5]], AspectRatio → 1]
0
1
2
3
4
DN2017_Kai_ADDITIVE.nb ���19
20. Visualize a hierarchical clustering of random representatives from each class.
representatives = Catenate@GroupBy[testSubset, Last → First, RandomSample[#, 6] &];
ClusteringTree[encoder[representatives] → Map[ImageCrop, representatives]]
20��� DN2017_Kai_ADDITIVE.nb
21. Neural Networks: Avoid overfitting using a hold-out set
Use the ValidationSet option to NetTrain to ensure that the trained net does not overfit the input data. This is commonly referred to as a test or hold-out dataset.
Create synthetic training data based on a Gaussian curve.
data = Table[x → Exp[-x^2] + RandomVariate[NormalDistribution[0, .15]], {x, -3, 3, .2}];
plot = ListPlot[List @@@ data, PlotStyle → Red]
-3 -2 -1 1 2 3
-0.2
0.2
0.4
0.6
0.8
1.0
Train a net with a large number of parameters relative to the amount of training data.
net = NetChain[{150, Tanh, 150, Tanh, 1}, "Input" → "Scalar", "Output" → "Scalar"];
net1 = NetTrain[net, data, Method → "ADAM"]
NetChain
�����
������
������ (����� �)
� ����������� ������ (����� ���)
� ���� ������ (����� ���)
� ����������� ������ (����� ���)
� ���� ������ (����� ���)
� ����������� ������ (����� �)
������ ������
The resulting net overfits the data, learning the noise in addition to the underlying function.
DN2017_Kai_ADDITIVE.nb ���21
22. Show[Plot[net1[x], {x, -3, 3}], plot]
-3 -2 -1 1 2 3
-0.2
0.2
0.4
0.6
0.8
1.0
Subdivide the data into a training set and a hold-out validation set.
data = RandomSample[data];
{train, test} = TakeDrop[data, 24];
Use the ValidationSet option to have NetTrain select the net that achieved the lowest validation loss during training.
net2 = NetTrain[net, train, ValidationSet → test]
NetChain
�����
������
������ (����� �)
� ����������� ������ (����� ���)
� ���� ������ (����� ���)
� ����������� ������ (����� ���)
� ���� ������ (����� ���)
� ����������� ������ (����� �)
������ ������
The result returned by NetTrain was the net that generalized best to points in the validation set, as measured by validation loss. This penalizes overfitting, as the noise present in the training data is
uncorrelated with the noise present in the validation set.
22��� DN2017_Kai_ADDITIVE.nb
28. Conclusion
◼ Don’t restrict yourself to any particular approach or method without need!
◼ Don’t imply the answer when defining a question!
◼ Stay curious!
28��� DN2017_Kai_ADDITIVE.nb
29. Thanks for listening!
For questions and suggestions, contact kai.gansel@additive-net.de.
http://www.additive-mathematica.de
ADDITIVE So�- und Hardware für Technik und Wissenscha� GmbH
Max-Planck-Staße 22b, 61381 Friedrichsdorf
Sales: 06172 - 5905 - 30 // mathematica@additive-net.de
Academy: 06172 - 5905 - 90 // academy@additive-net.de
Support: 06172 - 5905 - 20 // support@additive-net.de
DN2017_Kai_ADDITIVE.nb ���29