SlideShare une entreprise Scribd logo
1  sur  29
Télécharger pour lire hors ligne
Multi-Paradigm Data Science
On the many dimensions of Knowledge Discovery
Data Natives, Berlin, November 17th, 2017
Dr. Kai Gansel
ADDITIVE GmbH
kai.gansel@additive-net.de
Dimensions of Knowledge Discovery I and II: Data
���������� � �������� ������� �� � ��
���� ������ ��
high-dimensional data
low-dimensional data
bigdata
not-so-bigdata
Statistics & Modeling
Data Mining
ClusterPC
ML & NN
2��� DN2017_Kai_ADDITIVE.nb
Dimensions of Knowledge Discovery III and IV: Approach
���������� �� � ��
���� ������
fuzzy question
exact question
exactdata
fuzzydata
Statistics
Data Mining
ML & NN
DN2017_Kai_ADDITIVE.nb ���3
Dimensions of Knowledge Discovery V: Goal
��� �� � �������� ����������� ������ � ��������
Understanding
- Science -
Prediction
- Engineering -
Data Mining
Statistics
Modeling
Machine Learning
Neural Networks
Modeling
4��� DN2017_Kai_ADDITIVE.nb
Example: Statistics
Role of genetic variants in health and disease
(Kuehn, 2016)
DN2017_Kai_ADDITIVE.nb ���5
Correlation of SNPs with schizophrenic phenotypes
(Lencz et al., 2013)
6��� DN2017_Kai_ADDITIVE.nb
Special Topic: Higher order correlations
Definition
(Schneider & Grün, 2003)
An observed correlation between items or events is called genuine if it cannot be explained by correlations of lower order, i.e. by a random superposition of any of its constituent parts.
Meaning
Genuine higher order patterns are based on non-random, interacting processes and reflect the correlational structure of these processes. The appearances of such patterns may provide insights into
their hidden causes.
DN2017_Kai_ADDITIVE.nb ���7
General task
W = region defining one data point
τ = class / feature / quality
Application areas: visited websites, market basket analysis...
...you name it!
The problem
Combinatorial explosion of the number of candidate patterns and tests with increasing number of dimensions:
n = 20; 2^n - n - 1
1 048 555
8��� DN2017_Kai_ADDITIVE.nb
Reducing the complexity of data: DimensionReduce
Advantages of dimensionality reduction:
◼ It reduces the time and storage space required.
◼ Removal of multi-collinearity improves the performance of any machine learning model.
◼ It becomes easier to visualize the data when reduced to very low dimensions such as 2D or 3D.
Here are some multi-dimensional example data:
data = Import[NotebookDirectory[] <> "Example.dat"];
Rearrange example data to represent individual measurements.
Structure of the data:
ListPlot[Tally[First /@ data], PlotRange → All, Filling → Axis, AxesLabel → {"Sort ID", "Number of measurements"}]
20 40 60 80 100
Sort ID
10
20
30
40
50
Number of measurements
DN2017_Kai_ADDITIVE.nb ���9
ListLinePlot[data[[346, 2]], PlotRange → All, AxesLabel → {"Mass ID", "Value"}, Epilog → Text["Measurement 346nSort ID: " <> ToString[data[[346, 1]]], {14 000, 6000}]]
5000 10000 15000 20000 25000
Mass ID
2000
4000
6000
8000
10000
Value
Measurement 346
Sort ID: 55
Dimensions[Transpose[data][[2]]]
{346, 25 780}
Project the data onto a 3-dimensional subspace:
data3D = DimensionReduce[Transpose[data][[2]], 3];
data3D = Get[NotebookDirectory[] <> "Data3D.txt"];
ListPlot[data3D[[346]], PlotRange → All, AxesLabel → {"Component", "Value"},
Filling → Axis, Epilog → Text["Measurement 346nSort ID: " <> ToString[data[[346, 1]]], {2, 20}]]
1.5 2.0 2.5 3.0
Component
-20
-10
10
20
Value
Measurement 346
Sort ID: 55
Dimensions[data3D]
{346, 3}
10��� DN2017_Kai_ADDITIVE.nb
ListPointPlot3D[data3D]
DN2017_Kai_ADDITIVE.nb ���11
Clustering and classifying data: ClusterClassify
ClusterClassify automatically determines the number of clusters and classifies the data accordingly:
Manipulate[With[{CC = ClusterClassify[data3D, Method → method][data3D]}, ListPointPlot3D[Map[Last, GatherBy[Transpose[{CC, data3D}], First], {2}],
ImageSize → 500, PlotLegends → SwatchLegend[Union[CC], LegendLabel → "Cluster ID", LegendFunction → Panel, LegendMarkers → "SphereBubble"]]],
{method, {"GaussianMixture", "DBSCAN", "MeanShift", "Agglomerate", "NeighborhoodContraction"}}, SaveDefinitions → True]
������ ��������������� ������ ��������� ����������� �����������������������
Cluster ID
1
2
3
12��� DN2017_Kai_ADDITIVE.nb
Classifying data: Classify
Rock-paper-scissors
Click Reset.
Hold up a fist in front of the camera. Click Rock. Change your hand to paper as you click the paper button, same for scissors. Capture 10-12 images of each. Click stop when you are done. Click Train and
wait. Click Watch and hold up some rock paper scissors gestures and it should recognize what you are doing.
Data = 0
�����
Capture: Rock Paper Scissors Watch Stop
�����
DN2017_Kai_ADDITIVE.nb ���13
Find the optimal parameters of a classifier
Load a dataset and split it into a training set and a test set.
data = RandomSample[ExampleData[{"MachineLearning", "Titanic"}, "Data"]];
training = data[[ ;; 1000]];
test = data[[1001 ;;]];
Define a function computing the performance of a classifier as a function of its (hyper)parameters.
loss[{c_, gamma_, b_, d_}] :=
-ClassifierMeasurements[Classify[training, Method → {"SupportVectorMachine", "KernelType" → "Polynomial", "SoftMarginParameter" → Exp[c],
"GammaScalingParameter" → Exp[gamma], "BiasParameter" → Exp[b], "PolynomialDegree" → d}], test, "LogLikelihoodRate"];
Define the possible value of the parameters.
region = ImplicitRegion[And[-3. ≤ c ≤ 3., -3. ≤ gamma ≤ 3., -1. ≤ b ≤ 2., 1 ≤ d ≤ 3, d ∈ Integers], {c, gamma, b, d}]
Search for a good set of parameters.
bmo = BayesianMinimization[loss, region]
bmo["MinimumConfiguration"]
Train a classifier with these parameters.
Classify[training, Method → {"SupportVectorMachine", "KernelType" → "Polynomial", "SoftMarginParameter" → Exp[2.979837222482109`],
"GammaScalingParameter" → Exp[-2.1506497693543025`], "BiasParameter" → Exp[-0.9038364134482837`], "PolynomialDegree" → 2}]
ClassifierMeasurements[%, test, "Accuracy"]
14��� DN2017_Kai_ADDITIVE.nb
Neural Networks: Digit classification
Use the MNIST database of handwritten digits to train a convolutional network to predict the digit given an image.
First obtain the training and validation data.
resource = ResourceObject["MNIST"];
trainingData = ResourceData[resource, "TrainingData"];
testData = ResourceData[resource, "TestData"];
RandomSample[trainingData, 5]
Define a convolutional neural network that takes in 28×28 grayscale images as input.
lenet = NetChain[{ConvolutionLayer[20, 5], Ramp, PoolingLayer[2, 2], ConvolutionLayer[50, 5], Ramp, PoolingLayer[2, 2], FlattenLayer[], 500, Ramp, 10, SoftmaxLayer[]},
"Output" → NetDecoder[{"Class", Range[0, 9]}], "Input" → NetEncoder[{"Image", {28, 28}, "Grayscale"}]]
NetChain 
�����
�����
�-������ (����� �����)
� ���������������� �-������ (����� ������)
� ���� �-������ (����� ������)
� ������������ �-������ (����� ������)
� ���������������� �-������ (����� ����)
� ���� �-������ (����� ����)
� ������������ �-������ (����� ����)
� ������������ ������ (����� ���)
� ����������� ������ (����� ���)
� ���� ������ (����� ���)
�� ����������� ������ (����� ��)
�� ������������ ������ (����� ��)
������ �����
(�������������)

Train the network for one training round.
lenet = NetTrain[lenet, trainingData, ValidationSet → testData, MaxTrainingRounds → 1];
Evaluate the trained network directly on images randomly sampled from the validation set.
imgs = Keys@RandomSample[testData, 5];
Thread[imgs → lenet[imgs]]
 → 4, → 0, → 6, → 7, → 2
DN2017_Kai_ADDITIVE.nb ���15
Create a ClassifierMeasurements object from the trained network and the validation set.
cm = ClassifierMeasurements[lenet, testData]
ClassifierMeasurementsObject
���������� ���
������ �� ���� ��������� �����

Obtain the accuracy of the network on the validation set.
cm["Accuracy"]
0.9801
Obtain a plot of the confusion matrix of the network predictions on the validation set.
cm["ConfusionMatrixPlot"]
975
1150
1017
1013
979
882
947
1049
983
1005
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
980
1135
1032
1010
982
892
958
1028
974
1009
predicted class
actualclass
963
0
1
0
0
1
5
1
2
2
0
1132
5
0
0
0
5
4
0
4
1
1
1001
2
2
1
0
7
2
0
0
1
2
992
0
9
0
2
3
4
1
0
2
0
964
0
3
0
2
7
0
0
0
4
0
874
3
0
0
1
3
0
2
0
1
2
936
0
2
1
3
1
14
7
1
1
1
1008
5
8
4
0
5
5
2
4
5
3
952
3
5
0
0
0
12
0
0
3
6
979
16��� DN2017_Kai_ADDITIVE.nb
Neural Networks: Unsupervised learning with autoencoders
Train an autoencoder network to reconstruct images of handwritten digits a�er projecting them to a lower-dimensional “code” vector space. Use these code vectors to perform clustering and visualiza-
tion.
First obtain the training data, then select images corresponding to digits 0 through 4.
resource = ResourceObject["MNIST"];
trainingData = ResourceData[resource, "TrainingData"];
trainingSubset = Select[trainingData, Last[#] ≤ 4 &];
testData = ResourceData[resource, "TestData"];
testSubset = Select[testData, Last[#] ≤ 4 &];
RandomSample[trainingSubset, 8]
 → 1, → 3, → 0, → 0, → 4, → 2, → 1, → 4
Obtain the “mean image” to subtract from the training data.
trainingImages = Keys[trainingSubset];
meanImage = Image[Mean@Map[ImageData, trainingImages]]
Create a network to train that produces both the reconstruction and the reconstruction error.
DN2017_Kai_ADDITIVE.nb ���17
net = NetGraph[{FlattenLayer[], 50, Ramp, 784, Tanh, ReshapeLayer[{1, 28, 28}], MeanSquaredLossLayer[]},
{1 → 2 → 3 → 4 → 5 → 6 → NetPort["Output"], 6 → NetPort[7, "Input"], NetPort["Input"] → NetPort[7, "Target"]},
"Input" → NetEncoder[{"Image", {28, 28}, "Grayscale", "MeanImage" → meanImage}], "Output" → NetDecoder[{"Image", "Grayscale"}]]
NetGraph 
1 2 3 4 5 6 Output
7Input Loss
Input
Ramp Tanh
784 50 50 784 784 1 ⨯ 28 ⨯ 28
1
⨯
28
⨯
28
1 ⨯ 28 ⨯ 28
1 ⨯ 28 ⨯ 28
ℝ
FlattenLayer ReshapeLayer
LinearLayer MeanSquaredLossLayer
ElementwiseLayer

Train the network to minimize the reconstruction error.
trained4 = NetTrain[net, <|"Input" → trainingImages|>, "Loss"];
Obtain a subnetwork that performs only reconstruction.
reconstructor = Take[trained4, {NetPort["Input"], NetPort["Output"]}]
NetGraph  1 2 3 4 5 6 OutputInput
Ramp Tanh
784 50 50 784 784 1 ⨯ 28 ⨯ 281 ⨯ 28 ⨯ 28
FlattenLayer ElementwiseLayer
LinearLayer ReshapeLayer

Reconstruct some sample images.
ImageAdd[reconstructor[#], meanImage] & /@  , , , , 
 , , , , 
Obtain a subnetwork that produces the code vector.
18��� DN2017_Kai_ADDITIVE.nb
encoder = Take[trained4, {NetPort["Input"], 4}]
NetGraph  1 2 3 4Input Output
784 50 501 ⨯ 28 ⨯ 28 784
FlattenLayer Ramp
LinearLayer

Compute codes for all of the test images.
testImages = Keys[testSubset];
features = encoder[testImages];
Project the code vectors to three dimensions and visualize them along with the original classes (not seen by the network). The digit classes tend to cluster together.
coords = DimensionReduce[features, 3];
classes = Values[testSubset];
Table[Extract[coords, Position[classes, i]], {i, 0, 4}]
ListPointPlot3D[Table[Extract[coords, Position[classes, i]], {i, 0, 4}], PlotLegends → PointLegend[96, Range[0, 4]],
BoxRatios → 1, Axes → None, Boxed → True, PlotStyle → Map[ColorData[96], Range[1, 5]], AspectRatio → 1]
0
1
2
3
4
DN2017_Kai_ADDITIVE.nb ���19
Visualize a hierarchical clustering of random representatives from each class.
representatives = Catenate@GroupBy[testSubset, Last → First, RandomSample[#, 6] &];
ClusteringTree[encoder[representatives] → Map[ImageCrop, representatives]]
20��� DN2017_Kai_ADDITIVE.nb
Neural Networks: Avoid overfitting using a hold-out set
Use the ValidationSet option to NetTrain to ensure that the trained net does not overfit the input data. This is commonly referred to as a test or hold-out dataset.
Create synthetic training data based on a Gaussian curve.
data = Table[x → Exp[-x^2] + RandomVariate[NormalDistribution[0, .15]], {x, -3, 3, .2}];
plot = ListPlot[List @@@ data, PlotStyle → Red]
-3 -2 -1 1 2 3
-0.2
0.2
0.4
0.6
0.8
1.0
Train a net with a large number of parameters relative to the amount of training data.
net = NetChain[{150, Tanh, 150, Tanh, 1}, "Input" → "Scalar", "Output" → "Scalar"];
net1 = NetTrain[net, data, Method → "ADAM"]
NetChain 
�����
������
������ (����� �)
� ����������� ������ (����� ���)
� ���� ������ (����� ���)
� ����������� ������ (����� ���)
� ���� ������ (����� ���)
� ����������� ������ (����� �)
������ ������

The resulting net overfits the data, learning the noise in addition to the underlying function.
DN2017_Kai_ADDITIVE.nb ���21
Show[Plot[net1[x], {x, -3, 3}], plot]
-3 -2 -1 1 2 3
-0.2
0.2
0.4
0.6
0.8
1.0
Subdivide the data into a training set and a hold-out validation set.
data = RandomSample[data];
{train, test} = TakeDrop[data, 24];
Use the ValidationSet option to have NetTrain select the net that achieved the lowest validation loss during training.
net2 = NetTrain[net, train, ValidationSet → test]
NetChain 
�����
������
������ (����� �)
� ����������� ������ (����� ���)
� ���� ������ (����� ���)
� ����������� ������ (����� ���)
� ���� ������ (����� ���)
� ����������� ������ (����� �)
������ ������

The result returned by NetTrain was the net that generalized best to points in the validation set, as measured by validation loss. This penalizes overfitting, as the noise present in the training data is
uncorrelated with the noise present in the validation set.
22��� DN2017_Kai_ADDITIVE.nb
Show[Plot[net2[x], {x, -3, 3}], plot]
-3 -2 -1 1 2 3
0.2
0.4
0.6
0.8
1.0
DN2017_Kai_ADDITIVE.nb ���23
Model-based Prediction
Train a Gaussian process predictor on a simple dataset.
data = {-1.2 → 1.2, 1.0 → 1.4, 2.2 → 1.6, 3.1 → 1.8, 4.5 → 1.6};
p = Predict[data, Method → "GaussianProcess"]
Visualize the predicted values along with a confidence interval.
Show[Plot[{p[x], p[x] + StandardDeviation[p[x, "Distribution"]], p[x] - StandardDeviation[p[x, "Distribution"]]},
{x, -2, 6}, PlotStyle → {Blue, Gray, Gray}, Filling → {2 → {3}}, Exclusions → False, PerformanceGoal → "Speed",
PlotLegends → {"Prediction", "Confidence Interval"}], ListPlot[List @@@ data, PlotStyle → Red, PlotLegends → {"Data"}]]
-2 2 4 6
1.2
1.3
1.4
1.5
1.6
1.7
1.8
Prediction
Confidence Interval
Data
24��� DN2017_Kai_ADDITIVE.nb
Dealing with complexity: Graph analysis
Import a DIMACS file:
gr = Import["http://mat.gsia.cmu.edu/COLOR/instances/homer.col", "DIMACS"]
Get the metadata:
Import["http://mat.gsia.cmu.edu/COLOR/instances/homer.col", "Elements"]
{AdjacencyMatrix, EdgeRules, Graph, Graphics, VertexCount}
Edge rules:
rules = Import["http://mat.gsia.cmu.edu/COLOR/instances/homer.col", "EdgeRules"]
Most frequently occurring character:
Commonest[Flatten[List @@@ rules]]
{452}
Achilles neighborhood:
DN2017_Kai_ADDITIVE.nb ���25
NeighborhoodGraph[gr, 452]
Highlight the subgraph:
26��� DN2017_Kai_ADDITIVE.nb
HighlightGraph[gr, NeighborhoodGraph[gr, 452], ImageSize → Large]
DN2017_Kai_ADDITIVE.nb ���27
Conclusion
◼ Don’t restrict yourself to any particular approach or method without need!
◼ Don’t imply the answer when defining a question!
◼ Stay curious!
28��� DN2017_Kai_ADDITIVE.nb
Thanks for listening!
For questions and suggestions, contact kai.gansel@additive-net.de.
http://www.additive-mathematica.de
ADDITIVE So�- und Hardware für Technik und Wissenscha� GmbH
Max-Planck-Staße 22b, 61381 Friedrichsdorf
Sales: 06172 - 5905 - 30 // mathematica@additive-net.de
Academy: 06172 - 5905 - 90 // academy@additive-net.de
Support: 06172 - 5905 - 20 // support@additive-net.de
DN2017_Kai_ADDITIVE.nb ���29

Contenu connexe

Tendances

Neo4j Graph Data Science Training - June 9 & 10 - Slides #5 - Graph Catalog O...
Neo4j Graph Data Science Training - June 9 & 10 - Slides #5 - Graph Catalog O...Neo4j Graph Data Science Training - June 9 & 10 - Slides #5 - Graph Catalog O...
Neo4j Graph Data Science Training - June 9 & 10 - Slides #5 - Graph Catalog O...Neo4j
 
Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applicationsKexin Xie
 
Kaggle talk series top 0.2% kaggler on amazon employee access challenge
Kaggle talk series  top 0.2% kaggler on amazon employee access challengeKaggle talk series  top 0.2% kaggler on amazon employee access challenge
Kaggle talk series top 0.2% kaggler on amazon employee access challengeVivian S. Zhang
 
R Workshop for Beginners
R Workshop for BeginnersR Workshop for Beginners
R Workshop for BeginnersMetamarkets
 
3 R Tutorial Data Structure
3 R Tutorial Data Structure3 R Tutorial Data Structure
3 R Tutorial Data StructureSakthi Dasans
 
Introduction of Xgboost
Introduction of XgboostIntroduction of Xgboost
Introduction of Xgboostmichiaki ito
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Serban Tanasa
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorVivian S. Zhang
 
Data handling in r
Data handling in rData handling in r
Data handling in rAbhik Seal
 
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)Abhishek Thakur
 
Data manipulation on r
Data manipulation on rData manipulation on r
Data manipulation on rAbhik Seal
 
3 pandasadvanced
3 pandasadvanced3 pandasadvanced
3 pandasadvancedpramod naik
 
Nyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedNyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedVivian S. Zhang
 
Morel, a Functional Query Language
Morel, a Functional Query LanguageMorel, a Functional Query Language
Morel, a Functional Query LanguageJulian Hyde
 
Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...
Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...
Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...Vivian S. Zhang
 
[M3A3] Data Analysis and Interpretation Specialization
[M3A3] Data Analysis and Interpretation Specialization [M3A3] Data Analysis and Interpretation Specialization
[M3A3] Data Analysis and Interpretation Specialization Andrea Rubio
 

Tendances (20)

Neo4j Graph Data Science Training - June 9 & 10 - Slides #5 - Graph Catalog O...
Neo4j Graph Data Science Training - June 9 & 10 - Slides #5 - Graph Catalog O...Neo4j Graph Data Science Training - June 9 & 10 - Slides #5 - Graph Catalog O...
Neo4j Graph Data Science Training - June 9 & 10 - Slides #5 - Graph Catalog O...
 
Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applications
 
Kaggle talk series top 0.2% kaggler on amazon employee access challenge
Kaggle talk series  top 0.2% kaggler on amazon employee access challengeKaggle talk series  top 0.2% kaggler on amazon employee access challenge
Kaggle talk series top 0.2% kaggler on amazon employee access challenge
 
R Workshop for Beginners
R Workshop for BeginnersR Workshop for Beginners
R Workshop for Beginners
 
3 R Tutorial Data Structure
3 R Tutorial Data Structure3 R Tutorial Data Structure
3 R Tutorial Data Structure
 
R learning by examples
R learning by examplesR learning by examples
R learning by examples
 
Introduction of Xgboost
Introduction of XgboostIntroduction of Xgboost
Introduction of Xgboost
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 
Data handling in r
Data handling in rData handling in r
Data handling in r
 
array
arrayarray
array
 
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
 
Data manipulation on r
Data manipulation on rData manipulation on r
Data manipulation on r
 
Xgboost
XgboostXgboost
Xgboost
 
3 pandasadvanced
3 pandasadvanced3 pandasadvanced
3 pandasadvanced
 
Rclass
RclassRclass
Rclass
 
Nyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedNyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expanded
 
Morel, a Functional Query Language
Morel, a Functional Query LanguageMorel, a Functional Query Language
Morel, a Functional Query Language
 
Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...
Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...
Data Science Academy Student Demo day--Peggy sobolewski,analyzing transporati...
 
[M3A3] Data Analysis and Interpretation Specialization
[M3A3] Data Analysis and Interpretation Specialization [M3A3] Data Analysis and Interpretation Specialization
[M3A3] Data Analysis and Interpretation Specialization
 

Similaire à DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge Discovery | Kai Gansel | Additive

Learning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and KaggleLearning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and KaggleYvonne K. Matos
 
Session 4 start coding Tensorflow 2.0
Session 4 start coding Tensorflow 2.0Session 4 start coding Tensorflow 2.0
Session 4 start coding Tensorflow 2.0Rajagopal A
 
Introduction to deep learning using python
Introduction to deep learning using pythonIntroduction to deep learning using python
Introduction to deep learning using pythonLino Coria
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax
 
Big Data LDN 2017: From Zero to AI in 30 Minutes
Big Data LDN 2017: From Zero to AI in 30 MinutesBig Data LDN 2017: From Zero to AI in 30 Minutes
Big Data LDN 2017: From Zero to AI in 30 MinutesMatt Stubbs
 
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...Red Hat Developers
 
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018 Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018 Codemotion
 
Lecture 1 Pandas Basics.pptx machine learning
Lecture 1 Pandas Basics.pptx machine learningLecture 1 Pandas Basics.pptx machine learning
Lecture 1 Pandas Basics.pptx machine learningmy6305874
 
Lesson 2 data preprocessing
Lesson 2   data preprocessingLesson 2   data preprocessing
Lesson 2 data preprocessingAbdurRazzaqe1
 
Session 06 machine learning.pptx
Session 06 machine learning.pptxSession 06 machine learning.pptx
Session 06 machine learning.pptxbodaceacat
 
Session 06 machine learning.pptx
Session 06 machine learning.pptxSession 06 machine learning.pptx
Session 06 machine learning.pptxSara-Jayne Terp
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using PythonNishantKumar1179
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineeringJulian Hyde
 
Unit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptxUnit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptxprakashvs7
 
Log Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine LearningLog Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine LearningPiotr Tylenda
 
Log Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine LearningLog Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine LearningAgnieszka Potulska
 
Training course lect2
Training course lect2Training course lect2
Training course lect2Noor Dhiya
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Yao Yao
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...Paolo Missier
 

Similaire à DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge Discovery | Kai Gansel | Additive (20)

Learning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and KaggleLearning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and Kaggle
 
Session 4 start coding Tensorflow 2.0
Session 4 start coding Tensorflow 2.0Session 4 start coding Tensorflow 2.0
Session 4 start coding Tensorflow 2.0
 
Introduction to deep learning using python
Introduction to deep learning using pythonIntroduction to deep learning using python
Introduction to deep learning using python
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
 
Big Data LDN 2017: From Zero to AI in 30 Minutes
Big Data LDN 2017: From Zero to AI in 30 MinutesBig Data LDN 2017: From Zero to AI in 30 Minutes
Big Data LDN 2017: From Zero to AI in 30 Minutes
 
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
 
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018 Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
 
Lecture 1 Pandas Basics.pptx machine learning
Lecture 1 Pandas Basics.pptx machine learningLecture 1 Pandas Basics.pptx machine learning
Lecture 1 Pandas Basics.pptx machine learning
 
Lesson 2 data preprocessing
Lesson 2   data preprocessingLesson 2   data preprocessing
Lesson 2 data preprocessing
 
More on Pandas.pptx
More on Pandas.pptxMore on Pandas.pptx
More on Pandas.pptx
 
Session 06 machine learning.pptx
Session 06 machine learning.pptxSession 06 machine learning.pptx
Session 06 machine learning.pptx
 
Session 06 machine learning.pptx
Session 06 machine learning.pptxSession 06 machine learning.pptx
Session 06 machine learning.pptx
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineering
 
Unit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptxUnit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptx
 
Log Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine LearningLog Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine Learning
 
Log Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine LearningLog Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine Learning
 
Training course lect2
Training course lect2Training course lect2
Training course lect2
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
 

Plus de Dataconomy Media

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...Dataconomy Media
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Dataconomy Media
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Dataconomy Media
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Dataconomy Media
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...Dataconomy Media
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Dataconomy Media
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...Dataconomy Media
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Dataconomy Media
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...Dataconomy Media
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Dataconomy Media
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Dataconomy Media
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Dataconomy Media
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Dataconomy Media
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Dataconomy Media
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Dataconomy Media
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Dataconomy Media
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Dataconomy Media
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Dataconomy Media
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Dataconomy Media
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Dataconomy Media
 

Plus de Dataconomy Media (20)

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
 

Dernier

➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...gajnagarg
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...amitlee9823
 

Dernier (20)

➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 

DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge Discovery | Kai Gansel | Additive

  • 1. Multi-Paradigm Data Science On the many dimensions of Knowledge Discovery Data Natives, Berlin, November 17th, 2017 Dr. Kai Gansel ADDITIVE GmbH kai.gansel@additive-net.de
  • 2. Dimensions of Knowledge Discovery I and II: Data ���������� � �������� ������� �� � �� ���� ������ �� high-dimensional data low-dimensional data bigdata not-so-bigdata Statistics & Modeling Data Mining ClusterPC ML & NN 2��� DN2017_Kai_ADDITIVE.nb
  • 3. Dimensions of Knowledge Discovery III and IV: Approach ���������� �� � �� ���� ������ fuzzy question exact question exactdata fuzzydata Statistics Data Mining ML & NN DN2017_Kai_ADDITIVE.nb ���3
  • 4. Dimensions of Knowledge Discovery V: Goal ��� �� � �������� ����������� ������ � �������� Understanding - Science - Prediction - Engineering - Data Mining Statistics Modeling Machine Learning Neural Networks Modeling 4��� DN2017_Kai_ADDITIVE.nb
  • 5. Example: Statistics Role of genetic variants in health and disease (Kuehn, 2016) DN2017_Kai_ADDITIVE.nb ���5
  • 6. Correlation of SNPs with schizophrenic phenotypes (Lencz et al., 2013) 6��� DN2017_Kai_ADDITIVE.nb
  • 7. Special Topic: Higher order correlations Definition (Schneider & Grün, 2003) An observed correlation between items or events is called genuine if it cannot be explained by correlations of lower order, i.e. by a random superposition of any of its constituent parts. Meaning Genuine higher order patterns are based on non-random, interacting processes and reflect the correlational structure of these processes. The appearances of such patterns may provide insights into their hidden causes. DN2017_Kai_ADDITIVE.nb ���7
  • 8. General task W = region defining one data point τ = class / feature / quality Application areas: visited websites, market basket analysis... ...you name it! The problem Combinatorial explosion of the number of candidate patterns and tests with increasing number of dimensions: n = 20; 2^n - n - 1 1 048 555 8��� DN2017_Kai_ADDITIVE.nb
  • 9. Reducing the complexity of data: DimensionReduce Advantages of dimensionality reduction: ◼ It reduces the time and storage space required. ◼ Removal of multi-collinearity improves the performance of any machine learning model. ◼ It becomes easier to visualize the data when reduced to very low dimensions such as 2D or 3D. Here are some multi-dimensional example data: data = Import[NotebookDirectory[] <> "Example.dat"]; Rearrange example data to represent individual measurements. Structure of the data: ListPlot[Tally[First /@ data], PlotRange → All, Filling → Axis, AxesLabel → {"Sort ID", "Number of measurements"}] 20 40 60 80 100 Sort ID 10 20 30 40 50 Number of measurements DN2017_Kai_ADDITIVE.nb ���9
  • 10. ListLinePlot[data[[346, 2]], PlotRange → All, AxesLabel → {"Mass ID", "Value"}, Epilog → Text["Measurement 346nSort ID: " <> ToString[data[[346, 1]]], {14 000, 6000}]] 5000 10000 15000 20000 25000 Mass ID 2000 4000 6000 8000 10000 Value Measurement 346 Sort ID: 55 Dimensions[Transpose[data][[2]]] {346, 25 780} Project the data onto a 3-dimensional subspace: data3D = DimensionReduce[Transpose[data][[2]], 3]; data3D = Get[NotebookDirectory[] <> "Data3D.txt"]; ListPlot[data3D[[346]], PlotRange → All, AxesLabel → {"Component", "Value"}, Filling → Axis, Epilog → Text["Measurement 346nSort ID: " <> ToString[data[[346, 1]]], {2, 20}]] 1.5 2.0 2.5 3.0 Component -20 -10 10 20 Value Measurement 346 Sort ID: 55 Dimensions[data3D] {346, 3} 10��� DN2017_Kai_ADDITIVE.nb
  • 12. Clustering and classifying data: ClusterClassify ClusterClassify automatically determines the number of clusters and classifies the data accordingly: Manipulate[With[{CC = ClusterClassify[data3D, Method → method][data3D]}, ListPointPlot3D[Map[Last, GatherBy[Transpose[{CC, data3D}], First], {2}], ImageSize → 500, PlotLegends → SwatchLegend[Union[CC], LegendLabel → "Cluster ID", LegendFunction → Panel, LegendMarkers → "SphereBubble"]]], {method, {"GaussianMixture", "DBSCAN", "MeanShift", "Agglomerate", "NeighborhoodContraction"}}, SaveDefinitions → True] ������ ��������������� ������ ��������� ����������� ����������������������� Cluster ID 1 2 3 12��� DN2017_Kai_ADDITIVE.nb
  • 13. Classifying data: Classify Rock-paper-scissors Click Reset. Hold up a fist in front of the camera. Click Rock. Change your hand to paper as you click the paper button, same for scissors. Capture 10-12 images of each. Click stop when you are done. Click Train and wait. Click Watch and hold up some rock paper scissors gestures and it should recognize what you are doing. Data = 0 ����� Capture: Rock Paper Scissors Watch Stop ����� DN2017_Kai_ADDITIVE.nb ���13
  • 14. Find the optimal parameters of a classifier Load a dataset and split it into a training set and a test set. data = RandomSample[ExampleData[{"MachineLearning", "Titanic"}, "Data"]]; training = data[[ ;; 1000]]; test = data[[1001 ;;]]; Define a function computing the performance of a classifier as a function of its (hyper)parameters. loss[{c_, gamma_, b_, d_}] := -ClassifierMeasurements[Classify[training, Method → {"SupportVectorMachine", "KernelType" → "Polynomial", "SoftMarginParameter" → Exp[c], "GammaScalingParameter" → Exp[gamma], "BiasParameter" → Exp[b], "PolynomialDegree" → d}], test, "LogLikelihoodRate"]; Define the possible value of the parameters. region = ImplicitRegion[And[-3. ≤ c ≤ 3., -3. ≤ gamma ≤ 3., -1. ≤ b ≤ 2., 1 ≤ d ≤ 3, d ∈ Integers], {c, gamma, b, d}] Search for a good set of parameters. bmo = BayesianMinimization[loss, region] bmo["MinimumConfiguration"] Train a classifier with these parameters. Classify[training, Method → {"SupportVectorMachine", "KernelType" → "Polynomial", "SoftMarginParameter" → Exp[2.979837222482109`], "GammaScalingParameter" → Exp[-2.1506497693543025`], "BiasParameter" → Exp[-0.9038364134482837`], "PolynomialDegree" → 2}] ClassifierMeasurements[%, test, "Accuracy"] 14��� DN2017_Kai_ADDITIVE.nb
  • 15. Neural Networks: Digit classification Use the MNIST database of handwritten digits to train a convolutional network to predict the digit given an image. First obtain the training and validation data. resource = ResourceObject["MNIST"]; trainingData = ResourceData[resource, "TrainingData"]; testData = ResourceData[resource, "TestData"]; RandomSample[trainingData, 5] Define a convolutional neural network that takes in 28×28 grayscale images as input. lenet = NetChain[{ConvolutionLayer[20, 5], Ramp, PoolingLayer[2, 2], ConvolutionLayer[50, 5], Ramp, PoolingLayer[2, 2], FlattenLayer[], 500, Ramp, 10, SoftmaxLayer[]}, "Output" → NetDecoder[{"Class", Range[0, 9]}], "Input" → NetEncoder[{"Image", {28, 28}, "Grayscale"}]] NetChain  ����� ����� �-������ (����� �×��×��) � ���������������� �-������ (����� ��×��×��) � ���� �-������ (����� ��×��×��) � ������������ �-������ (����� ��×��×��) � ���������������� �-������ (����� ��×�×�) � ���� �-������ (����� ��×�×�) � ������������ �-������ (����� ��×�×�) � ������������ ������ (����� ���) � ����������� ������ (����� ���) � ���� ������ (����� ���) �� ����������� ������ (����� ��) �� ������������ ������ (����� ��) ������ ����� (�������������)  Train the network for one training round. lenet = NetTrain[lenet, trainingData, ValidationSet → testData, MaxTrainingRounds → 1]; Evaluate the trained network directly on images randomly sampled from the validation set. imgs = Keys@RandomSample[testData, 5]; Thread[imgs → lenet[imgs]]  → 4, → 0, → 6, → 7, → 2 DN2017_Kai_ADDITIVE.nb ���15
  • 16. Create a ClassifierMeasurements object from the trained network and the validation set. cm = ClassifierMeasurements[lenet, testData] ClassifierMeasurementsObject ���������� ��� ������ �� ���� ��������� �����  Obtain the accuracy of the network on the validation set. cm["Accuracy"] 0.9801 Obtain a plot of the confusion matrix of the network predictions on the validation set. cm["ConfusionMatrixPlot"] 975 1150 1017 1013 979 882 947 1049 983 1005 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 980 1135 1032 1010 982 892 958 1028 974 1009 predicted class actualclass 963 0 1 0 0 1 5 1 2 2 0 1132 5 0 0 0 5 4 0 4 1 1 1001 2 2 1 0 7 2 0 0 1 2 992 0 9 0 2 3 4 1 0 2 0 964 0 3 0 2 7 0 0 0 4 0 874 3 0 0 1 3 0 2 0 1 2 936 0 2 1 3 1 14 7 1 1 1 1008 5 8 4 0 5 5 2 4 5 3 952 3 5 0 0 0 12 0 0 3 6 979 16��� DN2017_Kai_ADDITIVE.nb
  • 17. Neural Networks: Unsupervised learning with autoencoders Train an autoencoder network to reconstruct images of handwritten digits a�er projecting them to a lower-dimensional “code” vector space. Use these code vectors to perform clustering and visualiza- tion. First obtain the training data, then select images corresponding to digits 0 through 4. resource = ResourceObject["MNIST"]; trainingData = ResourceData[resource, "TrainingData"]; trainingSubset = Select[trainingData, Last[#] ≤ 4 &]; testData = ResourceData[resource, "TestData"]; testSubset = Select[testData, Last[#] ≤ 4 &]; RandomSample[trainingSubset, 8]  → 1, → 3, → 0, → 0, → 4, → 2, → 1, → 4 Obtain the “mean image” to subtract from the training data. trainingImages = Keys[trainingSubset]; meanImage = Image[Mean@Map[ImageData, trainingImages]] Create a network to train that produces both the reconstruction and the reconstruction error. DN2017_Kai_ADDITIVE.nb ���17
  • 18. net = NetGraph[{FlattenLayer[], 50, Ramp, 784, Tanh, ReshapeLayer[{1, 28, 28}], MeanSquaredLossLayer[]}, {1 → 2 → 3 → 4 → 5 → 6 → NetPort["Output"], 6 → NetPort[7, "Input"], NetPort["Input"] → NetPort[7, "Target"]}, "Input" → NetEncoder[{"Image", {28, 28}, "Grayscale", "MeanImage" → meanImage}], "Output" → NetDecoder[{"Image", "Grayscale"}]] NetGraph  1 2 3 4 5 6 Output 7Input Loss Input Ramp Tanh 784 50 50 784 784 1 ⨯ 28 ⨯ 28 1 ⨯ 28 ⨯ 28 1 ⨯ 28 ⨯ 28 1 ⨯ 28 ⨯ 28 ℝ FlattenLayer ReshapeLayer LinearLayer MeanSquaredLossLayer ElementwiseLayer  Train the network to minimize the reconstruction error. trained4 = NetTrain[net, <|"Input" → trainingImages|>, "Loss"]; Obtain a subnetwork that performs only reconstruction. reconstructor = Take[trained4, {NetPort["Input"], NetPort["Output"]}] NetGraph  1 2 3 4 5 6 OutputInput Ramp Tanh 784 50 50 784 784 1 ⨯ 28 ⨯ 281 ⨯ 28 ⨯ 28 FlattenLayer ElementwiseLayer LinearLayer ReshapeLayer  Reconstruct some sample images. ImageAdd[reconstructor[#], meanImage] & /@  , , , ,   , , , ,  Obtain a subnetwork that produces the code vector. 18��� DN2017_Kai_ADDITIVE.nb
  • 19. encoder = Take[trained4, {NetPort["Input"], 4}] NetGraph  1 2 3 4Input Output 784 50 501 ⨯ 28 ⨯ 28 784 FlattenLayer Ramp LinearLayer  Compute codes for all of the test images. testImages = Keys[testSubset]; features = encoder[testImages]; Project the code vectors to three dimensions and visualize them along with the original classes (not seen by the network). The digit classes tend to cluster together. coords = DimensionReduce[features, 3]; classes = Values[testSubset]; Table[Extract[coords, Position[classes, i]], {i, 0, 4}] ListPointPlot3D[Table[Extract[coords, Position[classes, i]], {i, 0, 4}], PlotLegends → PointLegend[96, Range[0, 4]], BoxRatios → 1, Axes → None, Boxed → True, PlotStyle → Map[ColorData[96], Range[1, 5]], AspectRatio → 1] 0 1 2 3 4 DN2017_Kai_ADDITIVE.nb ���19
  • 20. Visualize a hierarchical clustering of random representatives from each class. representatives = Catenate@GroupBy[testSubset, Last → First, RandomSample[#, 6] &]; ClusteringTree[encoder[representatives] → Map[ImageCrop, representatives]] 20��� DN2017_Kai_ADDITIVE.nb
  • 21. Neural Networks: Avoid overfitting using a hold-out set Use the ValidationSet option to NetTrain to ensure that the trained net does not overfit the input data. This is commonly referred to as a test or hold-out dataset. Create synthetic training data based on a Gaussian curve. data = Table[x → Exp[-x^2] + RandomVariate[NormalDistribution[0, .15]], {x, -3, 3, .2}]; plot = ListPlot[List @@@ data, PlotStyle → Red] -3 -2 -1 1 2 3 -0.2 0.2 0.4 0.6 0.8 1.0 Train a net with a large number of parameters relative to the amount of training data. net = NetChain[{150, Tanh, 150, Tanh, 1}, "Input" → "Scalar", "Output" → "Scalar"]; net1 = NetTrain[net, data, Method → "ADAM"] NetChain  ����� ������ ������ (����� �) � ����������� ������ (����� ���) � ���� ������ (����� ���) � ����������� ������ (����� ���) � ���� ������ (����� ���) � ����������� ������ (����� �) ������ ������  The resulting net overfits the data, learning the noise in addition to the underlying function. DN2017_Kai_ADDITIVE.nb ���21
  • 22. Show[Plot[net1[x], {x, -3, 3}], plot] -3 -2 -1 1 2 3 -0.2 0.2 0.4 0.6 0.8 1.0 Subdivide the data into a training set and a hold-out validation set. data = RandomSample[data]; {train, test} = TakeDrop[data, 24]; Use the ValidationSet option to have NetTrain select the net that achieved the lowest validation loss during training. net2 = NetTrain[net, train, ValidationSet → test] NetChain  ����� ������ ������ (����� �) � ����������� ������ (����� ���) � ���� ������ (����� ���) � ����������� ������ (����� ���) � ���� ������ (����� ���) � ����������� ������ (����� �) ������ ������  The result returned by NetTrain was the net that generalized best to points in the validation set, as measured by validation loss. This penalizes overfitting, as the noise present in the training data is uncorrelated with the noise present in the validation set. 22��� DN2017_Kai_ADDITIVE.nb
  • 23. Show[Plot[net2[x], {x, -3, 3}], plot] -3 -2 -1 1 2 3 0.2 0.4 0.6 0.8 1.0 DN2017_Kai_ADDITIVE.nb ���23
  • 24. Model-based Prediction Train a Gaussian process predictor on a simple dataset. data = {-1.2 → 1.2, 1.0 → 1.4, 2.2 → 1.6, 3.1 → 1.8, 4.5 → 1.6}; p = Predict[data, Method → "GaussianProcess"] Visualize the predicted values along with a confidence interval. Show[Plot[{p[x], p[x] + StandardDeviation[p[x, "Distribution"]], p[x] - StandardDeviation[p[x, "Distribution"]]}, {x, -2, 6}, PlotStyle → {Blue, Gray, Gray}, Filling → {2 → {3}}, Exclusions → False, PerformanceGoal → "Speed", PlotLegends → {"Prediction", "Confidence Interval"}], ListPlot[List @@@ data, PlotStyle → Red, PlotLegends → {"Data"}]] -2 2 4 6 1.2 1.3 1.4 1.5 1.6 1.7 1.8 Prediction Confidence Interval Data 24��� DN2017_Kai_ADDITIVE.nb
  • 25. Dealing with complexity: Graph analysis Import a DIMACS file: gr = Import["http://mat.gsia.cmu.edu/COLOR/instances/homer.col", "DIMACS"] Get the metadata: Import["http://mat.gsia.cmu.edu/COLOR/instances/homer.col", "Elements"] {AdjacencyMatrix, EdgeRules, Graph, Graphics, VertexCount} Edge rules: rules = Import["http://mat.gsia.cmu.edu/COLOR/instances/homer.col", "EdgeRules"] Most frequently occurring character: Commonest[Flatten[List @@@ rules]] {452} Achilles neighborhood: DN2017_Kai_ADDITIVE.nb ���25
  • 26. NeighborhoodGraph[gr, 452] Highlight the subgraph: 26��� DN2017_Kai_ADDITIVE.nb
  • 27. HighlightGraph[gr, NeighborhoodGraph[gr, 452], ImageSize → Large] DN2017_Kai_ADDITIVE.nb ���27
  • 28. Conclusion ◼ Don’t restrict yourself to any particular approach or method without need! ◼ Don’t imply the answer when defining a question! ◼ Stay curious! 28��� DN2017_Kai_ADDITIVE.nb
  • 29. Thanks for listening! For questions and suggestions, contact kai.gansel@additive-net.de. http://www.additive-mathematica.de ADDITIVE So�- und Hardware für Technik und Wissenscha� GmbH Max-Planck-Staße 22b, 61381 Friedrichsdorf Sales: 06172 - 5905 - 30 // mathematica@additive-net.de Academy: 06172 - 5905 - 90 // academy@additive-net.de Support: 06172 - 5905 - 20 // support@additive-net.de DN2017_Kai_ADDITIVE.nb ���29