SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
Learning the structure of Gaussian
Graphical models with unobserved variables
Marina Vinyes, Ph.D.
Paris WiMLDS Organizer, Machine Learning Engineer at Criteo
4th June 2019
1 / 17
Why graphical models?
Graphs are a natural way to represent data
Family tree Social network
Gene regulatory
network
Left: Photo of Marie Curie Museum (Muzeum Marii Sklodowskiej-Curie) is courtesy of TripAdvisor. Middle:
https://en.wikipedia.org/wiki/Social graph. Right: Emmert Streib et al. [2014] 2 / 17
What are graphical models?
Nodes correspond to random variables
Edges correspond to statistical dependencies between variables
Different kinds of graphical models
directed/undirected graph
discrete/continous/both variables
3 / 17
Conditional independence
B
A C
B: Train strike
A: Marina is late
C: Caroline is late
A and C independent?
No
A and C cond. independent
given B?
Yes
B
A C
B: Traffic jam
A: Rain
C: Football match
A and C independent?
Yes
A and C cond. independent
given B?
No
4 / 17
Learning the structure of a graphical model
Goal: Knowledge discovery, first step towards causality effects,. . .
X1
X2 X3
X4
X6 X5
X1
X2 X3
X4
X6 X5
5 / 17
Learning the structure of a graphical model
Easier for undirected Gaussian graphical models...
Σ−1
i,j = 0 if and only if no edge between Xi and Xj
(where Σ−1 is the inverse covariance matrix)
X1
X2 X3
X4
X6 X5
ˆΣ−1 ≈
Clarification: All next slides only undirected Gaussian
graphical models
6 / 17
Graphical lasso: sparsity assumption
Approximation:
ˆΣ the empirical covariance matrix
ˆΣ−1 ≈ sparse
Formulation:
min
S
fnll (S) + λ S 1
s.t. S 0
Negative log likelihood fnll (M) := − log det(M) + tr(MΣ)
Semidefinite program
7 / 17
What if some variables are unobserved?
Consider a graphical model with 2 latent variables
Complete graph, 12 edges
sparse structure
Marginalized graph, 22 edges
not so sparse structure
8 / 17
Link with the structure of the precision matrix K
K = Σ−1 where Σ is the covariance of the full graph
X1
X2
X3
X4
X6
X5
X7
X8
X9
X10
X11
Inversion formula: Σ−1
OO = KOO − UK−1
HHU
9 / 17
Previous work
Chandrasekaran et al. [2010]
Since, Σ−1
OO = KOO − UK−1
HHU
Approximation:
ˆΣOO the empirical covariance matrix
ˆΣ−1
OO ≈ sparse + low rank
Formulation:
min
S,L
fnll (S − L) + λ(η S 1 + tr(L))
s.t. S − L 0 L 0
Negative log likelihood fnll (M) := − log det(M) + tr(MΣOO)
Semidefinite program
Limitation:
The low rank component does not recover the connectivity
between latent and observed variables
10 / 17
Our formulation: more structure on L
Assuming:
latent variables are independent (KHH is diagonal)
every latent variable is connected to k observed variables
ˆΣ−1
OO ≈ sparse + L where we impose structure on L
using an atomic norm on L ≈ UU
min
S,L
fnll (S − L) + λ(η S 1 + γA(L))
s.t. S − L 0 L 0
11 / 17
Our formulation: more structure on L
Σ−1
OO ≈ +s1 u1u1 +s2 +s3u2u2 u3u3
S L1 L2 L3
Atomic norm γA:
Atomic norm for matrices [Richard et al., 2014]
A := {uu | u ∈ Rp
: u 0 ≤ k, u 2 = 1}
12 / 17
Results: Plots of matrix K for the full graph
ground truth sparse + low rank ours
disjoint 5 10 15 20 25 30 35 40 45
5
10
15
20
25
30
35
40
45
5 10 15 20 25 30 35 40 45
5
10
15
20
25
30
35
40
45
5 10 15 20 25 30 35 40 45
5
10
15
20
25
30
35
40
45
overlap 5 10 15 20 25 30 35 40 45
5
10
15
20
25
30
35
40
45
5 10 15 20 25 30 35 40 45
5
10
15
20
25
30
35
40
45
5 10 15 20 25 30 35 40 45
5
10
15
20
25
30
35
40
45
different
sizes
5 10 15 20 25 30 35 40 45
5
10
15
20
25
30
35
40
45
5 10 15 20 25 30 35 40 45
5
10
15
20
25
30
35
40
45
5 10 15 20 25 30 35 40 45
5
10
15
20
25
30
35
40
45
13 / 17
Conclusion and perspectives
convex approach with matrix regularization
real dataset
directed graphs
full paper with algorithm and identifiability results
https://arxiv.org/abs/1807.07754
14 / 17
Thank you, questions?
15 / 17
References I
V. Chandrasekaran, P. A. Parrilo, and A. S. Willsky. Latent variable
graphical model selection via convex optimization. In Communication,
Control, and Computing (Allerton), 2010 48th Annual Allerton
Conference on, pages 1610–1613. IEEE, 2010.
V. Chandrasekaran, B. Recht, P. A. Parrilo, and A. S. Willsky. The
convex geometry of linear inverse problems. Foundations of
Computational mathematics, 12(6):805–849, 2012.
F. Emmert Streib, R. De Matos Simoes, P. Mullan, B. Haibe-Kains, and
M. Dehmer. The gene regulatory network for breast cancer: integrated
regulatory landscape of cancer hallmarks. Frontiers in Genetics, 5:15,
2014.
E. Richard, G. R. Obozinski, and J.-P. Vert. Tight convex relaxations for
sparse matrix factorization. In Advances in Neural Information
Processing Systems, pages 3284–3292, 2014.
R. Rockafellar. Convex Analysis. Princeton Univ. Press, 1970.
16 / 17
Atomic norms for leveraging structure
Rockafellar [1970], Chandrasekaran et al. [2012]
Let A be a collection of atoms
x =
a∈A
caa
Atomic norm on A:
γA(x) := inf
c
{
a∈A
ca | ca ≥ 0,
a∈A
caa = x}
Example of trace norm
Matrix M ∈ Rn×p of rank k.
SVD: M = k
i=1 ci ui vi
M tr :=
k
i=1
|ci | = γA(M)
A := set of rank one matrices uv with u 2
2 ≤ 1, v 2
2 ≤ 1 17 / 17

Contenu connexe

Tendances

Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining ConceptsDung Nguyen
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 TutorialAlexander Pico
 
Introduction to ggplot2
Introduction to ggplot2Introduction to ggplot2
Introduction to ggplot2maikroeder
 
Social Media Mining - Chapter 4 (Network Models)
Social Media Mining - Chapter 4 (Network Models)Social Media Mining - Chapter 4 (Network Models)
Social Media Mining - Chapter 4 (Network Models)SocialMediaMining
 
Microsoft Malware Classification Challenge 上位手法の紹介 (in Kaggle Study Meetup)
Microsoft Malware Classification Challenge 上位手法の紹介 (in Kaggle Study Meetup)Microsoft Malware Classification Challenge 上位手法の紹介 (in Kaggle Study Meetup)
Microsoft Malware Classification Challenge 上位手法の紹介 (in Kaggle Study Meetup)Shotaro Sano
 
Graphical Models In Python | Edureka
Graphical Models In Python | EdurekaGraphical Models In Python | Edureka
Graphical Models In Python | EdurekaEdureka!
 
Metabolic Network Analysis
Metabolic Network AnalysisMetabolic Network Analysis
Metabolic Network AnalysisMas Kot
 
08. Mining Type Of Complex Data
08. Mining Type Of Complex Data08. Mining Type Of Complex Data
08. Mining Type Of Complex DataAchmad Solichin
 
Cascading behavior in the networks
Cascading behavior in the networksCascading behavior in the networks
Cascading behavior in the networksVani Kandhasamy
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Mostafa G. M. Mostafa
 
BIOS 203 Lecture 8: Free Energy Methods
BIOS 203 Lecture 8: Free Energy MethodsBIOS 203 Lecture 8: Free Energy Methods
BIOS 203 Lecture 8: Free Energy Methodsbios203
 
structural biology-Protein structure function relationship
structural biology-Protein structure function relationshipstructural biology-Protein structure function relationship
structural biology-Protein structure function relationshipMSCW Mysore
 
Firewall ( Cyber Security)
Firewall ( Cyber Security)Firewall ( Cyber Security)
Firewall ( Cyber Security)Jainam Shah
 
Molecular dynamics Simulation.pptx
Molecular dynamics Simulation.pptxMolecular dynamics Simulation.pptx
Molecular dynamics Simulation.pptxHassanShah396906
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactionsTasuduq Yaqoob
 
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節Hakky St
 
Tokyor35 人工データの発生
Tokyor35 人工データの発生Tokyor35 人工データの発生
Tokyor35 人工データの発生Yohei Sato
 
正則化項について
正則化項について正則化項について
正則化項についてArata Honda
 

Tendances (20)

Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial
 
Introduction to ggplot2
Introduction to ggplot2Introduction to ggplot2
Introduction to ggplot2
 
Social Media Mining - Chapter 4 (Network Models)
Social Media Mining - Chapter 4 (Network Models)Social Media Mining - Chapter 4 (Network Models)
Social Media Mining - Chapter 4 (Network Models)
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
Microsoft Malware Classification Challenge 上位手法の紹介 (in Kaggle Study Meetup)
Microsoft Malware Classification Challenge 上位手法の紹介 (in Kaggle Study Meetup)Microsoft Malware Classification Challenge 上位手法の紹介 (in Kaggle Study Meetup)
Microsoft Malware Classification Challenge 上位手法の紹介 (in Kaggle Study Meetup)
 
Graphical Models In Python | Edureka
Graphical Models In Python | EdurekaGraphical Models In Python | Edureka
Graphical Models In Python | Edureka
 
Metabolic Network Analysis
Metabolic Network AnalysisMetabolic Network Analysis
Metabolic Network Analysis
 
08. Mining Type Of Complex Data
08. Mining Type Of Complex Data08. Mining Type Of Complex Data
08. Mining Type Of Complex Data
 
Cascading behavior in the networks
Cascading behavior in the networksCascading behavior in the networks
Cascading behavior in the networks
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
BIOS 203 Lecture 8: Free Energy Methods
BIOS 203 Lecture 8: Free Energy MethodsBIOS 203 Lecture 8: Free Energy Methods
BIOS 203 Lecture 8: Free Energy Methods
 
structural biology-Protein structure function relationship
structural biology-Protein structure function relationshipstructural biology-Protein structure function relationship
structural biology-Protein structure function relationship
 
Firewall ( Cyber Security)
Firewall ( Cyber Security)Firewall ( Cyber Security)
Firewall ( Cyber Security)
 
Molecular dynamics Simulation.pptx
Molecular dynamics Simulation.pptxMolecular dynamics Simulation.pptx
Molecular dynamics Simulation.pptx
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactions
 
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節
 
Tokyor35 人工データの発生
Tokyor35 人工データの発生Tokyor35 人工データの発生
Tokyor35 人工データの発生
 
正則化項について
正則化項について正則化項について
正則化項について
 

Similaire à Learning the structure of Gaussian Graphical models with unobserved variables by Marina Vinyes, Software Engineer in Machine Learning @Criteo

Topological Data Analysis and Persistent Homology
Topological Data Analysis and Persistent HomologyTopological Data Analysis and Persistent Homology
Topological Data Analysis and Persistent HomologyCarla Melia
 
Graph theory introduction - Samy
Graph theory  introduction - SamyGraph theory  introduction - Samy
Graph theory introduction - SamyMark Arokiasamy
 
Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-feiTianlu Wang
 
Line graphs, slope, and interpreting line graphs
Line graphs, slope, and interpreting line graphs Line graphs, slope, and interpreting line graphs
Line graphs, slope, and interpreting line graphs Charalee
 
Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)Frank Nielsen
 
An elementary introduction to information geometry
An elementary introduction to information geometryAn elementary introduction to information geometry
An elementary introduction to information geometryFrank Nielsen
 
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial DataESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial DataKostis Kyzirakos
 
Information geometry: Dualistic manifold structures and their uses
Information geometry: Dualistic manifold structures and their usesInformation geometry: Dualistic manifold structures and their uses
Information geometry: Dualistic manifold structures and their usesFrank Nielsen
 
Gradient Dynamical Systems, Bifurcation Theory, Numerical Methods and Applica...
Gradient Dynamical Systems, Bifurcation Theory, Numerical Methods and Applica...Gradient Dynamical Systems, Bifurcation Theory, Numerical Methods and Applica...
Gradient Dynamical Systems, Bifurcation Theory, Numerical Methods and Applica...Boris Fackovec
 
G6 m3-c-lesson 18-s
G6 m3-c-lesson 18-sG6 m3-c-lesson 18-s
G6 m3-c-lesson 18-smlabuski
 
Graph theory ppt.pptx
Graph theory ppt.pptxGraph theory ppt.pptx
Graph theory ppt.pptxsaranyajey
 
Lecture 07 leonidas guibas - networks of shapes and images
Lecture 07   leonidas guibas - networks of shapes and imagesLecture 07   leonidas guibas - networks of shapes and images
Lecture 07 leonidas guibas - networks of shapes and imagesmustafa sarac
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Frank Nielsen
 
Class 11 maths support material
Class 11 maths support materialClass 11 maths support material
Class 11 maths support materialnitishguptamaps
 
Litvinenko, Uncertainty Quantification - an Overview
Litvinenko, Uncertainty Quantification - an OverviewLitvinenko, Uncertainty Quantification - an Overview
Litvinenko, Uncertainty Quantification - an OverviewAlexander Litvinenko
 

Similaire à Learning the structure of Gaussian Graphical models with unobserved variables by Marina Vinyes, Software Engineer in Machine Learning @Criteo (20)

Topological Data Analysis and Persistent Homology
Topological Data Analysis and Persistent HomologyTopological Data Analysis and Persistent Homology
Topological Data Analysis and Persistent Homology
 
Graph theory introduction - Samy
Graph theory  introduction - SamyGraph theory  introduction - Samy
Graph theory introduction - Samy
 
Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-fei
 
Lausanne 2019 #4
Lausanne 2019 #4Lausanne 2019 #4
Lausanne 2019 #4
 
Line graphs, slope, and interpreting line graphs
Line graphs, slope, and interpreting line graphs Line graphs, slope, and interpreting line graphs
Line graphs, slope, and interpreting line graphs
 
Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)
 
An elementary introduction to information geometry
An elementary introduction to information geometryAn elementary introduction to information geometry
An elementary introduction to information geometry
 
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial DataESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
 
Information geometry: Dualistic manifold structures and their uses
Information geometry: Dualistic manifold structures and their usesInformation geometry: Dualistic manifold structures and their uses
Information geometry: Dualistic manifold structures and their uses
 
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
 
Gradient Dynamical Systems, Bifurcation Theory, Numerical Methods and Applica...
Gradient Dynamical Systems, Bifurcation Theory, Numerical Methods and Applica...Gradient Dynamical Systems, Bifurcation Theory, Numerical Methods and Applica...
Gradient Dynamical Systems, Bifurcation Theory, Numerical Methods and Applica...
 
G6 m3-c-lesson 18-s
G6 m3-c-lesson 18-sG6 m3-c-lesson 18-s
G6 m3-c-lesson 18-s
 
CLIM: Transition Workshop - Incorporating Spatial Dependence in Remote Sensin...
CLIM: Transition Workshop - Incorporating Spatial Dependence in Remote Sensin...CLIM: Transition Workshop - Incorporating Spatial Dependence in Remote Sensin...
CLIM: Transition Workshop - Incorporating Spatial Dependence in Remote Sensin...
 
Graph theory ppt.pptx
Graph theory ppt.pptxGraph theory ppt.pptx
Graph theory ppt.pptx
 
Lecture 07 leonidas guibas - networks of shapes and images
Lecture 07   leonidas guibas - networks of shapes and imagesLecture 07   leonidas guibas - networks of shapes and images
Lecture 07 leonidas guibas - networks of shapes and images
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
 
Class 11 maths support material
Class 11 maths support materialClass 11 maths support material
Class 11 maths support material
 
Graph Theory
Graph TheoryGraph Theory
Graph Theory
 
Litvinenko, Uncertainty Quantification - an Overview
Litvinenko, Uncertainty Quantification - an OverviewLitvinenko, Uncertainty Quantification - an Overview
Litvinenko, Uncertainty Quantification - an Overview
 
Cunha CILAMCE 2016
Cunha CILAMCE 2016Cunha CILAMCE 2016
Cunha CILAMCE 2016
 

Plus de Paris Women in Machine Learning and Data Science

Plus de Paris Women in Machine Learning and Data Science (20)

Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
How and why AI should fight cybersexism, by Chloe Daudier
How and why AI should fight cybersexism, by Chloe DaudierHow and why AI should fight cybersexism, by Chloe Daudier
How and why AI should fight cybersexism, by Chloe Daudier
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Managing international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha DimbanManaging international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha Dimban
 
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria KnorpsOptimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
 
Perspectives, by M. Pannegeon
Perspectives, by M. PannegeonPerspectives, by M. Pannegeon
Perspectives, by M. Pannegeon
 
Evaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled dataEvaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled data
 
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
 
An age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-PierreAn age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-Pierre
 
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle LautréApplying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
 
How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
How to supervise a thesis in NLP in the ChatGPT era? By Laure SoulierHow to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
 
Global Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna AbreuGlobal Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna Abreu
 
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie DelonPlug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
 
Sales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca IannuzziSales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca Iannuzzi
 
Identifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta BinkyteIdentifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta Binkyte
 
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
 
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
 
Sandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI projectSandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI project
 
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
 
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdfKhrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
 

Dernier

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086anil_gaur
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...soginsider
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projectssmsksolar
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 

Dernier (20)

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 

Learning the structure of Gaussian Graphical models with unobserved variables by Marina Vinyes, Software Engineer in Machine Learning @Criteo

  • 1. Learning the structure of Gaussian Graphical models with unobserved variables Marina Vinyes, Ph.D. Paris WiMLDS Organizer, Machine Learning Engineer at Criteo 4th June 2019 1 / 17
  • 2. Why graphical models? Graphs are a natural way to represent data Family tree Social network Gene regulatory network Left: Photo of Marie Curie Museum (Muzeum Marii Sklodowskiej-Curie) is courtesy of TripAdvisor. Middle: https://en.wikipedia.org/wiki/Social graph. Right: Emmert Streib et al. [2014] 2 / 17
  • 3. What are graphical models? Nodes correspond to random variables Edges correspond to statistical dependencies between variables Different kinds of graphical models directed/undirected graph discrete/continous/both variables 3 / 17
  • 4. Conditional independence B A C B: Train strike A: Marina is late C: Caroline is late A and C independent? No A and C cond. independent given B? Yes B A C B: Traffic jam A: Rain C: Football match A and C independent? Yes A and C cond. independent given B? No 4 / 17
  • 5. Learning the structure of a graphical model Goal: Knowledge discovery, first step towards causality effects,. . . X1 X2 X3 X4 X6 X5 X1 X2 X3 X4 X6 X5 5 / 17
  • 6. Learning the structure of a graphical model Easier for undirected Gaussian graphical models... Σ−1 i,j = 0 if and only if no edge between Xi and Xj (where Σ−1 is the inverse covariance matrix) X1 X2 X3 X4 X6 X5 ˆΣ−1 ≈ Clarification: All next slides only undirected Gaussian graphical models 6 / 17
  • 7. Graphical lasso: sparsity assumption Approximation: ˆΣ the empirical covariance matrix ˆΣ−1 ≈ sparse Formulation: min S fnll (S) + λ S 1 s.t. S 0 Negative log likelihood fnll (M) := − log det(M) + tr(MΣ) Semidefinite program 7 / 17
  • 8. What if some variables are unobserved? Consider a graphical model with 2 latent variables Complete graph, 12 edges sparse structure Marginalized graph, 22 edges not so sparse structure 8 / 17
  • 9. Link with the structure of the precision matrix K K = Σ−1 where Σ is the covariance of the full graph X1 X2 X3 X4 X6 X5 X7 X8 X9 X10 X11 Inversion formula: Σ−1 OO = KOO − UK−1 HHU 9 / 17
  • 10. Previous work Chandrasekaran et al. [2010] Since, Σ−1 OO = KOO − UK−1 HHU Approximation: ˆΣOO the empirical covariance matrix ˆΣ−1 OO ≈ sparse + low rank Formulation: min S,L fnll (S − L) + λ(η S 1 + tr(L)) s.t. S − L 0 L 0 Negative log likelihood fnll (M) := − log det(M) + tr(MΣOO) Semidefinite program Limitation: The low rank component does not recover the connectivity between latent and observed variables 10 / 17
  • 11. Our formulation: more structure on L Assuming: latent variables are independent (KHH is diagonal) every latent variable is connected to k observed variables ˆΣ−1 OO ≈ sparse + L where we impose structure on L using an atomic norm on L ≈ UU min S,L fnll (S − L) + λ(η S 1 + γA(L)) s.t. S − L 0 L 0 11 / 17
  • 12. Our formulation: more structure on L Σ−1 OO ≈ +s1 u1u1 +s2 +s3u2u2 u3u3 S L1 L2 L3 Atomic norm γA: Atomic norm for matrices [Richard et al., 2014] A := {uu | u ∈ Rp : u 0 ≤ k, u 2 = 1} 12 / 17
  • 13. Results: Plots of matrix K for the full graph ground truth sparse + low rank ours disjoint 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 overlap 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 different sizes 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 13 / 17
  • 14. Conclusion and perspectives convex approach with matrix regularization real dataset directed graphs full paper with algorithm and identifiability results https://arxiv.org/abs/1807.07754 14 / 17
  • 16. References I V. Chandrasekaran, P. A. Parrilo, and A. S. Willsky. Latent variable graphical model selection via convex optimization. In Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on, pages 1610–1613. IEEE, 2010. V. Chandrasekaran, B. Recht, P. A. Parrilo, and A. S. Willsky. The convex geometry of linear inverse problems. Foundations of Computational mathematics, 12(6):805–849, 2012. F. Emmert Streib, R. De Matos Simoes, P. Mullan, B. Haibe-Kains, and M. Dehmer. The gene regulatory network for breast cancer: integrated regulatory landscape of cancer hallmarks. Frontiers in Genetics, 5:15, 2014. E. Richard, G. R. Obozinski, and J.-P. Vert. Tight convex relaxations for sparse matrix factorization. In Advances in Neural Information Processing Systems, pages 3284–3292, 2014. R. Rockafellar. Convex Analysis. Princeton Univ. Press, 1970. 16 / 17
  • 17. Atomic norms for leveraging structure Rockafellar [1970], Chandrasekaran et al. [2012] Let A be a collection of atoms x = a∈A caa Atomic norm on A: γA(x) := inf c { a∈A ca | ca ≥ 0, a∈A caa = x} Example of trace norm Matrix M ∈ Rn×p of rank k. SVD: M = k i=1 ci ui vi M tr := k i=1 |ci | = γA(M) A := set of rank one matrices uv with u 2 2 ≤ 1, v 2 2 ≤ 1 17 / 17