SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
Learning from Noisy Label Distributions
Yuya Yoshikawa
STAIR Lab,
Chiba Institute of Technology, Japan
Standard supervised learning setting
• Given labeled data 𝒙", 𝑦" "%&
'
• Feature vector 𝒙" ∈ ℝ*
• Label 𝑦" ∈ {1,2, … , 𝑀}
• Goal: to learn a classifier 𝑓 𝒙; 𝑾 , i.e., to estimate 𝑾
• We consider a linear classifier, i.e., 𝑓 𝒙; 𝑾 = 𝒙5 𝑾
where, weight matrix 𝑾 ∈ ℝ*×7
• Estimating 𝑾 needs a lot of labeled data
2
If we have no labeled data …
• Give up learning? → No.
• Annotate labels to unlabeled data by hand
• However, annotation is often difficult and expensive
3
A case that annotation is difficult
• Consider annotating age (e.g., 20s, 30s, 40s) to SNS users
• It’s very easy if the age is explicitly written in users’ profile
• If not, annotators need to infer users’ age from:
• Profile photos
• Texts (tweets etc.)
• Followers and followees
4
20s? 30s?
difficult…
Problem setting in this study
• Goal: to learn a classifier 𝑓(𝒙, 𝑾)
• Assumptions:
• There is no labeled data
• Each instance 𝒙" belongs to more than one groups
• Each group has a noisy label distribution which can be observed
• Our solution
• Infer the true label distributions of the groups from the noisy ones
• Infer the true label of each instance from the true label distributions
• Learn a classifier 𝑓(𝒙, 𝑾) using the true labels
5
Illustration of our setting
6
Illustration of our setting
7
• Feature vectors 𝒙: ∈ ℝ*
:%&
;
for 𝑈 instances
• Each instance 𝑢 has a single label 𝑦: ∈ 1, … , 𝑀 ,
(The shape of each instance indicates the label)
• But, the label cannot be observed
Illustration of our setting
8
• Each instance belongs to
more than one groups
• For each group, there is a true
label distribution (unobserved)
Illustration of our setting
9
• The true label distributions are
distorted by an unknown noise
• As a result, we can observe
the noisy label distributions
A typical example: Twitter
10
hyperlink
Twitter world BBC News website
@BBCWorld
male
Gender distribution of
the website visitors
(noisy label dist.)
female
50% 50%
Website world
male female
60% 40%
Gender distribution
(true label dist.)
distorted
by noise
A typical example: Twitter
11
Twitter world
@BBCWorld
male female
60% 40%
Gender distribution
(true label dist.)
• Goal: to learn a classifier that predicts
the gender of Twitter users
• Some users follows official accounts
such as @BBCWorld (BBC News)
• Each user is an instance
• @BBCWorld is a group
• Users who follows @BBCWorld
are the members of the group
• Gender distribution of @BBCWorld
cannot be observed
A typical example: Twitter
12
Twitter world BBC News website
@BBCWorld
male
Gender distribution of
the website visitors
(noisy label dist.)
female
50% 50%
Website world
male female
60% 40%
Gender distribution
(true label dist.)
distorted
by noise
hyperlink
• @BBCWorld has a hyperlink to
BBC News website
• The gender distribution of the
website visitors (noisy label dist.)
can be obtained from audience
measurement services such as
Quantcast
• Why is noise generated?
• Twitter and website worlds
have different populations
• Noise is used for conforming
the populations of two worlds
Problem setting in this study
• Goal: to learn a classifier 𝑓(𝒙, 𝑾)
• Assumptions:
• There is no labeled data
• Each instance 𝒙" belongs more than one groups
• Each group has a noisy label distribution which can be observed
• Our solution
• Infer the true label distributions of the groups from the noisy ones
• Infer the true label of each instance from the true label distributions
• Learn a classifier 𝑓(𝒙, 𝑾) using the inferred true labels
13
Related work
• Our study is inspired by [Cullota et al., AAAI 2015]
• Our setting is almost the same as theirs
• Their solution is too simple
• The solution cannot capture the difference between true and noisy label
distributions
14
𝒙
𝑓(𝒙, 𝑾)
Training
Learn a linear regression model 𝑓(𝒙, 𝑾) that
predict label ratios from a feature vector 𝒙
Prediction
𝒙>?@
𝑓(𝒙>?@, 𝑾)
Return a label that have the highest label ratio
predicted by 𝑓 𝒙, 𝑾
predicted ratios
△
label
Related work
15
• Our contributions
• Formalized the problem by Cullota et al. as a machine learning problem
• Proposed a probabilistic generative model specialized for the problem
• Our study is inspired by [Cullota et al., AAAI 2015]
• Our setting is almost the same as theirs
• Their solution is too simple
• The solution cannot capture the difference between true and noisy label
distributions
Proposed approach
• Developed a probabilistic generative model that represents the
generative process of the noisy label distributions
16
Graphical model
17
Weight matrix
for classifier
True label of
each instance
Confusion
matrix for noise
Noisy label distributions
of groups (observed)
Group-dependent label for
each instance and group
Feature vector for each
instance (observed)
Generative process
18
Generative process
19
𝜷 ∈ ℝ7×7
is determined by
When 𝛼CD > 𝛼C&
Assume strong noise
When 𝛼C& > 𝛼CD
Assume weak noise
Generative process
20
Generative process
21
𝑡:"
Generative process
22
Inference: variational Bayes method
23
Objective function:
log of marginal posterior w.r.t. weight matrix 𝐖	and confusion matrix 𝐂
Goal: find 𝐖 and 𝐂 such that the objective function is maximized
• Mean-field approximation is applied to the objective for efficient computation
• Then, we estimated W and C by using a quasi-Newton method
Experimental setting
• We experimented on a synthetic dataset
• The dataset is generated based on the proposed model
• The purpose is to confirm that the proposed model is superior to the existing
methods when the label distributions are distorted by a noise.
• We created three datasets varying hyper-parameter 𝛼C& ∈ {1,10,100}
• The hyper-parameter controls the strength of noise distortion
• When 𝛼C&=1, noise is small, i.e., the difference between true and noisy label
distributions is small
• When 𝛼C&=100, noise is large, i.e., the difference between true and noisy label
distributions is large
24
Result
• Regardless of noise strength, the proposed model is consistently
superior to the methods proposed by [Cullota et al., AAAI 2015]
25
Table: Accuracy of true label estimation (# classes 𝑀 = 4)
Methods proposed by
[Cullota et al., AAAI 2015]
strong noiseweak noise
Conclusion and future work
• We addressed the problem of learning a classifier from noisy
label distributions
• There is no labeled data
• Instead, each instance belongs to more than one groups, and then,
each group has a noisy label distribution
• To solve this problem, we proposed a probabilistic generative model
• Future work
• Experiments on real-world datasets
26

Contenu connexe

Tendances

Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013Arjen de Vries
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsLior Rokach
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Countingjakehofman
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNNŞeyda Hatipoğlu
 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systemsjakehofman
 
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie RecommendationYONG ZHENG
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Tweets Classification
Tweets ClassificationTweets Classification
Tweets ClassificationVarun Gupta
 
[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...NAIST Machine Translation Study Group
 
CSTalks - Real movie recommendation - 9 Mar
CSTalks - Real movie recommendation - 9 MarCSTalks - Real movie recommendation - 9 Mar
CSTalks - Real movie recommendation - 9 Marcstalks
 
Getting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesGetting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesCSIRO
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyUniversity of Bergen
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systemsKapil Garg
 
Collaborative filtering at scale
Collaborative filtering at scaleCollaborative filtering at scale
Collaborative filtering at scalehuguk
 
Recommendation techniques
Recommendation techniques Recommendation techniques
Recommendation techniques sun9413
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...David Zibriczky
 

Tendances (20)

Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Counting
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
Content based filtering
Content based filteringContent based filtering
Content based filtering
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systems
 
Quantifying the bias in data links
Quantifying the bias in data linksQuantifying the bias in data links
Quantifying the bias in data links
 
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Tweets Classification
Tweets ClassificationTweets Classification
Tweets Classification
 
[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...
 
CSTalks - Real movie recommendation - 9 Mar
CSTalks - Real movie recommendation - 9 MarCSTalks - Real movie recommendation - 9 Mar
CSTalks - Real movie recommendation - 9 Mar
 
Getting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesGetting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensembles
 
Active learning
Active learningActive learning
Active learning
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a Survey
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systems
 
Collaborative filtering at scale
Collaborative filtering at scaleCollaborative filtering at scale
Collaborative filtering at scale
 
Recommendation techniques
Recommendation techniques Recommendation techniques
Recommendation techniques
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
 

Similaire à Learning from Noisy Label Distributions (ICANN2017)

Data Science 101
Data Science 101Data Science 101
Data Science 101ideatoipo
 
Networks community detection using artificial bee colony swarm optimization
Networks community detection using artificial bee colony swarm optimizationNetworks community detection using artificial bee colony swarm optimization
Networks community detection using artificial bee colony swarm optimizationAboul Ella Hassanien
 
Approaches to ml techniques on real world data
Approaches to ml techniques on real world dataApproaches to ml techniques on real world data
Approaches to ml techniques on real world dataVenkata Ramana
 
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Julián Urbano
 
Big Data Real Time Training in Chennai
Big Data Real Time Training in ChennaiBig Data Real Time Training in Chennai
Big Data Real Time Training in ChennaiVijay Susheedran C G
 
Big Data 101 - An introduction
Big Data 101 - An introductionBig Data 101 - An introduction
Big Data 101 - An introductionNeeraj Tewari
 
ISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsAndrea Arcuri
 
Data science for advanced dummies
Data science for advanced dummiesData science for advanced dummies
Data science for advanced dummiesSaurav Chakravorty
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Maninda Edirisooriya
 
Robust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsRobust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsKimin Lee
 
Bmgt 311 chapter_13
Bmgt 311 chapter_13Bmgt 311 chapter_13
Bmgt 311 chapter_13Chris Lovett
 
11-Statistical-Tests.pptx
11-Statistical-Tests.pptx11-Statistical-Tests.pptx
11-Statistical-Tests.pptxShree Shree
 
Explainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxExplainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxCSIRO
 
probability.pptx
probability.pptxprobability.pptx
probability.pptxbisan3
 
Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Rajul Kukreja
 
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...Jihwan Bang
 
Topic 3 (Stats summary)
Topic 3 (Stats summary)Topic 3 (Stats summary)
Topic 3 (Stats summary)Ryan Herzog
 

Similaire à Learning from Noisy Label Distributions (ICANN2017) (20)

Mini datathon
Mini datathonMini datathon
Mini datathon
 
Data Science 101
Data Science 101Data Science 101
Data Science 101
 
Networks community detection using artificial bee colony swarm optimization
Networks community detection using artificial bee colony swarm optimizationNetworks community detection using artificial bee colony swarm optimization
Networks community detection using artificial bee colony swarm optimization
 
Approaches to ml techniques on real world data
Approaches to ml techniques on real world dataApproaches to ml techniques on real world data
Approaches to ml techniques on real world data
 
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
 
Big Data Real Time Training in Chennai
Big Data Real Time Training in ChennaiBig Data Real Time Training in Chennai
Big Data Real Time Training in Chennai
 
Big Data 101 - An introduction
Big Data 101 - An introductionBig Data 101 - An introduction
Big Data 101 - An introduction
 
ISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to Statistics
 
Data science for advanced dummies
Data science for advanced dummiesData science for advanced dummies
Data science for advanced dummies
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
 
Robust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsRobust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labels
 
Bmgt 311 chapter_13
Bmgt 311 chapter_13Bmgt 311 chapter_13
Bmgt 311 chapter_13
 
11-Statistical-Tests.pptx
11-Statistical-Tests.pptx11-Statistical-Tests.pptx
11-Statistical-Tests.pptx
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
Explainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxExplainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptx
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
 
Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...
 
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
 
Topic 3 (Stats summary)
Topic 3 (Stats summary)Topic 3 (Stats summary)
Topic 3 (Stats summary)
 

Dernier

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Dernier (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Learning from Noisy Label Distributions (ICANN2017)

  • 1. Learning from Noisy Label Distributions Yuya Yoshikawa STAIR Lab, Chiba Institute of Technology, Japan
  • 2. Standard supervised learning setting • Given labeled data 𝒙", 𝑦" "%& ' • Feature vector 𝒙" ∈ ℝ* • Label 𝑦" ∈ {1,2, … , 𝑀} • Goal: to learn a classifier 𝑓 𝒙; 𝑾 , i.e., to estimate 𝑾 • We consider a linear classifier, i.e., 𝑓 𝒙; 𝑾 = 𝒙5 𝑾 where, weight matrix 𝑾 ∈ ℝ*×7 • Estimating 𝑾 needs a lot of labeled data 2
  • 3. If we have no labeled data … • Give up learning? → No. • Annotate labels to unlabeled data by hand • However, annotation is often difficult and expensive 3
  • 4. A case that annotation is difficult • Consider annotating age (e.g., 20s, 30s, 40s) to SNS users • It’s very easy if the age is explicitly written in users’ profile • If not, annotators need to infer users’ age from: • Profile photos • Texts (tweets etc.) • Followers and followees 4 20s? 30s? difficult…
  • 5. Problem setting in this study • Goal: to learn a classifier 𝑓(𝒙, 𝑾) • Assumptions: • There is no labeled data • Each instance 𝒙" belongs to more than one groups • Each group has a noisy label distribution which can be observed • Our solution • Infer the true label distributions of the groups from the noisy ones • Infer the true label of each instance from the true label distributions • Learn a classifier 𝑓(𝒙, 𝑾) using the true labels 5
  • 7. Illustration of our setting 7 • Feature vectors 𝒙: ∈ ℝ* :%& ; for 𝑈 instances • Each instance 𝑢 has a single label 𝑦: ∈ 1, … , 𝑀 , (The shape of each instance indicates the label) • But, the label cannot be observed
  • 8. Illustration of our setting 8 • Each instance belongs to more than one groups • For each group, there is a true label distribution (unobserved)
  • 9. Illustration of our setting 9 • The true label distributions are distorted by an unknown noise • As a result, we can observe the noisy label distributions
  • 10. A typical example: Twitter 10 hyperlink Twitter world BBC News website @BBCWorld male Gender distribution of the website visitors (noisy label dist.) female 50% 50% Website world male female 60% 40% Gender distribution (true label dist.) distorted by noise
  • 11. A typical example: Twitter 11 Twitter world @BBCWorld male female 60% 40% Gender distribution (true label dist.) • Goal: to learn a classifier that predicts the gender of Twitter users • Some users follows official accounts such as @BBCWorld (BBC News) • Each user is an instance • @BBCWorld is a group • Users who follows @BBCWorld are the members of the group • Gender distribution of @BBCWorld cannot be observed
  • 12. A typical example: Twitter 12 Twitter world BBC News website @BBCWorld male Gender distribution of the website visitors (noisy label dist.) female 50% 50% Website world male female 60% 40% Gender distribution (true label dist.) distorted by noise hyperlink • @BBCWorld has a hyperlink to BBC News website • The gender distribution of the website visitors (noisy label dist.) can be obtained from audience measurement services such as Quantcast • Why is noise generated? • Twitter and website worlds have different populations • Noise is used for conforming the populations of two worlds
  • 13. Problem setting in this study • Goal: to learn a classifier 𝑓(𝒙, 𝑾) • Assumptions: • There is no labeled data • Each instance 𝒙" belongs more than one groups • Each group has a noisy label distribution which can be observed • Our solution • Infer the true label distributions of the groups from the noisy ones • Infer the true label of each instance from the true label distributions • Learn a classifier 𝑓(𝒙, 𝑾) using the inferred true labels 13
  • 14. Related work • Our study is inspired by [Cullota et al., AAAI 2015] • Our setting is almost the same as theirs • Their solution is too simple • The solution cannot capture the difference between true and noisy label distributions 14 𝒙 𝑓(𝒙, 𝑾) Training Learn a linear regression model 𝑓(𝒙, 𝑾) that predict label ratios from a feature vector 𝒙 Prediction 𝒙>?@ 𝑓(𝒙>?@, 𝑾) Return a label that have the highest label ratio predicted by 𝑓 𝒙, 𝑾 predicted ratios △ label
  • 15. Related work 15 • Our contributions • Formalized the problem by Cullota et al. as a machine learning problem • Proposed a probabilistic generative model specialized for the problem • Our study is inspired by [Cullota et al., AAAI 2015] • Our setting is almost the same as theirs • Their solution is too simple • The solution cannot capture the difference between true and noisy label distributions
  • 16. Proposed approach • Developed a probabilistic generative model that represents the generative process of the noisy label distributions 16
  • 17. Graphical model 17 Weight matrix for classifier True label of each instance Confusion matrix for noise Noisy label distributions of groups (observed) Group-dependent label for each instance and group Feature vector for each instance (observed)
  • 19. Generative process 19 𝜷 ∈ ℝ7×7 is determined by When 𝛼CD > 𝛼C& Assume strong noise When 𝛼C& > 𝛼CD Assume weak noise
  • 23. Inference: variational Bayes method 23 Objective function: log of marginal posterior w.r.t. weight matrix 𝐖 and confusion matrix 𝐂 Goal: find 𝐖 and 𝐂 such that the objective function is maximized • Mean-field approximation is applied to the objective for efficient computation • Then, we estimated W and C by using a quasi-Newton method
  • 24. Experimental setting • We experimented on a synthetic dataset • The dataset is generated based on the proposed model • The purpose is to confirm that the proposed model is superior to the existing methods when the label distributions are distorted by a noise. • We created three datasets varying hyper-parameter 𝛼C& ∈ {1,10,100} • The hyper-parameter controls the strength of noise distortion • When 𝛼C&=1, noise is small, i.e., the difference between true and noisy label distributions is small • When 𝛼C&=100, noise is large, i.e., the difference between true and noisy label distributions is large 24
  • 25. Result • Regardless of noise strength, the proposed model is consistently superior to the methods proposed by [Cullota et al., AAAI 2015] 25 Table: Accuracy of true label estimation (# classes 𝑀 = 4) Methods proposed by [Cullota et al., AAAI 2015] strong noiseweak noise
  • 26. Conclusion and future work • We addressed the problem of learning a classifier from noisy label distributions • There is no labeled data • Instead, each instance belongs to more than one groups, and then, each group has a noisy label distribution • To solve this problem, we proposed a probabilistic generative model • Future work • Experiments on real-world datasets 26