SlideShare a Scribd company logo
1 of 12
TRADER: Trace Divergence Analysis
and Embedding Regulation for
Debugging Recurrent Neural
Networks
Guanhong Tao, Shiqing Ma, Yingqi Liu, Qiuling Xu, Xiangyu Zhang
Introduction
• Many software artifacts are in the form of text or sequences
• Recurrent Neural Networks (RNNs): designed to deal with textual
inputs and inputs in sequence
 SentiStrength [1] predicts positive or negative sentiment for informal English
text
 Predicted sentiment is further used to extract problematic API features by work
[2]
 A recent study [3] showed that SentiStrength achieved recall and precision
lower than 40% on negative sentences
2
[1] Thelwall et al. 2010. Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology.
[2] Zhang et al. 2013. Extracting problematic API features from forum discussions. In 21st International Conference on Program Comprehension (ICPC).
[3] Lin et al. 2018. Sentiment Analysis for Software Engineering: How Far Can We Go?. In Proceedings of 40th International Conference on Software
Engineering (ICSE).
Introduction
• Text sequence  word embeddings  RNN model  prediction
• Word embeddings are the dominating factor in model accuracy in
RNN applications [4, 5, 6]
• Same ML model using different word embeddings can have
divergent accuracy ranging from 62.95% to 88.90% [5]
[4] Baroni et al. 2014. Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the
52nd Annual
Meeting of the Association for Computational Linguistics (ACL).
[5] Schnabel et al. 2015. Evaluation methods for unsupervised word embeddings. In Conference on Empirical Methods in Natural Language Processing
(EMNLP).
[6] Yu et al. 2017. Refining word embeddings for sentiment analysis. In Conference on Empirical Methods in Natural Language Processing (EMNLP).
3
For a type of bugs in which problematic
word embeddings lead to suboptimal
model accuracy
Introduction
• Text sequence  word embeddings  RNN model  prediction
• The quality of word embeddings can be measured using neighboring
words
4
Original: original embeddings
Regulated: regulated embeddings (by our
tool)
Nearest words measured by cosine
similarity
Target
word
Debugging An RNN Model
5
Task: predict the label of an input
sentence
Each pair of [h_t, x_t] and o_t is a state,
and all the states (looping over input
text sequence) constitute a trace
Prediction is the last output
Loo
p
Unrolled RNN structure
xt
C
ht
ot
x0
C
h
0
o
0
x1
C
h
1
o
1
xt
C
ht
ot
• A misclassification is caused by buggy states within a trace
 “Also, JodaTime1 makes calculations with time much simpler”2
• Trace divergence analysis
 At each time step, use x_t and h_t
as input, and o_t as output
 Train classifiers on validation set to
identify diverged steps
 Please see paper for detailed
analysis
Trace Divergence Analysis
6
1A data and time library for Java. https://www.joda.org/joda-time/
2A sample text from Stack overflow dataset predicted by a LSTM model
Als
o
x0
C
h
0
o
0
x0 h
0
o
0
x1
C
h
1
o
1
JodaTime
x1 h
1
o
1
x6
C
h
6
o
6
x7
C
h
7
o
7
much simpler
x6 h
6
o
6
x7 h
7
o
7
State vector
Trace
Time step
Als
o
JodaTim
e
makes
calculation
s
with
time
much
simpler
Outpu
t
Trace
divergence
Defective Dimension Identification
• The root cause of buggy states comes from problematic state
dimensions
7
Also, JodaTime makes … much
simpler.
[ ··· 0.04 ··· 0.4 ··· -0.06 ··· ]
[ ]
xt ht
 Obtain state vector of diverged step
 Multiply state vector with a pre-
trained state importance element-
wisely
 Locate defective dimensions with
large values
 Aggregate defective dimensions
from all the diverged steps using
Algorithm 1 (details in paper)
[ ··· 0.4 ··· -0.1 ··· -0.2 ··· ]
[ ··· 0.1 ··· -4.0 ··· 0.3 ··· ]
Pre-
trained
importanc
e
Weighted
state
vector
⊙
=
Defective
dimension
Embedding Regulation
8
• Model is sensitive for defective dimensions
• Regulating word embeddings to reduce impact from buggy
dimensions
 Apply perturbations on buggy
(input and internal) dimensions
 Freeze model parameters and
update input embeddings by
minimizing output difference
 Retrain model with regulated word
embeddings
 Please see details in Algorithm 2
[ ··· 0.04 ··· 0.4 ··· -0.06 ··· ]
Weighted
state
vector +
[ ··· 0 ··· 𝜀 ··· 0 ···
]
Error
vector
Also, JodaTime makes … much
simpler.
[ ··· 0.4 ··· -0.1 ··· -0.2 ··· ]
[𝑥𝑡, ℎ𝑡]
[𝑥𝑡, ℎ𝑡]
xt
C
ht
ot
𝑥𝑡
C
ℎ𝑡
𝑜𝑡
dif
f
Experimental Results
• 5.37% improvement on 135 models (baseline 0.6%)
 5 datasets, 3 word embeddings, 3 RNN model structures (each with 3 different
settings)
 Case study
• Artifacts: https://github.com/trader-rnn/TRADER
9
Negative Neutral
Positive
Labe
l
Input
sentence
Original model
Fixed model
Original model
Fixed model
Related Work
• Existing works focus on debugging specific machine learning models
or feed-forward Neural Networks and are not applicable to RNNs [7,
8, 9]
• Work [10] aims at debugging NLP models by generating adversarial
examples as training data
• Researchers [11, 12] propose methods to debug models by cleaning
up the wrongly labeled training data
• These approaches debug RNN models by providing better training
data and do not analyze model internals
10
[7] Cadamuro et al. 2016. Debugging machine learning models. In ICML Workshop on Reliable Machine Learning in the Wild.
[8] Chakarov et al. 2016. Debugging machine learning tasks. arXiv preprint arXiv:1603.07292 (2016).
[9] Ma et al. 2018. MODE: automated neural network model debugging via state differential analysis and input selection. In Proceedings of the 2018
26th ACM Joint
Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE).
[10] Ribeiro et al. 2018. Semantically equivalent adversarial rules for debugging NLP models. In Association for Computational Linguistics (ACL).
[11] Jiang et al. 2004. Editing training data for kNN classifiers with neural network ensemble. In International Symposium on Neural Networks.
[12] Zhang et al. 2018. Training set debugging using trusted items. In Thirty-Second AAAI Conference on Artificial Intelligence.
Thank you!
Q&A

More Related Content

Similar to ICSE20_Tao_slides.pptx

Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Edmond Lepedus
 
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFEnd-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFJayavardhan Reddy Peddamail
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Reviewchangedaeoh
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET Journal
 
Re2018 Semios for Requirements
Re2018 Semios for RequirementsRe2018 Semios for Requirements
Re2018 Semios for RequirementsClément Portet
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models IJECEIAES
 
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONEXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONijaia
 
IRJET- Public Opinion Analysis on Law Enforcement
IRJET-  	  Public Opinion Analysis on Law EnforcementIRJET-  	  Public Opinion Analysis on Law Enforcement
IRJET- Public Opinion Analysis on Law EnforcementIRJET Journal
 
Conversational transfer learning for emotion recognition
Conversational transfer learning for emotion recognitionConversational transfer learning for emotion recognition
Conversational transfer learning for emotion recognitionTakato Hayashi
 
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...IRJET Journal
 
Slide 1
Slide 1Slide 1
Slide 1butest
 
a deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationa deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationJEE HYUN PARK
 
Overview of text classification approaches algorithms & software v lyubin...
Overview of text classification approaches algorithms & software v lyubin...Overview of text classification approaches algorithms & software v lyubin...
Overview of text classification approaches algorithms & software v lyubin...Olga Zinkevych
 
5_RNN_LSTM.pdf
5_RNN_LSTM.pdf5_RNN_LSTM.pdf
5_RNN_LSTM.pdfFEG
 
Isolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationIsolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationeSAT Journals
 
Isolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationIsolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationeSAT Publishing House
 
Seq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese LanguageSeq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese LanguageJinho Choi
 

Similar to ICSE20_Tao_slides.pptx (20)

Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...
 
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFEnd-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
 
team10.ppt.pptx
team10.ppt.pptxteam10.ppt.pptx
team10.ppt.pptx
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
 
Re2018 Semios for Requirements
Re2018 Semios for RequirementsRe2018 Semios for Requirements
Re2018 Semios for Requirements
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models
 
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONEXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
 
IRJET- Public Opinion Analysis on Law Enforcement
IRJET-  	  Public Opinion Analysis on Law EnforcementIRJET-  	  Public Opinion Analysis on Law Enforcement
IRJET- Public Opinion Analysis on Law Enforcement
 
Conversational transfer learning for emotion recognition
Conversational transfer learning for emotion recognitionConversational transfer learning for emotion recognition
Conversational transfer learning for emotion recognition
 
arttt.pdf
arttt.pdfarttt.pdf
arttt.pdf
 
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
 
Slide 1
Slide 1Slide 1
Slide 1
 
a deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationa deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarization
 
Overview of text classification approaches algorithms & software v lyubin...
Overview of text classification approaches algorithms & software v lyubin...Overview of text classification approaches algorithms & software v lyubin...
Overview of text classification approaches algorithms & software v lyubin...
 
5_RNN_LSTM.pdf
5_RNN_LSTM.pdf5_RNN_LSTM.pdf
5_RNN_LSTM.pdf
 
Isolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationIsolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantization
 
Isolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationIsolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantization
 
Seq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese LanguageSeq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese Language
 

Recently uploaded

Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 

Recently uploaded (20)

Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 

ICSE20_Tao_slides.pptx

  • 1. TRADER: Trace Divergence Analysis and Embedding Regulation for Debugging Recurrent Neural Networks Guanhong Tao, Shiqing Ma, Yingqi Liu, Qiuling Xu, Xiangyu Zhang
  • 2. Introduction • Many software artifacts are in the form of text or sequences • Recurrent Neural Networks (RNNs): designed to deal with textual inputs and inputs in sequence  SentiStrength [1] predicts positive or negative sentiment for informal English text  Predicted sentiment is further used to extract problematic API features by work [2]  A recent study [3] showed that SentiStrength achieved recall and precision lower than 40% on negative sentences 2 [1] Thelwall et al. 2010. Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology. [2] Zhang et al. 2013. Extracting problematic API features from forum discussions. In 21st International Conference on Program Comprehension (ICPC). [3] Lin et al. 2018. Sentiment Analysis for Software Engineering: How Far Can We Go?. In Proceedings of 40th International Conference on Software Engineering (ICSE).
  • 3. Introduction • Text sequence  word embeddings  RNN model  prediction • Word embeddings are the dominating factor in model accuracy in RNN applications [4, 5, 6] • Same ML model using different word embeddings can have divergent accuracy ranging from 62.95% to 88.90% [5] [4] Baroni et al. 2014. Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL). [5] Schnabel et al. 2015. Evaluation methods for unsupervised word embeddings. In Conference on Empirical Methods in Natural Language Processing (EMNLP). [6] Yu et al. 2017. Refining word embeddings for sentiment analysis. In Conference on Empirical Methods in Natural Language Processing (EMNLP). 3 For a type of bugs in which problematic word embeddings lead to suboptimal model accuracy
  • 4. Introduction • Text sequence  word embeddings  RNN model  prediction • The quality of word embeddings can be measured using neighboring words 4 Original: original embeddings Regulated: regulated embeddings (by our tool) Nearest words measured by cosine similarity Target word
  • 5. Debugging An RNN Model 5 Task: predict the label of an input sentence Each pair of [h_t, x_t] and o_t is a state, and all the states (looping over input text sequence) constitute a trace Prediction is the last output Loo p Unrolled RNN structure xt C ht ot x0 C h 0 o 0 x1 C h 1 o 1 xt C ht ot
  • 6. • A misclassification is caused by buggy states within a trace  “Also, JodaTime1 makes calculations with time much simpler”2 • Trace divergence analysis  At each time step, use x_t and h_t as input, and o_t as output  Train classifiers on validation set to identify diverged steps  Please see paper for detailed analysis Trace Divergence Analysis 6 1A data and time library for Java. https://www.joda.org/joda-time/ 2A sample text from Stack overflow dataset predicted by a LSTM model Als o x0 C h 0 o 0 x0 h 0 o 0 x1 C h 1 o 1 JodaTime x1 h 1 o 1 x6 C h 6 o 6 x7 C h 7 o 7 much simpler x6 h 6 o 6 x7 h 7 o 7 State vector Trace Time step Als o JodaTim e makes calculation s with time much simpler Outpu t Trace divergence
  • 7. Defective Dimension Identification • The root cause of buggy states comes from problematic state dimensions 7 Also, JodaTime makes … much simpler. [ ··· 0.04 ··· 0.4 ··· -0.06 ··· ] [ ] xt ht  Obtain state vector of diverged step  Multiply state vector with a pre- trained state importance element- wisely  Locate defective dimensions with large values  Aggregate defective dimensions from all the diverged steps using Algorithm 1 (details in paper) [ ··· 0.4 ··· -0.1 ··· -0.2 ··· ] [ ··· 0.1 ··· -4.0 ··· 0.3 ··· ] Pre- trained importanc e Weighted state vector ⊙ = Defective dimension
  • 8. Embedding Regulation 8 • Model is sensitive for defective dimensions • Regulating word embeddings to reduce impact from buggy dimensions  Apply perturbations on buggy (input and internal) dimensions  Freeze model parameters and update input embeddings by minimizing output difference  Retrain model with regulated word embeddings  Please see details in Algorithm 2 [ ··· 0.04 ··· 0.4 ··· -0.06 ··· ] Weighted state vector + [ ··· 0 ··· 𝜀 ··· 0 ··· ] Error vector Also, JodaTime makes … much simpler. [ ··· 0.4 ··· -0.1 ··· -0.2 ··· ] [𝑥𝑡, ℎ𝑡] [𝑥𝑡, ℎ𝑡] xt C ht ot 𝑥𝑡 C ℎ𝑡 𝑜𝑡 dif f
  • 9. Experimental Results • 5.37% improvement on 135 models (baseline 0.6%)  5 datasets, 3 word embeddings, 3 RNN model structures (each with 3 different settings)  Case study • Artifacts: https://github.com/trader-rnn/TRADER 9 Negative Neutral Positive Labe l Input sentence Original model Fixed model Original model Fixed model
  • 10. Related Work • Existing works focus on debugging specific machine learning models or feed-forward Neural Networks and are not applicable to RNNs [7, 8, 9] • Work [10] aims at debugging NLP models by generating adversarial examples as training data • Researchers [11, 12] propose methods to debug models by cleaning up the wrongly labeled training data • These approaches debug RNN models by providing better training data and do not analyze model internals 10 [7] Cadamuro et al. 2016. Debugging machine learning models. In ICML Workshop on Reliable Machine Learning in the Wild. [8] Chakarov et al. 2016. Debugging machine learning tasks. arXiv preprint arXiv:1603.07292 (2016). [9] Ma et al. 2018. MODE: automated neural network model debugging via state differential analysis and input selection. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). [10] Ribeiro et al. 2018. Semantically equivalent adversarial rules for debugging NLP models. In Association for Computational Linguistics (ACL). [11] Jiang et al. 2004. Editing training data for kNN classifiers with neural network ensemble. In International Symposium on Neural Networks. [12] Zhang et al. 2018. Training set debugging using trusted items. In Thirty-Second AAAI Conference on Artificial Intelligence.
  • 12. Q&A