SlideShare une entreprise Scribd logo
Mia Mohammad Imran
Virginia Commonwealth University
Emotion Classification In Software
Engineering Texts: A Comparative Analysis of
Pre-trained Transformers Language Models
What are Software Engineering Texts
● Chats
● PR comments
● Issue comments
● Commit messages
● GitHub discussions
● Stack Overflow
● Mailing list
“Programmers Have Feelings Too!”
Anger 🤬
Appreciation 🙏
@[USER] Thank you, Stephen. I hope in the
future Angular will become even better and
easier to understand. However, first of all, I
am grateful to Angular for making me grow
as a developer.
Soooooooooooo you’re setting Angular on
fire and saying bold sh*t in bold like the
Angular team don’t care about you cause
you found relative pathing has an issue is an
odd area
How can Understanding Emotions Help?
Awareness
Self-reflect and
seek feedback
01
Empathy
Understand and
respect diverse
perspectives
02
Regulation
Manage emotions
to maintain focus
03
Social Skills
Enhance
communication and
teamwork
04
Motivation
Drive innovation
and consistent
contribution
05
Benefits of Emotional Intelligence
Study Design and Goals
● Purpose: To investigate how PTMs perform in Emotion
Classification task in software engineering text
● Establish a Benchmark against state-of-the-art tool
● Identify strengths, limitations, and error patterns of PTMs in
this domain
● Propose techniques to improve classifications
Research Questions
● RQ1: How accurately can PTMs classify emotions compared to
the state-of-the-art model?
● RQ2: Can integrating polarity features during training improve
PTMs' emotion classification ability?
Shaver’s Emotion Model
Emotion Models
● Theoretical frameworks to represent emotions
● Shaver’s tree-structured model is most commonly used in
Software Engineering Research
○ 6 primary categories, 25 secondary categories and over 100
tertiary categories
Emotion Models: Shaver’s Taxonomy
● 6 primary categories:
○ Anger 😡
○ Love ❤️
○ Fear 😨
○ Joy 😊
○ Sadness 😥
○ Surprise 😲
Shaver’s Taxonomy: Mapping Example
Excitement
Every time you comment I realize
something new about JS or TS.
This is very exciting. 😊
Feel free to file a bug for that -
that code has a history of
breaking :
Joy
Worry Fear
RQ1: How accurately can PTMs
classify emotions compared to
the state-of-the-art model?
State-of-the Art Models
SEntiMoji [1] Transfer learning Neural Network
[3] Chen et al. “Emoji-powered sentiment and emotion detection from software developers' communication data.” TOSEM, 2021
● Studies show that general purpose tools perform poorly in
SE text
● All tools perform one-vs-all predictions for all 6 basic
emotions (Anger, Love, Fear, Joy, Sadness, and Surprise)
Compared Pre-trained Language Models
● BERT: First major transformer model applied to NLP
● RoBERTa: An optimized version of BERT
● ALBERT: Lighter, faster BERT with shared layers
● DeBERTa: Enhanced BERT with disentangled attention
● CodeBERT: BERT model specialized for code
● GraphCodeBERT: CodeBERT enhanced with graph data
Evaluating the Models
● Goal: Assess effectiveness of PTMs against SotA model
● On 2 datasets
○ Stack Overflow Dataset
○ GitHub Dataset
● 80% train set, 20% test set with stratified sampling
[1] Novielli et al., “A gold standard for emotion annotation in stack overflow.” MSR 2018
[2] Imran et al., “Data augmentation for improving emotion recognition in software engineering communication.” ASE 2022
Compared Metric
● F1-score: Harmonic mean of precision and recall
○ For overall performance: micro-averaged and macro-averaged
F1-score
Results (Average F1-score)
Model Micro Avg. Macro Avg.
SEntiMoji 0.530 0.521
BERT 0.585 0.591
RoBERTa 0.575 0.590
ALBERT 0.538 0.539
DeBERTa 0.610 0.608
CodeBERT 0.545 0.555
GraphCodeBERT 0.549 0.549
Model Micro Avg. Macro Avg.
SEntiMoji 0.714 0.530
BERT 0.754 0.588
RoBERTa 0.758 0.599
ALBERT 0.747 0.584
DeBERTa 0.756 0.607
CodeBERT 0.728 0.567
GraphCodeBERT 0.722 0.552
GitHub Stack Overflow
Error Analysis
● Error Categorization by Novielli et al. [1]
[1] Novielli, Nicole et al. "A benchmark study on sentiment analysis for software engineering research." 2018 MSR.
General Error
Implicit Sentiment Polarity
Pragmatics
Figurative Language
Politeness
Polar Facts
Subjectivity in Annotation
Error Analysis on GitHub Dataset
General Error: the inability to recognize lexical cues that occur in the text
Nice, this is more slick 👍
Implicit Sentiment Polarity: humans use common knowledge to recognize
emotions that the models miss
Patiently waiting for any updates. […]
Surprisingly! Presence of Emojis
And yes, there should be tests 😱😱😱
RQ2: Can integrating polarity
features during training improve
PTMs' emotion classification
ability?
RQ2 Methodology
● Integrate polarity features through token-level attention
adjustment
● Assign greater significance to tokens linked with polarity words
during fine-tuning
RQ2 Methodology
Results (Avg F1-score) - GitHub Dataset
Model Micro Avg. Macro Avg.
BERT
BERT-Polarity
0.585
0.619 (+5.99%)
0.591
0.621 (+5.04%)
RoBERTa
RoBERTa-Polarity
0.575
0.603 (+4.94%)
0.590
0.606 (+2.75%)
ALBERT
ALBERT-Polarity
0.538
0.580 (+7.86%)
0.539
0.581 (+7.65%)
DeBERTa
DeBERTa-Polarity
0.610
0.620 (+1.75%)
0.608
0.614 (+1.04%)
CodeBERT
CodeBERT-Polarity
0.545
0.595 (+9.16%)
0.555
0.601 (+8.37%)
GraphCodeBERT
GraphCodeBERT-Polarity
0.549
0.563 (+2.52%)
0.549
0.568 (+3.38%)
Results (Avg F1-score) - Stack Overflow Dataset
Model Micro Avg. Macro Avg.
BERT
BERT-Polarity
0.754
0.762 (+1.0%)
0.588
0.607 (+3.17%)
RoBERTa
RoBERTa-Polarity
0.758
0.767 (+1.20%)
0.599
0.646 (+7.78%)
ALBERT
ALBERT-Polarity
0.747
0.757 (+1.36%)
0.584
0.616 (+10.23%)
DeBERTa
DeBERTa-Polarity
0.756
0.766 (+1.37%)
0.607
0.624 (+2.89%)
CodeBERT
CodeBERT-Polarity
0.728
0.742 (+1.91%)
0.567
0.586 (+3.32%)
GraphCodeBERT
GraphCodeBERT-Polarity
0.722
0.732 (+1.29%)
0.552
0.569 (+3.11%)
Error Analysis on GitHub Dataset
● In RQ1, 67 cases all models made mistakes
○ After Polarity enhancement, 27/67 cases - at least one model made correct
prediction
● Most improved categories:
○ General error (13/29 cases resolved)
○ Implicit polarity (9/18 cases resolved)
○ Politeness (2/3 cases resolved)
● Least improved categories:
○ Pragmatics (6/7 cases remained unresolved)
○ Figurative Language (6/9 remains unresolved)
● Still considerate amount of misclassified utterances have presence of
Emojis
Key Takeaways
● General PTMs excel in emotion classification within SE texts
compared to SE-specific models
● Polarity features enhance performance consistently
○ Challenges persist especially with negative emotions
● No single model excels across all emotions and metrics
● Common error categories are usually context dependant:
implicit polarity, figurative language, pragmatics
● Challenges in handling emojis
Future Directions
● Establish more benchmark datasets
● Investigate hierarchical emotion classification (2 step)
○ Enhance performance by identifying broad emotional valence before
specific categories
● Investigate aspect-based sentiment analysis (ABSA)-enhanced PTMs
● Fusion of text and emoji cues during pre-training/fine-tuning
● Explore generative language models for emotion detection
○ Utilize zero-shot and few-shot learning for data augmentation and
prompting techniques
● Focus on detecting emotions that may harm productivity (e.g.,
Frustration)
Questions/Thoughts/Collaboration Ideas to: Mia Mohammad Imran, imranm3@vcu.edu
Thank You!
Question?

Contenu connexe

Similaire à Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models

Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!
University of Córdoba
 
Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing Information
Thomas Zimmermann
 
2020 09 24 - CONDG ML.Net
2020 09 24 - CONDG ML.Net2020 09 24 - CONDG ML.Net
2020 09 24 - CONDG ML.Net
Bruno Capuano
 
2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net
Bruno Capuano
 
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine LearningSentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
IRJET Journal
 
Predicting Tweet Sentiment
Predicting Tweet SentimentPredicting Tweet Sentiment
Predicting Tweet Sentiment
Lucinda Linde
 
포스터_아미르호세인그다르지_2010-11804
포스터_아미르호세인그다르지_2010-11804포스터_아미르호세인그다르지_2010-11804
포스터_아미르호세인그다르지_2010-11804
Amir Goudarzi
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docx
Surendra Gusain
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docx
Surendra Gusain
 
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Mustafa Ekim
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Xavier Amatriain
 
Google Interview Prep Guide Software Engineer
Google Interview Prep Guide Software EngineerGoogle Interview Prep Guide Software Engineer
Google Interview Prep Guide Software Engineer
Lewis Lin 🦊
 
2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net
Bruno Capuano
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)
Neal Lathia
 
Humane assessment on cards
Humane assessment on cardsHumane assessment on cards
Humane assessment on cards
Tudor Girba
 
Customer review using sentiment analysis.pptx
Customer review using sentiment analysis.pptxCustomer review using sentiment analysis.pptx
Customer review using sentiment analysis.pptx
TarunKalkar
 
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS
DevDay.org
 
From c# Into Machine Learning
From c# Into Machine LearningFrom c# Into Machine Learning
From c# Into Machine Learning
Dev Raj Gautam
 
Code Quality Makes Your Job Easier
Code Quality Makes Your Job EasierCode Quality Makes Your Job Easier
Code Quality Makes Your Job Easier
Tonya Mork
 
Introduction To Pc Security
Introduction To Pc SecurityIntroduction To Pc Security
Introduction To Pc Security
Walmart Super Center
 

Similaire à Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models (20)

Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!
 
Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing Information
 
2020 09 24 - CONDG ML.Net
2020 09 24 - CONDG ML.Net2020 09 24 - CONDG ML.Net
2020 09 24 - CONDG ML.Net
 
2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net
 
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine LearningSentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
 
Predicting Tweet Sentiment
Predicting Tweet SentimentPredicting Tweet Sentiment
Predicting Tweet Sentiment
 
포스터_아미르호세인그다르지_2010-11804
포스터_아미르호세인그다르지_2010-11804포스터_아미르호세인그다르지_2010-11804
포스터_아미르호세인그다르지_2010-11804
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docx
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docx
 
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
Google Interview Prep Guide Software Engineer
Google Interview Prep Guide Software EngineerGoogle Interview Prep Guide Software Engineer
Google Interview Prep Guide Software Engineer
 
2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)
 
Humane assessment on cards
Humane assessment on cardsHumane assessment on cards
Humane assessment on cards
 
Customer review using sentiment analysis.pptx
Customer review using sentiment analysis.pptxCustomer review using sentiment analysis.pptx
Customer review using sentiment analysis.pptx
 
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS
 
From c# Into Machine Learning
From c# Into Machine LearningFrom c# Into Machine Learning
From c# Into Machine Learning
 
Code Quality Makes Your Job Easier
Code Quality Makes Your Job EasierCode Quality Makes Your Job Easier
Code Quality Makes Your Job Easier
 
Introduction To Pc Security
Introduction To Pc SecurityIntroduction To Pc Security
Introduction To Pc Security
 

Dernier

HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
FODUU
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 

Dernier (20)

HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 

Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models

  • 1. Mia Mohammad Imran Virginia Commonwealth University Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models
  • 2. What are Software Engineering Texts ● Chats ● PR comments ● Issue comments ● Commit messages ● GitHub discussions ● Stack Overflow ● Mailing list
  • 3. “Programmers Have Feelings Too!” Anger 🤬 Appreciation 🙏 @[USER] Thank you, Stephen. I hope in the future Angular will become even better and easier to understand. However, first of all, I am grateful to Angular for making me grow as a developer. Soooooooooooo you’re setting Angular on fire and saying bold sh*t in bold like the Angular team don’t care about you cause you found relative pathing has an issue is an odd area
  • 4. How can Understanding Emotions Help?
  • 5. Awareness Self-reflect and seek feedback 01 Empathy Understand and respect diverse perspectives 02 Regulation Manage emotions to maintain focus 03 Social Skills Enhance communication and teamwork 04 Motivation Drive innovation and consistent contribution 05 Benefits of Emotional Intelligence
  • 6. Study Design and Goals ● Purpose: To investigate how PTMs perform in Emotion Classification task in software engineering text ● Establish a Benchmark against state-of-the-art tool ● Identify strengths, limitations, and error patterns of PTMs in this domain ● Propose techniques to improve classifications
  • 7. Research Questions ● RQ1: How accurately can PTMs classify emotions compared to the state-of-the-art model? ● RQ2: Can integrating polarity features during training improve PTMs' emotion classification ability?
  • 9. Emotion Models ● Theoretical frameworks to represent emotions ● Shaver’s tree-structured model is most commonly used in Software Engineering Research ○ 6 primary categories, 25 secondary categories and over 100 tertiary categories
  • 10. Emotion Models: Shaver’s Taxonomy ● 6 primary categories: ○ Anger 😡 ○ Love ❤️ ○ Fear 😨 ○ Joy 😊 ○ Sadness 😥 ○ Surprise 😲
  • 11. Shaver’s Taxonomy: Mapping Example Excitement Every time you comment I realize something new about JS or TS. This is very exciting. 😊 Feel free to file a bug for that - that code has a history of breaking : Joy Worry Fear
  • 12. RQ1: How accurately can PTMs classify emotions compared to the state-of-the-art model?
  • 13. State-of-the Art Models SEntiMoji [1] Transfer learning Neural Network [3] Chen et al. “Emoji-powered sentiment and emotion detection from software developers' communication data.” TOSEM, 2021 ● Studies show that general purpose tools perform poorly in SE text ● All tools perform one-vs-all predictions for all 6 basic emotions (Anger, Love, Fear, Joy, Sadness, and Surprise)
  • 14. Compared Pre-trained Language Models ● BERT: First major transformer model applied to NLP ● RoBERTa: An optimized version of BERT ● ALBERT: Lighter, faster BERT with shared layers ● DeBERTa: Enhanced BERT with disentangled attention ● CodeBERT: BERT model specialized for code ● GraphCodeBERT: CodeBERT enhanced with graph data
  • 15. Evaluating the Models ● Goal: Assess effectiveness of PTMs against SotA model ● On 2 datasets ○ Stack Overflow Dataset ○ GitHub Dataset ● 80% train set, 20% test set with stratified sampling [1] Novielli et al., “A gold standard for emotion annotation in stack overflow.” MSR 2018 [2] Imran et al., “Data augmentation for improving emotion recognition in software engineering communication.” ASE 2022
  • 16. Compared Metric ● F1-score: Harmonic mean of precision and recall ○ For overall performance: micro-averaged and macro-averaged F1-score
  • 17. Results (Average F1-score) Model Micro Avg. Macro Avg. SEntiMoji 0.530 0.521 BERT 0.585 0.591 RoBERTa 0.575 0.590 ALBERT 0.538 0.539 DeBERTa 0.610 0.608 CodeBERT 0.545 0.555 GraphCodeBERT 0.549 0.549 Model Micro Avg. Macro Avg. SEntiMoji 0.714 0.530 BERT 0.754 0.588 RoBERTa 0.758 0.599 ALBERT 0.747 0.584 DeBERTa 0.756 0.607 CodeBERT 0.728 0.567 GraphCodeBERT 0.722 0.552 GitHub Stack Overflow
  • 18. Error Analysis ● Error Categorization by Novielli et al. [1] [1] Novielli, Nicole et al. "A benchmark study on sentiment analysis for software engineering research." 2018 MSR. General Error Implicit Sentiment Polarity Pragmatics Figurative Language Politeness Polar Facts Subjectivity in Annotation
  • 19. Error Analysis on GitHub Dataset General Error: the inability to recognize lexical cues that occur in the text Nice, this is more slick 👍 Implicit Sentiment Polarity: humans use common knowledge to recognize emotions that the models miss Patiently waiting for any updates. […] Surprisingly! Presence of Emojis And yes, there should be tests 😱😱😱
  • 20. RQ2: Can integrating polarity features during training improve PTMs' emotion classification ability?
  • 21. RQ2 Methodology ● Integrate polarity features through token-level attention adjustment ● Assign greater significance to tokens linked with polarity words during fine-tuning
  • 23. Results (Avg F1-score) - GitHub Dataset Model Micro Avg. Macro Avg. BERT BERT-Polarity 0.585 0.619 (+5.99%) 0.591 0.621 (+5.04%) RoBERTa RoBERTa-Polarity 0.575 0.603 (+4.94%) 0.590 0.606 (+2.75%) ALBERT ALBERT-Polarity 0.538 0.580 (+7.86%) 0.539 0.581 (+7.65%) DeBERTa DeBERTa-Polarity 0.610 0.620 (+1.75%) 0.608 0.614 (+1.04%) CodeBERT CodeBERT-Polarity 0.545 0.595 (+9.16%) 0.555 0.601 (+8.37%) GraphCodeBERT GraphCodeBERT-Polarity 0.549 0.563 (+2.52%) 0.549 0.568 (+3.38%)
  • 24. Results (Avg F1-score) - Stack Overflow Dataset Model Micro Avg. Macro Avg. BERT BERT-Polarity 0.754 0.762 (+1.0%) 0.588 0.607 (+3.17%) RoBERTa RoBERTa-Polarity 0.758 0.767 (+1.20%) 0.599 0.646 (+7.78%) ALBERT ALBERT-Polarity 0.747 0.757 (+1.36%) 0.584 0.616 (+10.23%) DeBERTa DeBERTa-Polarity 0.756 0.766 (+1.37%) 0.607 0.624 (+2.89%) CodeBERT CodeBERT-Polarity 0.728 0.742 (+1.91%) 0.567 0.586 (+3.32%) GraphCodeBERT GraphCodeBERT-Polarity 0.722 0.732 (+1.29%) 0.552 0.569 (+3.11%)
  • 25. Error Analysis on GitHub Dataset ● In RQ1, 67 cases all models made mistakes ○ After Polarity enhancement, 27/67 cases - at least one model made correct prediction ● Most improved categories: ○ General error (13/29 cases resolved) ○ Implicit polarity (9/18 cases resolved) ○ Politeness (2/3 cases resolved) ● Least improved categories: ○ Pragmatics (6/7 cases remained unresolved) ○ Figurative Language (6/9 remains unresolved) ● Still considerate amount of misclassified utterances have presence of Emojis
  • 26. Key Takeaways ● General PTMs excel in emotion classification within SE texts compared to SE-specific models ● Polarity features enhance performance consistently ○ Challenges persist especially with negative emotions ● No single model excels across all emotions and metrics ● Common error categories are usually context dependant: implicit polarity, figurative language, pragmatics ● Challenges in handling emojis
  • 27. Future Directions ● Establish more benchmark datasets ● Investigate hierarchical emotion classification (2 step) ○ Enhance performance by identifying broad emotional valence before specific categories ● Investigate aspect-based sentiment analysis (ABSA)-enhanced PTMs ● Fusion of text and emoji cues during pre-training/fine-tuning ● Explore generative language models for emotion detection ○ Utilize zero-shot and few-shot learning for data augmentation and prompting techniques ● Focus on detecting emotions that may harm productivity (e.g., Frustration)
  • 28. Questions/Thoughts/Collaboration Ideas to: Mia Mohammad Imran, imranm3@vcu.edu Thank You! Question?