SlideShare a Scribd company logo
1 of 29
Download to read offline
AndroidAppsand
User Feedback:
ADatasetfor Software
Evolutionand Quality
Improvement
WorkshoponAppMarketAnalytics -WAMA2017
G.Grano,A. Di Sorbo, F. Mercaldo, C.Visaggio
G. Canfora, S. Panichella
✉ grano@ifi.uzh.ch giograno90
OUTLINE
→ Context
→ Motivationand relevance
→ Description ofthe dataset
→ Enabled Research
Giovanni Grano @ s.e.a.l. 2
Google Play Store
3 millions ofapps
65 billions ofdownloads
~ 13$ billions revenues
Giovanni Grano @ s.e.a.l. 3
AppStores → newparadigm
rich source ofinformation:
appdescriptions, changelogs
user reviews
Giovanni Grano @ s.e.a.l. 4
Findings from mobile store:
DirectandActionable impacts
forappdeveloperteams1
1
Martin, Sarro, Jia, Zhang, Harman, A Survey of App Store Analysis for Software
Engineering, TSE 16
Giovanni Grano @ s.e.a.l. 5
Initialresearch focused
on classification2
and summarization3
ofuser reviews
3
Di Sorbo, Panichella, Alexandru, Shimagaki, Visaggio, Canfora,Gall, What would
users change in my app? Summarizing app reviews for recommending software changes,
FSE 16
2
Panichella, Di Sorbo, Guzman, Visaggio, Canfora, Gall, How can i improve my app?
Classifying user reviews for software maintenance and evolution, ICSME 15
Giovanni Grano @ s.e.a.l. 6
Evolution is guided by
requests in user reviews4,5
stores lack in functionalities
5
Palomba, Linares-Vásquez, Bavota, Oliveto, Di Penta, Poshyvanyk, Lucia, User
reviews matter! Tracking crowdsourced reviews to support evolution of successful
apps, ICSME 15
4
Palomba, Salza, Ciurumelea, Panichella, Gall, Ferrucci, De Lucia, Recommending
and localizing change requests for mobile apps based on user reviews, ICSE 17
Giovanni Grano @ s.e.a.l. 7
Our Dataset:
~ 280k user reviews
395application
22 code quality metrics
8 code smells
Giovanni Grano @ s.e.a.l. 8
DatasetConstruction
We built the dataset in two phases:
→ DataCollection
FDroid + Google Play Store
→Analysis Phase
Classification + apk analsys
Giovanni Grano @ s.e.a.l. 9
DataCollection
→ FDroid
Crawler for meta-data ~ 1,929 apps
→ PlayStore Matching
Removed not matched apps or older than
2014
Giovanni Grano @ s.e.a.l. 10
DataCollection
→ ReviewCrawler
Mining reviews for 965 apps
→Version Matching
Based on release and post date
→ Filtering
Version with less than 10 review.
288k reviews for 629 versions of 395 apps!
Giovanni Grano @ s.e.a.l. 11
Analysis
→ User Reviews Classification
» Two-level taxonomy
→ CodeAnalysis
» Code Quality Indicators
» Code Smells
Giovanni Grano @ s.e.a.l. 12
User Reviews Classification
URMTaxonomyModel
Two-level taxonomy
» Intention
ARDOC6
: reviews classifier based on NLP+SA+TA
» Topic
SURF3
: topic classifier based on topics-
related keyword and n-grams
3
Di Sorbo, Panichella, Alexandru, Shimagaki, Visaggio, Canfora,Gall, What would
users change in my app? Summarizing app reviews for recommending software changes,
FSE 16
6
Panichella, Sorbo, Guzman, Visaggio, Canfora, Gall, ARdoc: app reviews
development oriented classifier, FSE 16
Giovanni Grano @ s.e.a.l. 13
Intention Categories
Category Definition
Information Giving Informs users or developers
about app aspects
Information Seeking Attemps to obtain
informations or help
Feature Requests Expresses idea, suggestions
for enhancing the app
Problem Discovery Unexpected behaviour or
issues
Other Anything not in previous
categories
Giovanni Grano @ s.e.a.l. 14
Examples
Problem Discovery, Update/Version
I can’t access my SD card with the new
update which makes this app and the ery
money I donated worthless.
Feature Request, Feature Functionality
I would give 5 stars if there was a way
to move emails from the delete folder
back into the inbox folder.
Giovanni Grano @ s.e.a.l. 15
Some numbers...
Topic Sentences FR PD IS IG Other
App 117,409 4,879 11,089 1,600 11,943 87,898
GUI 37,620 3,381 5,034 705 3,560 2,4940
Contents 16,819 1,315 1,973 434 1,620 11,477
Download 7,853 333 1,346 363 830 4,981
Company 1672 118 190 57 152 1,155
Feature 173,847 15,480 27,810 4,342 14,972 111,243
Improvement 8,281 1,005 304 54 755 6,163
Pricing 4,016 142 216 62 559 3,037
Resources 3071 155 375 50 263 2228
Update/
Version
21,669 1,358 3,886 548 2,423 13,454
Model 22,044 1,308 3,397 459 2,055 14,825
Security 2,392 212 313 65 218 1,584
Other 189,784 630 2,019 1,402 2,842 182,891
TOTAL 606,477 30,316 57,952 10,141 42,192 465,876
Giovanni Grano @ s.e.a.l. 16
CodeAnalysisapks →apktool→ smali bytecode
smali bytecode → python scripts → metrics
available metrics @ githubwiki
Giovanni Grano @ s.e.a.l. 17
Code Metrics
→ DimensionalMetrics
→ ComplexityMetrics
→ Object-Oriented Metrics
→Android-Oriented Metrics
Giovanni Grano @ s.e.a.l. 18
CodeAnalysis
smali bytecode → Paprika→ smells
» Blob Class (BLOB)
» Swiss Army Knife (SAK)
» Long Method (LM)
» Complex Class (CC)
» Internal Getter/Setter (IGS)
» Member Ignoring Method (MIM)
» No Low Memory Resolver (NLMR)
» Leaking Inner Class (LIC)
code smells @ githubwiki
Giovanni Grano @ s.e.a.l. 19
Data Sharing
→ CSVFiles→ Relational Database
Giovanni Grano @ s.e.a.l. 20
CSVFiles
→Versions
id, package name, category,version, release date
1125,org.tomdroid,Productivity,0.7.5,January 16 2014
→ Reviews
id, package name,text,category,version, release date, stars,version id
7bd1c70a-afc9-11e6-93ea-c4b301cdf627
org.tomdroid
Don't sync it online. The whole app crashed. I had to reinstall it.
Lost my notes. As long as you keep it in ur sd card it works good
August 24 2015
3
1125
Giovanni Grano @ s.e.a.l. 21
→ Sentences
id,text, intention,topic
7bd1c70a-afc9-11e6-93ea-c4b301cdf627
Don't sync it online.
INFORMATION GIVING, Other
7bd1c70a-afc9-11e6-93ea-c4b301cdf627
The whole app crashed.
PROBLEM DISCOVERY, App
7bd1c70a-afc9-11e6-93ea-c4b301cdf627
I had to reinstall it.
OTHER, App-Update/Version
7bd1c70a-afc9-11e6-93ea-c4b301cdf627
Lost my notes.
OTHER, Contents-Feature/Functionality
7bd1c70a-afc9-11e6-93ea-c4b301cdf627
As long as you keep it in ur sd card it works good
OTHER, Feature/Functionality
Giovanni Grano @ s.e.a.l. 22
→ User metrics
id, package name, no.reviews, no.sentences, rating,
FR, %FR, PD, % PD
→ Code Metrics
id, package name, <allmetric names>
→ Code Smells
id, package name, <allsmellnames>
Giovanni Grano @ s.e.a.l. 23
RelationalDB
Giovanni Grano @ s.e.a.l. 24
Research Opportunities
undestanding how
code quality affects
reviewsand rating
for different categories
Giovanni Grano @ s.e.a.l. 26
observe consequences
on code quality
while integrating user feedback
intotheappcodebase
Giovanni Grano @ s.e.a.l. 27
studyco-evolutiontrends
of quality metrics,
code smells and user feedback
for sequentialreleases
Giovanni Grano @ s.e.a.l. 28
thanks foryour
attentiondataset@ GitHub
✉ grano@ifi.uzh.ch giograno90

More Related Content

Similar to Android Apps and User Feedback: A Dataset or Software Evolution and Quality Improvement

Eurecom уличили приложения для Android в тайной от пользователя активности
Eurecom уличили приложения для Android в тайной от пользователя активностиEurecom уличили приложения для Android в тайной от пользователя активности
Eurecom уличили приложения для Android в тайной от пользователя активностиSergey Ulankin
 
Fighting Malware with Graph Analytics: An End-to-End Case Study
Fighting Malware with Graph Analytics: An End-to-End Case StudyFighting Malware with Graph Analytics: An End-to-End Case Study
Fighting Malware with Graph Analytics: An End-to-End Case StudyPriyanka Aash
 
Exploring the Integration of User Feedback in Automated Testing of Android Ap...
Exploring the Integration of User Feedback in Automated Testing of Android Ap...Exploring the Integration of User Feedback in Automated Testing of Android Ap...
Exploring the Integration of User Feedback in Automated Testing of Android Ap...Sebastiano Panichella
 
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Debdoot Mukherjee
 
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Debdoot Mukherjee
 
App Stores - Analytics and Trends
App Stores - Analytics and TrendsApp Stores - Analytics and Trends
App Stores - Analytics and TrendsPriori Data
 
App Store Analytics & Trends @ Nordics Mobile Developer Summit - Priori Prese...
App Store Analytics & Trends @ Nordics Mobile Developer Summit - Priori Prese...App Store Analytics & Trends @ Nordics Mobile Developer Summit - Priori Prese...
App Store Analytics & Trends @ Nordics Mobile Developer Summit - Priori Prese...PRIORI DATA
 
OWF13 - Catalyzing the discovery, analysis and adoption of OSS community-ba...
OWF13 - Catalyzing the discovery, analysis and adoption of   OSS community-ba...OWF13 - Catalyzing the discovery, analysis and adoption of   OSS community-ba...
OWF13 - Catalyzing the discovery, analysis and adoption of OSS community-ba...Paris Open Source Summit
 
On the Link Between Mobile App Quality and User Reviews
On the Link Between Mobile App Quality and User ReviewsOn the Link Between Mobile App Quality and User Reviews
On the Link Between Mobile App Quality and User ReviewsSAIL_QU
 
Heuristics, mnemonics and other Greek words in the exploratory testing of mob...
Heuristics, mnemonics and other Greek words in the exploratory testing of mob...Heuristics, mnemonics and other Greek words in the exploratory testing of mob...
Heuristics, mnemonics and other Greek words in the exploratory testing of mob...COMAQA.BY
 
Open Source Insight: GitHub Finds 4M Flaws, IAST Magic Quadrant, 2018 Open So...
Open Source Insight:GitHub Finds 4M Flaws, IAST Magic Quadrant, 2018 Open So...Open Source Insight:GitHub Finds 4M Flaws, IAST Magic Quadrant, 2018 Open So...
Open Source Insight: GitHub Finds 4M Flaws, IAST Magic Quadrant, 2018 Open So...Black Duck by Synopsys
 
[NMDS] Anders Lykke | Priori Data
[NMDS] Anders Lykke | Priori Data[NMDS] Anders Lykke | Priori Data
[NMDS] Anders Lykke | Priori DataMobilbusiness
 
OutSystems - Go Fast or Go Home
OutSystems - Go Fast or Go Home OutSystems - Go Fast or Go Home
OutSystems - Go Fast or Go Home OutSystems
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementTrey Grainger
 
Would Static Analysis Tools Help Developers with Code Reviews?
Would Static Analysis Tools Help Developers with Code Reviews?Would Static Analysis Tools Help Developers with Code Reviews?
Would Static Analysis Tools Help Developers with Code Reviews?Sebastiano Panichella
 
SBA Live Academy: Software Security – Towards a Mature Lifecycle and DevSecOp...
SBA Live Academy: Software Security – Towards a Mature Lifecycle and DevSecOp...SBA Live Academy: Software Security – Towards a Mature Lifecycle and DevSecOp...
SBA Live Academy: Software Security – Towards a Mature Lifecycle and DevSecOp...SBA Research
 
Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Gabriel Moreira
 

Similar to Android Apps and User Feedback: A Dataset or Software Evolution and Quality Improvement (20)

SIG-NOC survey 2019
SIG-NOC survey 2019SIG-NOC survey 2019
SIG-NOC survey 2019
 
Eurecom уличили приложения для Android в тайной от пользователя активности
Eurecom уличили приложения для Android в тайной от пользователя активностиEurecom уличили приложения для Android в тайной от пользователя активности
Eurecom уличили приложения для Android в тайной от пользователя активности
 
Fighting Malware with Graph Analytics: An End-to-End Case Study
Fighting Malware with Graph Analytics: An End-to-End Case StudyFighting Malware with Graph Analytics: An End-to-End Case Study
Fighting Malware with Graph Analytics: An End-to-End Case Study
 
Is AI generation the next platform shift?
Is AI generation the next platform shift?Is AI generation the next platform shift?
Is AI generation the next platform shift?
 
Exploring the Integration of User Feedback in Automated Testing of Android Ap...
Exploring the Integration of User Feedback in Automated Testing of Android Ap...Exploring the Integration of User Feedback in Automated Testing of Android Ap...
Exploring the Integration of User Feedback in Automated Testing of Android Ap...
 
Open Source
Open Source Open Source
Open Source
 
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
 
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
 
App Stores - Analytics and Trends
App Stores - Analytics and TrendsApp Stores - Analytics and Trends
App Stores - Analytics and Trends
 
App Store Analytics & Trends @ Nordics Mobile Developer Summit - Priori Prese...
App Store Analytics & Trends @ Nordics Mobile Developer Summit - Priori Prese...App Store Analytics & Trends @ Nordics Mobile Developer Summit - Priori Prese...
App Store Analytics & Trends @ Nordics Mobile Developer Summit - Priori Prese...
 
OWF13 - Catalyzing the discovery, analysis and adoption of OSS community-ba...
OWF13 - Catalyzing the discovery, analysis and adoption of   OSS community-ba...OWF13 - Catalyzing the discovery, analysis and adoption of   OSS community-ba...
OWF13 - Catalyzing the discovery, analysis and adoption of OSS community-ba...
 
On the Link Between Mobile App Quality and User Reviews
On the Link Between Mobile App Quality and User ReviewsOn the Link Between Mobile App Quality and User Reviews
On the Link Between Mobile App Quality and User Reviews
 
Heuristics, mnemonics and other Greek words in the exploratory testing of mob...
Heuristics, mnemonics and other Greek words in the exploratory testing of mob...Heuristics, mnemonics and other Greek words in the exploratory testing of mob...
Heuristics, mnemonics and other Greek words in the exploratory testing of mob...
 
Open Source Insight: GitHub Finds 4M Flaws, IAST Magic Quadrant, 2018 Open So...
Open Source Insight:GitHub Finds 4M Flaws, IAST Magic Quadrant, 2018 Open So...Open Source Insight:GitHub Finds 4M Flaws, IAST Magic Quadrant, 2018 Open So...
Open Source Insight: GitHub Finds 4M Flaws, IAST Magic Quadrant, 2018 Open So...
 
[NMDS] Anders Lykke | Priori Data
[NMDS] Anders Lykke | Priori Data[NMDS] Anders Lykke | Priori Data
[NMDS] Anders Lykke | Priori Data
 
OutSystems - Go Fast or Go Home
OutSystems - Go Fast or Go Home OutSystems - Go Fast or Go Home
OutSystems - Go Fast or Go Home
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
 
Would Static Analysis Tools Help Developers with Code Reviews?
Would Static Analysis Tools Help Developers with Code Reviews?Would Static Analysis Tools Help Developers with Code Reviews?
Would Static Analysis Tools Help Developers with Code Reviews?
 
SBA Live Academy: Software Security – Towards a Mature Lifecycle and DevSecOp...
SBA Live Academy: Software Security – Towards a Mature Lifecycle and DevSecOp...SBA Live Academy: Software Security – Towards a Mature Lifecycle and DevSecOp...
SBA Live Academy: Software Security – Towards a Mature Lifecycle and DevSecOp...
 
Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação
 

More from Sebastiano Panichella

Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Sebastiano Panichella
 
The 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringThe 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
The 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingThe 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingSebastiano Panichella
 
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Sebastiano Panichella
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insi...
Testing and Development Challenges for  Complex Cyber-Physical Systems:  Insi...Testing and Development Challenges for  Complex Cyber-Physical Systems:  Insi...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insi...Sebastiano Panichella
 
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...Sebastiano Panichella
 
COSMOS: DevOps for complex cyber-physical systems (H2020 Project) - WASOS wor...
COSMOS: DevOps for complex cyber-physical systems (H2020 Project) - WASOS wor...COSMOS: DevOps for complex cyber-physical systems (H2020 Project) - WASOS wor...
COSMOS: DevOps for complex cyber-physical systems (H2020 Project) - WASOS wor...Sebastiano Panichella
 
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective T...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective T...Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective T...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective T...Sebastiano Panichella
 
Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Sebastiano Panichella
 
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
 "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.  "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021. Sebastiano Panichella
 
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...Sebastiano Panichella
 
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Sebastiano Panichella
 
A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.Sebastiano Panichella
 
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...Sebastiano Panichella
 
Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Sebastiano Panichella
 
Unit Testing Tool Competition-Eighth Round
Unit Testing Tool Competition-Eighth RoundUnit Testing Tool Competition-Eighth Round
Unit Testing Tool Competition-Eighth RoundSebastiano Panichella
 
Testing with Fewer Resources: An Adaptive Approach to Performance-Aware Test ...
Testing with Fewer Resources: An Adaptive Approach to Performance-Aware Test ...Testing with Fewer Resources: An Adaptive Approach to Performance-Aware Test ...
Testing with Fewer Resources: An Adaptive Approach to Performance-Aware Test ...Sebastiano Panichella
 
A Mixed Graph-Relational Dataset of Socio-technical interactions in Open Sour...
A Mixed Graph-Relational Dataset of Socio-technical interactions in Open Sour...A Mixed Graph-Relational Dataset of Socio-technical interactions in Open Sour...
A Mixed Graph-Relational Dataset of Socio-technical interactions in Open Sour...Sebastiano Panichella
 

More from Sebastiano Panichella (20)

Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...
 
The 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringThe 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software Engineering
 
The 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingThe 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz Testing
 
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insi...
Testing and Development Challenges for  Complex Cyber-Physical Systems:  Insi...Testing and Development Challenges for  Complex Cyber-Physical Systems:  Insi...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insi...
 
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
 
COSMOS: DevOps for complex cyber-physical systems (H2020 Project) - WASOS wor...
COSMOS: DevOps for complex cyber-physical systems (H2020 Project) - WASOS wor...COSMOS: DevOps for complex cyber-physical systems (H2020 Project) - WASOS wor...
COSMOS: DevOps for complex cyber-physical systems (H2020 Project) - WASOS wor...
 
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective T...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective T...Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective T...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective T...
 
Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22
 
NLBSE’22: Tool Competition
NLBSE’22: Tool CompetitionNLBSE’22: Tool Competition
NLBSE’22: Tool Competition
 
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
 "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.  "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
 
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
 
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
 
A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.
 
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
 
Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...Requirements-Collector: Automating Requirements Specification from Elicitatio...
Requirements-Collector: Automating Requirements Specification from Elicitatio...
 
Unit Testing Tool Competition-Eighth Round
Unit Testing Tool Competition-Eighth RoundUnit Testing Tool Competition-Eighth Round
Unit Testing Tool Competition-Eighth Round
 
Cultural Exchange - ICSE 2020
Cultural Exchange - ICSE 2020Cultural Exchange - ICSE 2020
Cultural Exchange - ICSE 2020
 
Testing with Fewer Resources: An Adaptive Approach to Performance-Aware Test ...
Testing with Fewer Resources: An Adaptive Approach to Performance-Aware Test ...Testing with Fewer Resources: An Adaptive Approach to Performance-Aware Test ...
Testing with Fewer Resources: An Adaptive Approach to Performance-Aware Test ...
 
A Mixed Graph-Relational Dataset of Socio-technical interactions in Open Sour...
A Mixed Graph-Relational Dataset of Socio-technical interactions in Open Sour...A Mixed Graph-Relational Dataset of Socio-technical interactions in Open Sour...
A Mixed Graph-Relational Dataset of Socio-technical interactions in Open Sour...
 

Recently uploaded

Juan Pablo Sugiura - eCommerce Day Bolivia 2024
Juan Pablo Sugiura - eCommerce Day Bolivia 2024Juan Pablo Sugiura - eCommerce Day Bolivia 2024
Juan Pablo Sugiura - eCommerce Day Bolivia 2024eCommerce Institute
 
Dynamics of Professional Presentationpdf
Dynamics of Professional PresentationpdfDynamics of Professional Presentationpdf
Dynamics of Professional Presentationpdfravleel42
 
The Real Story Of Project Manager/Scrum Master From Where It Came?!
The Real Story Of Project Manager/Scrum Master From Where It Came?!The Real Story Of Project Manager/Scrum Master From Where It Came?!
The Real Story Of Project Manager/Scrum Master From Where It Came?!Loay Mohamed Ibrahim Aly
 
Burning Issue presentation of Zhazgul N. , Cycle 54
Burning Issue presentation of Zhazgul N. , Cycle 54Burning Issue presentation of Zhazgul N. , Cycle 54
Burning Issue presentation of Zhazgul N. , Cycle 54ZhazgulNurdinova
 
Machine learning workshop, CZU Prague 2024
Machine learning workshop, CZU Prague 2024Machine learning workshop, CZU Prague 2024
Machine learning workshop, CZU Prague 2024Gokulks007
 
Communication Accommodation Theory Kaylyn Benton.pptx
Communication Accommodation Theory Kaylyn Benton.pptxCommunication Accommodation Theory Kaylyn Benton.pptx
Communication Accommodation Theory Kaylyn Benton.pptxkb31670
 
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8Access Innovations, Inc.
 
Communication Accommodation Theory Kaylyn Benton.pptx
Communication Accommodation Theory Kaylyn Benton.pptxCommunication Accommodation Theory Kaylyn Benton.pptx
Communication Accommodation Theory Kaylyn Benton.pptxkb31670
 

Recently uploaded (8)

Juan Pablo Sugiura - eCommerce Day Bolivia 2024
Juan Pablo Sugiura - eCommerce Day Bolivia 2024Juan Pablo Sugiura - eCommerce Day Bolivia 2024
Juan Pablo Sugiura - eCommerce Day Bolivia 2024
 
Dynamics of Professional Presentationpdf
Dynamics of Professional PresentationpdfDynamics of Professional Presentationpdf
Dynamics of Professional Presentationpdf
 
The Real Story Of Project Manager/Scrum Master From Where It Came?!
The Real Story Of Project Manager/Scrum Master From Where It Came?!The Real Story Of Project Manager/Scrum Master From Where It Came?!
The Real Story Of Project Manager/Scrum Master From Where It Came?!
 
Burning Issue presentation of Zhazgul N. , Cycle 54
Burning Issue presentation of Zhazgul N. , Cycle 54Burning Issue presentation of Zhazgul N. , Cycle 54
Burning Issue presentation of Zhazgul N. , Cycle 54
 
Machine learning workshop, CZU Prague 2024
Machine learning workshop, CZU Prague 2024Machine learning workshop, CZU Prague 2024
Machine learning workshop, CZU Prague 2024
 
Communication Accommodation Theory Kaylyn Benton.pptx
Communication Accommodation Theory Kaylyn Benton.pptxCommunication Accommodation Theory Kaylyn Benton.pptx
Communication Accommodation Theory Kaylyn Benton.pptx
 
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
 
Communication Accommodation Theory Kaylyn Benton.pptx
Communication Accommodation Theory Kaylyn Benton.pptxCommunication Accommodation Theory Kaylyn Benton.pptx
Communication Accommodation Theory Kaylyn Benton.pptx
 

Android Apps and User Feedback: A Dataset or Software Evolution and Quality Improvement

  • 1. AndroidAppsand User Feedback: ADatasetfor Software Evolutionand Quality Improvement WorkshoponAppMarketAnalytics -WAMA2017 G.Grano,A. Di Sorbo, F. Mercaldo, C.Visaggio G. Canfora, S. Panichella ✉ grano@ifi.uzh.ch giograno90
  • 2. OUTLINE → Context → Motivationand relevance → Description ofthe dataset → Enabled Research Giovanni Grano @ s.e.a.l. 2
  • 3. Google Play Store 3 millions ofapps 65 billions ofdownloads ~ 13$ billions revenues Giovanni Grano @ s.e.a.l. 3
  • 4. AppStores → newparadigm rich source ofinformation: appdescriptions, changelogs user reviews Giovanni Grano @ s.e.a.l. 4
  • 5. Findings from mobile store: DirectandActionable impacts forappdeveloperteams1 1 Martin, Sarro, Jia, Zhang, Harman, A Survey of App Store Analysis for Software Engineering, TSE 16 Giovanni Grano @ s.e.a.l. 5
  • 6. Initialresearch focused on classification2 and summarization3 ofuser reviews 3 Di Sorbo, Panichella, Alexandru, Shimagaki, Visaggio, Canfora,Gall, What would users change in my app? Summarizing app reviews for recommending software changes, FSE 16 2 Panichella, Di Sorbo, Guzman, Visaggio, Canfora, Gall, How can i improve my app? Classifying user reviews for software maintenance and evolution, ICSME 15 Giovanni Grano @ s.e.a.l. 6
  • 7. Evolution is guided by requests in user reviews4,5 stores lack in functionalities 5 Palomba, Linares-Vásquez, Bavota, Oliveto, Di Penta, Poshyvanyk, Lucia, User reviews matter! Tracking crowdsourced reviews to support evolution of successful apps, ICSME 15 4 Palomba, Salza, Ciurumelea, Panichella, Gall, Ferrucci, De Lucia, Recommending and localizing change requests for mobile apps based on user reviews, ICSE 17 Giovanni Grano @ s.e.a.l. 7
  • 8. Our Dataset: ~ 280k user reviews 395application 22 code quality metrics 8 code smells Giovanni Grano @ s.e.a.l. 8
  • 9. DatasetConstruction We built the dataset in two phases: → DataCollection FDroid + Google Play Store →Analysis Phase Classification + apk analsys Giovanni Grano @ s.e.a.l. 9
  • 10. DataCollection → FDroid Crawler for meta-data ~ 1,929 apps → PlayStore Matching Removed not matched apps or older than 2014 Giovanni Grano @ s.e.a.l. 10
  • 11. DataCollection → ReviewCrawler Mining reviews for 965 apps →Version Matching Based on release and post date → Filtering Version with less than 10 review. 288k reviews for 629 versions of 395 apps! Giovanni Grano @ s.e.a.l. 11
  • 12. Analysis → User Reviews Classification » Two-level taxonomy → CodeAnalysis » Code Quality Indicators » Code Smells Giovanni Grano @ s.e.a.l. 12
  • 13. User Reviews Classification URMTaxonomyModel Two-level taxonomy » Intention ARDOC6 : reviews classifier based on NLP+SA+TA » Topic SURF3 : topic classifier based on topics- related keyword and n-grams 3 Di Sorbo, Panichella, Alexandru, Shimagaki, Visaggio, Canfora,Gall, What would users change in my app? Summarizing app reviews for recommending software changes, FSE 16 6 Panichella, Sorbo, Guzman, Visaggio, Canfora, Gall, ARdoc: app reviews development oriented classifier, FSE 16 Giovanni Grano @ s.e.a.l. 13
  • 14. Intention Categories Category Definition Information Giving Informs users or developers about app aspects Information Seeking Attemps to obtain informations or help Feature Requests Expresses idea, suggestions for enhancing the app Problem Discovery Unexpected behaviour or issues Other Anything not in previous categories Giovanni Grano @ s.e.a.l. 14
  • 15. Examples Problem Discovery, Update/Version I can’t access my SD card with the new update which makes this app and the ery money I donated worthless. Feature Request, Feature Functionality I would give 5 stars if there was a way to move emails from the delete folder back into the inbox folder. Giovanni Grano @ s.e.a.l. 15
  • 16. Some numbers... Topic Sentences FR PD IS IG Other App 117,409 4,879 11,089 1,600 11,943 87,898 GUI 37,620 3,381 5,034 705 3,560 2,4940 Contents 16,819 1,315 1,973 434 1,620 11,477 Download 7,853 333 1,346 363 830 4,981 Company 1672 118 190 57 152 1,155 Feature 173,847 15,480 27,810 4,342 14,972 111,243 Improvement 8,281 1,005 304 54 755 6,163 Pricing 4,016 142 216 62 559 3,037 Resources 3071 155 375 50 263 2228 Update/ Version 21,669 1,358 3,886 548 2,423 13,454 Model 22,044 1,308 3,397 459 2,055 14,825 Security 2,392 212 313 65 218 1,584 Other 189,784 630 2,019 1,402 2,842 182,891 TOTAL 606,477 30,316 57,952 10,141 42,192 465,876 Giovanni Grano @ s.e.a.l. 16
  • 17. CodeAnalysisapks →apktool→ smali bytecode smali bytecode → python scripts → metrics available metrics @ githubwiki Giovanni Grano @ s.e.a.l. 17
  • 18. Code Metrics → DimensionalMetrics → ComplexityMetrics → Object-Oriented Metrics →Android-Oriented Metrics Giovanni Grano @ s.e.a.l. 18
  • 19. CodeAnalysis smali bytecode → Paprika→ smells » Blob Class (BLOB) » Swiss Army Knife (SAK) » Long Method (LM) » Complex Class (CC) » Internal Getter/Setter (IGS) » Member Ignoring Method (MIM) » No Low Memory Resolver (NLMR) » Leaking Inner Class (LIC) code smells @ githubwiki Giovanni Grano @ s.e.a.l. 19
  • 20. Data Sharing → CSVFiles→ Relational Database Giovanni Grano @ s.e.a.l. 20
  • 21. CSVFiles →Versions id, package name, category,version, release date 1125,org.tomdroid,Productivity,0.7.5,January 16 2014 → Reviews id, package name,text,category,version, release date, stars,version id 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 org.tomdroid Don't sync it online. The whole app crashed. I had to reinstall it. Lost my notes. As long as you keep it in ur sd card it works good August 24 2015 3 1125 Giovanni Grano @ s.e.a.l. 21
  • 22. → Sentences id,text, intention,topic 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 Don't sync it online. INFORMATION GIVING, Other 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 The whole app crashed. PROBLEM DISCOVERY, App 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 I had to reinstall it. OTHER, App-Update/Version 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 Lost my notes. OTHER, Contents-Feature/Functionality 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 As long as you keep it in ur sd card it works good OTHER, Feature/Functionality Giovanni Grano @ s.e.a.l. 22
  • 23. → User metrics id, package name, no.reviews, no.sentences, rating, FR, %FR, PD, % PD → Code Metrics id, package name, <allmetric names> → Code Smells id, package name, <allsmellnames> Giovanni Grano @ s.e.a.l. 23
  • 26. undestanding how code quality affects reviewsand rating for different categories Giovanni Grano @ s.e.a.l. 26
  • 27. observe consequences on code quality while integrating user feedback intotheappcodebase Giovanni Grano @ s.e.a.l. 27
  • 28. studyco-evolutiontrends of quality metrics, code smells and user feedback for sequentialreleases Giovanni Grano @ s.e.a.l. 28
  • 29. thanks foryour attentiondataset@ GitHub ✉ grano@ifi.uzh.ch giograno90