SlideShare une entreprise Scribd logo
1  sur  20
Software	Estimation	– towards	prescriptive	analytics
Przemyslaw Pospieszny,	PhD
IWSM	Mensura	Conference,	October	24-26,	2017
Bio
Research	Scientist,	Data	Science	Manager,	Project	Manager
• MSc	in	Business	Computing	(2006)	– Poznan	University	of	Economics
• PgD in	Knowledge	Management	(2007)	– Dublin	Institute	of	Technology
• Visiting	Scholar	(2014)	– Florida	Atlantic	University
• PhD	in	Economics	(2016)	– Warsaw	School	of	Economics
• PMP	Certified	(2011)
10yrs	of	professional	experience	in	business	analysis,	project	management	and	
business	intelligence	– AIB,	Aviva,	PKO,	DNB,	Raiffeisen
Currently	establishing	Data	Lab	for	Jones	Lang	LaSalle	(JLL)	- EMEA
Research	interests:	Applicability	of	machine	learning	algorithms,	in	particular	for	
software	estimation	and	smart	cities
<	2 >
Current state of software development
<	3 >
Challenges for software estimation
Agile/	hybrid	methodologies
Rapid,	continuous	delivery
Vague,	changing	requirements
<	4 >
High	uncertainty:
• Product	features
• Budget
• Timeframe
• Quality
Need	for	techniques	and	tools	that	enable	scenario	planning	
and	dynamic	adoption	to	changing	environment!
Leaner,	faster	and	more	dynamic!
Common scenario
<	5 >
Limitations of existing estimation techniques
<	6 >
Expert	estimation	
(PERT,	Delphi,	Planning	Poker)
Estimation	by	analogy
Parametric	models	
(COCOMO	II,	SLIM,	SEER-SEM)
Size-based	
estimation	models
(FPA,	Use	Case)
Decomposition	and	bottom-up
(WBS-based,	User	stories)
Expert	Knowledge
Subjective	choice	of	
comparison	criterion
Difficult	to	preform	in	changing	
environment,	limited	information
Code	reuse,	libraries,	
codeless	programming	
and	agile	development
Requires	training,	
may	not	be	able	to	
apply	for	baseline	
estimation
Data Science?
• Growing	applicability	of	advanced	analytics	in	various	industries
• Availability	of	data	
• State	of	art	machine	learning	algorithms
• Automation	and	feedback	mechanism
• Exceptional	in	tackling	uncertainty
• Open	source:	R,	Python,	KNIME
<	7 >
Data Science for software estimation
• Researched	for	last	2	decades,	primarily	for:
- Effort	and	duration	estimation	
- Monitoring
- Quality
- Maintenance
• Applied	various	techniques	and	algorithms	– classification,	regression,	ensembling,	data	
preparation,	machine	learning	algorithms
• Datasets	– ISBSG,	COCOMO,	NASA,	SourceForge,	PROMISE	Software	Engineering	
Repository	
• Emphasis	on	prediction	accuracy	of	Effort	and	Duration	models	
• Exceptional	results,	although	limited	to	none	implementations	within	organisations	
<	8 >
Wen,	J.,	Li,	S.,	Lin,	Z.,	Hu,	Y.	and	Huang,	C.	(2012).	Systematic	literature	review	of	machine	learning	based	software	development	effort	estimation	models.	
Information	and	Software	Technology.	54.
Effort and duration estimation model - example
<	9 >
Input	
dataset
3-fold		
cross-
validation
SVM
MLP
GLM
Ensemble	
aggregation
Evaluation
MMRE,	PRED,	
MMER,	MBRE
ISBSG	
dataset
Feature	
selection
Data	
understanding
Data	
preparation
LASSO,	
stepwise	
regression
Transformation,	
normalization
Data	selection	
&	cleaning
Pearson	
correlation
Pospieszny,	P.,	Czarnacka-Chrobot,	B.	and	Kobyliński,	A.	(2017).		An	effective	approach	for	software	project	effort	and	duration	estimation	with	machine	learning	algorithms	– working	paper
Go beyond!
<	10 >
Laney,	D.,	Bitterer,	A.,	Sallam,	R.L.	and	Kart,	L.	2012.	Predicts	2013:	Information	Innovation.	Gartner	Research.	(2012)
Descriptive	analytics
What	happened?
Reports,	Dashboards
Hindsight
Predictive	analytics
What	will	happen?	
Data	mining,	Forecasting
Prescriptive	analytics
What	will	happen?	
Machine	learning,	
simulation,	optimization
Diagnostic	analytics
Why	did	it	happen?	
Root	cause	analysis,	
correlations,	associations
Insight
Foresight
Prescriptive Analytics
• Final	frontier	of	business	analytics	– IBM,	2010
• Scenario	planning	and	smart	foresight
• Performed	with	a	given	set	of	goals,	limitations and	constraints that	define	different	
scenarios	which	provide	foresight	as	to	the	best	outputs	(set	of	alternative	actions	or	
decisions)	
• Applies	combination	of	different	techniques	and	approaches	with	emphasis	on	machine	
learning	algorithms
• Software	Process	Simulation	– similar	principle,	different	approach,	techniques	and	
algorithms	(system	dynamics,	discrete	event	simulation,	Petri	nets	and	Monte	Carlo)
• Enhanced	planning,	optimization	of	resources	and	increasing	project	success	rate
<	11 >Lustig,	I.,	Dietrich,	B.,	Johnson,	C.	and	Dziekan,	C.	(2010).	The	analytics	journey.	Analytics	Magazine.	3,	11–13.
Five pillars
<	12 >Basu,	A.	2013.	Five	pillars	of	prescriptive	analytics	success.	Executive	Edge.	(2013),	8–11.
Integrated	predictions	
&	prescriptions
Both must	work	
synergistically	for	
prescriptive	analytics	to	
deliver	accurate	results
Prescriptions	&	side	
effects	
Prescriptions based	on	
advanced	analytic	
approaches	that	include	a	
process	of	generating	the	
best	course	of	actions	for	
defined	goals,	constraints	
and	decision	variables
Adaptive	algorithms
Flexible algorithms	that	
enable	re-predictions	
and	re-prescriptions	and	
can	handle	noisy	input	
data	– preferably	
machine	learning	
algorithms.	
Feedback	mechanism
Any actions	based	on	
prescriptions	should	be	
recorded	for	further	use	
to	deliver	better	actions	
– automated	learning	
process.
Hybrid	data
Structured	&	
unstructured	data,	
including	feed	from	
external	sources	
(environmental,	
economic	data	etc.)
Machine learning
• Exceptional in handling multi-variety and noisy data in uncertain environments
• Unsupervised vs. Supervised learning (+ Reinforcement)
• Types of problems – association rules, clustering, regression, classification
• Algorithms – Neural Networks, Deep learning, Support Vector Machines,
Decision Trees, Generalized Linear Models
• Optimization and learning mechanism
<	13 >
Areas of software estimation – applicability
<	14 >
Area Description Sample questions
Baseline	estimation
Scenario	planning	based	on	fixed	effort	and/or	duration.	The	
aim	is	to	define	project	and	product	characteristics	that	will	
ensure	project/	phase/sprint’s	completion	within	determined	
effort	and	timeframe.
• Which	resources	based	on	their	skillset	should	be	
allocated?	
• What	size	and	quality	will	the	product	have?
• Which	development	methodology	should	be	used?
Monitoring
Identify	any	deviations	from	baselines	that	may	impact	
successful	completion	of	project/phase/sprint	or	even	task,	and	
propose	corrective	actions	by	adjusting	project	or	product	
characteristics	(fixed	effort	and/or	duration).	
• Which	additional	resources	should	be	allocated?
• Which	product	features	needs	to	be	sacrificed?
• How	effort	or	duration	overrun	reduction	activities	will	
impact	product	quality?
Quality
Define	project	and	product	features	that	will	ensure	delivering	
product	with	determined	baseline	quality.
• Which	resources	will	ensure	delivering	high	quality	
product?
• What	architecture,	development	platform	or	
programing	language	should	be	applied?
• Which	software	development	and	testing	
methodologies	should	be	used?
Maintenance
By	determining	maintenance	effort	of	product	to	be	develop	
define	project	and	product	characteristics.
• What	quality	of	product	should	be	delivered?
• Which	skilled	resources	should	be	allocated?
• What	development	and	testing	methodology	should	be	
applied?
Use Case#1 - Baseline estimation
Objective:	Complete	project	within	12	months
Question:	
• What	effort	is	involved?	
• Which	resources	should	be	applied?
Key	metrics:
• Decision	variables:	effort,	resource	types	&	volume
• Constraints:	duration,	product	characteristics,	development	methodology
Approach:
Run	multiple	ML	predictive	models	with	different	scenarios	in	order	to	obtain	the	most	
optimal	solution:
1. Manipulate	with	resource	types	&	volume,	and	also	effort	in	order	to	achieve	(predict)	
duration	~12	months		OR
2. Define	multiple	resource	scenarios	and	predict	effort	(reminding	variables	as	constraints)
<	15 >
UC#2 - Task Estimation
<	16 >
Objective:	Complete	tasks	within	a	Scrum	Sprint	(2-4	weeks)	or	Release	(1-3	months)
Question:	
• How	many	story	points/size?	
• Which	resources	should	be	applied?
• Which	development	methodology	should	be	used?
Key	metrics:
• Decision	variables:	Story	points/	size,	resource	types	&	volume
• Constraints:	duration,	functionalities	to	be	delivered,	task	characteristics
Approach:
Define	multiple	scenarios	for	resources,	dev	methodologies	(for	releases)	and	predict	story	
points/size	for	each	task	or	sprint
UC#3 – Change request (release, project)
<	17 >
Objective:	Absorb	changes	in	release/	project	scope	and	deliver	functionalities/	product	
within	baseline	duration	
Question:	
• How	many	sprints/	iterations	and	at	what	duration?
• How	many	story	points/size?	
• Which	additional	resources	should	be	applied?
• Which	functionalities/	user	stories	should	be	dropped?	
Key	metrics:
• Decision	variables:	Resource	types	&	volume,	story	points/	size,	user	stories/	functionalities
• Constraints:	Duration,	quality	metrics,	development	methodology	and	framework,	architecture,	
programming	languages	etc.
Approach:
Define	multiple	scenarios	for	sprints/	iterations	and	resources	and	predict	story	points/size	for	release	
or	project
• Data
- Availability	– more	granular	than	in	predictive	required
- Volume,	quality	and	completeness		
- Data	preparation	approach!
- External	data	– working	days,	planned	vacations,	flu	index etc.
• Cost	vs	benefits	of	implementation	
• Implementation	guidelines	– simplicity!
• Traditional	simulation	techniques	vs.	machine	learning
• Need	for	hybrid	approach?
<	18 >
Open questions
Further research
• Simulation	using	machine	learning	algorithms	– approach	&	algorithms
• Integration	with	traditional	simulation	techniques- probabilistic	&	rule	based
• Proof	of	concept within	chosen	organisations
• Integration	with	existing	project	management,	issue	and	bug	tracking	software	
or	development	of	standalone	tool
<	19 >
Towards	dynamic	planning,	optimal	utilization	of	resources	and	ultimately	
increasing	project	success	rate!
Thank you!
linkedin.com/in/ppospieszny
p.pospieszny@gmail.com

Contenu connexe

Similaire à Software Estimation - towards prescriptive analytics

Pratik Patel resume
Pratik Patel  resumePratik Patel  resume
Pratik Patel resumePratik Patel
 
Data Science and Business Analytics PG Program
Data Science and Business Analytics PG ProgramData Science and Business Analytics PG Program
Data Science and Business Analytics PG ProgramMamathaSharma4
 
Big Data Hadoop as a Services
Big Data Hadoop as a Services Big Data Hadoop as a Services
Big Data Hadoop as a Services Vikas Kumar
 
Laura Bush Resume_20150618
Laura Bush Resume_20150618Laura Bush Resume_20150618
Laura Bush Resume_20150618Laura Bush
 
Prasad Doddi - Hyeprion Developer
Prasad Doddi - Hyeprion DeveloperPrasad Doddi - Hyeprion Developer
Prasad Doddi - Hyeprion Developerprasad doddi
 
Prasad Doddi - Hyeprion Developer
Prasad Doddi - Hyeprion DeveloperPrasad Doddi - Hyeprion Developer
Prasad Doddi - Hyeprion Developerprasad doddi
 
Jose Dang- Resume
Jose Dang- ResumeJose Dang- Resume
Jose Dang- ResumeJose Dang
 
NLB Analytics Overview
NLB Analytics OverviewNLB Analytics Overview
NLB Analytics OverviewKevin Dingle
 
NLB Services Data Analytics Overview
NLB Services Data Analytics OverviewNLB Services Data Analytics Overview
NLB Services Data Analytics OverviewKevin Dingle
 
Resume - Carlos Reyes
Resume - Carlos Reyes Resume - Carlos Reyes
Resume - Carlos Reyes Carlos Reyes
 
Joey dowell resume doc
Joey dowell resume docJoey dowell resume doc
Joey dowell resume docJoey Dowell
 
Associate_SamikshaGupta_Resume
Associate_SamikshaGupta_ResumeAssociate_SamikshaGupta_Resume
Associate_SamikshaGupta_ResumeSamiksha Gupta
 
Nadia Omer Resume - Oracle
Nadia Omer Resume - OracleNadia Omer Resume - Oracle
Nadia Omer Resume - OracleNadia Omer
 
Bradley grigson agile collaboration_pm_ 2016
Bradley grigson agile collaboration_pm_ 2016Bradley grigson agile collaboration_pm_ 2016
Bradley grigson agile collaboration_pm_ 2016Brad Grigson
 

Similaire à Software Estimation - towards prescriptive analytics (20)

Pratik Patel resume
Pratik Patel  resumePratik Patel  resume
Pratik Patel resume
 
Vipul_ashawa_php_7_yearsexp
Vipul_ashawa_php_7_yearsexpVipul_ashawa_php_7_yearsexp
Vipul_ashawa_php_7_yearsexp
 
Data Science and Business Analytics PG Program
Data Science and Business Analytics PG ProgramData Science and Business Analytics PG Program
Data Science and Business Analytics PG Program
 
Ekansh Gupta CV
Ekansh Gupta CVEkansh Gupta CV
Ekansh Gupta CV
 
Divya_Kumarasubramanian_Resume
Divya_Kumarasubramanian_ResumeDivya_Kumarasubramanian_Resume
Divya_Kumarasubramanian_Resume
 
Wester Resume_06202016
Wester Resume_06202016Wester Resume_06202016
Wester Resume_06202016
 
Big Data Hadoop as a Services
Big Data Hadoop as a Services Big Data Hadoop as a Services
Big Data Hadoop as a Services
 
Laura Bush Resume_20150618
Laura Bush Resume_20150618Laura Bush Resume_20150618
Laura Bush Resume_20150618
 
Prasad Doddi - Hyeprion Developer
Prasad Doddi - Hyeprion DeveloperPrasad Doddi - Hyeprion Developer
Prasad Doddi - Hyeprion Developer
 
Prasad Doddi - Hyeprion Developer
Prasad Doddi - Hyeprion DeveloperPrasad Doddi - Hyeprion Developer
Prasad Doddi - Hyeprion Developer
 
Jose Dang- Resume
Jose Dang- ResumeJose Dang- Resume
Jose Dang- Resume
 
Resume Chitranjan Verma
Resume Chitranjan VermaResume Chitranjan Verma
Resume Chitranjan Verma
 
NLB Analytics Overview
NLB Analytics OverviewNLB Analytics Overview
NLB Analytics Overview
 
NLB Services Data Analytics Overview
NLB Services Data Analytics OverviewNLB Services Data Analytics Overview
NLB Services Data Analytics Overview
 
Resume - Carlos Reyes
Resume - Carlos Reyes Resume - Carlos Reyes
Resume - Carlos Reyes
 
Joey dowell resume doc
Joey dowell resume docJoey dowell resume doc
Joey dowell resume doc
 
Associate_SamikshaGupta_Resume
Associate_SamikshaGupta_ResumeAssociate_SamikshaGupta_Resume
Associate_SamikshaGupta_Resume
 
JerinDaniel New
JerinDaniel NewJerinDaniel New
JerinDaniel New
 
Nadia Omer Resume - Oracle
Nadia Omer Resume - OracleNadia Omer Resume - Oracle
Nadia Omer Resume - Oracle
 
Bradley grigson agile collaboration_pm_ 2016
Bradley grigson agile collaboration_pm_ 2016Bradley grigson agile collaboration_pm_ 2016
Bradley grigson agile collaboration_pm_ 2016
 

Dernier

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 

Dernier (20)

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 

Software Estimation - towards prescriptive analytics

  • 2. Bio Research Scientist, Data Science Manager, Project Manager • MSc in Business Computing (2006) – Poznan University of Economics • PgD in Knowledge Management (2007) – Dublin Institute of Technology • Visiting Scholar (2014) – Florida Atlantic University • PhD in Economics (2016) – Warsaw School of Economics • PMP Certified (2011) 10yrs of professional experience in business analysis, project management and business intelligence – AIB, Aviva, PKO, DNB, Raiffeisen Currently establishing Data Lab for Jones Lang LaSalle (JLL) - EMEA Research interests: Applicability of machine learning algorithms, in particular for software estimation and smart cities < 2 >
  • 3. Current state of software development < 3 >
  • 4. Challenges for software estimation Agile/ hybrid methodologies Rapid, continuous delivery Vague, changing requirements < 4 > High uncertainty: • Product features • Budget • Timeframe • Quality Need for techniques and tools that enable scenario planning and dynamic adoption to changing environment! Leaner, faster and more dynamic!
  • 6. Limitations of existing estimation techniques < 6 > Expert estimation (PERT, Delphi, Planning Poker) Estimation by analogy Parametric models (COCOMO II, SLIM, SEER-SEM) Size-based estimation models (FPA, Use Case) Decomposition and bottom-up (WBS-based, User stories) Expert Knowledge Subjective choice of comparison criterion Difficult to preform in changing environment, limited information Code reuse, libraries, codeless programming and agile development Requires training, may not be able to apply for baseline estimation
  • 7. Data Science? • Growing applicability of advanced analytics in various industries • Availability of data • State of art machine learning algorithms • Automation and feedback mechanism • Exceptional in tackling uncertainty • Open source: R, Python, KNIME < 7 >
  • 8. Data Science for software estimation • Researched for last 2 decades, primarily for: - Effort and duration estimation - Monitoring - Quality - Maintenance • Applied various techniques and algorithms – classification, regression, ensembling, data preparation, machine learning algorithms • Datasets – ISBSG, COCOMO, NASA, SourceForge, PROMISE Software Engineering Repository • Emphasis on prediction accuracy of Effort and Duration models • Exceptional results, although limited to none implementations within organisations < 8 > Wen, J., Li, S., Lin, Z., Hu, Y. and Huang, C. (2012). Systematic literature review of machine learning based software development effort estimation models. Information and Software Technology. 54.
  • 9. Effort and duration estimation model - example < 9 > Input dataset 3-fold cross- validation SVM MLP GLM Ensemble aggregation Evaluation MMRE, PRED, MMER, MBRE ISBSG dataset Feature selection Data understanding Data preparation LASSO, stepwise regression Transformation, normalization Data selection & cleaning Pearson correlation Pospieszny, P., Czarnacka-Chrobot, B. and Kobyliński, A. (2017). An effective approach for software project effort and duration estimation with machine learning algorithms – working paper
  • 11. Prescriptive Analytics • Final frontier of business analytics – IBM, 2010 • Scenario planning and smart foresight • Performed with a given set of goals, limitations and constraints that define different scenarios which provide foresight as to the best outputs (set of alternative actions or decisions) • Applies combination of different techniques and approaches with emphasis on machine learning algorithms • Software Process Simulation – similar principle, different approach, techniques and algorithms (system dynamics, discrete event simulation, Petri nets and Monte Carlo) • Enhanced planning, optimization of resources and increasing project success rate < 11 >Lustig, I., Dietrich, B., Johnson, C. and Dziekan, C. (2010). The analytics journey. Analytics Magazine. 3, 11–13.
  • 12. Five pillars < 12 >Basu, A. 2013. Five pillars of prescriptive analytics success. Executive Edge. (2013), 8–11. Integrated predictions & prescriptions Both must work synergistically for prescriptive analytics to deliver accurate results Prescriptions & side effects Prescriptions based on advanced analytic approaches that include a process of generating the best course of actions for defined goals, constraints and decision variables Adaptive algorithms Flexible algorithms that enable re-predictions and re-prescriptions and can handle noisy input data – preferably machine learning algorithms. Feedback mechanism Any actions based on prescriptions should be recorded for further use to deliver better actions – automated learning process. Hybrid data Structured & unstructured data, including feed from external sources (environmental, economic data etc.)
  • 13. Machine learning • Exceptional in handling multi-variety and noisy data in uncertain environments • Unsupervised vs. Supervised learning (+ Reinforcement) • Types of problems – association rules, clustering, regression, classification • Algorithms – Neural Networks, Deep learning, Support Vector Machines, Decision Trees, Generalized Linear Models • Optimization and learning mechanism < 13 >
  • 14. Areas of software estimation – applicability < 14 > Area Description Sample questions Baseline estimation Scenario planning based on fixed effort and/or duration. The aim is to define project and product characteristics that will ensure project/ phase/sprint’s completion within determined effort and timeframe. • Which resources based on their skillset should be allocated? • What size and quality will the product have? • Which development methodology should be used? Monitoring Identify any deviations from baselines that may impact successful completion of project/phase/sprint or even task, and propose corrective actions by adjusting project or product characteristics (fixed effort and/or duration). • Which additional resources should be allocated? • Which product features needs to be sacrificed? • How effort or duration overrun reduction activities will impact product quality? Quality Define project and product features that will ensure delivering product with determined baseline quality. • Which resources will ensure delivering high quality product? • What architecture, development platform or programing language should be applied? • Which software development and testing methodologies should be used? Maintenance By determining maintenance effort of product to be develop define project and product characteristics. • What quality of product should be delivered? • Which skilled resources should be allocated? • What development and testing methodology should be applied?
  • 15. Use Case#1 - Baseline estimation Objective: Complete project within 12 months Question: • What effort is involved? • Which resources should be applied? Key metrics: • Decision variables: effort, resource types & volume • Constraints: duration, product characteristics, development methodology Approach: Run multiple ML predictive models with different scenarios in order to obtain the most optimal solution: 1. Manipulate with resource types & volume, and also effort in order to achieve (predict) duration ~12 months OR 2. Define multiple resource scenarios and predict effort (reminding variables as constraints) < 15 >
  • 16. UC#2 - Task Estimation < 16 > Objective: Complete tasks within a Scrum Sprint (2-4 weeks) or Release (1-3 months) Question: • How many story points/size? • Which resources should be applied? • Which development methodology should be used? Key metrics: • Decision variables: Story points/ size, resource types & volume • Constraints: duration, functionalities to be delivered, task characteristics Approach: Define multiple scenarios for resources, dev methodologies (for releases) and predict story points/size for each task or sprint
  • 17. UC#3 – Change request (release, project) < 17 > Objective: Absorb changes in release/ project scope and deliver functionalities/ product within baseline duration Question: • How many sprints/ iterations and at what duration? • How many story points/size? • Which additional resources should be applied? • Which functionalities/ user stories should be dropped? Key metrics: • Decision variables: Resource types & volume, story points/ size, user stories/ functionalities • Constraints: Duration, quality metrics, development methodology and framework, architecture, programming languages etc. Approach: Define multiple scenarios for sprints/ iterations and resources and predict story points/size for release or project
  • 18. • Data - Availability – more granular than in predictive required - Volume, quality and completeness - Data preparation approach! - External data – working days, planned vacations, flu index etc. • Cost vs benefits of implementation • Implementation guidelines – simplicity! • Traditional simulation techniques vs. machine learning • Need for hybrid approach? < 18 > Open questions
  • 19. Further research • Simulation using machine learning algorithms – approach & algorithms • Integration with traditional simulation techniques- probabilistic & rule based • Proof of concept within chosen organisations • Integration with existing project management, issue and bug tracking software or development of standalone tool < 19 > Towards dynamic planning, optimal utilization of resources and ultimately increasing project success rate!