Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Software Analytics
1. MAC6912 -Ambientes de Desenvolvimento de Software
Professor Marco Aurélio Gerosa
Ana Paula Oliveira Bertholdo
2. •4 papers:
–Analyticsfor SW Development
(Zimmermann & Buse, 2010)
–SwAnalyticsas a Learning Case in Practice: Approaches andExperiences
(Zhang et al., 2011)
–Analyzethis! 145 questionsfor data scientistsin SwEngineering
(Begel& Zimmermann, 2014)
–What’snextin SW Analytics
(Hassan et al., 2013)
3. •Software engineering is a data rich activity.
•Artifacts of a project’s development
–automation, efficiency, and granularity.
•Projects can be measured throughout their life-cycle.
4. •SW development continues to be risky and unpredictable.
•It is not unusual for major development efforts to experience large delays or failures.
5. •Substantial disconnect between
–(A) the information needed by project managers to make good decisions and
–(B) the information currently delivered by existing tools.
–At its root:
•Problem: real-world information needs of project managers are not well understood by the research community.
•Research has ignored the needs of managers and has instead focused on the information needs of developers.
6. •When data needs are not met…
–tools are unavailable
–too difficult to use
–too difficult to interpret or
–they simply do not present useful or actionable information
•Managers must primarily rely on past experience and intuition for critical decision making.
7. •The data-centric style of decision making is known as analytics.
•The idea is to leverage large amounts of data into real and actionable insights.
9. •Transition isn’t easy!
•Insight necessarily requires
–knowledge of the domain coupled with the
–ability to identify patterns involving multiple indicators.
10. •Managers may be too busy or may simply lack the quantitative skills or analytic expertise to fully leverage advanced analytical applications.
•One possibility is that tools should be created with this in mind.
•Another possibility is the addition of an analytic professional to the software development team.
11.
12. •Conclusion
–All resources, especially talent, are always constrained.
–This alludes to the importance of careful and deliberate decision making by the managers of software projects.
–The observation that software projects continue to be risky and unpredictable despite being highly measurable implies that more analytic information should be leveraged toward decision making.
–In this paper, the researchers
•described how software analytics can help managers move from low-level measurements to high-level insights about complex projects.
•advocated more research into the information needs and decision process of managers.
•discussed how the complexity of software development suggests that dedicated analytic professionals with both quantitative skills and domain knowledge might provide great benefit to future projects.
13. •Researchers(Microsoft Research Asia) advocatethatwhenapplying analytic technologies in practice one should:
–(1) incorporate a broad spectrum of domain knowledge and expertise,
•e.g., management, machine learning, large-scale data processing and computing, and information visualization; and
–(2) investigate how practitioners take actions on the produced information, and provide effective support for such information-based action taking.
14. –Various analytic technologies
•(data mining, machine learning, and information visualization).
–Software analytics is to enable to perform data exploration and analysis in order to obtain insightful and actionable information.
–Insightful information
•meaningful and useful understanding or knowledge towards performing the target task.
–Actionable information
•upon which software practitioners can come up with concrete solutions towards completing the target task.
15. •Developing a software analytic project typically goes through iterations of the life cycle of four phases:
1) task definition,
2) data preparation,
3) analytic-technology development, and
4) deployment and feedback gathering.
16. •Task definition is to define the target task to be assisted by software analytics
–pull model: Stack Mine -> performance analysis
–push model: XIAO -> refactoring and defect detection
17. •Data preparationis to collect data to be analyzed.
–2 types of infrastructure supports: existing ones in industry and in-house ones.
–StackMine-> existing Microsoft infrastructure support.
–XIAO-> in-house code-analysis.
18. •Analytic-technology development is to develop problem formulation, algorithms, and systems to explore, understand, and get insights from the data.
–The SA team needs to acquire deep knowledge about the data (including its format and semantics) and target tasks.
–the time this acquirement process may be non-trivial.
19. •Deployment and feedback gathering involves two typical scenarios.
–1: the researchers have obtained some insightful information from the data and they ask domain experts to review and verify.
–2: the researchers ask domain experts to use the analytic tools to obtain insights by themselves.
•“the more the customers use the tools, the “smarter” the tools become.”
20. •Domain knowledge and expertise are strongly needed in successfully developing a software analytic project for technology transfer.
•Types of domain knowledge:
–Specific application domain knowledge (customers).
–Common application domain knowledge(family of swapplications).
–Data domain knowledge(data preparation).
21. •Typesofexpertise:
–Task expertise
•work with the customers to learn the workflow.
–Management expertise
•good management and communication skills to interact with the customers and manage the team.
–Machine learning expertise.
•to develop machine learning algorithms and tools (not just in a black-box way).
–Large-scale data processing/computing expertise.
•to design and implement scalable data processing tools and learning tools.
–Information visualization expertise.
•to design and implement good user interfaces and visualization for presenting analysis results.
22. •Conclusion:
–What do developers think about your result?
–Is it applicable in their context?
–How much would it help them in their daily work?”
24. •Businesses of all types commonly use analytics to better reach and understand their customers.
•Many software engineering researchers have argued for more use of data for decision-making.
•The demand for data scientists in software projects will grow rapidly.
•Harvard Business Review named the job of Data Scientist as the most desired Job of the 21st Century
•By 2018, the U.S. may face a shortage of as many as 190,000 people with analytical expertise and of 1.5 million managers and analysts with the skills to make data-driven decisions, according to a report by the McKinsey Global Institute.
27. •The research:
–provides a catalog of 145 questions that software engineers would like to ask data scientists about software.
–ranks the questions by importance (and opposition) to help researchers, practitioners, and educators focus their efforts on topics of importance to industry.
–calls to action to other industry companies and to the academic community to replicate its methods and grow the body of knowledge from this start (technical report).
28. •Initialsurvey:
–2 pilotsurveysto25 and75 Microsoft engineers.
–The pilot demonstrated the need to seed the survey with data analytics questions.
•What impact does code quality have on our ability to monetize a software service?
–1500 SW engineers in September 2012.
–36,5% developers, 38,9% testers, 22,7% program managers.
36. •Of the questions with the most opposition, the top five are about the fear that respondents had of being ranked and rated.
37.
38. Catalog of 145 questions is relevant for:
•Research:
–the descriptive questions outline opportunities to collaborate with industry and
–influence their software development processes, practices, and tools.
•Practice:
•the list of questions identifies particular data to collect and analyze to find answers,
•as well as the need to build collection and analysis tools at industrial scale.
•Education:
•the questions provide guidance on what analytical techniques to teach in courses for future data scientists,
•as well as providing instruction on topics of importance to industry (which students always appreciate).
39. •Conclusion
–Researchershope that this paper will inspire similar research projects.
–In order to facilitate replication of this work for additional engineering disciplines and companies, they provide the full text of both surveys as well as the 145 questions in a technical report.
–With the growing demand for data scientists, more research is needed to better understand how people make decisions in software projects and what data and tools they need.
–There is also a need to increase the data literacy of future software engineers.
–Lastly, we need to think more about the consumer of analyses and not just the producers of them (data scientists, empirical researchers).
42. •SW analyticsshouldgo beyonddevelopers
–SA focusesonhelpingindividual developers with coding and bug-fixing decisions
•by mining developer-oriented repositories such as version control systems and bug trackers.
–SA needs to service a project’s various stakeholders
•marketing, sales, support teams –not just developers.
44. •ProvingrelevancetoPractitioners
–Future -> Layersofcontextare takenintoconsideration:
•Domain ofSW development
–nonfunctional requirements, environments, tools, idioms, and so on.
•Domain of the software itself
–databases, applications, and so on.
•Context of the overall software project
–Requirements, glossary, architecture, community, and so on.
45. •ProvingrelevancetoPractitioners
–Software analytics has to prove its relevance by showing its cost effectiveness versus the alternative, which is doing nothing.
•Doing nothing can be amazingly efficient.
•We need to evaluate these techniques with practitioners in mind.
•More meaningful and less superficial software analytics.
46. •Merenumbersaren’tenough
–Numbers and equations are important to capture relations in the data,
–For practical use: they must be accompanied with interpretation and visualization.
–It’s a transfer from the quantitative domain to the qualitative domain.
–more research is needed on:
•how to bring the message out of the software analytics to those who make decision based on them.
47. •3 Questionsfor Analytics:
•1)How much better is my model performing than a simple strategy, such as guessing?
•2) How practically significant are the results?
–effect sizes
•3) How sensitive are the results to small changes in one or more of the inputs?
–uncertain data
48. •Opportunitiesfor natural SW analytics
–using models from statistical natural language processing for a new kind of analytics.
–What most people write and say, most of the time, is highly repeatable and predictable.
–Devices like Google Translate and Siri.
–Code is no different.
•most everyday code is simple and highly predictable.
–Able to adapt standard n-gram models from statistical NLP to code, and train them on hundreds of millions of LOC.
–Code is actually between 8 and 16 times more predictable than English.
49. •Wanted: Assistance from Information Analysts
–Mission Impossible and TV Series 24
•Fields agents -> heroes -> developers
•We shouldn’t neglect the information analysts (Chloe on 24)
•Information Analysts -> provide critical information
–such as the backgrounds, strengths, and weaknesses of the people, places, and eventualities faced by the field agents.
–Without the information analysts, it’s hard to imagine a successful mission.
–Information analysts = real heroes.
50. •Wanted: Assistance from Information Analysts
–Developers have to figure out all the necessary information about
•what and where and how to change the software by themselves.
–We need to provide the services of information analysts to developers
•and assist them in making the right decisions.
–SW analytics can continually provide contextual information based on developers’ current tasks.
–Decent information visualization and computer- human interaction technologies
•can help present this information efficiently.
51. •Papersdiscuss:
–Context!
–Relevancefor practioners.
–New waysfor conductingSW analytics.
–Importanceofnew studies.
–Additionof an analytic professional to the software development team.
53. [1] Buse& Zimmermann: Analytics for Software Development (FoSER2010).
[2] Zhang et al.: Software Analytics as a Learning Case in Practice: Approaches and Experiences (MALETS 2011).
[3] Begel& Zimmerman: Analyze This! 145 Questions for Data Scientists in Software Engineering (ICSE 2014).
[4] Hassan, Hindle, Runeson, Shepperd, Devanbu, & Kim: What’s Next in Software Analytics (IEEE Software 2013).