SlideShare a Scribd company logo
1 of 30
Download to read offline
Agility in an
AI/DS/ML
project
TATHAGAT VARMA
STRATEGY & OPERATIONS, WALMART
DOCTORAL SCHOLAR, INDIAN SCHOOL OF BUSINESS
Disclaimer!
THESE ARE MY PERSONAL VIEWS.
AI is (getting)
everywhere…!
https://info.algorithmia.com/2021
…and fast accelerating!!!
https://info.algorithmia.com/2021
However…it is
taking too long
to develop and
deploy…!
u 2-6 months depending on
scope and size. Data
Collection (20%), Data
Cleaning (50%), Data
Exploration (15%), Data
Modeling (10%), Data
Interpretation (5%)
u The time required to deploy
a model is increasing year-
on-year.
u Only 11% of organizations
can put a model into
production within a week,
and 64% take a month or
longer
https://info.algorithmia.com/2021
The time of Data Scientists being spent in deploying the
models…and more models means more time spent in
deployment…!
https://info.algorithmia.com/2021
…with alarmingly high failure rates!
u It was estimated that 85% of AI projects will fail
and deliver erroneous outcomes through 2022.
u 70% of companies report minimal or no impact
from AI.
u 87% of data science projects never make it into
production.
https://research.aimultiple.com/ai-fail/
Low ROI, & Long Payback periods!
u The ROI for AI projects varies greatly, based on
how much experience an organization has.
Leaders showed an average of a 4.3% ROI for
their projects, compared to only 0.2% for
beginning companies.
u Payback periods also varied, with leaders
reporting a typical payback period of 1.2 years
and beginners at 1.6 years.
https://www2.deloitte.com/us/en/insights/industry/technology/artificial-intelligence-roi.html
How is AI different?
Traditional Software AI Software
Reasoning Deductive Inductive
Inputs Data + Program Data + Output
Logic Manually pre-programmed to perform a
specific task on a given dataset
Programmed to automatically keep learning
rules from a given dataset
Output Output Models, Rules
Learning Learns one-time from the programmer Learns constantly being the data
Resource Code Data
Solutions Deterministic Probabilistic
Output Consistently remains the same Can improve with usage (or degrade over time)
Business
model
One-time development efforts, followed by
multiple sales, and small maintenance effort
(optional)
Each project is one-off, and needs full lifecycle
management mandatorily
Elements of ML systems
https://www.ibm.com/cloud/blog/ai-model-lifecycle-management-overview
A typical lifecycle for an AI project
u Scoping and Data
Acquisition
u Experimentation
and Model Building
u Production,
Deployment,
Scaling and
Operationalize
Data, data, data…!
u Industry reports indicate up
to 80% efforts in data
wrangling!
u Upto 1/4th of that only in
cleaning and another 1/4th
in labeling
u Just 10% of the time spend
in model training!
https://medium.com/whattolabel/data-labeling-ais-human-bottleneck-24bd10136e52
Data trumps algorithms!
In the article “Datasets Over Algorithms”, Alexander Wissner-Gross showed that
the mean time between a new machine learning algorithm being published
and its use in an AI breakthrough was 18 years; however, the mean time
between the required datasets becoming available and those AI
breakthroughs was 3 years. Machine learning without the necessary data and
use cases is merely a pile of nuts and bolts waiting to be built into something
useful. Nonetheless, machine learning is about learning from data, not about
writing code, and that represents a fundamental difference from previous
software engineering practices.
- Agile AI, Carlo Appugliese, Paco Nathan, and William S. Roberts, O’Reilly
Data lifecycle
u While CRISP-DM (Cross
Industry Standard
Process for Data
Mining) lifecycle seems
to be a bit dated
(published 1999) and
inactive, it is still a good
reference point on the
key phases of data
lifecycle
u Flows are not
sequential but
back/forth
https://www.ibm.com/docs/en/spss-modeler/SaaS?topic=dm-crisp-help-overview
Generic
tasks and
outputs in a
CRISP-DM
Reference
Model
https://www.ibm.com/docs/en/spss-modeler/SaaS?topic=dm-crisp-help-overview
CRISP-DM favored over agile methodologies?
https://www.datascience-pm.com/crisp-dm-still-most-popular/
Challenges with Scrum in Data
Science projects
u One key challenge of using a sprint-based framework within a data
science context is the fact that task estimation is unreliable. In other words,
if the team can not accurately estimate task duration, the concept of a
sprint, and what can get done within a sprint is problematic.
u Another key challenge is that Scrum’s fixed-length sprints can be
problematic. Even if a team could estimate how long a specific analysis
might take, having a fixed-length sprint might force the team to define an
iteration to include unrelated work items (as well as delay the feedback
from an exploratory analysis), which could help prioritize new work. In short,
a sprint does not allow smaller (or longer) logical chunks of work to be
completed and analyzed in a coherent fashion.
https://www.datascience-pm.com/data-driven-agile/
Challenges with traditional Kanban in
Data Science projects
u In general, these challenges include the lack of organizational support and
culture, lack of training and the misunderstanding of key concepts.
u Specifically, Kanban does not define project roles nor any process
specifics.
u The freedom Kanban provides (such as letting teams define their own
process for prioritizing tasks) can be part of the challenge in implementing
Kanban. While this lack of process structure can be a strength (since the
lack of a specified process definition allows teams to implement Kanban
within existing organizational practices), it can also mean that every team
could implement Kanban differently. In other words, a team that wants to
use Kanban needs to figure out its own processes and artifacts.
https://www.datascience-pm.com/data-driven-agile/
Data-Driven Scrum (DDS)
u The Data Science Process
Alliance created an alternative
framework called Data Driven
Scrum which is designed with data
science in mind.
u Data Driven Scrum™ (DDS) is
an agile framework specifically
designed for data science teams. DDS
provides a continuous flow framework
for agile data science by integrating
the structure of Scrum with the
continuous flow of Kanban.
https://www.datascience-pm.com/data-driven-scrum/
Leveraging Scrum and Kanban…
u DDS can be viewed as a specific instantiation of Scrum with two notable
exceptions:
u The most important exception is that the Scrum Guide requires all iterations (sprints) to be
of equal length in time. However, iterations in DDS vary in duration to allow a logical
increment of work to be done in one iteration (rather than defining the amount of work
that can be done in a specific unit of time).
u The other notable exception is that retrospectives and item reviews are not done at the
end of every iteration, but rather, on a frequency the team deems appropriate.
u DDS also adheres to the Kanban principles (e.g., there is a Kanban board, teams
need to limit WIP, and work items flow across the board). However, the framework
provides more structure than defined by Kanban, such as defined iterations as well
as a more defined framework (ex. roles and meetings). Having a more clearly
defined process that leverages agile best practices, will enable teams to
implement the process in a more consistent and repeatable manner.
https://www.datascience-pm.com/data-driven-scrum/
Key Tenets of DDS
u Agile is Iterative Experimentation
Agile is intended to be a sequence of iterative experimentation and adaptation cycles.
u Iterations are Capacity-Based
Teams work iteratively on a given set of items until they are done (no inflexible deadlines).
u Focus on Create, Observe, Analyze
Each iteration always follows three core steps: Create something, observe its performance,
and analyze the results.
u Easily Integrate with Scrum
DDS’s interfaces can be seamlessly integrated within a traditional Scrum-based
organization.
https://www.datascience-pm.com/data-driven-scrum/
DDS vs Traditional Scrum: Similarities
u Similar Roles
Just like traditional Scrum, each DDS team is a group of up to about ten people,
one of whom is the product owner, and one of whom is the process expert.
u Similar Events
Just as in traditional Scrum, there is a daily stand-up, as well as Iteration and
Retrospective Reviews.
u Similar Process to create and prioritize Items
Just like traditional Scrum, items are created, prioritized and viewed on a task
board.
https://www.datascience-pm.com/data-driven-scrum/
DDS vs Traditional Scrum: Differences
u Functional Iterations
DDS iterations have unknown and varying length iterations (as compared to traditional Scrum sprints, which
have fixed-time durations). This enables iterations that might make sense to be shorter or longer than
average (e.g., an iteration might be shorter than normal due to being able to learn from a quick / short
experiment).
u Uncertain Task Duration
Unlike traditional Scrum (which requires accurate task estimations to know what can fit into a sprint), DDS
naturally accommodates tasks that are difficult to estimate (and task estimation is often difficult within a
data science context).
u Collective Analysis
The entire team focuses on creating, observing and then analyzing an hypothesis, analysis or feature (often
in traditional scrum, this analysis is done by the product owner outside of the codified process).
u Iteration-Independent Meetings
Retrospectives and item reviews and not done at the end of every iteration (as is done in traditional
Scrum), but rather, on a calendar-based frequency the team deems appropriate.
https://www.datascience-pm.com/data-driven-scrum/
Principles of DDS
u Allow capability-based iterations – it might be that sometimes it makes sense to
have an iteration that lasts one day, and other times, for an iteration last three
weeks (ex. due to how long it takes to acquire / clean data or how long it takes for
an exploratory analysis). The goal should be to allow logical chunks of work to be
released in a coherent fashion.
u Decoupling meetings from an iteration – since an iteration could be very short (ex.
one day for a specific exploratory analysis), meetings (such as a retrospective to
improve the team’s process) should be based on a logical time-based window, not
linked to each iteration.
u Only require high-level item estimation – In many situations, defining an explicit
timeline for an exploratory analysis is difficult, so one should not need to generate
accurate detailed task estimations in order to use the framework. But, high-level “T-
Shirt” level of effort estimates can be helpful for prioritizing the potential tasks to be
done.
https://www.datascience-pm.com/data-driven-agile/
DDS Framework
u Data Driven Scrum supports lean iterative exploratory data science analysis,
and acknowledges that iterations will vary in length due to the phase of the
project (collecting data vs creating a machine learning analysis).
u DDS defines an agile lean process framework that leverages some of the key
concepts of Scrum as well as the key concepts of Kanban, but differently than
Scrumban (which as is more of Kanban within a Scrum Framework and hence,
Scrumban implements Scrum sprints, which as previously noted, introduces
several challenges for the project team).
u In short, DDS teams use a Kanban-like visual board and focus on working on a
specific item or collection of items during an iteration, which is task-based, not
time-boxed. Thus, an iteration more closely aligns with the lean concept of
pulling tasks, in a prioritized manner, when the team has capacity. Each
iteration can be viewed as validating or rejecting a specific lean hypothesis.
https://www.datascience-pm.com/data-driven-agile/
Steps in a DDS Iteration
Create: A thing or set of
things that will be created,
put into use with a
hypothesis about what will
happen.
Observe: A set of
observable outcomes of
that use that will be
measured (and any work
that is needed to facilitate
that measurement).
Analyze: Analyzing those
observables and create a
plan for the next iteration
https://www.datascience-pm.com/data-driven-agile/
https://datadrivenscrum.com/how-DDS-works/
Scaling DDS
The DDS framework is a single team
framework that is designed to be
compatible with the Scrum@Scale
scaling framework.
Each DDS team exposes the
necessary interfaces to collaborate
with other teams (each of which
might be doing Scrum or DDS) via
its roles and artifacts, while
encapsulating its internal workflow.
Team touchpoint DDS Scrum
Metascrum
representation
Product Owner Product owner
Scrum of Scrums
representation
Process Master Scrum Master
Product / release
feedback
Iteration Review Sprint Review
Metrics and
transparency
Item Backlog /
Taskboard
Product Backlog /
Sprint Backlog
Recap
u AI / DS / ML is an evolving field, with long development /
deployment cycles, high failure rates and low ROI.
u It is still a software, but yet, not quite like the traditional
software in many ways!
u While agile principles are rather generic problem-solving
methods, some ideas don’t quite apply well.
u Data-Driven Scrum offers an interesting perspective for
delivering DS projects with agility.
u For deployment, AIOps / MLOps orchestration platforms
are fast emerging to provide necessary tool support.
References
u https://future.a16z.com/new-business-ai-different-traditional-software/
u https://medium.com/machine-learning-in-practice/how-machine-learning-
differs-from-traditional-software-80d0a235ff3b
u https://blog.dataiku.com/ai-projects-lifecycle-key-steps-and-considerations
u https://www.ibm.com/cloud/blog/ai-model-lifecycle-management-overview
u https://labelyourdata.com/articles/lifecycle-of-an-ai-project-stages-
breakdown
u https://www.datascience-pm.com/effective-data-science-process/
u https://www.datascience-pm.com/data-driven-agile/
u https://datadrivenscrum.com/

More Related Content

What's hot

Digital Transformation: Step-by-step Implementation Guide
Digital Transformation: Step-by-step Implementation GuideDigital Transformation: Step-by-step Implementation Guide
Digital Transformation: Step-by-step Implementation Guide
Operational Excellence Consulting
 
Masterclass On Improving & Measuring Onboarding, Retention & Well-being
Masterclass On Improving & Measuring Onboarding, Retention & Well-beingMasterclass On Improving & Measuring Onboarding, Retention & Well-being
Masterclass On Improving & Measuring Onboarding, Retention & Well-being
Richard Harbridge
 

What's hot (20)

Business Process Management Training | By ex-Deloitte & McKinsey Consultants
Business Process Management Training | By ex-Deloitte & McKinsey ConsultantsBusiness Process Management Training | By ex-Deloitte & McKinsey Consultants
Business Process Management Training | By ex-Deloitte & McKinsey Consultants
 
What is a Product Manager? by Datank.ai's Product Manager
What is a Product Manager? by Datank.ai's Product ManagerWhat is a Product Manager? by Datank.ai's Product Manager
What is a Product Manager? by Datank.ai's Product Manager
 
Strategy to Execution by Jonny Schneider - ThoughtWorks
Strategy to Execution by Jonny Schneider - ThoughtWorksStrategy to Execution by Jonny Schneider - ThoughtWorks
Strategy to Execution by Jonny Schneider - ThoughtWorks
 
First 90 days as a Product Manager
First 90 days as a Product ManagerFirst 90 days as a Product Manager
First 90 days as a Product Manager
 
Digital Transformation: Step-by-step Implementation Guide
Digital Transformation: Step-by-step Implementation GuideDigital Transformation: Step-by-step Implementation Guide
Digital Transformation: Step-by-step Implementation Guide
 
Lightning Talk: Meaningfully Reframing PI Planning
Lightning Talk: Meaningfully Reframing PI PlanningLightning Talk: Meaningfully Reframing PI Planning
Lightning Talk: Meaningfully Reframing PI Planning
 
Platform Product Management: Changing What’s Possible by The New York Times S...
Platform Product Management: Changing What’s Possible by The New York Times S...Platform Product Management: Changing What’s Possible by The New York Times S...
Platform Product Management: Changing What’s Possible by The New York Times S...
 
Digital Transformation From Strategy To Implementation
Digital Transformation From Strategy To ImplementationDigital Transformation From Strategy To Implementation
Digital Transformation From Strategy To Implementation
 
Design Thinking
Design ThinkingDesign Thinking
Design Thinking
 
Stop doing Retrospective and Start your Toyota Kata
Stop doing Retrospective and Start your Toyota KataStop doing Retrospective and Start your Toyota Kata
Stop doing Retrospective and Start your Toyota Kata
 
The 10 Steps to Becoming a Great Agile Coach
The 10 Steps to Becoming a Great Agile CoachThe 10 Steps to Becoming a Great Agile Coach
The 10 Steps to Becoming a Great Agile Coach
 
Become a Great Product Manager
Become a Great Product ManagerBecome a Great Product Manager
Become a Great Product Manager
 
#UiPathForward Daniel Dines Keynote
#UiPathForward Daniel Dines Keynote#UiPathForward Daniel Dines Keynote
#UiPathForward Daniel Dines Keynote
 
Data-Driven Operating Models Enabled by Process Mining
Data-Driven Operating Models Enabled by Process MiningData-Driven Operating Models Enabled by Process Mining
Data-Driven Operating Models Enabled by Process Mining
 
The Product Owner Playbook - Introduction
The Product Owner Playbook - IntroductionThe Product Owner Playbook - Introduction
The Product Owner Playbook - Introduction
 
Be ready for hyperautomation with the UiPath RPA Platform
Be ready for hyperautomation with the UiPath RPA PlatformBe ready for hyperautomation with the UiPath RPA Platform
Be ready for hyperautomation with the UiPath RPA Platform
 
Core Capabilities Of RPA Center Of Excellence Operating Framework
Core Capabilities Of RPA Center Of Excellence Operating FrameworkCore Capabilities Of RPA Center Of Excellence Operating Framework
Core Capabilities Of RPA Center Of Excellence Operating Framework
 
Masterclass On Improving & Measuring Onboarding, Retention & Well-being
Masterclass On Improving & Measuring Onboarding, Retention & Well-beingMasterclass On Improving & Measuring Onboarding, Retention & Well-being
Masterclass On Improving & Measuring Onboarding, Retention & Well-being
 
Customer Centric & Hypothesis Driven Innovation by Cruise VP of Product Engin...
Customer Centric & Hypothesis Driven Innovation by Cruise VP of Product Engin...Customer Centric & Hypothesis Driven Innovation by Cruise VP of Product Engin...
Customer Centric & Hypothesis Driven Innovation by Cruise VP of Product Engin...
 
Product Discovery At Google
Product Discovery At GoogleProduct Discovery At Google
Product Discovery At Google
 

Similar to Agility in an AI / DS / ML Project

· Stability in the Frequency Domain1. Consider a closed-loop sys.docx
· Stability in the Frequency Domain1. Consider a closed-loop sys.docx· Stability in the Frequency Domain1. Consider a closed-loop sys.docx
· Stability in the Frequency Domain1. Consider a closed-loop sys.docx
oswald1horne84988
 
Basic-Project-Estimation-1999
Basic-Project-Estimation-1999Basic-Project-Estimation-1999
Basic-Project-Estimation-1999
Michael Wigley
 
Agile Prediction Model EASE 2016 V2
Agile Prediction Model EASE 2016 V2Agile Prediction Model EASE 2016 V2
Agile Prediction Model EASE 2016 V2
Mathieu Carsique
 
Site-Reliability-Engineering-v2[6241].pdf
Site-Reliability-Engineering-v2[6241].pdfSite-Reliability-Engineering-v2[6241].pdf
Site-Reliability-Engineering-v2[6241].pdf
DeepakGupta747774
 

Similar to Agility in an AI / DS / ML Project (20)

Scrum an extension pattern language for hyperproductive software development
Scrum an extension pattern language  for hyperproductive software developmentScrum an extension pattern language  for hyperproductive software development
Scrum an extension pattern language for hyperproductive software development
 
· Stability in the Frequency Domain1. Consider a closed-loop sys.docx
· Stability in the Frequency Domain1. Consider a closed-loop sys.docx· Stability in the Frequency Domain1. Consider a closed-loop sys.docx
· Stability in the Frequency Domain1. Consider a closed-loop sys.docx
 
A Pattern-Language-for-software-Development
A Pattern-Language-for-software-DevelopmentA Pattern-Language-for-software-Development
A Pattern-Language-for-software-Development
 
Data Analytics Life Cycle
Data Analytics Life CycleData Analytics Life Cycle
Data Analytics Life Cycle
 
Current Trends in Agile - opening keynote for Agile Israel 2014
Current Trends in Agile - opening keynote for Agile Israel 2014Current Trends in Agile - opening keynote for Agile Israel 2014
Current Trends in Agile - opening keynote for Agile Israel 2014
 
Oasize llnl
Oasize llnlOasize llnl
Oasize llnl
 
Basic-Project-Estimation-1999
Basic-Project-Estimation-1999Basic-Project-Estimation-1999
Basic-Project-Estimation-1999
 
Oracle database performance diagnostics - before your begin
Oracle database performance diagnostics  - before your beginOracle database performance diagnostics  - before your begin
Oracle database performance diagnostics - before your begin
 
Adf and data quality
Adf and data qualityAdf and data quality
Adf and data quality
 
Agile Prediction Model EASE 2016 V2
Agile Prediction Model EASE 2016 V2Agile Prediction Model EASE 2016 V2
Agile Prediction Model EASE 2016 V2
 
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERSTEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
 
Temporally Extended Actions For Reinforcement Learning Based Schedulers
Temporally Extended Actions For Reinforcement Learning Based Schedulers Temporally Extended Actions For Reinforcement Learning Based Schedulers
Temporally Extended Actions For Reinforcement Learning Based Schedulers
 
Modern Software Methodologies(Agile ,Scrum & Lean) + CASE STUDY(Google)
Modern Software Methodologies(Agile ,Scrum & Lean) + CASE STUDY(Google)Modern Software Methodologies(Agile ,Scrum & Lean) + CASE STUDY(Google)
Modern Software Methodologies(Agile ,Scrum & Lean) + CASE STUDY(Google)
 
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERSTEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
TEMPORALLY EXTENDED ACTIONS FOR REINFORCEMENT LEARNING BASED SCHEDULERS
 
Kanban Methodology
Kanban MethodologyKanban Methodology
Kanban Methodology
 
Introduction to Scrum
Introduction to ScrumIntroduction to Scrum
Introduction to Scrum
 
Using Agile Methodologies
Using Agile MethodologiesUsing Agile Methodologies
Using Agile Methodologies
 
Site-Reliability-Engineering-v2[6241].pdf
Site-Reliability-Engineering-v2[6241].pdfSite-Reliability-Engineering-v2[6241].pdf
Site-Reliability-Engineering-v2[6241].pdf
 
BAAgileQA
BAAgileQABAAgileQA
BAAgileQA
 
Agilelessons scanagile-final 2013
Agilelessons scanagile-final 2013Agilelessons scanagile-final 2013
Agilelessons scanagile-final 2013
 

More from Tathagat Varma

More from Tathagat Varma (20)

Can AI finally "cure" the Marketing Myopia?
Can AI finally "cure" the Marketing Myopia?Can AI finally "cure" the Marketing Myopia?
Can AI finally "cure" the Marketing Myopia?
 
AI in Manufacturing: Opportunities & Challenges
AI in Manufacturing: Opportunities & ChallengesAI in Manufacturing: Opportunities & Challenges
AI in Manufacturing: Opportunities & Challenges
 
Preparing for the next ________?
Preparing for the next ________?Preparing for the next ________?
Preparing for the next ________?
 
AI in Business: Opportunities & Challenges
AI in Business: Opportunities & ChallengesAI in Business: Opportunities & Challenges
AI in Business: Opportunities & Challenges
 
Leadership Agility Mindsets
Leadership Agility MindsetsLeadership Agility Mindsets
Leadership Agility Mindsets
 
Building an AI Startup
Building an AI StartupBuilding an AI Startup
Building an AI Startup
 
Cognitive Chasms
Cognitive ChasmsCognitive Chasms
Cognitive Chasms
 
AI Technology Delivering Business Value
AI Technology Delivering Business Value AI Technology Delivering Business Value
AI Technology Delivering Business Value
 
Nurturing Innovation Mindset
Nurturing Innovation MindsetNurturing Innovation Mindset
Nurturing Innovation Mindset
 
Thought Leadership
Thought LeadershipThought Leadership
Thought Leadership
 
PMOs and Complexity Management
PMOs and Complexity ManagementPMOs and Complexity Management
PMOs and Complexity Management
 
An Introduction to the Systematic Inventive Thinking (SIT) Method
An Introduction to the Systematic Inventive Thinking (SIT) MethodAn Introduction to the Systematic Inventive Thinking (SIT) Method
An Introduction to the Systematic Inventive Thinking (SIT) Method
 
Agile at Scale
Agile at ScaleAgile at Scale
Agile at Scale
 
I blog...therefore I am!
I blog...therefore I am!I blog...therefore I am!
I blog...therefore I am!
 
Bridging the gap between Education and Learning
Bridging the gap between Education and LearningBridging the gap between Education and Learning
Bridging the gap between Education and Learning
 
Is my iceberg melting?
Is my iceberg melting?Is my iceberg melting?
Is my iceberg melting?
 
Digital Business Model Innovation
Digital Business Model InnovationDigital Business Model Innovation
Digital Business Model Innovation
 
25 Years of Evolution of Software Product Management: A practitioner's perspe...
25 Years of Evolution of Software Product Management: A practitioner's perspe...25 Years of Evolution of Software Product Management: A practitioner's perspe...
25 Years of Evolution of Software Product Management: A practitioner's perspe...
 
Agility from First Principles
Agility from First PrinciplesAgility from First Principles
Agility from First Principles
 
Why the world needs more rebels like you?
Why the world needs more rebels like you?Why the world needs more rebels like you?
Why the world needs more rebels like you?
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Agility in an AI / DS / ML Project

  • 1. Agility in an AI/DS/ML project TATHAGAT VARMA STRATEGY & OPERATIONS, WALMART DOCTORAL SCHOLAR, INDIAN SCHOOL OF BUSINESS
  • 2. Disclaimer! THESE ARE MY PERSONAL VIEWS.
  • 5. However…it is taking too long to develop and deploy…! u 2-6 months depending on scope and size. Data Collection (20%), Data Cleaning (50%), Data Exploration (15%), Data Modeling (10%), Data Interpretation (5%) u The time required to deploy a model is increasing year- on-year. u Only 11% of organizations can put a model into production within a week, and 64% take a month or longer https://info.algorithmia.com/2021
  • 6. The time of Data Scientists being spent in deploying the models…and more models means more time spent in deployment…! https://info.algorithmia.com/2021
  • 7. …with alarmingly high failure rates! u It was estimated that 85% of AI projects will fail and deliver erroneous outcomes through 2022. u 70% of companies report minimal or no impact from AI. u 87% of data science projects never make it into production. https://research.aimultiple.com/ai-fail/
  • 8. Low ROI, & Long Payback periods! u The ROI for AI projects varies greatly, based on how much experience an organization has. Leaders showed an average of a 4.3% ROI for their projects, compared to only 0.2% for beginning companies. u Payback periods also varied, with leaders reporting a typical payback period of 1.2 years and beginners at 1.6 years. https://www2.deloitte.com/us/en/insights/industry/technology/artificial-intelligence-roi.html
  • 9. How is AI different? Traditional Software AI Software Reasoning Deductive Inductive Inputs Data + Program Data + Output Logic Manually pre-programmed to perform a specific task on a given dataset Programmed to automatically keep learning rules from a given dataset Output Output Models, Rules Learning Learns one-time from the programmer Learns constantly being the data Resource Code Data Solutions Deterministic Probabilistic Output Consistently remains the same Can improve with usage (or degrade over time) Business model One-time development efforts, followed by multiple sales, and small maintenance effort (optional) Each project is one-off, and needs full lifecycle management mandatorily
  • 10. Elements of ML systems https://www.ibm.com/cloud/blog/ai-model-lifecycle-management-overview
  • 11. A typical lifecycle for an AI project u Scoping and Data Acquisition u Experimentation and Model Building u Production, Deployment, Scaling and Operationalize
  • 12. Data, data, data…! u Industry reports indicate up to 80% efforts in data wrangling! u Upto 1/4th of that only in cleaning and another 1/4th in labeling u Just 10% of the time spend in model training! https://medium.com/whattolabel/data-labeling-ais-human-bottleneck-24bd10136e52
  • 13. Data trumps algorithms! In the article “Datasets Over Algorithms”, Alexander Wissner-Gross showed that the mean time between a new machine learning algorithm being published and its use in an AI breakthrough was 18 years; however, the mean time between the required datasets becoming available and those AI breakthroughs was 3 years. Machine learning without the necessary data and use cases is merely a pile of nuts and bolts waiting to be built into something useful. Nonetheless, machine learning is about learning from data, not about writing code, and that represents a fundamental difference from previous software engineering practices. - Agile AI, Carlo Appugliese, Paco Nathan, and William S. Roberts, O’Reilly
  • 14. Data lifecycle u While CRISP-DM (Cross Industry Standard Process for Data Mining) lifecycle seems to be a bit dated (published 1999) and inactive, it is still a good reference point on the key phases of data lifecycle u Flows are not sequential but back/forth https://www.ibm.com/docs/en/spss-modeler/SaaS?topic=dm-crisp-help-overview
  • 15. Generic tasks and outputs in a CRISP-DM Reference Model https://www.ibm.com/docs/en/spss-modeler/SaaS?topic=dm-crisp-help-overview
  • 16. CRISP-DM favored over agile methodologies? https://www.datascience-pm.com/crisp-dm-still-most-popular/
  • 17. Challenges with Scrum in Data Science projects u One key challenge of using a sprint-based framework within a data science context is the fact that task estimation is unreliable. In other words, if the team can not accurately estimate task duration, the concept of a sprint, and what can get done within a sprint is problematic. u Another key challenge is that Scrum’s fixed-length sprints can be problematic. Even if a team could estimate how long a specific analysis might take, having a fixed-length sprint might force the team to define an iteration to include unrelated work items (as well as delay the feedback from an exploratory analysis), which could help prioritize new work. In short, a sprint does not allow smaller (or longer) logical chunks of work to be completed and analyzed in a coherent fashion. https://www.datascience-pm.com/data-driven-agile/
  • 18. Challenges with traditional Kanban in Data Science projects u In general, these challenges include the lack of organizational support and culture, lack of training and the misunderstanding of key concepts. u Specifically, Kanban does not define project roles nor any process specifics. u The freedom Kanban provides (such as letting teams define their own process for prioritizing tasks) can be part of the challenge in implementing Kanban. While this lack of process structure can be a strength (since the lack of a specified process definition allows teams to implement Kanban within existing organizational practices), it can also mean that every team could implement Kanban differently. In other words, a team that wants to use Kanban needs to figure out its own processes and artifacts. https://www.datascience-pm.com/data-driven-agile/
  • 19. Data-Driven Scrum (DDS) u The Data Science Process Alliance created an alternative framework called Data Driven Scrum which is designed with data science in mind. u Data Driven Scrum™ (DDS) is an agile framework specifically designed for data science teams. DDS provides a continuous flow framework for agile data science by integrating the structure of Scrum with the continuous flow of Kanban. https://www.datascience-pm.com/data-driven-scrum/
  • 20. Leveraging Scrum and Kanban… u DDS can be viewed as a specific instantiation of Scrum with two notable exceptions: u The most important exception is that the Scrum Guide requires all iterations (sprints) to be of equal length in time. However, iterations in DDS vary in duration to allow a logical increment of work to be done in one iteration (rather than defining the amount of work that can be done in a specific unit of time). u The other notable exception is that retrospectives and item reviews are not done at the end of every iteration, but rather, on a frequency the team deems appropriate. u DDS also adheres to the Kanban principles (e.g., there is a Kanban board, teams need to limit WIP, and work items flow across the board). However, the framework provides more structure than defined by Kanban, such as defined iterations as well as a more defined framework (ex. roles and meetings). Having a more clearly defined process that leverages agile best practices, will enable teams to implement the process in a more consistent and repeatable manner. https://www.datascience-pm.com/data-driven-scrum/
  • 21. Key Tenets of DDS u Agile is Iterative Experimentation Agile is intended to be a sequence of iterative experimentation and adaptation cycles. u Iterations are Capacity-Based Teams work iteratively on a given set of items until they are done (no inflexible deadlines). u Focus on Create, Observe, Analyze Each iteration always follows three core steps: Create something, observe its performance, and analyze the results. u Easily Integrate with Scrum DDS’s interfaces can be seamlessly integrated within a traditional Scrum-based organization. https://www.datascience-pm.com/data-driven-scrum/
  • 22. DDS vs Traditional Scrum: Similarities u Similar Roles Just like traditional Scrum, each DDS team is a group of up to about ten people, one of whom is the product owner, and one of whom is the process expert. u Similar Events Just as in traditional Scrum, there is a daily stand-up, as well as Iteration and Retrospective Reviews. u Similar Process to create and prioritize Items Just like traditional Scrum, items are created, prioritized and viewed on a task board. https://www.datascience-pm.com/data-driven-scrum/
  • 23. DDS vs Traditional Scrum: Differences u Functional Iterations DDS iterations have unknown and varying length iterations (as compared to traditional Scrum sprints, which have fixed-time durations). This enables iterations that might make sense to be shorter or longer than average (e.g., an iteration might be shorter than normal due to being able to learn from a quick / short experiment). u Uncertain Task Duration Unlike traditional Scrum (which requires accurate task estimations to know what can fit into a sprint), DDS naturally accommodates tasks that are difficult to estimate (and task estimation is often difficult within a data science context). u Collective Analysis The entire team focuses on creating, observing and then analyzing an hypothesis, analysis or feature (often in traditional scrum, this analysis is done by the product owner outside of the codified process). u Iteration-Independent Meetings Retrospectives and item reviews and not done at the end of every iteration (as is done in traditional Scrum), but rather, on a calendar-based frequency the team deems appropriate. https://www.datascience-pm.com/data-driven-scrum/
  • 24. Principles of DDS u Allow capability-based iterations – it might be that sometimes it makes sense to have an iteration that lasts one day, and other times, for an iteration last three weeks (ex. due to how long it takes to acquire / clean data or how long it takes for an exploratory analysis). The goal should be to allow logical chunks of work to be released in a coherent fashion. u Decoupling meetings from an iteration – since an iteration could be very short (ex. one day for a specific exploratory analysis), meetings (such as a retrospective to improve the team’s process) should be based on a logical time-based window, not linked to each iteration. u Only require high-level item estimation – In many situations, defining an explicit timeline for an exploratory analysis is difficult, so one should not need to generate accurate detailed task estimations in order to use the framework. But, high-level “T- Shirt” level of effort estimates can be helpful for prioritizing the potential tasks to be done. https://www.datascience-pm.com/data-driven-agile/
  • 25. DDS Framework u Data Driven Scrum supports lean iterative exploratory data science analysis, and acknowledges that iterations will vary in length due to the phase of the project (collecting data vs creating a machine learning analysis). u DDS defines an agile lean process framework that leverages some of the key concepts of Scrum as well as the key concepts of Kanban, but differently than Scrumban (which as is more of Kanban within a Scrum Framework and hence, Scrumban implements Scrum sprints, which as previously noted, introduces several challenges for the project team). u In short, DDS teams use a Kanban-like visual board and focus on working on a specific item or collection of items during an iteration, which is task-based, not time-boxed. Thus, an iteration more closely aligns with the lean concept of pulling tasks, in a prioritized manner, when the team has capacity. Each iteration can be viewed as validating or rejecting a specific lean hypothesis. https://www.datascience-pm.com/data-driven-agile/
  • 26. Steps in a DDS Iteration Create: A thing or set of things that will be created, put into use with a hypothesis about what will happen. Observe: A set of observable outcomes of that use that will be measured (and any work that is needed to facilitate that measurement). Analyze: Analyzing those observables and create a plan for the next iteration https://www.datascience-pm.com/data-driven-agile/
  • 28. Scaling DDS The DDS framework is a single team framework that is designed to be compatible with the Scrum@Scale scaling framework. Each DDS team exposes the necessary interfaces to collaborate with other teams (each of which might be doing Scrum or DDS) via its roles and artifacts, while encapsulating its internal workflow. Team touchpoint DDS Scrum Metascrum representation Product Owner Product owner Scrum of Scrums representation Process Master Scrum Master Product / release feedback Iteration Review Sprint Review Metrics and transparency Item Backlog / Taskboard Product Backlog / Sprint Backlog
  • 29. Recap u AI / DS / ML is an evolving field, with long development / deployment cycles, high failure rates and low ROI. u It is still a software, but yet, not quite like the traditional software in many ways! u While agile principles are rather generic problem-solving methods, some ideas don’t quite apply well. u Data-Driven Scrum offers an interesting perspective for delivering DS projects with agility. u For deployment, AIOps / MLOps orchestration platforms are fast emerging to provide necessary tool support.
  • 30. References u https://future.a16z.com/new-business-ai-different-traditional-software/ u https://medium.com/machine-learning-in-practice/how-machine-learning- differs-from-traditional-software-80d0a235ff3b u https://blog.dataiku.com/ai-projects-lifecycle-key-steps-and-considerations u https://www.ibm.com/cloud/blog/ai-model-lifecycle-management-overview u https://labelyourdata.com/articles/lifecycle-of-an-ai-project-stages- breakdown u https://www.datascience-pm.com/effective-data-science-process/ u https://www.datascience-pm.com/data-driven-agile/ u https://datadrivenscrum.com/