SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Dave Litwiller
Jan 11, 2022
IMPROVING AI
DEVELOPMENT
TOOLS, TIPS & TECHNIQUES TO ENHANCE
OUTCOMES
Path Dependency of
Models
Making Deep Learners
More Transparent and
Predictable
Data Quality and
Improvement
2
INTRODUCTION
BACKGROUND
• A conversational tour through some things I’ve learned in
helping scale-up stage client companies improve their AI
development practices, especially where deep neural nets
(DNNs) are in use
• YMMV
PATH DEPENDENCIES OF MODELS
INITIAL TRAINING
DATA
• Where initial training data creates broad parameters for a
model (a.k.a. pre-training), and further training data will
create greater detail or precision:
• +ve: compresses amount of pre-training data
• -ve: errors or hidden variables can hamper performance of
the ultimate model
• Initial training data needs to be selected with care for
representativeness of the larger data set
• If generative changes in the process being modelled will
take place over time, the model and pre-training will need
to evolve
OUTLIER EVENTS
• A way is needed to respond
gracefully to events that
were not well captured in the
training data
• A simple common approach
is a voting system of multiple
models, to determine if the
outlier event is likely correct
or an error/exception
REALITY GAPS
• Watch out for the potential
for system failure in real
world use if the training
environment or data are
simplified representations of
reality
REALITY GAPS AND
OUTLIER EVENTS
GENERATIVE
MECHANISM CHANGE
• Periodically revisit if generative mechanisms or standards
of practice have changed, and the model needs to be
rebuilt
• The conditions under which training data come to be are
often different than in wider, forward-looking usage
• Classic dilemma:
• On any given day, it is usually easier to keep coaxing
along the current model, rather than doing a larger
redesign and rebuild
• It can help objectivity in the heat of battle to have
established a prior threshold criteria which will signal that
sufficient change has likely occurred to trigger a model
rebuild
MAKING DEEP LEARNERS MORE
TRANSPARENT & PREDICTABLE
• Akin to satisficing vs. optimizing; accuracy vs. precision
• Design dividend: Simpler representations lend themselves
more easily to heuristics to foster design intuition for
developers and maintainers of models
 With somewhat simpler models, the impact of change
can be better anticipated – in data, in a single model,
or in a system of multiple models
• Further benefit of simpler models:
 Better explain-ability, and understanding of the major
influences in the model, as well as the combinations of
rare or decisive events which can make the model tip
from one output state to another
• To boot, overly optimized models are often at greater risk of
failure or slowing the larger systems in which they operate if
usage conditions change unexpectedly
Often,
slightly lower
performing
models
perform
better over
time as
conditions
change
ROBUSTNESS
THIRD PARTY
PERSPECTIVE
• 3rd Party View Diagnostic Tools
• It is often good practice to build a visualization of what the
system “sees” looking out at its constituent parts, its
interfaces (both internal and external), and external
environment
• A 3rd party view tends to reveal a lot about the
perspectives which shaped how the system was built, what
influences it, what it ignores or downplays, and which
combinations of circumstances cause sharp changes in
performance
THIRD PARTY
PERSPECTIVE
• 3rd Party View Diagnostic Tools (cont’d)
• Especially in systems AI (multiple constituent models), the
real issue often comes down to understanding how
bottom-up performance of component models affects other
parts of the system and overall system performance (ex:
common issue in self-driving vehicle development)
• Stated differently, transparency and predictability is about
understanding the alternatives that the overall system
sees, and what it is capable of seeing
OVERFITTING
• Main Risk:
• Some AI models are no more compressed than the training
data, especially with DNNs
• A warning flag should usually go up if deep learners are
being used with training data points that number only in the
thousands
• Pragmatic criteria are needed to see if a more general
model is being built, to build assurance of achieving robust
predictive value in forward use of the model
THEORY OF MINIMUM
INFORMATION
• Theory of Minimum Information Thought Experiment
• Keep an eye on what subset of information would yield
most of the benefit of the larger model
• Similarly, always look at what the model could or should be
if one or a few of the data inputs were to become
erroneous or spurious in performance
BREADTH OF DATA AND
RELATIONSHIPS REFLECTED
IN THE DATA
• The training data and model need to reflect temporal,
spatial and causal relationships
• Otherwise, the model will only find correlations, and be at
greater risk of generating nonsensical errors
BREADTH OF DATA AND
RELATIONSHIPS REFLECTED
IN THE DATA
• The power zone of AI is when the data is rich enough to
detect valid patterns that exploit volumes of information
far beyond what humans can process
• The Gold Standard: When rare but decisive patterns can
be detected and acted upon
• The delicate balance: AI tends toward generalizations,
when some systems need to allow for idiosyncratic
behavior
• In some cases, especially when a lot of training data is
available, it can be useful to not filter data too early
based on preconceived notions of expected model
function; the rawest data may contain useful information
which has previously escaped notice
SOCIAL AND CULTURAL
ISSUES
• Models for social process are subject to individual and
group norms which vary greatly across the world
• Beware the classic trap of assuming that others have the
same values or conventions as we do
• Social and cultural processes tend to be underspecified,
i.e. there is a larger unstated context than what appears in
the data
• Often then, there needs to be a top-down inference
engine to estimate the larger context to get to a high
level of overall performance; purely bottom-up efforts to
construct larger predictions will often not perform as well
ABRIDGED REPERTOIRE OF
MODELLING AND MODEL
EXPOSITION TECHNIQUES
• Having AI developers think out loud (talking or writing)
• Transforming words to images
• Makes explicit things that were often implicit or hidden
• Aids further inferences to enhance model value
• Metaphors, analogies, benchmarks, adjacencies, and thought experiments
• Increasing the availability of candidate representations can help a lot, expanding variety in
reference frames
• Comparative study helps raise the level of model abstraction
• Analysing the well-documented experiences of AI development in self-driving vehicles,
facial recognition, gaming, and medical diagnostics can provide a lot of instructive value
for development practices in other fields
• Induction from observing the AI closely, including brute force parameter variation
• Differential or difference equations, and systems thereof
• Statistical & clustering analyses and probabilistic inference
• Simplified simulation
• Longitudinal analysis
• Articulating not just core assumptions and dependencies, but boundary conditions as
well
REPRESENTATIONS,
COMBINATIONS AND MODEL
EVOLUTION STEPS
• Isomorphic representations can provide clarity and useful
simplicity, revealing unwarranted complexity
• Because representations are used in design and
development as much for discovery as verification, creating
multiple representations also tends to help model creators go
beyond disambiguation to extend knowledge
• Combinations of techniques are often powerful to reveal
explanation and greater predictive value from models
• With the insight benefit from additional representations,
sometimes it becomes necessary to go back several
evolutionary steps of development before going forward
again on a new evolutionary branch to attain ultimately
higher performance
TRANSPARENCY,
PREDICTABILITY AND
DIVERSITY
• Taking advantage of these techniques for improving
transparency and predictability of AIs requires a respect
for and openness to the potential contribution of
intellectual diversity in model formation, testing and
optimization
• Useful advantages come from learning or knowing relevant
things that others don’t know or appreciate
• The cumulative effect of such gains can become
formidable over time
• Often though, some additional leadership effort is required
to gain full benefit, since interdisciplinary perspectives as a
source of intellectual diversity can become more
challenging to incorporate over time as intra-disciplinary
norms and dialects increasingly specialize
THE POWER OF ASKING
WHY
• There is frequently a temptation in the interest of time to
rely on correlations, without causal understanding of why
the model performs the way it does
• Correlation without causality has initial benefits for allowing
hypothesis-free forms of exploration
• Useful findings can be obtained from such systems-level
views of what is possible and actionable, rather than an
initial preoccupation on reductionism
• Over time however, reliance on models which perform
without understanding why is risky, or even dangerous
• Doing so largely sets aside the scientific method, all that it
has accomplished and will continue to achieve
CAUSAL UNDERSTANDING
AND BREAKTHROUGH
INSIGHT POTENTIAL
• Correlations and a functioning, if opaque, AI model should be the
beginning of the causal discovery process, not the end
• The evidence of unexpected but statistically significant correlations
from erstwhile hidden signals and relationships should be used to
drive causal inquiry
• Causal investigation is the only way to reliably separate out
confounding factors and better specify conditions of robust
prospective model use
• The added benefit of the drive toward causal understanding:
• Such insight usually paves the way to gain much further
differentiating, defensible IP and competitively important technology
of lasting significance
• The field of radiomics in diagnostic medical imaging provides a
useful case study for some forms of causal discovery practices
CAUTION ABOUT MAKING DEEP
LEARNERS MORE TRANSPARENT
AND PREDICTABLE
• Be cautious about accepting proposals to try to fix the
explain-ability difficulties of one opaque model through
the use of another
• Domain experts can be tempted to overreach in trying to
solve the problems of their technological field through
further application of the same technology
• While powerful AI technology can be built this way (such
as GANs), explain-ability and causal understanding often
require a different approach and greater technological
breadth
BACKSTOP TECHNIQUE
• If you’re really stuck to define a path for working forward
toward greater transparency and predictability, like just
about every other field of technology, thorough testing at
component, unit, integration and system levels can reveal
a lot about a model
• The known edges and boundary regions of performance,
as well as vulnerabilities, sensitivities, and potential for
dysfunction are always illuminating to build toward
functional understanding
• Knowing not just the region of safe operation, but the locus
of performance beyond which a technology fails, and how
it behaves in the zone of operation approaching the failure
realm is usually instructive to build toward transparency
and predictability
DATA QUALITY AND IMPROVEMENT
AI LEARNS WHAT IS
IN THE DATA
• Make sure the data conveys what the AI is to learn
• This is different from the software centric view of tuning the
algorithm until desired performance is achieved
• Data quality becomes extremely important with small
training sets
• Noise in small training sets can greatly hamper model
performance
• Raw training data based on human judgement may reflect
satisficing or historical values which will not be
acceptable in forward looking or machine-delegated
decision making
OBJECTIVE DATA
QUALITY MEASURES
• Develop objective measures of data quality, and have a
process for continuous improvement of data quality
• Useful forcing exercise:
• Hold the algorithm constant for a while, and see how much
improvement can be achieved purely through data quality
enhancement
DATA LABELING AND
VERSION/CONFIGURATION
CONTROL
• Labeling is a common source of data quality issues,
compromising the representation of ground truth
• Spot checking, statistical sampling, and increased
checking where errors have recently been detected can
help improve labeling
• Improvement efforts should look toward error proofing the
process for generating labels in the first place, not fixing
errors retroactively on a sustained basis
• Be especially wary if data, labels and models were built
when people were harried or in a rush
• Data benefits from similar version and configuration
control concepts as code
THICK VS. THIN TAILED
DISTRIBUTIONS
• The most constructive approaches to data cleaning and
error correction are greatly influenced by whether the data
distribution has thick or thin tails
• Thick tails are often a sign of multiple generative
mechanisms aggregated in the data
• Thin tails are typically associated with a single generative
mechanism producing the data
• The most efficacious data quality improvements vary
considerably depending on which of the tail phenomena is
the case
HIGH IMPACT CASES
AND DATA DEPTH
• Identify important cases and make sure the data includes
them with sufficient statistical depth
• Don’t forget about reserved test cases, not just training
data
• Map out happy paths, acceptable paths, exception cases
and error conditions
• Fault tree and failure propagation analysis can help
incorporate systems thinking in the data and the resulting
model
• More generally, understand the depth of the data over the
entire range of inputs, not just sub-ranges
FURTHER DATA QUALITY
IMPROVEMENT CONSIDERATIONS
• Arguably, data practices tend to be much more vertically
(application) specific, whereas algorithms tend to be more
horizontal in nature
• Ask what attributes of the data need to be monitored in
production to see if model/concept drift is taking place
• Instrumenting the distribution of production data along with
cohort and trend analysis can often be used to help signal
if drift may be occurring
• Do sensitivity analyses to know what kinds of data the
model performs poorly on, and build mechanisms to go
acquire or generate more of that data
FURTHER
DISCUSSION
For additional discussion about improving AI development:
dave.litwiller@communitech.ca
APPENDIX: REGULATORY AND
COMPLIANCE ISSUES
CHANGE TRACKING
AND USER SEGMENTS
• If the AI will be used in industries where there are
regulatory and legal compliance issues for the model:
• Change tracking and performance change monitoring are
typically required
• Performance against baseline and historical performance
• Assessment of occurrences and rates of false positives
and false negatives
• Performance of sub-types of users or patients
• Averages can conceal a lot of things, especially with thick
or long tails to the distributions where multiple generative
mechanisms may be at work
• In general, the better people can provably explain why a
model performs and changes the way it does, the better

Contenu connexe

Tendances

Solution Logic - Change as Progress
Solution Logic - Change as ProgressSolution Logic - Change as Progress
Solution Logic - Change as ProgressMalcolm Ryder
 
MS Lecture 9 information technology
MS Lecture 9 information technologyMS Lecture 9 information technology
MS Lecture 9 information technologyEst
 
Effective Business Analysis in a Changing World
Effective Business Analysis in a Changing WorldEffective Business Analysis in a Changing World
Effective Business Analysis in a Changing WorldDevFactoTechnologies
 
Light Touch Suite 1.5
Light Touch Suite 1.5Light Touch Suite 1.5
Light Touch Suite 1.5Philip Pryor
 
Welingkar_final project_ppt_IMPORTANCE & NEED FOR TESTING
Welingkar_final project_ppt_IMPORTANCE & NEED FOR TESTINGWelingkar_final project_ppt_IMPORTANCE & NEED FOR TESTING
Welingkar_final project_ppt_IMPORTANCE & NEED FOR TESTINGSachin Pathania
 
How to become a great Business Analyst
How to become a great Business AnalystHow to become a great Business Analyst
How to become a great Business AnalystAndreas Hägglund
 
Chapter 6 management (10 th edition) by robbins and coulter
Chapter 6 management (10 th edition) by robbins and coulterChapter 6 management (10 th edition) by robbins and coulter
Chapter 6 management (10 th edition) by robbins and coulterMd. Abul Ala
 
Risk Analysis In IT Projects - TNS09
Risk Analysis In IT Projects - TNS09Risk Analysis In IT Projects - TNS09
Risk Analysis In IT Projects - TNS09Thomas Danford
 
Health sector strategies notice
Health sector strategies noticeHealth sector strategies notice
Health sector strategies noticeRogate Phinias
 
Leveraging Siebel CTMS for Risk-Based Monitoring
Leveraging Siebel CTMS for Risk-Based MonitoringLeveraging Siebel CTMS for Risk-Based Monitoring
Leveraging Siebel CTMS for Risk-Based MonitoringPerficient, Inc.
 
EKSBIT Marc Van Veldhoven (Universiteit Van Tilburg) 23 September 2009
EKSBIT Marc Van Veldhoven (Universiteit Van Tilburg) 23 September 2009EKSBIT Marc Van Veldhoven (Universiteit Van Tilburg) 23 September 2009
EKSBIT Marc Van Veldhoven (Universiteit Van Tilburg) 23 September 2009EKSBIT
 
Business Intelligence Analysis - The key to organisational and business success
Business Intelligence Analysis - The key to organisational and business successBusiness Intelligence Analysis - The key to organisational and business success
Business Intelligence Analysis - The key to organisational and business successcssa
 
Students will create a seven slide (minimum) power point presentation
Students will create a seven slide (minimum) power point presentationStudents will create a seven slide (minimum) power point presentation
Students will create a seven slide (minimum) power point presentationmayank272369
 
Decision Making
Decision MakingDecision Making
Decision MakingMarcus9000
 

Tendances (20)

Solution Logic - Change as Progress
Solution Logic - Change as ProgressSolution Logic - Change as Progress
Solution Logic - Change as Progress
 
Soft Systems Methodology
Soft Systems MethodologySoft Systems Methodology
Soft Systems Methodology
 
Shrikant more
Shrikant moreShrikant more
Shrikant more
 
MS Lecture 9 information technology
MS Lecture 9 information technologyMS Lecture 9 information technology
MS Lecture 9 information technology
 
Effective Business Analysis in a Changing World
Effective Business Analysis in a Changing WorldEffective Business Analysis in a Changing World
Effective Business Analysis in a Changing World
 
Light Touch Suite 1.5
Light Touch Suite 1.5Light Touch Suite 1.5
Light Touch Suite 1.5
 
Welingkar_final project_ppt_IMPORTANCE & NEED FOR TESTING
Welingkar_final project_ppt_IMPORTANCE & NEED FOR TESTINGWelingkar_final project_ppt_IMPORTANCE & NEED FOR TESTING
Welingkar_final project_ppt_IMPORTANCE & NEED FOR TESTING
 
Retail Perspective & Credentials
Retail Perspective & CredentialsRetail Perspective & Credentials
Retail Perspective & Credentials
 
How to become a great Business Analyst
How to become a great Business AnalystHow to become a great Business Analyst
How to become a great Business Analyst
 
Chapter 6 management (10 th edition) by robbins and coulter
Chapter 6 management (10 th edition) by robbins and coulterChapter 6 management (10 th edition) by robbins and coulter
Chapter 6 management (10 th edition) by robbins and coulter
 
Risk Analysis In IT Projects - TNS09
Risk Analysis In IT Projects - TNS09Risk Analysis In IT Projects - TNS09
Risk Analysis In IT Projects - TNS09
 
Health sector strategies notice
Health sector strategies noticeHealth sector strategies notice
Health sector strategies notice
 
Leveraging Siebel CTMS for Risk-Based Monitoring
Leveraging Siebel CTMS for Risk-Based MonitoringLeveraging Siebel CTMS for Risk-Based Monitoring
Leveraging Siebel CTMS for Risk-Based Monitoring
 
EKSBIT Marc Van Veldhoven (Universiteit Van Tilburg) 23 September 2009
EKSBIT Marc Van Veldhoven (Universiteit Van Tilburg) 23 September 2009EKSBIT Marc Van Veldhoven (Universiteit Van Tilburg) 23 September 2009
EKSBIT Marc Van Veldhoven (Universiteit Van Tilburg) 23 September 2009
 
Business Intelligence Analysis - The key to organisational and business success
Business Intelligence Analysis - The key to organisational and business successBusiness Intelligence Analysis - The key to organisational and business success
Business Intelligence Analysis - The key to organisational and business success
 
Ch # 1 brm
Ch # 1 brmCh # 1 brm
Ch # 1 brm
 
Students will create a seven slide (minimum) power point presentation
Students will create a seven slide (minimum) power point presentationStudents will create a seven slide (minimum) power point presentation
Students will create a seven slide (minimum) power point presentation
 
Reaching across the enterprise
Reaching across the enterpriseReaching across the enterprise
Reaching across the enterprise
 
Decision Making
Decision MakingDecision Making
Decision Making
 
Decision Making
Decision MakingDecision Making
Decision Making
 

Similaire à Improving AI Development - Dave Litwiller - Jan 11 2022 - Public

Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceNiko Vuokko
 
Analytics in Context: Modelling in a regulatory environment
Analytics in Context: Modelling in a regulatory environmentAnalytics in Context: Modelling in a regulatory environment
Analytics in Context: Modelling in a regulatory environmentIntegrated Knowledge Services
 
Challenges in adapting predictive analytics
Challenges  in  adapting  predictive  analyticsChallenges  in  adapting  predictive  analytics
Challenges in adapting predictive analyticsPrasad Narasimhan
 
STAT7440StudentIMLPresentationJishan.pptx
STAT7440StudentIMLPresentationJishan.pptxSTAT7440StudentIMLPresentationJishan.pptx
STAT7440StudentIMLPresentationJishan.pptxJishanAhmed24
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MAHIRA
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needGibDevs
 
Designing High Quality Data Driven Solutions 110520
Designing High Quality Data Driven Solutions 110520Designing High Quality Data Driven Solutions 110520
Designing High Quality Data Driven Solutions 110520MariaHalstead1
 
The Simulacrum, a Synthetic Cancer Dataset
The Simulacrum, a Synthetic Cancer DatasetThe Simulacrum, a Synthetic Cancer Dataset
The Simulacrum, a Synthetic Cancer DatasetCongChen35
 
Alternative Methodologies for Systems Development
Alternative Methodologies for Systems Development Alternative Methodologies for Systems Development
Alternative Methodologies for Systems Development Sunderland City Council
 
The Research specifically DataAnalysis.pptx
The Research specifically DataAnalysis.pptxThe Research specifically DataAnalysis.pptx
The Research specifically DataAnalysis.pptxCasylouMendozaBorqui
 
Introduction to Modelling and Simulation.pptx
Introduction to Modelling and Simulation.pptxIntroduction to Modelling and Simulation.pptx
Introduction to Modelling and Simulation.pptxPortiaMupfumiraTenda
 
Data warehouse 16 data analysis techniques
Data warehouse 16 data analysis techniquesData warehouse 16 data analysis techniques
Data warehouse 16 data analysis techniquesVaibhav Khanna
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detectionjagan477830
 
How Azure and Databricks Enabled a Personalized Experience for Customers and ...
How Azure and Databricks Enabled a Personalized Experience for Customers and ...How Azure and Databricks Enabled a Personalized Experience for Customers and ...
How Azure and Databricks Enabled a Personalized Experience for Customers and ...Databricks
 
MACHINE LEARNING INTRODUCTION DIFFERENCE BETWEEN SUOERVISED , UNSUPERVISED AN...
MACHINE LEARNING INTRODUCTION DIFFERENCE BETWEEN SUOERVISED , UNSUPERVISED AN...MACHINE LEARNING INTRODUCTION DIFFERENCE BETWEEN SUOERVISED , UNSUPERVISED AN...
MACHINE LEARNING INTRODUCTION DIFFERENCE BETWEEN SUOERVISED , UNSUPERVISED AN...DurgaDevi310087
 
Advancing Testing Using Axioms
Advancing Testing Using AxiomsAdvancing Testing Using Axioms
Advancing Testing Using AxiomsSQALab
 

Similaire à Improving AI Development - Dave Litwiller - Jan 11 2022 - Public (20)

Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Analytics in Context: Modelling in a regulatory environment
Analytics in Context: Modelling in a regulatory environmentAnalytics in Context: Modelling in a regulatory environment
Analytics in Context: Modelling in a regulatory environment
 
Challenges in adapting predictive analytics
Challenges  in  adapting  predictive  analyticsChallenges  in  adapting  predictive  analytics
Challenges in adapting predictive analytics
 
STAT7440StudentIMLPresentationJishan.pptx
STAT7440StudentIMLPresentationJishan.pptxSTAT7440StudentIMLPresentationJishan.pptx
STAT7440StudentIMLPresentationJishan.pptx
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
 
Designing High Quality Data Driven Solutions 110520
Designing High Quality Data Driven Solutions 110520Designing High Quality Data Driven Solutions 110520
Designing High Quality Data Driven Solutions 110520
 
The Simulacrum, a Synthetic Cancer Dataset
The Simulacrum, a Synthetic Cancer DatasetThe Simulacrum, a Synthetic Cancer Dataset
The Simulacrum, a Synthetic Cancer Dataset
 
Alternative Methodologies for Systems Development
Alternative Methodologies for Systems Development Alternative Methodologies for Systems Development
Alternative Methodologies for Systems Development
 
The Research specifically DataAnalysis.pptx
The Research specifically DataAnalysis.pptxThe Research specifically DataAnalysis.pptx
The Research specifically DataAnalysis.pptx
 
Introduction to Modelling and Simulation.pptx
Introduction to Modelling and Simulation.pptxIntroduction to Modelling and Simulation.pptx
Introduction to Modelling and Simulation.pptx
 
Data warehouse 16 data analysis techniques
Data warehouse 16 data analysis techniquesData warehouse 16 data analysis techniques
Data warehouse 16 data analysis techniques
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detection
 
How Azure and Databricks Enabled a Personalized Experience for Customers and ...
How Azure and Databricks Enabled a Personalized Experience for Customers and ...How Azure and Databricks Enabled a Personalized Experience for Customers and ...
How Azure and Databricks Enabled a Personalized Experience for Customers and ...
 
MACHINE LEARNING INTRODUCTION DIFFERENCE BETWEEN SUOERVISED , UNSUPERVISED AN...
MACHINE LEARNING INTRODUCTION DIFFERENCE BETWEEN SUOERVISED , UNSUPERVISED AN...MACHINE LEARNING INTRODUCTION DIFFERENCE BETWEEN SUOERVISED , UNSUPERVISED AN...
MACHINE LEARNING INTRODUCTION DIFFERENCE BETWEEN SUOERVISED , UNSUPERVISED AN...
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
 
Advancing Testing Using Axioms
Advancing Testing Using AxiomsAdvancing Testing Using Axioms
Advancing Testing Using Axioms
 
Machine learning in Banks
Machine learning in BanksMachine learning in Banks
Machine learning in Banks
 

Plus de Dave Litwiller

Optimizing C-Suite Dynamics - Nov 2 2023 - Dave Litwiller - Public.pptx
Optimizing C-Suite Dynamics - Nov 2 2023 - Dave Litwiller - Public.pptxOptimizing C-Suite Dynamics - Nov 2 2023 - Dave Litwiller - Public.pptx
Optimizing C-Suite Dynamics - Nov 2 2023 - Dave Litwiller - Public.pptxDave Litwiller
 
Technology Reacceleration - Taking Back the Lead After Falling Behind - May 2...
Technology Reacceleration - Taking Back the Lead After Falling Behind - May 2...Technology Reacceleration - Taking Back the Lead After Falling Behind - May 2...
Technology Reacceleration - Taking Back the Lead After Falling Behind - May 2...Dave Litwiller
 
Market Volatility Considerations for Scale-up Stage Tech Companies in 2023 - ...
Market Volatility Considerations for Scale-up Stage Tech Companies in 2023 - ...Market Volatility Considerations for Scale-up Stage Tech Companies in 2023 - ...
Market Volatility Considerations for Scale-up Stage Tech Companies in 2023 - ...Dave Litwiller
 
Product and Technology Roadmaps and Roadmapping Processes - Dave Litwiller - ...
Product and Technology Roadmaps and Roadmapping Processes - Dave Litwiller - ...Product and Technology Roadmaps and Roadmapping Processes - Dave Litwiller - ...
Product and Technology Roadmaps and Roadmapping Processes - Dave Litwiller - ...Dave Litwiller
 
Strategy Execution - Dave Litwiller - Nov 2021
Strategy Execution - Dave Litwiller - Nov 2021Strategy Execution - Dave Litwiller - Nov 2021
Strategy Execution - Dave Litwiller - Nov 2021Dave Litwiller
 
A Year In - Economic Reopening Opportunities and Cautions for Scale-up Tech F...
A Year In - Economic Reopening Opportunities and Cautions for Scale-up Tech F...A Year In - Economic Reopening Opportunities and Cautions for Scale-up Tech F...
A Year In - Economic Reopening Opportunities and Cautions for Scale-up Tech F...Dave Litwiller
 
Team Effectiveness - Part 3 - Jan 2021
Team Effectiveness - Part 3 - Jan 2021Team Effectiveness - Part 3 - Jan 2021
Team Effectiveness - Part 3 - Jan 2021Dave Litwiller
 
Team Effectiveness - Part 2 - Jan 2021
Team Effectiveness - Part 2 - Jan 2021Team Effectiveness - Part 2 - Jan 2021
Team Effectiveness - Part 2 - Jan 2021Dave Litwiller
 
Company Culture 8 Point Health Check - Scale-up Stage Technology Firms - Dave...
Company Culture 8 Point Health Check - Scale-up Stage Technology Firms - Dave...Company Culture 8 Point Health Check - Scale-up Stage Technology Firms - Dave...
Company Culture 8 Point Health Check - Scale-up Stage Technology Firms - Dave...Dave Litwiller
 
Applying TQM and the Toyota Production System in Development of Software Arti...
Applying TQM and the Toyota Production System in Development of Software Arti...Applying TQM and the Toyota Production System in Development of Software Arti...
Applying TQM and the Toyota Production System in Development of Software Arti...Dave Litwiller
 
Future Office Layout and Productivity Considerations for Startups and Scale u...
Future Office Layout and Productivity Considerations for Startups and Scale u...Future Office Layout and Productivity Considerations for Startups and Scale u...
Future Office Layout and Productivity Considerations for Startups and Scale u...Dave Litwiller
 
Thoughts About the Road to Success through the Economic Recovery from Covid-1...
Thoughts About the Road to Success through the Economic Recovery from Covid-1...Thoughts About the Road to Success through the Economic Recovery from Covid-1...
Thoughts About the Road to Success through the Economic Recovery from Covid-1...Dave Litwiller
 
Navigating Volatility in the General Economy in Growth Stage Technology Busin...
Navigating Volatility in the General Economy in Growth Stage Technology Busin...Navigating Volatility in the General Economy in Growth Stage Technology Busin...
Navigating Volatility in the General Economy in Growth Stage Technology Busin...Dave Litwiller
 
Improving and Updating Corporate Strategy in Growth Stage Technology Enterpri...
Improving and Updating Corporate Strategy in Growth Stage Technology Enterpri...Improving and Updating Corporate Strategy in Growth Stage Technology Enterpri...
Improving and Updating Corporate Strategy in Growth Stage Technology Enterpri...Dave Litwiller
 
Onboarding Engineering New Hires in Growth Stage Technology Companies - Dave ...
Onboarding Engineering New Hires in Growth Stage Technology Companies - Dave ...Onboarding Engineering New Hires in Growth Stage Technology Companies - Dave ...
Onboarding Engineering New Hires in Growth Stage Technology Companies - Dave ...Dave Litwiller
 
Leadership Development in Growth Stage Technology Companies - Dave Litwiller ...
Leadership Development in Growth Stage Technology Companies - Dave Litwiller ...Leadership Development in Growth Stage Technology Companies - Dave Litwiller ...
Leadership Development in Growth Stage Technology Companies - Dave Litwiller ...Dave Litwiller
 
Managerial Decentralization in Growth Stage Technology Companies - Dave Litwi...
Managerial Decentralization in Growth Stage Technology Companies - Dave Litwi...Managerial Decentralization in Growth Stage Technology Companies - Dave Litwi...
Managerial Decentralization in Growth Stage Technology Companies - Dave Litwi...Dave Litwiller
 
Business Review Meetings in Growth Stage Technology Companies - Dave Litwille...
Business Review Meetings in Growth Stage Technology Companies - Dave Litwille...Business Review Meetings in Growth Stage Technology Companies - Dave Litwille...
Business Review Meetings in Growth Stage Technology Companies - Dave Litwille...Dave Litwiller
 
Systems Engineering - Dave Litwiller - March 2019
Systems Engineering -  Dave Litwiller - March 2019Systems Engineering -  Dave Litwiller - March 2019
Systems Engineering - Dave Litwiller - March 2019Dave Litwiller
 
Patent Portfolios and Invention Sessions in Growth Stage Tech Companies - Dav...
Patent Portfolios and Invention Sessions in Growth Stage Tech Companies - Dav...Patent Portfolios and Invention Sessions in Growth Stage Tech Companies - Dav...
Patent Portfolios and Invention Sessions in Growth Stage Tech Companies - Dav...Dave Litwiller
 

Plus de Dave Litwiller (20)

Optimizing C-Suite Dynamics - Nov 2 2023 - Dave Litwiller - Public.pptx
Optimizing C-Suite Dynamics - Nov 2 2023 - Dave Litwiller - Public.pptxOptimizing C-Suite Dynamics - Nov 2 2023 - Dave Litwiller - Public.pptx
Optimizing C-Suite Dynamics - Nov 2 2023 - Dave Litwiller - Public.pptx
 
Technology Reacceleration - Taking Back the Lead After Falling Behind - May 2...
Technology Reacceleration - Taking Back the Lead After Falling Behind - May 2...Technology Reacceleration - Taking Back the Lead After Falling Behind - May 2...
Technology Reacceleration - Taking Back the Lead After Falling Behind - May 2...
 
Market Volatility Considerations for Scale-up Stage Tech Companies in 2023 - ...
Market Volatility Considerations for Scale-up Stage Tech Companies in 2023 - ...Market Volatility Considerations for Scale-up Stage Tech Companies in 2023 - ...
Market Volatility Considerations for Scale-up Stage Tech Companies in 2023 - ...
 
Product and Technology Roadmaps and Roadmapping Processes - Dave Litwiller - ...
Product and Technology Roadmaps and Roadmapping Processes - Dave Litwiller - ...Product and Technology Roadmaps and Roadmapping Processes - Dave Litwiller - ...
Product and Technology Roadmaps and Roadmapping Processes - Dave Litwiller - ...
 
Strategy Execution - Dave Litwiller - Nov 2021
Strategy Execution - Dave Litwiller - Nov 2021Strategy Execution - Dave Litwiller - Nov 2021
Strategy Execution - Dave Litwiller - Nov 2021
 
A Year In - Economic Reopening Opportunities and Cautions for Scale-up Tech F...
A Year In - Economic Reopening Opportunities and Cautions for Scale-up Tech F...A Year In - Economic Reopening Opportunities and Cautions for Scale-up Tech F...
A Year In - Economic Reopening Opportunities and Cautions for Scale-up Tech F...
 
Team Effectiveness - Part 3 - Jan 2021
Team Effectiveness - Part 3 - Jan 2021Team Effectiveness - Part 3 - Jan 2021
Team Effectiveness - Part 3 - Jan 2021
 
Team Effectiveness - Part 2 - Jan 2021
Team Effectiveness - Part 2 - Jan 2021Team Effectiveness - Part 2 - Jan 2021
Team Effectiveness - Part 2 - Jan 2021
 
Company Culture 8 Point Health Check - Scale-up Stage Technology Firms - Dave...
Company Culture 8 Point Health Check - Scale-up Stage Technology Firms - Dave...Company Culture 8 Point Health Check - Scale-up Stage Technology Firms - Dave...
Company Culture 8 Point Health Check - Scale-up Stage Technology Firms - Dave...
 
Applying TQM and the Toyota Production System in Development of Software Arti...
Applying TQM and the Toyota Production System in Development of Software Arti...Applying TQM and the Toyota Production System in Development of Software Arti...
Applying TQM and the Toyota Production System in Development of Software Arti...
 
Future Office Layout and Productivity Considerations for Startups and Scale u...
Future Office Layout and Productivity Considerations for Startups and Scale u...Future Office Layout and Productivity Considerations for Startups and Scale u...
Future Office Layout and Productivity Considerations for Startups and Scale u...
 
Thoughts About the Road to Success through the Economic Recovery from Covid-1...
Thoughts About the Road to Success through the Economic Recovery from Covid-1...Thoughts About the Road to Success through the Economic Recovery from Covid-1...
Thoughts About the Road to Success through the Economic Recovery from Covid-1...
 
Navigating Volatility in the General Economy in Growth Stage Technology Busin...
Navigating Volatility in the General Economy in Growth Stage Technology Busin...Navigating Volatility in the General Economy in Growth Stage Technology Busin...
Navigating Volatility in the General Economy in Growth Stage Technology Busin...
 
Improving and Updating Corporate Strategy in Growth Stage Technology Enterpri...
Improving and Updating Corporate Strategy in Growth Stage Technology Enterpri...Improving and Updating Corporate Strategy in Growth Stage Technology Enterpri...
Improving and Updating Corporate Strategy in Growth Stage Technology Enterpri...
 
Onboarding Engineering New Hires in Growth Stage Technology Companies - Dave ...
Onboarding Engineering New Hires in Growth Stage Technology Companies - Dave ...Onboarding Engineering New Hires in Growth Stage Technology Companies - Dave ...
Onboarding Engineering New Hires in Growth Stage Technology Companies - Dave ...
 
Leadership Development in Growth Stage Technology Companies - Dave Litwiller ...
Leadership Development in Growth Stage Technology Companies - Dave Litwiller ...Leadership Development in Growth Stage Technology Companies - Dave Litwiller ...
Leadership Development in Growth Stage Technology Companies - Dave Litwiller ...
 
Managerial Decentralization in Growth Stage Technology Companies - Dave Litwi...
Managerial Decentralization in Growth Stage Technology Companies - Dave Litwi...Managerial Decentralization in Growth Stage Technology Companies - Dave Litwi...
Managerial Decentralization in Growth Stage Technology Companies - Dave Litwi...
 
Business Review Meetings in Growth Stage Technology Companies - Dave Litwille...
Business Review Meetings in Growth Stage Technology Companies - Dave Litwille...Business Review Meetings in Growth Stage Technology Companies - Dave Litwille...
Business Review Meetings in Growth Stage Technology Companies - Dave Litwille...
 
Systems Engineering - Dave Litwiller - March 2019
Systems Engineering -  Dave Litwiller - March 2019Systems Engineering -  Dave Litwiller - March 2019
Systems Engineering - Dave Litwiller - March 2019
 
Patent Portfolios and Invention Sessions in Growth Stage Tech Companies - Dav...
Patent Portfolios and Invention Sessions in Growth Stage Tech Companies - Dav...Patent Portfolios and Invention Sessions in Growth Stage Tech Companies - Dav...
Patent Portfolios and Invention Sessions in Growth Stage Tech Companies - Dav...
 

Dernier

UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 

Dernier (20)

UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 

Improving AI Development - Dave Litwiller - Jan 11 2022 - Public

  • 1. Dave Litwiller Jan 11, 2022 IMPROVING AI DEVELOPMENT TOOLS, TIPS & TECHNIQUES TO ENHANCE OUTCOMES
  • 2. Path Dependency of Models Making Deep Learners More Transparent and Predictable Data Quality and Improvement 2 INTRODUCTION
  • 3. BACKGROUND • A conversational tour through some things I’ve learned in helping scale-up stage client companies improve their AI development practices, especially where deep neural nets (DNNs) are in use • YMMV
  • 5. INITIAL TRAINING DATA • Where initial training data creates broad parameters for a model (a.k.a. pre-training), and further training data will create greater detail or precision: • +ve: compresses amount of pre-training data • -ve: errors or hidden variables can hamper performance of the ultimate model • Initial training data needs to be selected with care for representativeness of the larger data set • If generative changes in the process being modelled will take place over time, the model and pre-training will need to evolve
  • 6. OUTLIER EVENTS • A way is needed to respond gracefully to events that were not well captured in the training data • A simple common approach is a voting system of multiple models, to determine if the outlier event is likely correct or an error/exception REALITY GAPS • Watch out for the potential for system failure in real world use if the training environment or data are simplified representations of reality REALITY GAPS AND OUTLIER EVENTS
  • 7. GENERATIVE MECHANISM CHANGE • Periodically revisit if generative mechanisms or standards of practice have changed, and the model needs to be rebuilt • The conditions under which training data come to be are often different than in wider, forward-looking usage • Classic dilemma: • On any given day, it is usually easier to keep coaxing along the current model, rather than doing a larger redesign and rebuild • It can help objectivity in the heat of battle to have established a prior threshold criteria which will signal that sufficient change has likely occurred to trigger a model rebuild
  • 8. MAKING DEEP LEARNERS MORE TRANSPARENT & PREDICTABLE
  • 9. • Akin to satisficing vs. optimizing; accuracy vs. precision • Design dividend: Simpler representations lend themselves more easily to heuristics to foster design intuition for developers and maintainers of models  With somewhat simpler models, the impact of change can be better anticipated – in data, in a single model, or in a system of multiple models • Further benefit of simpler models:  Better explain-ability, and understanding of the major influences in the model, as well as the combinations of rare or decisive events which can make the model tip from one output state to another • To boot, overly optimized models are often at greater risk of failure or slowing the larger systems in which they operate if usage conditions change unexpectedly Often, slightly lower performing models perform better over time as conditions change ROBUSTNESS
  • 10. THIRD PARTY PERSPECTIVE • 3rd Party View Diagnostic Tools • It is often good practice to build a visualization of what the system “sees” looking out at its constituent parts, its interfaces (both internal and external), and external environment • A 3rd party view tends to reveal a lot about the perspectives which shaped how the system was built, what influences it, what it ignores or downplays, and which combinations of circumstances cause sharp changes in performance
  • 11. THIRD PARTY PERSPECTIVE • 3rd Party View Diagnostic Tools (cont’d) • Especially in systems AI (multiple constituent models), the real issue often comes down to understanding how bottom-up performance of component models affects other parts of the system and overall system performance (ex: common issue in self-driving vehicle development) • Stated differently, transparency and predictability is about understanding the alternatives that the overall system sees, and what it is capable of seeing
  • 12. OVERFITTING • Main Risk: • Some AI models are no more compressed than the training data, especially with DNNs • A warning flag should usually go up if deep learners are being used with training data points that number only in the thousands • Pragmatic criteria are needed to see if a more general model is being built, to build assurance of achieving robust predictive value in forward use of the model
  • 13. THEORY OF MINIMUM INFORMATION • Theory of Minimum Information Thought Experiment • Keep an eye on what subset of information would yield most of the benefit of the larger model • Similarly, always look at what the model could or should be if one or a few of the data inputs were to become erroneous or spurious in performance
  • 14. BREADTH OF DATA AND RELATIONSHIPS REFLECTED IN THE DATA • The training data and model need to reflect temporal, spatial and causal relationships • Otherwise, the model will only find correlations, and be at greater risk of generating nonsensical errors
  • 15. BREADTH OF DATA AND RELATIONSHIPS REFLECTED IN THE DATA • The power zone of AI is when the data is rich enough to detect valid patterns that exploit volumes of information far beyond what humans can process • The Gold Standard: When rare but decisive patterns can be detected and acted upon • The delicate balance: AI tends toward generalizations, when some systems need to allow for idiosyncratic behavior • In some cases, especially when a lot of training data is available, it can be useful to not filter data too early based on preconceived notions of expected model function; the rawest data may contain useful information which has previously escaped notice
  • 16. SOCIAL AND CULTURAL ISSUES • Models for social process are subject to individual and group norms which vary greatly across the world • Beware the classic trap of assuming that others have the same values or conventions as we do • Social and cultural processes tend to be underspecified, i.e. there is a larger unstated context than what appears in the data • Often then, there needs to be a top-down inference engine to estimate the larger context to get to a high level of overall performance; purely bottom-up efforts to construct larger predictions will often not perform as well
  • 17. ABRIDGED REPERTOIRE OF MODELLING AND MODEL EXPOSITION TECHNIQUES • Having AI developers think out loud (talking or writing) • Transforming words to images • Makes explicit things that were often implicit or hidden • Aids further inferences to enhance model value • Metaphors, analogies, benchmarks, adjacencies, and thought experiments • Increasing the availability of candidate representations can help a lot, expanding variety in reference frames • Comparative study helps raise the level of model abstraction • Analysing the well-documented experiences of AI development in self-driving vehicles, facial recognition, gaming, and medical diagnostics can provide a lot of instructive value for development practices in other fields • Induction from observing the AI closely, including brute force parameter variation • Differential or difference equations, and systems thereof • Statistical & clustering analyses and probabilistic inference • Simplified simulation • Longitudinal analysis • Articulating not just core assumptions and dependencies, but boundary conditions as well
  • 18. REPRESENTATIONS, COMBINATIONS AND MODEL EVOLUTION STEPS • Isomorphic representations can provide clarity and useful simplicity, revealing unwarranted complexity • Because representations are used in design and development as much for discovery as verification, creating multiple representations also tends to help model creators go beyond disambiguation to extend knowledge • Combinations of techniques are often powerful to reveal explanation and greater predictive value from models • With the insight benefit from additional representations, sometimes it becomes necessary to go back several evolutionary steps of development before going forward again on a new evolutionary branch to attain ultimately higher performance
  • 19. TRANSPARENCY, PREDICTABILITY AND DIVERSITY • Taking advantage of these techniques for improving transparency and predictability of AIs requires a respect for and openness to the potential contribution of intellectual diversity in model formation, testing and optimization • Useful advantages come from learning or knowing relevant things that others don’t know or appreciate • The cumulative effect of such gains can become formidable over time • Often though, some additional leadership effort is required to gain full benefit, since interdisciplinary perspectives as a source of intellectual diversity can become more challenging to incorporate over time as intra-disciplinary norms and dialects increasingly specialize
  • 20. THE POWER OF ASKING WHY • There is frequently a temptation in the interest of time to rely on correlations, without causal understanding of why the model performs the way it does • Correlation without causality has initial benefits for allowing hypothesis-free forms of exploration • Useful findings can be obtained from such systems-level views of what is possible and actionable, rather than an initial preoccupation on reductionism • Over time however, reliance on models which perform without understanding why is risky, or even dangerous • Doing so largely sets aside the scientific method, all that it has accomplished and will continue to achieve
  • 21. CAUSAL UNDERSTANDING AND BREAKTHROUGH INSIGHT POTENTIAL • Correlations and a functioning, if opaque, AI model should be the beginning of the causal discovery process, not the end • The evidence of unexpected but statistically significant correlations from erstwhile hidden signals and relationships should be used to drive causal inquiry • Causal investigation is the only way to reliably separate out confounding factors and better specify conditions of robust prospective model use • The added benefit of the drive toward causal understanding: • Such insight usually paves the way to gain much further differentiating, defensible IP and competitively important technology of lasting significance • The field of radiomics in diagnostic medical imaging provides a useful case study for some forms of causal discovery practices
  • 22. CAUTION ABOUT MAKING DEEP LEARNERS MORE TRANSPARENT AND PREDICTABLE • Be cautious about accepting proposals to try to fix the explain-ability difficulties of one opaque model through the use of another • Domain experts can be tempted to overreach in trying to solve the problems of their technological field through further application of the same technology • While powerful AI technology can be built this way (such as GANs), explain-ability and causal understanding often require a different approach and greater technological breadth
  • 23. BACKSTOP TECHNIQUE • If you’re really stuck to define a path for working forward toward greater transparency and predictability, like just about every other field of technology, thorough testing at component, unit, integration and system levels can reveal a lot about a model • The known edges and boundary regions of performance, as well as vulnerabilities, sensitivities, and potential for dysfunction are always illuminating to build toward functional understanding • Knowing not just the region of safe operation, but the locus of performance beyond which a technology fails, and how it behaves in the zone of operation approaching the failure realm is usually instructive to build toward transparency and predictability
  • 24. DATA QUALITY AND IMPROVEMENT
  • 25. AI LEARNS WHAT IS IN THE DATA • Make sure the data conveys what the AI is to learn • This is different from the software centric view of tuning the algorithm until desired performance is achieved • Data quality becomes extremely important with small training sets • Noise in small training sets can greatly hamper model performance • Raw training data based on human judgement may reflect satisficing or historical values which will not be acceptable in forward looking or machine-delegated decision making
  • 26. OBJECTIVE DATA QUALITY MEASURES • Develop objective measures of data quality, and have a process for continuous improvement of data quality • Useful forcing exercise: • Hold the algorithm constant for a while, and see how much improvement can be achieved purely through data quality enhancement
  • 27. DATA LABELING AND VERSION/CONFIGURATION CONTROL • Labeling is a common source of data quality issues, compromising the representation of ground truth • Spot checking, statistical sampling, and increased checking where errors have recently been detected can help improve labeling • Improvement efforts should look toward error proofing the process for generating labels in the first place, not fixing errors retroactively on a sustained basis • Be especially wary if data, labels and models were built when people were harried or in a rush • Data benefits from similar version and configuration control concepts as code
  • 28. THICK VS. THIN TAILED DISTRIBUTIONS • The most constructive approaches to data cleaning and error correction are greatly influenced by whether the data distribution has thick or thin tails • Thick tails are often a sign of multiple generative mechanisms aggregated in the data • Thin tails are typically associated with a single generative mechanism producing the data • The most efficacious data quality improvements vary considerably depending on which of the tail phenomena is the case
  • 29. HIGH IMPACT CASES AND DATA DEPTH • Identify important cases and make sure the data includes them with sufficient statistical depth • Don’t forget about reserved test cases, not just training data • Map out happy paths, acceptable paths, exception cases and error conditions • Fault tree and failure propagation analysis can help incorporate systems thinking in the data and the resulting model • More generally, understand the depth of the data over the entire range of inputs, not just sub-ranges
  • 30. FURTHER DATA QUALITY IMPROVEMENT CONSIDERATIONS • Arguably, data practices tend to be much more vertically (application) specific, whereas algorithms tend to be more horizontal in nature • Ask what attributes of the data need to be monitored in production to see if model/concept drift is taking place • Instrumenting the distribution of production data along with cohort and trend analysis can often be used to help signal if drift may be occurring • Do sensitivity analyses to know what kinds of data the model performs poorly on, and build mechanisms to go acquire or generate more of that data
  • 31. FURTHER DISCUSSION For additional discussion about improving AI development: dave.litwiller@communitech.ca
  • 33. CHANGE TRACKING AND USER SEGMENTS • If the AI will be used in industries where there are regulatory and legal compliance issues for the model: • Change tracking and performance change monitoring are typically required • Performance against baseline and historical performance • Assessment of occurrences and rates of false positives and false negatives • Performance of sub-types of users or patients • Averages can conceal a lot of things, especially with thick or long tails to the distributions where multiple generative mechanisms may be at work • In general, the better people can provably explain why a model performs and changes the way it does, the better