SlideShare une entreprise Scribd logo
1  sur  31
Department of Statistics
The Maharaja Sayajirao University of Baroda
Can we make computers learn?
Machine learning is a field of artificial intelligence that
uses statistical techniques to give computer systems the
ability to "learn" from data, without being explicitly
programmed.
- Arthur Samuel
Here, by learning we mean progressively improving
performance, through experience, on a specific task.
Machine Learning
Need of Algorithm
To solve a problem on a computer, we need an algorithm.
 Searching
 Sorting
 Numerical computations
However, there are problems for which algorithms do
not exist
 Recognizing characters
 Distinguish spam emails from legitimate ones.
This lack of knowledge is made up for by data.
Programming and Machine Learning
The difference between “Programing” and “Machine
Learning” is similar to that between “Mathematics” and
“Statistics”.
It is believed that there is an unknown underlying
process that explains the data we observe. By analyzing
the available data, we intend to understand this process
as much as possible. Though identifying the complete
process may not be possible, we can still detect certain
patterns or regularities. This detection is the function of
machine learning.
Machine Learning is essentially learning from data.
Learning Program
Definition: A computer program is said to learn from
experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T,
as measured by P, improves with experience E.
To formulate a well-defined learning problem, we must
identify
 The class of tasks T
 The measure of performance P
 The source of experience E
-Tom M. Mitchell
ML Applications
 Alexa
Alexa is a virtual assistant developed by Amazon. It is
used in Echo Smart Speakers and many other devices.
 Watson
IBM Watson was created as question-answering
computer system. It won the quiz show Jeopardy in
2011 against the legendry champions.
 Driverless cars
Machine learning algorithms are extensively used in
driverless cars.
Machine Learning Approaches
Supervised learning
In supervised learning, the goal is to learn to predict the
value of an outcome measure based on a number of
input measures for which correct output values are
provided by supervisor.
Supervised learning is like using data to build predictive
models.
ML Approaches
Unsupervised learning
In unsupervised learning, there is no such supervisor,
and no outcome measure to be predicted. We only have
input data, and the goal is to find and describe
regularities in the input. We want to see ‘what generally
happens, and what does not’.
Unsupervised learning is like identifying probability
distributions from data.
 Reinforcement learning
Reinforcement learning is learning what to do so as to
maximize a numerical reward. The learner is not told
which actions to take, instead learner must discover
which actions are most rewarding by trying them.
The actions may affect not only the immediate reward
but also the next situation and, through that, all
subsequent rewards.
Reinforcement learning is learning from
mistakes/ success
ML Approaches
ML and Data Mining
The manual extraction of patterns from data has been
practiced for centuries. The techniques that have been
used for this purpose come in the purview of Statistics.
The term Data Mining, in this context, refers to the
automatic (or semi-automatic) analysis of large
quantities of data to extract previously unknown,
interesting patterns.
Data Mining as a tool for ML
 Machine learning uses variety of techniques that are
part of Data mining. In that sense Data mining
provides tools for machine learning.
 While data mining involves discovery of knowledge
from data, machine learning takes the journey forward
by utilizing the discovered knowledge for developing
intelligent systems.
 Learning algorithms is what distinguishes ‘Machine
learning’ from ‘Data mining’. Learning algorithms
utilizes the knowledge discovered by applying Data
Mining.
ML as a tool for Data Mining
Recall that machine learning makes the machines learn
to perform specified tasks.
If the task to be performed is ‘Mining’ of patterns,
machine learning can be used as a tool to make
machines learn to mine patterns.
It is this vice-versa relationship between the two fields,
which has led to a lot of confusion between ‘Data
Mining’ and ‘Machine Learning’ in the community.
Artificial Intelligence and ML
Artificial Intelligence
Artificial intelligence (AI), is the intelligence
demonstrated by machines.
It refers to the ability of a computer (or a computer-
controlled machine) to perform tasks that are commonly
associated with intelligent beings.
This term is frequently used to describe the projects of
developing systems with abilities that are characteristic
of humans, such as the ability to reason, discover
meaning, generalize, or learn from past experience.
The discipline of AI was born during a summer
workshop organized by John McCarthy at Dartmouth
college in 1956 named “The Dartmouth Summer
Research Project on Artificial Intelligence”.
The basis of the workshop was the conjecture that
Every aspect of learning or any other feature of
intelligence can, in principle, be so precisely described
that a machine can be made to simulate it.
Machine learning is an area of AI that deals with
designing systems that possess the trait of learning.
Supervised Learning
The goal of supervised learning is to predict the values of
output using the values of features.
In statistical terminology, features are called predictors.
Similarly, the outputs are called responses or dependent
variables.
A prediction task is called regression when the output is a
quantitative measure, and for qualitative output, the
prediction task is called classification.
It is useful to think of a prediction problem as a decision
problem. We know that the optimal solutions to
decision problems depend on the loss function under
consideration.
The most common choices of loss functions are
For numerical output: Squared error loss function
For qualitative output: Probability of wrong prediction
However, there can be several other loss functions that
are more suitable in a given context.
Prediction Problem as a decision Problem
Best prediction
Suppose we are faced with the problem of predicting the
value of a random variable Y. Further, suppose we know
the probability distribution P of Y.
If Y is numeric, and the loss function is ‘squared error’
loss function, We know that the expected loss is
minimized when 𝑌 = 𝐸 𝑃(𝑌).
If Y is qualitative, and the loss function is ‘probability of
wrong prediction’, We know that the expected loss is
minimized when 𝑌 = 𝑀𝑜𝑑𝑒 𝑃 (𝑌)
Best prediction in supervised learning
Recall that ‘supervised learning’ is a problem of
predicting the value of output Y for the given values of
features X1, X2, …, Xp. Consequently,
In a regression/ classification problem, we need to
estimate the conditional probability distribution of Y
given X1, X2, …, Xp and use the estimate of the expected
value/ mode of this conditional distribution as the
prediction.
Different strategies of estimating this conditional
distribution leads to different machine learning
solutions.
Supervised Learning Process
Let Y be a quantitative response, and X1,X2, . . .,Xp be the p
different predictors. We assume that there is some
relationship between Y and X = (X1,X2, . . .,Xp), which can be
written as
Y = f(X) + 
Here f(x) = E(Y|X=x) is some fixed but unknown function of
x, and  is an error having some probability distribution with
mean 0 (zero).
Supervised learning refers to a set of approaches for learning
(estimating) f using a training dataset. Once f is learnt, Y can
be predicted for a given input X = x as
𝑌 = 𝑓(𝑥)
Three components of Risk
The expected loss (risk) in using 𝑌 as a prediction for Y
for given value x of X is
𝐸 𝑌 − 𝑓(𝑋)
2
|𝑋 = 𝑥
Which can be decomposed into two terms as
𝐸 𝑌 − 𝑓(𝑋)
2
|𝑋 = 𝑥 = 𝐸 𝑌 − 𝑓(𝑋) 2|𝑋 = 𝑥 + 𝑓 𝑥 − 𝑓(𝑥)
2
The first, of the two terms, is called the irreducible error
and the second one is called the reducible error.
The reducible error can be further decomposed into
variance and bias.
Estimating f
 There are several possible functions to choose from.
 We need to select a function 𝑓 that can explain the
examples in the training data set, and at the same time
capable of making good predictions for the examples
not in the training data set.
 Above requirement implies that, every new training
example reduces the search space.
 In principle, enough training examples should,
therefore, lead to only one function 𝑓 at the end. In
that case 𝑓 is the solution to the problem.
The Inductive bias
In practice, however, the data by itself is not sufficient to
find unique solution function 𝑓. Such a learning
problem is said to be an ill-posed problem
Because learning problems are generally ill-posed, we
need to make some extra assumptions to have a unique
solution with the data we have.
The set of assumptions we make to have learning
possible is said to introduce the inductive bias in the
learning algorithm.
In general, learning is not possible without inductive bias.
The triple trade-off
 While choosing a model with minimum inductive
bias, it is important to understand that the goal of
learning is to achieve best generalization rather than
best explanation of training data.
 Avoid underfitting and overfitting.
 Typically, we need to achieve the triple trade-off
 Complexity of the class F
 Amount of training data
 Generalization error
Two approaches of learning
Parametric approach
In this approach, we assume (inductive bias) that the
conditional probability distribution F of outcome Y given the
responses X1, X2, …, Xp belongs to a parametric family F. The
problem of learning F then reduces to that of learning the
model parameters using approaches such as ML estimation.
Eg. Linear Regression
Non-parametric approach
In this approach F is directly constructed from the training
data
Eg. KNN regression
Classifiers
KNN Classifiers
F(y|X=x) is estimated by the frequency distribution of Y
from the training examples in the neighborhood of x.
Bayesian classifier
The posterior distribution F(y|X=x) is estimated using
Bayes’ theorem, where both prior distribution F(y) and
the conditional distribution F(x|Y) are estimated as the
frequency distributions from training data
Classifiers
Logistic regression
F(Y|X=x) is assumed to be Bernoulli/ Multinomial with
parameters as parametric functions of X. These
parameters are learnt using approaches such as ML
estimation.
Classification Trees
The training space is iteratively partitioned using one
feature at a time until the partition is reasonably pure
with respect to Y (i.e. Most examples in the partition
belongs to one class)
Ensemble Learning
When multiple estimators are available for a quantity/
function, an appropriate combination of these estimator
is better than any individual estimator.
This principle is exploited by Ensemble Learning.
Bagging: This techniques uses bootstrapping
Boosting: Adaptive subsampling
Random forest: Ensemble learning for decision trees
Feedback loops
The learning process should ensure into improvement of
performance as more experience is accumulated.
This requires incorporation of feedback loop shown
above. The feedback loop updates (improves) the
learned model which results into better performance.
Updating regression estimators
When a new example x becomes available, we can update the
learned values of regression coefficients as
𝛽 𝑛𝑒𝑤 = (𝑋′ 𝑋) 𝑛𝑒𝑤
−1 𝑋′ 𝑛𝑒𝑤 𝑌𝑛𝑒𝑤
Where
𝑋′ 𝑋 𝑛𝑒𝑤
−1 = 𝑋′ 𝑋 + 𝑥 𝑥′ −1
= (𝑋′ 𝑋)−1−
(𝑋′ 𝑋)−1 𝑥 (𝑋′ 𝑋)−1 𝑥
′
1+𝑥′(𝑋′ 𝑋)−1 𝑥
Rank one update
formula
Instead of updating the model with every new example, one
can also adopt the strategy of updating model after every batch
of examples.
CRISP-DM
Thanks!

Contenu connexe

Tendances

Internship project report,Predictive Modelling
Internship project report,Predictive ModellingInternship project report,Predictive Modelling
Internship project report,Predictive Modelling
Amit Kumar
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
butest
 
DagdelenSiriwardaneY..
DagdelenSiriwardaneY..DagdelenSiriwardaneY..
DagdelenSiriwardaneY..
butest
 

Tendances (19)

(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...
(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...
(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory Concepts
 
Chapter 5 of 1
Chapter 5 of 1Chapter 5 of 1
Chapter 5 of 1
 
Internship project report,Predictive Modelling
Internship project report,Predictive ModellingInternship project report,Predictive Modelling
Internship project report,Predictive Modelling
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
 
Eckovation Machine Learning
Eckovation Machine LearningEckovation Machine Learning
Eckovation Machine Learning
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and Answers
 
Machine Learning Interview Questions Answers
Machine Learning Interview Questions AnswersMachine Learning Interview Questions Answers
Machine Learning Interview Questions Answers
 
DagdelenSiriwardaneY..
DagdelenSiriwardaneY..DagdelenSiriwardaneY..
DagdelenSiriwardaneY..
 
Machine learning Algorithms
Machine learning AlgorithmsMachine learning Algorithms
Machine learning Algorithms
 
supervised learning
supervised learningsupervised learning
supervised learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
How ml can improve purchase conversions
How ml can improve purchase conversionsHow ml can improve purchase conversions
How ml can improve purchase conversions
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Machine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainMachine learning_ Replicating Human Brain
Machine learning_ Replicating Human Brain
 

Similaire à Statistical foundations of ml

machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
PranavPatil822557
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
butest
 
introductiontomachinelearning.pptx
introductiontomachinelearning.pptxintroductiontomachinelearning.pptx
introductiontomachinelearning.pptx
SivapriyaS12
 

Similaire à Statistical foundations of ml (20)

Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-code
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
machine learning
machine learningmachine learning
machine learning
 
Machine learning with ADA Boost
Machine learning with ADA BoostMachine learning with ADA Boost
Machine learning with ADA Boost
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.ppt
 
Machine Learning Chapter one introduction
Machine Learning Chapter one introductionMachine Learning Chapter one introduction
Machine Learning Chapter one introduction
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning Chapter 1
Machine learning Chapter 1Machine learning Chapter 1
Machine learning Chapter 1
 
ML_lec1.pdf
ML_lec1.pdfML_lec1.pdf
ML_lec1.pdf
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Machine learning basics
Machine learning   basicsMachine learning   basics
Machine learning basics
 
PREDICT 422 - Module 1.pptx
PREDICT 422 - Module 1.pptxPREDICT 422 - Module 1.pptx
PREDICT 422 - Module 1.pptx
 
Essentials of machine learning algorithms
Essentials of machine learning algorithmsEssentials of machine learning algorithms
Essentials of machine learning algorithms
 
Machine Learning PPT BY RAVINDRA SINGH KUSHWAHA B.TECH(IT) CHAUDHARY CHARAN S...
Machine Learning PPT BY RAVINDRA SINGH KUSHWAHA B.TECH(IT) CHAUDHARY CHARAN S...Machine Learning PPT BY RAVINDRA SINGH KUSHWAHA B.TECH(IT) CHAUDHARY CHARAN S...
Machine Learning PPT BY RAVINDRA SINGH KUSHWAHA B.TECH(IT) CHAUDHARY CHARAN S...
 
introductiontomachinelearning.pptx
introductiontomachinelearning.pptxintroductiontomachinelearning.pptx
introductiontomachinelearning.pptx
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
 

Dernier

Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 

Dernier (20)

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 

Statistical foundations of ml

  • 1. Department of Statistics The Maharaja Sayajirao University of Baroda
  • 2. Can we make computers learn? Machine learning is a field of artificial intelligence that uses statistical techniques to give computer systems the ability to "learn" from data, without being explicitly programmed. - Arthur Samuel Here, by learning we mean progressively improving performance, through experience, on a specific task. Machine Learning
  • 3. Need of Algorithm To solve a problem on a computer, we need an algorithm.  Searching  Sorting  Numerical computations However, there are problems for which algorithms do not exist  Recognizing characters  Distinguish spam emails from legitimate ones. This lack of knowledge is made up for by data.
  • 4. Programming and Machine Learning The difference between “Programing” and “Machine Learning” is similar to that between “Mathematics” and “Statistics”. It is believed that there is an unknown underlying process that explains the data we observe. By analyzing the available data, we intend to understand this process as much as possible. Though identifying the complete process may not be possible, we can still detect certain patterns or regularities. This detection is the function of machine learning. Machine Learning is essentially learning from data.
  • 5. Learning Program Definition: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. To formulate a well-defined learning problem, we must identify  The class of tasks T  The measure of performance P  The source of experience E -Tom M. Mitchell
  • 6. ML Applications  Alexa Alexa is a virtual assistant developed by Amazon. It is used in Echo Smart Speakers and many other devices.  Watson IBM Watson was created as question-answering computer system. It won the quiz show Jeopardy in 2011 against the legendry champions.  Driverless cars Machine learning algorithms are extensively used in driverless cars.
  • 7. Machine Learning Approaches Supervised learning In supervised learning, the goal is to learn to predict the value of an outcome measure based on a number of input measures for which correct output values are provided by supervisor. Supervised learning is like using data to build predictive models.
  • 8. ML Approaches Unsupervised learning In unsupervised learning, there is no such supervisor, and no outcome measure to be predicted. We only have input data, and the goal is to find and describe regularities in the input. We want to see ‘what generally happens, and what does not’. Unsupervised learning is like identifying probability distributions from data.
  • 9.  Reinforcement learning Reinforcement learning is learning what to do so as to maximize a numerical reward. The learner is not told which actions to take, instead learner must discover which actions are most rewarding by trying them. The actions may affect not only the immediate reward but also the next situation and, through that, all subsequent rewards. Reinforcement learning is learning from mistakes/ success ML Approaches
  • 10. ML and Data Mining The manual extraction of patterns from data has been practiced for centuries. The techniques that have been used for this purpose come in the purview of Statistics. The term Data Mining, in this context, refers to the automatic (or semi-automatic) analysis of large quantities of data to extract previously unknown, interesting patterns.
  • 11. Data Mining as a tool for ML  Machine learning uses variety of techniques that are part of Data mining. In that sense Data mining provides tools for machine learning.  While data mining involves discovery of knowledge from data, machine learning takes the journey forward by utilizing the discovered knowledge for developing intelligent systems.  Learning algorithms is what distinguishes ‘Machine learning’ from ‘Data mining’. Learning algorithms utilizes the knowledge discovered by applying Data Mining.
  • 12. ML as a tool for Data Mining Recall that machine learning makes the machines learn to perform specified tasks. If the task to be performed is ‘Mining’ of patterns, machine learning can be used as a tool to make machines learn to mine patterns. It is this vice-versa relationship between the two fields, which has led to a lot of confusion between ‘Data Mining’ and ‘Machine Learning’ in the community.
  • 13. Artificial Intelligence and ML Artificial Intelligence Artificial intelligence (AI), is the intelligence demonstrated by machines. It refers to the ability of a computer (or a computer- controlled machine) to perform tasks that are commonly associated with intelligent beings. This term is frequently used to describe the projects of developing systems with abilities that are characteristic of humans, such as the ability to reason, discover meaning, generalize, or learn from past experience.
  • 14. The discipline of AI was born during a summer workshop organized by John McCarthy at Dartmouth college in 1956 named “The Dartmouth Summer Research Project on Artificial Intelligence”. The basis of the workshop was the conjecture that Every aspect of learning or any other feature of intelligence can, in principle, be so precisely described that a machine can be made to simulate it. Machine learning is an area of AI that deals with designing systems that possess the trait of learning.
  • 15. Supervised Learning The goal of supervised learning is to predict the values of output using the values of features. In statistical terminology, features are called predictors. Similarly, the outputs are called responses or dependent variables. A prediction task is called regression when the output is a quantitative measure, and for qualitative output, the prediction task is called classification.
  • 16. It is useful to think of a prediction problem as a decision problem. We know that the optimal solutions to decision problems depend on the loss function under consideration. The most common choices of loss functions are For numerical output: Squared error loss function For qualitative output: Probability of wrong prediction However, there can be several other loss functions that are more suitable in a given context. Prediction Problem as a decision Problem
  • 17. Best prediction Suppose we are faced with the problem of predicting the value of a random variable Y. Further, suppose we know the probability distribution P of Y. If Y is numeric, and the loss function is ‘squared error’ loss function, We know that the expected loss is minimized when 𝑌 = 𝐸 𝑃(𝑌). If Y is qualitative, and the loss function is ‘probability of wrong prediction’, We know that the expected loss is minimized when 𝑌 = 𝑀𝑜𝑑𝑒 𝑃 (𝑌)
  • 18. Best prediction in supervised learning Recall that ‘supervised learning’ is a problem of predicting the value of output Y for the given values of features X1, X2, …, Xp. Consequently, In a regression/ classification problem, we need to estimate the conditional probability distribution of Y given X1, X2, …, Xp and use the estimate of the expected value/ mode of this conditional distribution as the prediction. Different strategies of estimating this conditional distribution leads to different machine learning solutions.
  • 19. Supervised Learning Process Let Y be a quantitative response, and X1,X2, . . .,Xp be the p different predictors. We assume that there is some relationship between Y and X = (X1,X2, . . .,Xp), which can be written as Y = f(X) +  Here f(x) = E(Y|X=x) is some fixed but unknown function of x, and  is an error having some probability distribution with mean 0 (zero). Supervised learning refers to a set of approaches for learning (estimating) f using a training dataset. Once f is learnt, Y can be predicted for a given input X = x as 𝑌 = 𝑓(𝑥)
  • 20. Three components of Risk The expected loss (risk) in using 𝑌 as a prediction for Y for given value x of X is 𝐸 𝑌 − 𝑓(𝑋) 2 |𝑋 = 𝑥 Which can be decomposed into two terms as 𝐸 𝑌 − 𝑓(𝑋) 2 |𝑋 = 𝑥 = 𝐸 𝑌 − 𝑓(𝑋) 2|𝑋 = 𝑥 + 𝑓 𝑥 − 𝑓(𝑥) 2 The first, of the two terms, is called the irreducible error and the second one is called the reducible error. The reducible error can be further decomposed into variance and bias.
  • 21. Estimating f  There are several possible functions to choose from.  We need to select a function 𝑓 that can explain the examples in the training data set, and at the same time capable of making good predictions for the examples not in the training data set.  Above requirement implies that, every new training example reduces the search space.  In principle, enough training examples should, therefore, lead to only one function 𝑓 at the end. In that case 𝑓 is the solution to the problem.
  • 22. The Inductive bias In practice, however, the data by itself is not sufficient to find unique solution function 𝑓. Such a learning problem is said to be an ill-posed problem Because learning problems are generally ill-posed, we need to make some extra assumptions to have a unique solution with the data we have. The set of assumptions we make to have learning possible is said to introduce the inductive bias in the learning algorithm. In general, learning is not possible without inductive bias.
  • 23. The triple trade-off  While choosing a model with minimum inductive bias, it is important to understand that the goal of learning is to achieve best generalization rather than best explanation of training data.  Avoid underfitting and overfitting.  Typically, we need to achieve the triple trade-off  Complexity of the class F  Amount of training data  Generalization error
  • 24. Two approaches of learning Parametric approach In this approach, we assume (inductive bias) that the conditional probability distribution F of outcome Y given the responses X1, X2, …, Xp belongs to a parametric family F. The problem of learning F then reduces to that of learning the model parameters using approaches such as ML estimation. Eg. Linear Regression Non-parametric approach In this approach F is directly constructed from the training data Eg. KNN regression
  • 25. Classifiers KNN Classifiers F(y|X=x) is estimated by the frequency distribution of Y from the training examples in the neighborhood of x. Bayesian classifier The posterior distribution F(y|X=x) is estimated using Bayes’ theorem, where both prior distribution F(y) and the conditional distribution F(x|Y) are estimated as the frequency distributions from training data
  • 26. Classifiers Logistic regression F(Y|X=x) is assumed to be Bernoulli/ Multinomial with parameters as parametric functions of X. These parameters are learnt using approaches such as ML estimation. Classification Trees The training space is iteratively partitioned using one feature at a time until the partition is reasonably pure with respect to Y (i.e. Most examples in the partition belongs to one class)
  • 27. Ensemble Learning When multiple estimators are available for a quantity/ function, an appropriate combination of these estimator is better than any individual estimator. This principle is exploited by Ensemble Learning. Bagging: This techniques uses bootstrapping Boosting: Adaptive subsampling Random forest: Ensemble learning for decision trees
  • 28. Feedback loops The learning process should ensure into improvement of performance as more experience is accumulated. This requires incorporation of feedback loop shown above. The feedback loop updates (improves) the learned model which results into better performance.
  • 29. Updating regression estimators When a new example x becomes available, we can update the learned values of regression coefficients as 𝛽 𝑛𝑒𝑤 = (𝑋′ 𝑋) 𝑛𝑒𝑤 −1 𝑋′ 𝑛𝑒𝑤 𝑌𝑛𝑒𝑤 Where 𝑋′ 𝑋 𝑛𝑒𝑤 −1 = 𝑋′ 𝑋 + 𝑥 𝑥′ −1 = (𝑋′ 𝑋)−1− (𝑋′ 𝑋)−1 𝑥 (𝑋′ 𝑋)−1 𝑥 ′ 1+𝑥′(𝑋′ 𝑋)−1 𝑥 Rank one update formula Instead of updating the model with every new example, one can also adopt the strategy of updating model after every batch of examples.