SlideShare une entreprise Scribd logo
1  sur  7
Télécharger pour lire hors ligne
Christof Monz
Informatics Institute
University of Amsterdam
Data Mining
Week 1: Linear Regression
Outline
Christof Monz
Data Mining - Week 1: Linear Regression
1
Plotting real-valued predictions
Linear regression
Error function
Linear Regression
Christof Monz
Data Mining - Week 1: Linear Regression
2
Predict real-values (as opposed to discrete
classes)
Simple machine learning prediction task
Assumes linear correlation between data and
target values
Scatter Plots
Christof Monz
Data Mining - Week 1: Linear Regression
3
10 15 20 25 30 35 40 45
10152025303540
x
y
Linear Regression
Christof Monz
Data Mining - Week 1: Linear Regression
4
Find the line that approximates the data as
closely as possible
ˆy = a +b ·x
where b is the slope, and a is the y-intercept
a and b should be chosen such that they
minimize the difference between the predicted
values and the values in the training data
Error Functions
Christof Monz
Data Mining - Week 1: Linear Regression
5
There are a number of ways to define an error
function
Sum of absolute errors = ∑
i∈D
|yi −(a +bxi)|
Sum of squared errors = ∑
i∈D
(yi −(a +bxi))2
where yi is the true value
Squared error is most commonly used
Task: Find the parameters a and b that
minimize the squared error over the training
data
Error Functions
Christof Monz
Data Mining - Week 1: Linear Regression
6
Normalized error functions:
Mean squared error = ∑
i∈D
(yi −(a+bxi ))2
|D|
Relative squared error = ∑i∈D(yi −(a+bxi ))2
∑i∈D(yi −¯y)2
where ¯y = 1
|D| ∑i∈D yi
Root relative squared error = ∑i∈D(yi −(a+bxi ))2
∑i∈D(yi −¯y)2
Minimizing Error Functions
Christof Monz
Data Mining - Week 1: Linear Regression
7
There are roughly two ways:
• Try different parameter instantiations and see which
ones lead to the lowest error (search)
• Solve mathematically (closed form)
Most parameter estimation problems in machine
learning can only be solved by searching
For linear regression, we can solve it
mathematically
Minimizing SSE
Christof Monz
Data Mining - Week 1: Linear Regression
8
SSE = ∑
i∈D
(yi −(a +bxi))2
Take the partial derivatives with respect to a
and b
Set each partial derivative equal to zero and
solve for a and b respectively
The resulting values for a and b minimize the
error rate and can be used to predict unseen
data instances
Applying Linear Regression
Christof Monz
Data Mining - Week 1: Linear Regression
9
For a given training set we first compute b:
b =
|D|∑i∈D xi yi −∑i∈D xi ∑i∈D yi
|D|∑i∈D x2
i −(∑i∈D xi )2
and then a, using the value computed for b:
a = ¯y −b¯x
For any new instances x (i.e. instances that
were not in the training set), the predicted value
is: a +bx
Extendible to multi-valued functions
Linear Regression
Christof Monz
Data Mining - Week 1: Linear Regression
10
Used to predict real-number values, given
numerical input variables
Parameters can be estimated analytically (i.e.
by applying some mathematics), which won’t be
the case for most parameter estimation
algorithms we’ll see later on
Extendible to non-linear functions, e.g.
log-linear regression
Correlation
Christof Monz
Data Mining - Week 1: Linear Regression
11
So far we have used linear regression to predict
target values (prediction)
Linear regression can also be used to determine
how closely to variables are correlated
(description)
The smaller the error rate, the stronger the
correlation between the variables
Correlation does mean that there is some
(interesting relation) between variables (not
necessarily causal)
Recap
Christof Monz
Data Mining - Week 1: Linear Regression
12
Linear regression
Error rates
Analytical parameter estimation

Contenu connexe

Tendances (20)

AP Calculus January 5, 2009
AP Calculus January 5, 2009AP Calculus January 5, 2009
AP Calculus January 5, 2009
 
Alg2 Notes Unit 1 Day 5
Alg2 Notes Unit 1 Day 5Alg2 Notes Unit 1 Day 5
Alg2 Notes Unit 1 Day 5
 
Examen du seconde semestre g8
Examen du seconde semestre g8Examen du seconde semestre g8
Examen du seconde semestre g8
 
AP Calculus Slides December 10, 2007
AP Calculus Slides December 10, 2007AP Calculus Slides December 10, 2007
AP Calculus Slides December 10, 2007
 
Abstract PDF
Abstract PDFAbstract PDF
Abstract PDF
 
Activity 2
Activity 2Activity 2
Activity 2
 
Activity 02
Activity 02Activity 02
Activity 02
 
Math hssc-ii-a1
Math hssc-ii-a1Math hssc-ii-a1
Math hssc-ii-a1
 
130701 04-01-2013
130701 04-01-2013130701 04-01-2013
130701 04-01-2013
 
Subtractor (1)
Subtractor (1)Subtractor (1)
Subtractor (1)
 
Module 12 topic 1 notes
Module 12 topic 1 notesModule 12 topic 1 notes
Module 12 topic 1 notes
 
4.5 graph using slope int form - day 2
4.5 graph using slope int form - day 24.5 graph using slope int form - day 2
4.5 graph using slope int form - day 2
 
Subtractor
SubtractorSubtractor
Subtractor
 
Matrices, Arrays and Vectors in MATLAB
Matrices, Arrays and Vectors in MATLABMatrices, Arrays and Vectors in MATLAB
Matrices, Arrays and Vectors in MATLAB
 
Examplelf flowchart
Examplelf flowchartExamplelf flowchart
Examplelf flowchart
 
Funções 2
Funções 2Funções 2
Funções 2
 
Chirantan (java)
Chirantan   (java)Chirantan   (java)
Chirantan (java)
 
8 6 Notes
8 6 Notes8 6 Notes
8 6 Notes
 
Implementation
ImplementationImplementation
Implementation
 
Day 3 Angles In Polygons
Day 3 Angles In PolygonsDay 3 Angles In Polygons
Day 3 Angles In Polygons
 

Similaire à UM Amsterdam Linear Regression Week 1

Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear RegressionAndrew Ferlitsch
 
Regression: A skin-deep dive
Regression: A skin-deep diveRegression: A skin-deep dive
Regression: A skin-deep diveabulyomon
 
Dm week01 prob-refresher.handout
Dm week01 prob-refresher.handoutDm week01 prob-refresher.handout
Dm week01 prob-refresher.handoutokeee
 
Unit One - Solved problems on error analysis .ppt
Unit One - Solved problems on error analysis .pptUnit One - Solved problems on error analysis .ppt
Unit One - Solved problems on error analysis .pptashugizaw1506
 
optimal subsampling
optimal subsamplingoptimal subsampling
optimal subsamplingTian Tian
 
Data flow vs. procedural programming: How to put your algorithms into Flink
Data flow vs. procedural programming: How to put your algorithms into FlinkData flow vs. procedural programming: How to put your algorithms into Flink
Data flow vs. procedural programming: How to put your algorithms into FlinkMikio L. Braun
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...MLconf
 
ISI MSQE Entrance Question Paper (2006)
ISI MSQE Entrance Question Paper (2006)ISI MSQE Entrance Question Paper (2006)
ISI MSQE Entrance Question Paper (2006)CrackDSE
 
SKuehn_MachineLearningAndOptimization_2015
SKuehn_MachineLearningAndOptimization_2015SKuehn_MachineLearningAndOptimization_2015
SKuehn_MachineLearningAndOptimization_2015Stefan Kühn
 
Dm part03 neural-networks-handout
Dm part03 neural-networks-handoutDm part03 neural-networks-handout
Dm part03 neural-networks-handoutokeee
 

Similaire à UM Amsterdam Linear Regression Week 1 (20)

Talk iccf 19_ben_hammouda
Talk iccf 19_ben_hammoudaTalk iccf 19_ben_hammouda
Talk iccf 19_ben_hammouda
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Regression
RegressionRegression
Regression
 
Optimization tutorial
Optimization tutorialOptimization tutorial
Optimization tutorial
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear Regression
 
1
11
1
 
Regression: A skin-deep dive
Regression: A skin-deep diveRegression: A skin-deep dive
Regression: A skin-deep dive
 
Dynamic pgmming
Dynamic pgmmingDynamic pgmming
Dynamic pgmming
 
3ml.pdf
3ml.pdf3ml.pdf
3ml.pdf
 
Dm week01 prob-refresher.handout
Dm week01 prob-refresher.handoutDm week01 prob-refresher.handout
Dm week01 prob-refresher.handout
 
Unit One - Solved problems on error analysis .ppt
Unit One - Solved problems on error analysis .pptUnit One - Solved problems on error analysis .ppt
Unit One - Solved problems on error analysis .ppt
 
optimal subsampling
optimal subsamplingoptimal subsampling
optimal subsampling
 
Data flow vs. procedural programming: How to put your algorithms into Flink
Data flow vs. procedural programming: How to put your algorithms into FlinkData flow vs. procedural programming: How to put your algorithms into Flink
Data flow vs. procedural programming: How to put your algorithms into Flink
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
 
Dynamicpgmming
DynamicpgmmingDynamicpgmming
Dynamicpgmming
 
ISI MSQE Entrance Question Paper (2006)
ISI MSQE Entrance Question Paper (2006)ISI MSQE Entrance Question Paper (2006)
ISI MSQE Entrance Question Paper (2006)
 
SKuehn_MachineLearningAndOptimization_2015
SKuehn_MachineLearningAndOptimization_2015SKuehn_MachineLearningAndOptimization_2015
SKuehn_MachineLearningAndOptimization_2015
 
Network Security CS3-4
Network Security CS3-4 Network Security CS3-4
Network Security CS3-4
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Dm part03 neural-networks-handout
Dm part03 neural-networks-handoutDm part03 neural-networks-handout
Dm part03 neural-networks-handout
 

Plus de okeee

Week02 answer
Week02 answerWeek02 answer
Week02 answerokeee
 
Dm uitwerkingen wc4
Dm uitwerkingen wc4Dm uitwerkingen wc4
Dm uitwerkingen wc4okeee
 
Dm uitwerkingen wc2
Dm uitwerkingen wc2Dm uitwerkingen wc2
Dm uitwerkingen wc2okeee
 
Dm uitwerkingen wc1
Dm uitwerkingen wc1Dm uitwerkingen wc1
Dm uitwerkingen wc1okeee
 
Dm uitwerkingen wc3
Dm uitwerkingen wc3Dm uitwerkingen wc3
Dm uitwerkingen wc3okeee
 
Dm uitwerkingen wc1
Dm uitwerkingen wc1Dm uitwerkingen wc1
Dm uitwerkingen wc1okeee
 
Dm part03 neural-networks-homework
Dm part03 neural-networks-homeworkDm part03 neural-networks-homework
Dm part03 neural-networks-homeworkokeee
 
10[1].1.1.115.9508
10[1].1.1.115.950810[1].1.1.115.9508
10[1].1.1.115.9508okeee
 
Hcm p137 hilliges
Hcm p137 hilligesHcm p137 hilliges
Hcm p137 hilligesokeee
 
Prob18
Prob18Prob18
Prob18okeee
 
Overfit10
Overfit10Overfit10
Overfit10okeee
 
Decision tree.10.11
Decision tree.10.11Decision tree.10.11
Decision tree.10.11okeee
 
Dm week02 decision-trees-handout
Dm week02 decision-trees-handoutDm week02 decision-trees-handout
Dm week02 decision-trees-handoutokeee
 
Dm week01 intro.handout
Dm week01 intro.handoutDm week01 intro.handout
Dm week01 intro.handoutokeee
 
Dm week01 homework(1)
Dm week01 homework(1)Dm week01 homework(1)
Dm week01 homework(1)okeee
 
Chapter7 huizing
Chapter7 huizingChapter7 huizing
Chapter7 huizingokeee
 
Chapter8 choo
Chapter8 chooChapter8 choo
Chapter8 choookeee
 
Chapter6 huizing
Chapter6 huizingChapter6 huizing
Chapter6 huizingokeee
 
Kbms text-image
Kbms text-imageKbms text-image
Kbms text-imageokeee
 
Kbms audio
Kbms audioKbms audio
Kbms audiookeee
 

Plus de okeee (20)

Week02 answer
Week02 answerWeek02 answer
Week02 answer
 
Dm uitwerkingen wc4
Dm uitwerkingen wc4Dm uitwerkingen wc4
Dm uitwerkingen wc4
 
Dm uitwerkingen wc2
Dm uitwerkingen wc2Dm uitwerkingen wc2
Dm uitwerkingen wc2
 
Dm uitwerkingen wc1
Dm uitwerkingen wc1Dm uitwerkingen wc1
Dm uitwerkingen wc1
 
Dm uitwerkingen wc3
Dm uitwerkingen wc3Dm uitwerkingen wc3
Dm uitwerkingen wc3
 
Dm uitwerkingen wc1
Dm uitwerkingen wc1Dm uitwerkingen wc1
Dm uitwerkingen wc1
 
Dm part03 neural-networks-homework
Dm part03 neural-networks-homeworkDm part03 neural-networks-homework
Dm part03 neural-networks-homework
 
10[1].1.1.115.9508
10[1].1.1.115.950810[1].1.1.115.9508
10[1].1.1.115.9508
 
Hcm p137 hilliges
Hcm p137 hilligesHcm p137 hilliges
Hcm p137 hilliges
 
Prob18
Prob18Prob18
Prob18
 
Overfit10
Overfit10Overfit10
Overfit10
 
Decision tree.10.11
Decision tree.10.11Decision tree.10.11
Decision tree.10.11
 
Dm week02 decision-trees-handout
Dm week02 decision-trees-handoutDm week02 decision-trees-handout
Dm week02 decision-trees-handout
 
Dm week01 intro.handout
Dm week01 intro.handoutDm week01 intro.handout
Dm week01 intro.handout
 
Dm week01 homework(1)
Dm week01 homework(1)Dm week01 homework(1)
Dm week01 homework(1)
 
Chapter7 huizing
Chapter7 huizingChapter7 huizing
Chapter7 huizing
 
Chapter8 choo
Chapter8 chooChapter8 choo
Chapter8 choo
 
Chapter6 huizing
Chapter6 huizingChapter6 huizing
Chapter6 huizing
 
Kbms text-image
Kbms text-imageKbms text-image
Kbms text-image
 
Kbms audio
Kbms audioKbms audio
Kbms audio
 

Dernier

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxElton John Embodo
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsRommel Regala
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 

Dernier (20)

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World Politics
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 

UM Amsterdam Linear Regression Week 1

  • 1. Christof Monz Informatics Institute University of Amsterdam Data Mining Week 1: Linear Regression Outline Christof Monz Data Mining - Week 1: Linear Regression 1 Plotting real-valued predictions Linear regression Error function
  • 2. Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 2 Predict real-values (as opposed to discrete classes) Simple machine learning prediction task Assumes linear correlation between data and target values Scatter Plots Christof Monz Data Mining - Week 1: Linear Regression 3 10 15 20 25 30 35 40 45 10152025303540 x y
  • 3. Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 4 Find the line that approximates the data as closely as possible ˆy = a +b ·x where b is the slope, and a is the y-intercept a and b should be chosen such that they minimize the difference between the predicted values and the values in the training data Error Functions Christof Monz Data Mining - Week 1: Linear Regression 5 There are a number of ways to define an error function Sum of absolute errors = ∑ i∈D |yi −(a +bxi)| Sum of squared errors = ∑ i∈D (yi −(a +bxi))2 where yi is the true value Squared error is most commonly used Task: Find the parameters a and b that minimize the squared error over the training data
  • 4. Error Functions Christof Monz Data Mining - Week 1: Linear Regression 6 Normalized error functions: Mean squared error = ∑ i∈D (yi −(a+bxi ))2 |D| Relative squared error = ∑i∈D(yi −(a+bxi ))2 ∑i∈D(yi −¯y)2 where ¯y = 1 |D| ∑i∈D yi Root relative squared error = ∑i∈D(yi −(a+bxi ))2 ∑i∈D(yi −¯y)2 Minimizing Error Functions Christof Monz Data Mining - Week 1: Linear Regression 7 There are roughly two ways: • Try different parameter instantiations and see which ones lead to the lowest error (search) • Solve mathematically (closed form) Most parameter estimation problems in machine learning can only be solved by searching For linear regression, we can solve it mathematically
  • 5. Minimizing SSE Christof Monz Data Mining - Week 1: Linear Regression 8 SSE = ∑ i∈D (yi −(a +bxi))2 Take the partial derivatives with respect to a and b Set each partial derivative equal to zero and solve for a and b respectively The resulting values for a and b minimize the error rate and can be used to predict unseen data instances Applying Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 9 For a given training set we first compute b: b = |D|∑i∈D xi yi −∑i∈D xi ∑i∈D yi |D|∑i∈D x2 i −(∑i∈D xi )2 and then a, using the value computed for b: a = ¯y −b¯x For any new instances x (i.e. instances that were not in the training set), the predicted value is: a +bx Extendible to multi-valued functions
  • 6. Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 10 Used to predict real-number values, given numerical input variables Parameters can be estimated analytically (i.e. by applying some mathematics), which won’t be the case for most parameter estimation algorithms we’ll see later on Extendible to non-linear functions, e.g. log-linear regression Correlation Christof Monz Data Mining - Week 1: Linear Regression 11 So far we have used linear regression to predict target values (prediction) Linear regression can also be used to determine how closely to variables are correlated (description) The smaller the error rate, the stronger the correlation between the variables Correlation does mean that there is some (interesting relation) between variables (not necessarily causal)
  • 7. Recap Christof Monz Data Mining - Week 1: Linear Regression 12 Linear regression Error rates Analytical parameter estimation