SlideShare une entreprise Scribd logo
1  sur  8
Télécharger pour lire hors ligne
15.071 THE ANALYTICS EDGE
SPRING 2015
Class Time
Section A
Lecture: Mondays and Wednesdays, 1:00pm – 2:30pm, Room E51-315
Recitation: Fridays, 2:00pm – 3:00pm, Room E51-335
Section B
Lecture: Mondays and Wednesdays, 2:30pm – 4:00pm, Room E51-315
Recitation: Fridays, 3:00pm – 4:00pm, Room E51-335
Instructors
Dimitris Bertsimas, E40-147, dbertsim@mit.edu, (617) 253-4223
Allison O’Hair, E40-111, akohair@mit.edu, (617) 452-2116
Teaching Assistants
TBA
Course Description
In the last decade, the amount of data available to organizations has reached
unprecedented levels. Companies and individuals who can use this data together with
analytics give themselves an edge over the competition. In this class, we examine real
world examples of how analytics have been used to transform a business or industry.
These examples include Moneyball, eHarmony, the Framingham Heart Study, Twitter,
IBM Watson, and Netflix. Through these examples and many more, we cover the
following analytics methods and how to implement them: linear regression, logistic
regression, trees, text analytics, clustering, visualization, and optimization.
Readings
The readings are chapters from the following book:
Dimitris Bertsimas, Allison O’Hair and Bill Pulleyblank, The Analytics Edge,
Dynamic Ideas, March 2015.
We refer to the book below as the AE book. Electronic copies of some of the book
chapters are available on the Stellar course webpage (please do not distribute without
permission from the authors). We will also provide a copy of the “Analytics Edge R
Manual” on Stellar.
Contents
1. February 4, 2015 Lecture 1 – Introduction to the Analytics Edge
In the first lecture, we will discuss the logistics and goals of the class, the recent impact
of analytics, and the examples that will be covered during the semester. We will then
discuss analytics software, and start working in R. In preparation for this class, you will
need to install R on your personal computer (instructions on Stellar) and download the
datasets provided on Stellar. The reading for Lecture 1 is the first section of the Analytics
Edge R Manual titled “Introduction to R”.
2. February 9, 2015 Lecture 2 – Predicting Wine Quality
We’ll review linear regression, discuss how linear regression can be used to predict the
quality of wine, and cover linear regression in R. Download the dataset provided on
Stellar so you can follow along in class. The readings for Lecture 2 are the first section of
Chapter 1 of the AE book, titled “Predicting the Quality and Prices of Wine,” the first
section of Chapter 21 of the AE book, titled “Multiple Linear Regression,” and the
second section of the Analytics Edge R Manual titled “Linear Regression in R”.
3. February 11, 2015 Lecture 3 – Moneyball
We will discuss how the Oakland A’s used analytics to become a competitive baseball
team, and how these techniques can be applied to other sports. The reading for Lecture 3
is Chapter 4 of the AE book, titled “How to Evaluate Championship Players.”
4. February 17, 2015 Lecture 4 – The Framingham Heart Study
(NOTE: This class is on Tuesday due to President’s Day)
We will discuss the Framingham Heart Study, which led to one of the top 10 cardiology
advances of the 1900s, and paved the way for clinical decision rules. Through this
example, we’ll start discussing the method of logistic regression, and we’ll use the
original Framingham Heart Study data to build logistic regression models in R. The
readings for Lecture 4 are Chapter 7 of the AE book titled “The Framingham Heart
Study”, the second section of Chapter 21 of the AE book, titled “Logistic Regression”,
and the “Logistic Regression in R” section of the Analytics Edge R Manual.
5. February 18, 2015 Lecture 5 – Quality of Healthcare
We will discuss how analytics can be used to model the expertise of a physician and
predict the quality of healthcare. Through this example, we will continue to discuss the
method of logistic regression. The reading for Lecture 5 is the second section of Chapter
1 of the AE book, titled “Assessing Quality in Healthcare”.
6. February 23, 2015 Lecture 6 – The Supreme Court
We discuss how a group of academics predicted the outcomes of the United States
Supreme Court. Through this example, we will discuss the analytical methods of CART
and Random Forests, and then use data for Supreme Court cases to build models in R.
The readings for Lecture 6 are the third section of Chapter 1 of the AE book, titled
“Forecasting Supreme Court Decisions,” the third section of Chapter 21 of the AE book,
titled “CART and Random Forests” and the “Trees in R” section of the Analytics Edge R
Manual.
7. February 25, 2015 Lecture 7 – D2Hawkeye
We will present the story of D2Hawkeye, a medical data mining company Dimitris
Bertsimas was involved in from 2001-2009, and present how analytics methods,
specifically CART, were used to predict medical knowledge for individual patients. The
reading for Lecture 7 is Chapter 8 of the AE book, titled “Predicting Healthcare Costs.”
8. March 2, 2015 Lecture 8 – Twitter Sentiment Detection
We present how tweets on the social networking site Twitter can be used to understand
public perception and analyze sentiment. Through this example, we’ll introduce the
method of text analytics, and use tweets in R to build models.
9. March 4, 2015 Lecture 9 – The eDiscovery Problem
In Lecture 9, we discuss how text analytics is being used to find files relevant to a
lawsuit. Specifically, we’ll discuss the story of Enron, and how analytics can be used to
detect relevant emails and provide evidence for a legal case.
10. March 9, 2015 Lecture 10 – Netflix and Clustering
We will discuss the Netflix Prize and recommendation systems in general. As an example
of a type of recommendation system, we introduce the method of clustering. The readings
for Lecture 10 are Chapter 13 of the AE book, titled “Recommendations Worth a
Million,” the fourth section of Chapter 21 of the AE book, titled “Clustering” and the
“Clustering in R” section of the Analytics Edge R Manual.
11. March 11, 2015 Lecture 11 – Patterns of Heart Attacks
We present how analytics have been used to understand the patterns of heart attacks. The
reading for Lecture 11 is Chapter 9 of the AE book, titled “Medical Monitoring and
Predictive Diagnosing.”
NO CLASS from March 16 – March 27 due to SIP week and Spring Break.
12. March 30, 2015 Lecture 12 – Fraud Detection
This week, we will discuss examples that have successfully combined many different
analytics methods to create an edge. We will first discuss how predictive methods and
clustering have been used to construct sophisticated algorithms for fraud detection. The
reading for Lecture 21 is Chapter 14 of the AE book, titled “Fraud Detection”.
13. April 1, 2015 Lecture 13 – IBM Watson
We will discuss how IBM build a computer that could beat the best human players at
Jeopardy, a game known for testing human knowledge and reasoning. The reading for
Lecture 13 is Chapter 3 of the AE book, titled “What is Watson?”.
14. April 6, 2015 Lecture 14 – The Power of Visualization
We will discuss the power of visualizations, specifically for WHO, the World Health
Organization. Through this example, we’ll learn how to create visualizations in R.
15. April 8, 2015 Lecture 15 – Data-Driven Policing
We will discuss the use of analytics and visualization in policing, specifically, we’ll
create heat maps, or “hot spot” maps. These maps are currently being used by police
departments all over the country to allocate resources. The reading for Lecture 15 is
Chapter 15 of the AE book, titled “Predictive Policing”.
16. April 13, 2015 Lecture 16 – Sports Scheduling
We will discuss how professional sports use integer optimization to design sports
schedules, and how analytics methods can significantly outperform human scheduling.
Through this example, we’ll learn how to solve optimization models in a powerful
modeling language.
17. April 15, 2015 Lecture 17 – Revenue Management
We will discuss how optimization is used for revenue management, and how airlines and
casinos have relied on the power of analytics to create a competitive edge. The reading
for Lecture 17 is Chapter 17 of the AE book.
18. April 22, 2015 Lecture 18 – eHarmony
We will discuss how the online dating site eHarmony uses logistic regression and
optimization to predict the probability of love and find perfect matches. Through this
example, we’ll see how the results of a predictive model can be used in an optimization
model to make optimal decisions.
19. April 27, 2015 Lecture 19 – The MIT Blackjack Team
We will discuss how a group of MIT students made millions playing blackjack, and how
strategies were developed using data and simulation. The reading for Lecture 19 is
Chapter 6 of the AE book, titled “The MIT Blackjack Team.”
20. April 29, 2015 Lecture 20 – Emergency Room Operations
We will discuss how simulations and analytics can be used to understand the operations
in an emergency room, and to analyze the effects of different decisions on patient care
and hospital efficiency. The reading for Lecture 20 is Chapter 18 of the AE book.
21. May 4, 2015 Lecture 21 – Social Networks
We will discuss social networks, specifically how the social networks of gangs can be
used to better understand gang dynamics and combat crime. We will also discuss the use
of social networks in other applications. The reading for Lecture 21 is Chapter 16 of the
AE book.
22. May 6, 2015 Lecture 22 – Analytics in Finance
We will discuss the use of analytics in finance, including asset management and options
pricing. The readings for Lecture 22 are Chapters 19 and 20 of the AE book.
23. May 11, 2015 Student Project Presentations
During this lecture, selected students will make 15 minute presentations of their projects.
24. May 13, 2015 Student Project Presentations
During this lecture, selected students will make 15 minute presentations of their projects.
Recitations:
Recitations will be held on Fridays in Room E51-335 (2pm – 3pm for Section A, and
3pm – 4pm for Section B).
The recitations will be interactive sessions, covering additional examples on the analytics
methods learned in class, and how to create models in R. Attendance is strongly
encouraged.
Assignments:
There will be seven homework assignments, and a final project in teams of two.
The following are tentative due dates and topics for the homework assignments:
• February 17: Data analysis and linear regression in R.
• February 23: Logistic Regression.
• March 2: CART and Random Forests.
• March 9: Text analytics.
• March 30: Clustering.
• April 13: Visualization.
• April 27: Optimization.
All homework assignments are due by the beginning of class on the date assigned.
For the final project, by March 11, each team will submit a one page proposal that
outlines a plan to apply analytical methods to a problem you identify using some of the
concepts and tools discussed in the course. It should include a description of: (1) the
problem, (2) the data that you have or plan to collect to solve the problem, (3) which
analytic techniques you plan to use, and (4) the impact or overall goal of the project (if
you could build a perfect model, what would it be able to do?). The teaching staff will be
available to answer questions over email, and will provide all students with electronic
feedback by March 20.
The week of April 13, each project team will set up a meeting with a member of the
teaching team to show your progress applying the analytical methods you have learned to
your project topic. This meeting is intended to help you progress on your project.
The final project submission consists of a written report of at most 4 pages (not including
appendices) that describes your analysis, as well as a 15 minute presentation (in
powerpoint or pdf format) of your project. Unfortunately, due to time constraints, we will
not be able to have all student teams present in class. However, ALL TEAMS should be
prepared to give a 15 minute presentation on May 11 or May 13, and all teams are
required to submit their presentation for a grade.
To determine who will present on May 11 and May 13, by midnight on Thursday May 7,
each team will electronically submit a) a 1 page abstract summarizing their project
(including the scope and idea of your project, what analytical methods/models you used,
and your results), and b) the presentation. The abstracts will be uploaded to the class
website. Students will vote by the end of the day on Sunday, May 10 about which
projects they would like to see presented in class. The teaching team will vote as well
(taking the abstracts and presentations into account), and the presenters will be notified in
real-time during class on May 11 and May 13.
Office Hours:
Allison: Mondays and Wednesdays, 9:30am – 10:30am in E40-111.
Teaching Assistants: TBD
We are also always available by appointment and email.
Policy on Individual Work:
In the case of homework assignments, your assignment must represent your own
individual work. Although you may discuss homework problems with other students,
assignments must represent your own work. Copying from another individual or from
any outside source (including past homework solutions) constitutes a violation of the
Policy on Individual Work. Any student who copies or knowingly allows his/her work to
be copied will receive an F grade for the assignment. If there is a second offense, the
student will receive an F grade in the course.
You may find it useful to discuss broad conceptual issues and general solution procedures with
others. If this is the case, then we enthusiastically recommend that you do so. The objective
here is to learn. In our opinion (and personal experiences), the material of this class is best
learned through individual practice and exposure to a variety of application contexts.
Class Participation and Conduct
Your class participation will be evaluated subjectively, but will rely upon measures of
punctuality, attendance, familiarity with the readings, relevance and insight reflected in
classroom questions, and commentary. Relative differences in technical background will
not be a criterion. Although several lectures will be didactic, we will rely heavily upon
interactive discussion within the class. Students will be expected to be familiar with the
readings, even though they might not understand all of the material in advance. In
general, questions and comments are encouraged.
We will require you to bring and use a personal laptop in some class sessions. However,
if we are not using laptops together as a class, we expect your laptops to be closed or only
used for class materials.
Grading:
Grades for the course will be based upon participation (10%), homework assignments
(50%), and the final project (40%).
Prerequisites:
It is highly recommended that students have taken 15.060 (Data, Models and Decisions),
or basic statistics and optimization courses.

Contenu connexe

Similaire à 15071

Pin On Sop For MBA Sample. Online assignment writing service.
Pin On Sop For MBA Sample. Online assignment writing service.Pin On Sop For MBA Sample. Online assignment writing service.
Pin On Sop For MBA Sample. Online assignment writing service.Tiffany Carpenter
 
Essay On Gang Violence.pdf
Essay On Gang Violence.pdfEssay On Gang Violence.pdf
Essay On Gang Violence.pdfAshley Ito
 
Detection of Fake Accounts in Instagram Using Machine Learning
Detection of Fake Accounts in Instagram Using Machine LearningDetection of Fake Accounts in Instagram Using Machine Learning
Detection of Fake Accounts in Instagram Using Machine LearningAIRCC Publishing Corporation
 
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGDETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGijcsit
 
Links to Estimation Techniques Tim Shaughnessy, Chapter 7 .docx
Links to Estimation Techniques Tim Shaughnessy, Chapter 7 .docxLinks to Estimation Techniques Tim Shaughnessy, Chapter 7 .docx
Links to Estimation Techniques Tim Shaughnessy, Chapter 7 .docxsmile790243
 
A Brief Survey Paper on Sentiment Analysis.pdf
A Brief Survey Paper on Sentiment Analysis.pdfA Brief Survey Paper on Sentiment Analysis.pdf
A Brief Survey Paper on Sentiment Analysis.pdfJill Brown
 
Can I Pay Someone To Write My Research Paper - The
Can I Pay Someone To Write My Research Paper - TheCan I Pay Someone To Write My Research Paper - The
Can I Pay Someone To Write My Research Paper - TheLaura Smith
 
Easybib Essay Checker
Easybib Essay CheckerEasybib Essay Checker
Easybib Essay CheckerDebbie White
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET Journal
 
Child Labour Essay In English Very Simple - YouTube
Child Labour Essay In English Very Simple - YouTubeChild Labour Essay In English Very Simple - YouTube
Child Labour Essay In English Very Simple - YouTubeAaron Anyaakuu
 
Unit 3 Qualitative Data
Unit 3 Qualitative DataUnit 3 Qualitative Data
Unit 3 Qualitative DataSherry Bailey
 
We need Paper on Risk Assessment for the organization (NASA). Th.docx
We need Paper on Risk Assessment for the organization (NASA). Th.docxWe need Paper on Risk Assessment for the organization (NASA). Th.docx
We need Paper on Risk Assessment for the organization (NASA). Th.docxcelenarouzie
 
Applications of machine learning
Applications of machine learningApplications of machine learning
Applications of machine learningbusiness Corporate
 
Measuring Performance at Intuit A Value-Added Component in ERM Pr.docx
Measuring Performance at Intuit A Value-Added Component in ERM Pr.docxMeasuring Performance at Intuit A Value-Added Component in ERM Pr.docx
Measuring Performance at Intuit A Value-Added Component in ERM Pr.docxalfredacavx97
 
Written Assignment #7 How to evaluate inferential and desc.docx
Written Assignment #7  How to evaluate inferential and desc.docxWritten Assignment #7  How to evaluate inferential and desc.docx
Written Assignment #7 How to evaluate inferential and desc.docxjeffevans62972
 
Identify the types of graphsand statistics that areappropr
Identify the types of graphsand statistics that areapproprIdentify the types of graphsand statistics that areappropr
Identify the types of graphsand statistics that areapproprMalikPinckney86
 

Similaire à 15071 (20)

Pin On Sop For MBA Sample. Online assignment writing service.
Pin On Sop For MBA Sample. Online assignment writing service.Pin On Sop For MBA Sample. Online assignment writing service.
Pin On Sop For MBA Sample. Online assignment writing service.
 
Applied statistics
Applied statisticsApplied statistics
Applied statistics
 
Essay On Gang Violence.pdf
Essay On Gang Violence.pdfEssay On Gang Violence.pdf
Essay On Gang Violence.pdf
 
Detection of Fake Accounts in Instagram Using Machine Learning
Detection of Fake Accounts in Instagram Using Machine LearningDetection of Fake Accounts in Instagram Using Machine Learning
Detection of Fake Accounts in Instagram Using Machine Learning
 
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGDETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
 
Essay On Math 533
Essay On Math 533Essay On Math 533
Essay On Math 533
 
Links to Estimation Techniques Tim Shaughnessy, Chapter 7 .docx
Links to Estimation Techniques Tim Shaughnessy, Chapter 7 .docxLinks to Estimation Techniques Tim Shaughnessy, Chapter 7 .docx
Links to Estimation Techniques Tim Shaughnessy, Chapter 7 .docx
 
A Brief Survey Paper on Sentiment Analysis.pdf
A Brief Survey Paper on Sentiment Analysis.pdfA Brief Survey Paper on Sentiment Analysis.pdf
A Brief Survey Paper on Sentiment Analysis.pdf
 
Can I Pay Someone To Write My Research Paper - The
Can I Pay Someone To Write My Research Paper - TheCan I Pay Someone To Write My Research Paper - The
Can I Pay Someone To Write My Research Paper - The
 
Abstract Essay
Abstract EssayAbstract Essay
Abstract Essay
 
Easybib Essay Checker
Easybib Essay CheckerEasybib Essay Checker
Easybib Essay Checker
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
 
Child Labour Essay In English Very Simple - YouTube
Child Labour Essay In English Very Simple - YouTubeChild Labour Essay In English Very Simple - YouTube
Child Labour Essay In English Very Simple - YouTube
 
Unit 3 Qualitative Data
Unit 3 Qualitative DataUnit 3 Qualitative Data
Unit 3 Qualitative Data
 
We need Paper on Risk Assessment for the organization (NASA). Th.docx
We need Paper on Risk Assessment for the organization (NASA). Th.docxWe need Paper on Risk Assessment for the organization (NASA). Th.docx
We need Paper on Risk Assessment for the organization (NASA). Th.docx
 
Applications of machine learning
Applications of machine learningApplications of machine learning
Applications of machine learning
 
Measuring Performance at Intuit A Value-Added Component in ERM Pr.docx
Measuring Performance at Intuit A Value-Added Component in ERM Pr.docxMeasuring Performance at Intuit A Value-Added Component in ERM Pr.docx
Measuring Performance at Intuit A Value-Added Component in ERM Pr.docx
 
Written Assignment #7 How to evaluate inferential and desc.docx
Written Assignment #7  How to evaluate inferential and desc.docxWritten Assignment #7  How to evaluate inferential and desc.docx
Written Assignment #7 How to evaluate inferential and desc.docx
 
Special topics in finance lec 1
Special topics in finance   lec 1Special topics in finance   lec 1
Special topics in finance lec 1
 
Identify the types of graphsand statistics that areappropr
Identify the types of graphsand statistics that areapproprIdentify the types of graphsand statistics that areappropr
Identify the types of graphsand statistics that areappropr
 

15071

  • 1. 15.071 THE ANALYTICS EDGE SPRING 2015 Class Time Section A Lecture: Mondays and Wednesdays, 1:00pm – 2:30pm, Room E51-315 Recitation: Fridays, 2:00pm – 3:00pm, Room E51-335 Section B Lecture: Mondays and Wednesdays, 2:30pm – 4:00pm, Room E51-315 Recitation: Fridays, 3:00pm – 4:00pm, Room E51-335 Instructors Dimitris Bertsimas, E40-147, dbertsim@mit.edu, (617) 253-4223 Allison O’Hair, E40-111, akohair@mit.edu, (617) 452-2116 Teaching Assistants TBA Course Description In the last decade, the amount of data available to organizations has reached unprecedented levels. Companies and individuals who can use this data together with analytics give themselves an edge over the competition. In this class, we examine real world examples of how analytics have been used to transform a business or industry. These examples include Moneyball, eHarmony, the Framingham Heart Study, Twitter, IBM Watson, and Netflix. Through these examples and many more, we cover the following analytics methods and how to implement them: linear regression, logistic regression, trees, text analytics, clustering, visualization, and optimization. Readings The readings are chapters from the following book: Dimitris Bertsimas, Allison O’Hair and Bill Pulleyblank, The Analytics Edge, Dynamic Ideas, March 2015.
  • 2. We refer to the book below as the AE book. Electronic copies of some of the book chapters are available on the Stellar course webpage (please do not distribute without permission from the authors). We will also provide a copy of the “Analytics Edge R Manual” on Stellar. Contents 1. February 4, 2015 Lecture 1 – Introduction to the Analytics Edge In the first lecture, we will discuss the logistics and goals of the class, the recent impact of analytics, and the examples that will be covered during the semester. We will then discuss analytics software, and start working in R. In preparation for this class, you will need to install R on your personal computer (instructions on Stellar) and download the datasets provided on Stellar. The reading for Lecture 1 is the first section of the Analytics Edge R Manual titled “Introduction to R”. 2. February 9, 2015 Lecture 2 – Predicting Wine Quality We’ll review linear regression, discuss how linear regression can be used to predict the quality of wine, and cover linear regression in R. Download the dataset provided on Stellar so you can follow along in class. The readings for Lecture 2 are the first section of Chapter 1 of the AE book, titled “Predicting the Quality and Prices of Wine,” the first section of Chapter 21 of the AE book, titled “Multiple Linear Regression,” and the second section of the Analytics Edge R Manual titled “Linear Regression in R”. 3. February 11, 2015 Lecture 3 – Moneyball We will discuss how the Oakland A’s used analytics to become a competitive baseball team, and how these techniques can be applied to other sports. The reading for Lecture 3 is Chapter 4 of the AE book, titled “How to Evaluate Championship Players.” 4. February 17, 2015 Lecture 4 – The Framingham Heart Study (NOTE: This class is on Tuesday due to President’s Day) We will discuss the Framingham Heart Study, which led to one of the top 10 cardiology advances of the 1900s, and paved the way for clinical decision rules. Through this example, we’ll start discussing the method of logistic regression, and we’ll use the original Framingham Heart Study data to build logistic regression models in R. The readings for Lecture 4 are Chapter 7 of the AE book titled “The Framingham Heart Study”, the second section of Chapter 21 of the AE book, titled “Logistic Regression”, and the “Logistic Regression in R” section of the Analytics Edge R Manual. 5. February 18, 2015 Lecture 5 – Quality of Healthcare
  • 3. We will discuss how analytics can be used to model the expertise of a physician and predict the quality of healthcare. Through this example, we will continue to discuss the method of logistic regression. The reading for Lecture 5 is the second section of Chapter 1 of the AE book, titled “Assessing Quality in Healthcare”. 6. February 23, 2015 Lecture 6 – The Supreme Court We discuss how a group of academics predicted the outcomes of the United States Supreme Court. Through this example, we will discuss the analytical methods of CART and Random Forests, and then use data for Supreme Court cases to build models in R. The readings for Lecture 6 are the third section of Chapter 1 of the AE book, titled “Forecasting Supreme Court Decisions,” the third section of Chapter 21 of the AE book, titled “CART and Random Forests” and the “Trees in R” section of the Analytics Edge R Manual. 7. February 25, 2015 Lecture 7 – D2Hawkeye We will present the story of D2Hawkeye, a medical data mining company Dimitris Bertsimas was involved in from 2001-2009, and present how analytics methods, specifically CART, were used to predict medical knowledge for individual patients. The reading for Lecture 7 is Chapter 8 of the AE book, titled “Predicting Healthcare Costs.” 8. March 2, 2015 Lecture 8 – Twitter Sentiment Detection We present how tweets on the social networking site Twitter can be used to understand public perception and analyze sentiment. Through this example, we’ll introduce the method of text analytics, and use tweets in R to build models. 9. March 4, 2015 Lecture 9 – The eDiscovery Problem In Lecture 9, we discuss how text analytics is being used to find files relevant to a lawsuit. Specifically, we’ll discuss the story of Enron, and how analytics can be used to detect relevant emails and provide evidence for a legal case. 10. March 9, 2015 Lecture 10 – Netflix and Clustering We will discuss the Netflix Prize and recommendation systems in general. As an example of a type of recommendation system, we introduce the method of clustering. The readings for Lecture 10 are Chapter 13 of the AE book, titled “Recommendations Worth a Million,” the fourth section of Chapter 21 of the AE book, titled “Clustering” and the “Clustering in R” section of the Analytics Edge R Manual. 11. March 11, 2015 Lecture 11 – Patterns of Heart Attacks
  • 4. We present how analytics have been used to understand the patterns of heart attacks. The reading for Lecture 11 is Chapter 9 of the AE book, titled “Medical Monitoring and Predictive Diagnosing.” NO CLASS from March 16 – March 27 due to SIP week and Spring Break. 12. March 30, 2015 Lecture 12 – Fraud Detection This week, we will discuss examples that have successfully combined many different analytics methods to create an edge. We will first discuss how predictive methods and clustering have been used to construct sophisticated algorithms for fraud detection. The reading for Lecture 21 is Chapter 14 of the AE book, titled “Fraud Detection”. 13. April 1, 2015 Lecture 13 – IBM Watson We will discuss how IBM build a computer that could beat the best human players at Jeopardy, a game known for testing human knowledge and reasoning. The reading for Lecture 13 is Chapter 3 of the AE book, titled “What is Watson?”. 14. April 6, 2015 Lecture 14 – The Power of Visualization We will discuss the power of visualizations, specifically for WHO, the World Health Organization. Through this example, we’ll learn how to create visualizations in R. 15. April 8, 2015 Lecture 15 – Data-Driven Policing We will discuss the use of analytics and visualization in policing, specifically, we’ll create heat maps, or “hot spot” maps. These maps are currently being used by police departments all over the country to allocate resources. The reading for Lecture 15 is Chapter 15 of the AE book, titled “Predictive Policing”. 16. April 13, 2015 Lecture 16 – Sports Scheduling We will discuss how professional sports use integer optimization to design sports schedules, and how analytics methods can significantly outperform human scheduling. Through this example, we’ll learn how to solve optimization models in a powerful modeling language. 17. April 15, 2015 Lecture 17 – Revenue Management We will discuss how optimization is used for revenue management, and how airlines and casinos have relied on the power of analytics to create a competitive edge. The reading for Lecture 17 is Chapter 17 of the AE book.
  • 5. 18. April 22, 2015 Lecture 18 – eHarmony We will discuss how the online dating site eHarmony uses logistic regression and optimization to predict the probability of love and find perfect matches. Through this example, we’ll see how the results of a predictive model can be used in an optimization model to make optimal decisions. 19. April 27, 2015 Lecture 19 – The MIT Blackjack Team We will discuss how a group of MIT students made millions playing blackjack, and how strategies were developed using data and simulation. The reading for Lecture 19 is Chapter 6 of the AE book, titled “The MIT Blackjack Team.” 20. April 29, 2015 Lecture 20 – Emergency Room Operations We will discuss how simulations and analytics can be used to understand the operations in an emergency room, and to analyze the effects of different decisions on patient care and hospital efficiency. The reading for Lecture 20 is Chapter 18 of the AE book. 21. May 4, 2015 Lecture 21 – Social Networks We will discuss social networks, specifically how the social networks of gangs can be used to better understand gang dynamics and combat crime. We will also discuss the use of social networks in other applications. The reading for Lecture 21 is Chapter 16 of the AE book. 22. May 6, 2015 Lecture 22 – Analytics in Finance We will discuss the use of analytics in finance, including asset management and options pricing. The readings for Lecture 22 are Chapters 19 and 20 of the AE book. 23. May 11, 2015 Student Project Presentations During this lecture, selected students will make 15 minute presentations of their projects. 24. May 13, 2015 Student Project Presentations During this lecture, selected students will make 15 minute presentations of their projects. Recitations: Recitations will be held on Fridays in Room E51-335 (2pm – 3pm for Section A, and 3pm – 4pm for Section B).
  • 6. The recitations will be interactive sessions, covering additional examples on the analytics methods learned in class, and how to create models in R. Attendance is strongly encouraged. Assignments: There will be seven homework assignments, and a final project in teams of two. The following are tentative due dates and topics for the homework assignments: • February 17: Data analysis and linear regression in R. • February 23: Logistic Regression. • March 2: CART and Random Forests. • March 9: Text analytics. • March 30: Clustering. • April 13: Visualization. • April 27: Optimization. All homework assignments are due by the beginning of class on the date assigned. For the final project, by March 11, each team will submit a one page proposal that outlines a plan to apply analytical methods to a problem you identify using some of the concepts and tools discussed in the course. It should include a description of: (1) the problem, (2) the data that you have or plan to collect to solve the problem, (3) which analytic techniques you plan to use, and (4) the impact or overall goal of the project (if you could build a perfect model, what would it be able to do?). The teaching staff will be available to answer questions over email, and will provide all students with electronic feedback by March 20. The week of April 13, each project team will set up a meeting with a member of the teaching team to show your progress applying the analytical methods you have learned to your project topic. This meeting is intended to help you progress on your project. The final project submission consists of a written report of at most 4 pages (not including appendices) that describes your analysis, as well as a 15 minute presentation (in powerpoint or pdf format) of your project. Unfortunately, due to time constraints, we will not be able to have all student teams present in class. However, ALL TEAMS should be prepared to give a 15 minute presentation on May 11 or May 13, and all teams are required to submit their presentation for a grade. To determine who will present on May 11 and May 13, by midnight on Thursday May 7, each team will electronically submit a) a 1 page abstract summarizing their project (including the scope and idea of your project, what analytical methods/models you used, and your results), and b) the presentation. The abstracts will be uploaded to the class website. Students will vote by the end of the day on Sunday, May 10 about which projects they would like to see presented in class. The teaching team will vote as well
  • 7. (taking the abstracts and presentations into account), and the presenters will be notified in real-time during class on May 11 and May 13. Office Hours: Allison: Mondays and Wednesdays, 9:30am – 10:30am in E40-111. Teaching Assistants: TBD We are also always available by appointment and email. Policy on Individual Work: In the case of homework assignments, your assignment must represent your own individual work. Although you may discuss homework problems with other students, assignments must represent your own work. Copying from another individual or from any outside source (including past homework solutions) constitutes a violation of the Policy on Individual Work. Any student who copies or knowingly allows his/her work to be copied will receive an F grade for the assignment. If there is a second offense, the student will receive an F grade in the course. You may find it useful to discuss broad conceptual issues and general solution procedures with others. If this is the case, then we enthusiastically recommend that you do so. The objective here is to learn. In our opinion (and personal experiences), the material of this class is best learned through individual practice and exposure to a variety of application contexts. Class Participation and Conduct Your class participation will be evaluated subjectively, but will rely upon measures of punctuality, attendance, familiarity with the readings, relevance and insight reflected in classroom questions, and commentary. Relative differences in technical background will not be a criterion. Although several lectures will be didactic, we will rely heavily upon interactive discussion within the class. Students will be expected to be familiar with the readings, even though they might not understand all of the material in advance. In general, questions and comments are encouraged. We will require you to bring and use a personal laptop in some class sessions. However, if we are not using laptops together as a class, we expect your laptops to be closed or only used for class materials. Grading: Grades for the course will be based upon participation (10%), homework assignments (50%), and the final project (40%).
  • 8. Prerequisites: It is highly recommended that students have taken 15.060 (Data, Models and Decisions), or basic statistics and optimization courses.