SlideShare une entreprise Scribd logo
1  sur  35
STATISTICAL MEASUREMENTS,
ANALYSIS & RESEARCH
Final Presentation
 Weixi Tan
 Net ID: wt2084
 NYU SPS Integrated Marketing
 Professor: Luyao Zhang
Outline
 Part I: Introduction
 Part II: Summary of course takeaway
 Part III: Market research report: Regression analysis
 Part VI: Appendix
Part I: Introduction
 Weixi (Vicky) Tan comes from Chongqing, China. She received her Bachelor's degree in statistics.
She has interned as a reporter for CQNEWS.Net, which is Chongqing's 1st and largest portal news
website, independently completing several interviewing and news reporting tasks at the 2019 Smart
China exposition for her clients, China Telecom Co., Ltd. During the process, she got involved in
news media marketing. She was one of the persons in charge of a photography studio called Match,
which is a university students’ innovative undertaking program. Its main business are taking
professional photographs or commercial shoot for large-scale events and each year’s graduation
season. Her operation at Match has driven the studio to develop from a small studio in its primary
start-up phase into one dominating university market and also gain increasing popularity beyond
campus. Additionally, she felt the significance of marketing planning and execution for businesses
and began to pursuit a real marketing career after a systematic study of marketing at the graduate
level at NYU. She is passionate about photography, volunteering, traveling, and exploring new things.
 LinkedIn URL:https://www.linkedin.com/in/weixi-tan-5384911a4/
 Github Repo URL:https://github.com/WeixiTan/NYU_Integrated_Marketing
 Kaggle Notebook URL:https://www.kaggle.com/weixitan/customer-segementation-wt2084
Part II: Summary of course takeaway
I draw several simple mind maps:
 Use tools such as Google data studio, Github, Kaggle and apply python codes.
 T-test is very basic and important in Hypothesis Testing for testing continuous variables.
 If we want to demonstrate the relativity of variables, we should use relevant analysis. Regression analysis
should be used if we want to reflect how much one variable affects another.
 Cluster analysis is a class of techniques used to classify and segment our target customers.
 Different methods are used for different types of data sets and each model has its own applicable conditions. As
a market analyst, you may try many methods and models and fail many times to find the most appropriate way
to help your company make the best marketing decisions.
 Statistical analysis and research provide me quantitative and qualitative techniques for developing consumer
insights, determining market potential, maximizing market share and building customer relationships in an
integrated-marketing environment.
 If I want to engage in analytical positions in the future, I still need more statistical analysis knowledge and
ability to improve my statistical thinking. This course really help me and make me realized the importance of
statical analysis in further work.
 In today‘s big data age, mastering some data analytics technologies learned in this course can enhance our
competitiveness when applying for a great job.
Part II: Summary of course takeaway
Key learning and my takeaway for personal and professional growth:
Executive Summary
 The URL to the data source:
 https://www.kaggle.com/rafailmahammadli/regression
 This is a Kaggle data sets for advertising data contain information about Sales of a product in 200 different
markets and the advertising budgets for the product in TV, Radio and Newspaper. Sales are measured in
thousands of units, the advertising budgets in thousands of dollars.
 We use linear regression in this report to test the correlations. The results shows that TV and Sales are
correlated, and Radio and Sales are correlated, which implies we should pay more attention on TV and
Radio advertising.
 Github Repo Link:https://github.com/WeixiTan/NYU_Integrated_Marketing
Part III: Market research report: Regression analysis
Research Design and The Data
Google Data studio Link :
https://datastudio.google.com/embed/reporting/02f262af-782f-4785-8904-
e45a433b21ee/page/1tUrB
Abstract: This data is about Sales of a product in 200
different markets and the advertising budgets for the
product in TV, Radio and Newspaper. By visualizing
secondary data from Kaggle, we find TV advertising
budget is much bigger than radio and newspaper.
We want to research on the relationship between sales
and other three variables and which media channel has
a further impact on product sales.
We use linear regression in this report to test the
correlations. The results shows that TV and Sales are
correlated, and Radio and Sales are correlated.
Through such research, product managers could make
better marketing decisions like on which channel to put
more budget.
Scatter plots
We draw threes scatter plots, first is for TV and sales, second is for radio and sales, third is for newspaper and sales.
Scatter plot 1 shows that TV advertising and sales have a linear relationship.
Scatter plot 2 shows that radio advertising and sales have a linear relationship.
Scatter plot 3 shows that newspaper advertising and sales have a linear relationship but not so clear.
Regression result
From the regression result, we can see the p value
of TV is 0 which smaller than 0.05, we can reject
the null hypothesis that TV and sales are not
correlated.
The p value of radio is 0 which smaller than 0.05,
we can reject the null hypothesis that radio and
sales are not correlated.
The p value of newspaper is 0.860 which greater
than 0.05, we can not reject the null hypothesis
that newspaper and sales are not correlated.
Insights
 From the regression result, we can conclude that TV advertising and radio advertising
do have significant effect on product sales with 95% confidence level, but newspaper
advertising and sales are not correlated.
 This conclusion implies that traditional media also play an important role in marketing
campaign, and company should pay more attention on TV and radio advertising instead
of newspaper advertising.
Assumptions Check
 Then we further check the 6 assumptions of the linear model.
 Results show assumption 2 and 3 are likely to be satisfied, but assumption 1, 4 and 6 are not likely to be
satisfied.
For assumption 1, the error term is not normally
distributed. For each fixed value of X, the distribution
of Y is not normal.
For assumption 3, the mean of the error term is 0.
Assumptions Check
 For assumption 2, from the scatter plots above, the means of all these normal distributions of Y, given X,
lie on a straight line. So TV and sales have linear relationship, radio and sales have linear relationship, and
newspaper and sales have linear relationship.
• For assumption 4, The variance of the error term is not so constant. This variance depend on the values
assumed by X.
For assumption 5, the data set is not for time
series data, so we omitted here.
Assumptions Check
• For assumption 6, TV and radio are not correlated, also TV and newspaper are not correlated,
but radio and newspaper are correlated. Maybe there are some issues of multi-collinearity.
Further research:
 From the regression result, the p value of newspaper is
0.860 which greater than 0.05, we can not reject the null
hypothesis that newspaper and sales are not correlated.
And through the assumption check, some assumptions
are not likely to be satisfied.
 We should consider that this linear regression model is
not so valid, maybe we can remove the variable
(newspaper) which don’t have significant impact on
product sales.
 Besides, based on the scatter plot between residuals and
predictions, we can consider non-linear regression to
conduct the research.
 In this research we focus on traditional media but we can
also find more data about new media like social media to
analysis.
Capstone Project Milestone 2: Research Design and The Data
Part VI: Appendix
Capstone Project Milestone 3-Hypothesis Testing
 Data source:
 https://data.world/data-society/bank-marketing-data
The data is related with direct marketing campaigns(phone calls) of Portuguese banking institution.
 https://stats.oecd.org
The data is quarterly growth rates of GDP in volume of G20 countries.
 Tests: Paired T-test; Two sample T-test; Person Test of Correlations.
All of the results showed that there are significant differences.
 Github:
https://github.com/WeixiTan/NYU_Integrated_Marketing
16
Paired T-test
 Because the data is about before-and-after observations on the same sample(measured twice,
resulting in pairs of observations), we pick the paired T-test.
 Conclusion: The p-value = 0.0<0.05, we can reject the null hypothesis that there is no
significant deference between mean GDP level 2018 and 2020.
17
 Because the data is metric data but not paired, and has two groups, we pick two-sample T-Test.
 Conclusion: The P-value<0.05, we can reject the null hypothesis that the mean of the balance equals those who
have loan and those who do not have loan at 0.05 significant level.
18
 Because the two variables are normality distributed with no outlier, so we pick the person for testing.
 Conclusion: We get the result P-value=0.0<0.5, we can reject the null hypothesis that the GDP for the
same country in 2018 and 2020 are not correlated.
19
• Conclusion: For effect size of 0.12, a power of 0.8, and a type Ⅰ error of 0.05, we need a
simple size of 25.
20
For the Two Sample T-test based on bank marketing data:
• The limitations: Bank marketing data can not do the paired T-test since the data is not paired.
• Future research plan: We can decide the simple size through power analysis and collect another data of
bank clients before marketing campaigns to measure the difference before and after the marketing campaign,
thereby measure effect of the marketing campaign(phone calls or something like that).
Capstone Project Milestone 4: Regression
 Data source:
 https://www.kaggle.com/c/customer-churn-prediction-2020/overview
 This is a Kaggle competition data sets for competition in 2020. The original purpose is to predict whether a
customer will change telecommunications provider, something known as "churning". "total_day_calls",
"total_eve_calls" and "total_night_calls" mean "total number of day calls", "total number of evening calls" and
"total number of night calls".
 We use linear regression in this report to test the correlations. The results shows that total day calls and total night
calls are not correlated, and total evening calls and total night calls are not correlated, which implies we should
design customized packages for day calls and night calls.
 Github:
https://github.com/WeixiTan/NYU_Integrated_Marketing
21
Executive Summary
Scatter plots
 We draw two scatterplots, one is for total day calls and
total night calls, another is for total evening calls and
total night calls.
 Scatter plot 1 shows that total day calls and total night
calls have a linear relationship.
 Scatter plot 2 shows that total evening calls and total
night calls have a linear relationship.
Regression result
 From the regression result, we can see
the p value is 0.756 and 0.438 which
greater than 0.05. We can not reject the
null hypothesis that total day calls and
total night calls are not correlated, and
total evening calls and total night calls
are not correlated.
Insights
 From the regression result, we can conclude that total day calls and total evening calls
don’t have significant effect on total night calls with 95% confidence level.
 This conclusion implies that we should design customized packages for day calls,
evening calls and night calls.
Assumptions Check
 For assumption 1 and 3, the error term is normally
distributed. For each fixed value of X, the distribution of Y is
normal. The mean of the error term is 0.
 For assumption 2, from the scatter plots above, the means of
all these normal distributions of Y, given X, lie on a straight
line. So total day calls and total night calls have linear
relationship, and also total evening calls and total night calls
have linear relationship.
Then we further check the 6 assumptions of the linear model.
Results show all the assumptions are likely to be satisfied
 For assumption 4, The variance of the error term
is constant. This variance does not depend on the
values assumed by X.
 For assumption 5, the data set is not for time series
data, so we omitted here.
 For assumption 6, the independent variables in X
are not correlated. This is no issue of multi-
collinearity.
 Further research:
 Though we can design customized packages for different periods, we should do
more further and detailed researches to see how to design different packages. For
example, we can do another linear regression considering whether total day
minutes or total day calls has more significant effect on total day charge to see
whether we should increase price for every minutes or every calls.
Capstone Project Milestone 5: Customer
Segmentation
Executive Summary
• Data source:
https://www.kaggle.com/hellbuoy/online-retail-customer-clustering
This is an online retail transnational data set which contains all the transactions occurring between
01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells
unique all-occasion gifts. Many customers of the company are wholesalers. The business goal is to build a
RFM clustering and choose the best set of customers which the company should target.
We choose K-mean clustering and Hierarchical clustering. The result is that K-mean clustering returns
57 customers and Hierarchical clustering returns 2 customers, which is a much smaller group than the
one that K-Means Clustering return.
• Kaggle Notebook:
https://www.kaggle.com/weixitan/customer-segementation-wt2084 28
K-Means Clustering: Finding the best k- The Elbow
Method
 Since the can’t get return, we choose second k as
the best k, so when metric=“silhoustte”, we get
best k=3.
29
K-Mean Clustering: Interpreting the Clustering
30
By the RFM criteria, we should choose the customer clusters with
a lower recency, a higher frequency and amount. From the K-
means clustering results, we can see that customers with
Cluster_Id 2 best fit the criteria.
K-Mean Clustering: Interpreting the Clustering
 We can see that we k-Means
Clustering returns 57 customers.
31
Hierarchical Clustering
32
• Visualize the dendrogram (tree) by different linkage methods
Hierarchical Clustering: Virtualize and Interpret Result
 By the RFM criteria, we should choose the customer cluster with a lower recency, a higher
frequency and amount.
 From the Hierarchical Clustering results, we can see that customers with Cluster_Id 1 best fit the
high Frequency criteria but customers with Cluster_Id 2 best fit the high Amount criteria .
33
Hierarchical Clustering: Interpreting the Clustering
 We can see that Hierarchical Clustering returns 2 customers, which is a much smaller group than the
one that K-Means Clustering return.
 If the manager value the frequency more, we choose Cluster_Id 2, and the company can provide some
daily discounts for customers in future marketing campaign. If manager more consider the amount,
we choose Cluster_Id 1, the company can provide discount over a certain amout.34
Thanks !
Presented by Weixi Tan

Contenu connexe

Tendances

The Death of a Salesman
The Death of a SalesmanThe Death of a Salesman
The Death of a Salesman
Houston Hunter
 
EC4417 Econometrics Project
EC4417 Econometrics ProjectEC4417 Econometrics Project
EC4417 Econometrics Project
Gearóid Dowling
 
Review Report B
Review Report BReview Report B
Review Report B
Ruoqing Li
 
ECONOMETRICS PROJECT PG2 2015
ECONOMETRICS PROJECT PG2 2015ECONOMETRICS PROJECT PG2 2015
ECONOMETRICS PROJECT PG2 2015
Sayantan Baidya
 

Tendances (18)

4
44
4
 
The Death of a Salesman
The Death of a SalesmanThe Death of a Salesman
The Death of a Salesman
 
EC4417 Econometrics Project
EC4417 Econometrics ProjectEC4417 Econometrics Project
EC4417 Econometrics Project
 
Application of econometrics in business world
Application of econometrics in business worldApplication of econometrics in business world
Application of econometrics in business world
 
James Hamer – Proactive Advisor Magazine – Volume 3, Issue 12
James Hamer – Proactive Advisor Magazine – Volume 3, Issue 12James Hamer – Proactive Advisor Magazine – Volume 3, Issue 12
James Hamer – Proactive Advisor Magazine – Volume 3, Issue 12
 
Data Science - Part VI - Market Basket and Product Recommendation Engines
Data Science - Part VI - Market Basket and Product Recommendation EnginesData Science - Part VI - Market Basket and Product Recommendation Engines
Data Science - Part VI - Market Basket and Product Recommendation Engines
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
 
Email marketing-metrics-benchmark-study-2014-silverpop
Email marketing-metrics-benchmark-study-2014-silverpopEmail marketing-metrics-benchmark-study-2014-silverpop
Email marketing-metrics-benchmark-study-2014-silverpop
 
Statistics for management assignment
Statistics for management assignmentStatistics for management assignment
Statistics for management assignment
 
Qnt275 qnt 275
Qnt275 qnt 275Qnt275 qnt 275
Qnt275 qnt 275
 
Videocon industries limited
Videocon industries limitedVideocon industries limited
Videocon industries limited
 
FYP
FYPFYP
FYP
 
Demand estimation
Demand estimation Demand estimation
Demand estimation
 
Causal Relationship between Stock market and Real Economy in India using Gran...
Causal Relationship between Stock market and Real Economy in India using Gran...Causal Relationship between Stock market and Real Economy in India using Gran...
Causal Relationship between Stock market and Real Economy in India using Gran...
 
Predicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using ClassificationPredicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using Classification
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term deposit
 
Review Report B
Review Report BReview Report B
Review Report B
 
ECONOMETRICS PROJECT PG2 2015
ECONOMETRICS PROJECT PG2 2015ECONOMETRICS PROJECT PG2 2015
ECONOMETRICS PROJECT PG2 2015
 

Similaire à wt2084 final presentation slides

Christine Perkett - PR Analytics and Measurement
Christine Perkett - PR Analytics and MeasurementChristine Perkett - PR Analytics and Measurement
Christine Perkett - PR Analytics and Measurement
INBOUND
 
Econometrics Explained - IPA Report
Econometrics Explained - IPA ReportEconometrics Explained - IPA Report
Econometrics Explained - IPA Report
Think Ethnic
 
CCMarketingResearchReport
CCMarketingResearchReportCCMarketingResearchReport
CCMarketingResearchReport
Ziyao Yang
 

Similaire à wt2084 final presentation slides (20)

Hy2208 final
Hy2208 finalHy2208 final
Hy2208 final
 
Hy2208 Final
Hy2208 FinalHy2208 Final
Hy2208 Final
 
statistical measurement project present
statistical measurement project presentstatistical measurement project present
statistical measurement project present
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Final Presentation Slide--yw5244
Final Presentation Slide--yw5244Final Presentation Slide--yw5244
Final Presentation Slide--yw5244
 
statistical measurement project present
statistical measurement project presentstatistical measurement project present
statistical measurement project present
 
statistical measurement project presentation
statistical measurement project presentationstatistical measurement project presentation
statistical measurement project presentation
 
Yx2489 final presentation slides
Yx2489 final presentation slidesYx2489 final presentation slides
Yx2489 final presentation slides
 
Christine Perkett - PR Analytics and Measurement
Christine Perkett - PR Analytics and MeasurementChristine Perkett - PR Analytics and Measurement
Christine Perkett - PR Analytics and Measurement
 
Start Up Market Analysis Tutorial from Sunstone Communication
Start Up Market Analysis Tutorial from Sunstone CommunicationStart Up Market Analysis Tutorial from Sunstone Communication
Start Up Market Analysis Tutorial from Sunstone Communication
 
PR Analytics and Measurement: A New Data Frontier
PR Analytics and Measurement: A New Data FrontierPR Analytics and Measurement: A New Data Frontier
PR Analytics and Measurement: A New Data Frontier
 
Rmda
RmdaRmda
Rmda
 
Innovations in marketing effectiveness measurement
Innovations in marketing effectiveness measurement  Innovations in marketing effectiveness measurement
Innovations in marketing effectiveness measurement
 
Yg2298
Yg2298Yg2298
Yg2298
 
PRSA 2010 International Conference
PRSA 2010 International ConferencePRSA 2010 International Conference
PRSA 2010 International Conference
 
Econometrics Explained - IPA Report
Econometrics Explained - IPA ReportEconometrics Explained - IPA Report
Econometrics Explained - IPA Report
 
Shrinking big data for real time marketing strategy - A statistical Report
Shrinking big data for real time marketing strategy - A statistical ReportShrinking big data for real time marketing strategy - A statistical Report
Shrinking big data for real time marketing strategy - A statistical Report
 
CCMarketingResearchReport
CCMarketingResearchReportCCMarketingResearchReport
CCMarketingResearchReport
 
Predicting the Next News Trends: The Advent of Intelligent Media Analysis
Predicting the Next News Trends: The Advent of Intelligent Media AnalysisPredicting the Next News Trends: The Advent of Intelligent Media Analysis
Predicting the Next News Trends: The Advent of Intelligent Media Analysis
 
Social media monitoring
Social media monitoringSocial media monitoring
Social media monitoring
 

Dernier

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
Sheetaleventcompany
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
Kayode Fayemi
 

Dernier (20)

Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 

wt2084 final presentation slides

  • 1. STATISTICAL MEASUREMENTS, ANALYSIS & RESEARCH Final Presentation  Weixi Tan  Net ID: wt2084  NYU SPS Integrated Marketing  Professor: Luyao Zhang
  • 2. Outline  Part I: Introduction  Part II: Summary of course takeaway  Part III: Market research report: Regression analysis  Part VI: Appendix
  • 3. Part I: Introduction  Weixi (Vicky) Tan comes from Chongqing, China. She received her Bachelor's degree in statistics. She has interned as a reporter for CQNEWS.Net, which is Chongqing's 1st and largest portal news website, independently completing several interviewing and news reporting tasks at the 2019 Smart China exposition for her clients, China Telecom Co., Ltd. During the process, she got involved in news media marketing. She was one of the persons in charge of a photography studio called Match, which is a university students’ innovative undertaking program. Its main business are taking professional photographs or commercial shoot for large-scale events and each year’s graduation season. Her operation at Match has driven the studio to develop from a small studio in its primary start-up phase into one dominating university market and also gain increasing popularity beyond campus. Additionally, she felt the significance of marketing planning and execution for businesses and began to pursuit a real marketing career after a systematic study of marketing at the graduate level at NYU. She is passionate about photography, volunteering, traveling, and exploring new things.  LinkedIn URL:https://www.linkedin.com/in/weixi-tan-5384911a4/  Github Repo URL:https://github.com/WeixiTan/NYU_Integrated_Marketing  Kaggle Notebook URL:https://www.kaggle.com/weixitan/customer-segementation-wt2084
  • 4. Part II: Summary of course takeaway I draw several simple mind maps:
  • 5.  Use tools such as Google data studio, Github, Kaggle and apply python codes.  T-test is very basic and important in Hypothesis Testing for testing continuous variables.  If we want to demonstrate the relativity of variables, we should use relevant analysis. Regression analysis should be used if we want to reflect how much one variable affects another.  Cluster analysis is a class of techniques used to classify and segment our target customers.  Different methods are used for different types of data sets and each model has its own applicable conditions. As a market analyst, you may try many methods and models and fail many times to find the most appropriate way to help your company make the best marketing decisions.  Statistical analysis and research provide me quantitative and qualitative techniques for developing consumer insights, determining market potential, maximizing market share and building customer relationships in an integrated-marketing environment.  If I want to engage in analytical positions in the future, I still need more statistical analysis knowledge and ability to improve my statistical thinking. This course really help me and make me realized the importance of statical analysis in further work.  In today‘s big data age, mastering some data analytics technologies learned in this course can enhance our competitiveness when applying for a great job. Part II: Summary of course takeaway Key learning and my takeaway for personal and professional growth:
  • 6. Executive Summary  The URL to the data source:  https://www.kaggle.com/rafailmahammadli/regression  This is a Kaggle data sets for advertising data contain information about Sales of a product in 200 different markets and the advertising budgets for the product in TV, Radio and Newspaper. Sales are measured in thousands of units, the advertising budgets in thousands of dollars.  We use linear regression in this report to test the correlations. The results shows that TV and Sales are correlated, and Radio and Sales are correlated, which implies we should pay more attention on TV and Radio advertising.  Github Repo Link:https://github.com/WeixiTan/NYU_Integrated_Marketing Part III: Market research report: Regression analysis
  • 7. Research Design and The Data Google Data studio Link : https://datastudio.google.com/embed/reporting/02f262af-782f-4785-8904- e45a433b21ee/page/1tUrB Abstract: This data is about Sales of a product in 200 different markets and the advertising budgets for the product in TV, Radio and Newspaper. By visualizing secondary data from Kaggle, we find TV advertising budget is much bigger than radio and newspaper. We want to research on the relationship between sales and other three variables and which media channel has a further impact on product sales. We use linear regression in this report to test the correlations. The results shows that TV and Sales are correlated, and Radio and Sales are correlated. Through such research, product managers could make better marketing decisions like on which channel to put more budget.
  • 8. Scatter plots We draw threes scatter plots, first is for TV and sales, second is for radio and sales, third is for newspaper and sales. Scatter plot 1 shows that TV advertising and sales have a linear relationship. Scatter plot 2 shows that radio advertising and sales have a linear relationship. Scatter plot 3 shows that newspaper advertising and sales have a linear relationship but not so clear.
  • 9. Regression result From the regression result, we can see the p value of TV is 0 which smaller than 0.05, we can reject the null hypothesis that TV and sales are not correlated. The p value of radio is 0 which smaller than 0.05, we can reject the null hypothesis that radio and sales are not correlated. The p value of newspaper is 0.860 which greater than 0.05, we can not reject the null hypothesis that newspaper and sales are not correlated.
  • 10. Insights  From the regression result, we can conclude that TV advertising and radio advertising do have significant effect on product sales with 95% confidence level, but newspaper advertising and sales are not correlated.  This conclusion implies that traditional media also play an important role in marketing campaign, and company should pay more attention on TV and radio advertising instead of newspaper advertising.
  • 11. Assumptions Check  Then we further check the 6 assumptions of the linear model.  Results show assumption 2 and 3 are likely to be satisfied, but assumption 1, 4 and 6 are not likely to be satisfied. For assumption 1, the error term is not normally distributed. For each fixed value of X, the distribution of Y is not normal. For assumption 3, the mean of the error term is 0.
  • 12. Assumptions Check  For assumption 2, from the scatter plots above, the means of all these normal distributions of Y, given X, lie on a straight line. So TV and sales have linear relationship, radio and sales have linear relationship, and newspaper and sales have linear relationship. • For assumption 4, The variance of the error term is not so constant. This variance depend on the values assumed by X. For assumption 5, the data set is not for time series data, so we omitted here.
  • 13. Assumptions Check • For assumption 6, TV and radio are not correlated, also TV and newspaper are not correlated, but radio and newspaper are correlated. Maybe there are some issues of multi-collinearity.
  • 14. Further research:  From the regression result, the p value of newspaper is 0.860 which greater than 0.05, we can not reject the null hypothesis that newspaper and sales are not correlated. And through the assumption check, some assumptions are not likely to be satisfied.  We should consider that this linear regression model is not so valid, maybe we can remove the variable (newspaper) which don’t have significant impact on product sales.  Besides, based on the scatter plot between residuals and predictions, we can consider non-linear regression to conduct the research.  In this research we focus on traditional media but we can also find more data about new media like social media to analysis.
  • 15. Capstone Project Milestone 2: Research Design and The Data Part VI: Appendix
  • 16. Capstone Project Milestone 3-Hypothesis Testing  Data source:  https://data.world/data-society/bank-marketing-data The data is related with direct marketing campaigns(phone calls) of Portuguese banking institution.  https://stats.oecd.org The data is quarterly growth rates of GDP in volume of G20 countries.  Tests: Paired T-test; Two sample T-test; Person Test of Correlations. All of the results showed that there are significant differences.  Github: https://github.com/WeixiTan/NYU_Integrated_Marketing 16
  • 17. Paired T-test  Because the data is about before-and-after observations on the same sample(measured twice, resulting in pairs of observations), we pick the paired T-test.  Conclusion: The p-value = 0.0<0.05, we can reject the null hypothesis that there is no significant deference between mean GDP level 2018 and 2020. 17
  • 18.  Because the data is metric data but not paired, and has two groups, we pick two-sample T-Test.  Conclusion: The P-value<0.05, we can reject the null hypothesis that the mean of the balance equals those who have loan and those who do not have loan at 0.05 significant level. 18
  • 19.  Because the two variables are normality distributed with no outlier, so we pick the person for testing.  Conclusion: We get the result P-value=0.0<0.5, we can reject the null hypothesis that the GDP for the same country in 2018 and 2020 are not correlated. 19
  • 20. • Conclusion: For effect size of 0.12, a power of 0.8, and a type Ⅰ error of 0.05, we need a simple size of 25. 20 For the Two Sample T-test based on bank marketing data: • The limitations: Bank marketing data can not do the paired T-test since the data is not paired. • Future research plan: We can decide the simple size through power analysis and collect another data of bank clients before marketing campaigns to measure the difference before and after the marketing campaign, thereby measure effect of the marketing campaign(phone calls or something like that).
  • 21. Capstone Project Milestone 4: Regression  Data source:  https://www.kaggle.com/c/customer-churn-prediction-2020/overview  This is a Kaggle competition data sets for competition in 2020. The original purpose is to predict whether a customer will change telecommunications provider, something known as "churning". "total_day_calls", "total_eve_calls" and "total_night_calls" mean "total number of day calls", "total number of evening calls" and "total number of night calls".  We use linear regression in this report to test the correlations. The results shows that total day calls and total night calls are not correlated, and total evening calls and total night calls are not correlated, which implies we should design customized packages for day calls and night calls.  Github: https://github.com/WeixiTan/NYU_Integrated_Marketing 21 Executive Summary
  • 22. Scatter plots  We draw two scatterplots, one is for total day calls and total night calls, another is for total evening calls and total night calls.  Scatter plot 1 shows that total day calls and total night calls have a linear relationship.  Scatter plot 2 shows that total evening calls and total night calls have a linear relationship.
  • 23. Regression result  From the regression result, we can see the p value is 0.756 and 0.438 which greater than 0.05. We can not reject the null hypothesis that total day calls and total night calls are not correlated, and total evening calls and total night calls are not correlated.
  • 24. Insights  From the regression result, we can conclude that total day calls and total evening calls don’t have significant effect on total night calls with 95% confidence level.  This conclusion implies that we should design customized packages for day calls, evening calls and night calls.
  • 25. Assumptions Check  For assumption 1 and 3, the error term is normally distributed. For each fixed value of X, the distribution of Y is normal. The mean of the error term is 0.  For assumption 2, from the scatter plots above, the means of all these normal distributions of Y, given X, lie on a straight line. So total day calls and total night calls have linear relationship, and also total evening calls and total night calls have linear relationship. Then we further check the 6 assumptions of the linear model. Results show all the assumptions are likely to be satisfied
  • 26.  For assumption 4, The variance of the error term is constant. This variance does not depend on the values assumed by X.  For assumption 5, the data set is not for time series data, so we omitted here.  For assumption 6, the independent variables in X are not correlated. This is no issue of multi- collinearity.
  • 27.  Further research:  Though we can design customized packages for different periods, we should do more further and detailed researches to see how to design different packages. For example, we can do another linear regression considering whether total day minutes or total day calls has more significant effect on total day charge to see whether we should increase price for every minutes or every calls.
  • 28. Capstone Project Milestone 5: Customer Segmentation Executive Summary • Data source: https://www.kaggle.com/hellbuoy/online-retail-customer-clustering This is an online retail transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers. The business goal is to build a RFM clustering and choose the best set of customers which the company should target. We choose K-mean clustering and Hierarchical clustering. The result is that K-mean clustering returns 57 customers and Hierarchical clustering returns 2 customers, which is a much smaller group than the one that K-Means Clustering return. • Kaggle Notebook: https://www.kaggle.com/weixitan/customer-segementation-wt2084 28
  • 29. K-Means Clustering: Finding the best k- The Elbow Method  Since the can’t get return, we choose second k as the best k, so when metric=“silhoustte”, we get best k=3. 29
  • 30. K-Mean Clustering: Interpreting the Clustering 30 By the RFM criteria, we should choose the customer clusters with a lower recency, a higher frequency and amount. From the K- means clustering results, we can see that customers with Cluster_Id 2 best fit the criteria.
  • 31. K-Mean Clustering: Interpreting the Clustering  We can see that we k-Means Clustering returns 57 customers. 31
  • 32. Hierarchical Clustering 32 • Visualize the dendrogram (tree) by different linkage methods
  • 33. Hierarchical Clustering: Virtualize and Interpret Result  By the RFM criteria, we should choose the customer cluster with a lower recency, a higher frequency and amount.  From the Hierarchical Clustering results, we can see that customers with Cluster_Id 1 best fit the high Frequency criteria but customers with Cluster_Id 2 best fit the high Amount criteria . 33
  • 34. Hierarchical Clustering: Interpreting the Clustering  We can see that Hierarchical Clustering returns 2 customers, which is a much smaller group than the one that K-Means Clustering return.  If the manager value the frequency more, we choose Cluster_Id 2, and the company can provide some daily discounts for customers in future marketing campaign. If manager more consider the amount, we choose Cluster_Id 1, the company can provide discount over a certain amout.34