SlideShare a Scribd company logo
1 of 24
Download to read offline
Customer Personality Analysis — Part 1
Detailed Exploratory Data Analysis
Data Science has revolutionized the world a lot through technical
transformation. Now, we have gotten accustomed to seeing many
machine learning applications in our day-to-day lives. But I am
more interested in how machine learning can classify humans based
on their personality traits.
https://www.lsretail.com/hubfs/BLOG_Retail-queue-covid-times.jpg
Follow for more : https://manaliraut2.medium.com/
In this article, I will demonstrate the data analysis of Customer
Personalities to extract meaningful insights from a large volume of
marketing campaign data .This is an attempt to have insights on
how the characteristics of a person relate to their personality traits
and habits.
1.Introduction
2.Understanding the data
3.Exploratory Data Analysis (Matplotlib, Seaborn, Pandas)
4.Exploratory Data Analysis ((Dataprep.eda)
5.Conclusion
1. Introduction
Customer Personality Analysis is a detailed analysis of a company’s
all types of customers. It also helps a business to understand
behavior of customers, increase usage, customer satisfaction and
also modify products according to needs. Here I am targeting
specific people who paved the way for increasing marketing
campaigns. These Personality based analysis are highly effective in
increasing the popularity and attractiveness of products and
services.
2. Understanding The Data
Customer personality analysis helps a business to modify its product
based on its target customers from different types of customer
segments. For example, instead of spending money to market a new
product to every customer in the company’s database, a company
can analyze which customer segment is most likely to buy the
product and then market the product only on that particular
segment.
2.1 Content
2.1.1. People
ID: Customer’s unique identifier
Year_Birth: Customer’s birth year
Education: Customer’s education level
Marital_Status: Customer’s marital status
Income: Customer’s yearly household income
Kidhome: Number of children in customer’s household
Teenhome: Number of teenagers in customer’s household
Dt_Customer: Date of customer’s enrollment with the
company
Recency: Number of days since customer’s last purchase
Complain: 1 if the customer complained in the last 2 years, 0
otherwise
2.1.2. Products
MntWines: Amount spent on wine in last 2 years
MntFruits: Amount spent on fruits in last 2 years
MntMeatProducts: Amount spent on meat in last 2 years
MntFishProducts: Amount spent on fish in last 2 years
MntSweetProducts: Amount spent on sweets in last 2 years
MntGoldProds: Amount spent on gold in last 2 years
2.1.3. Promotion
NumDealsPurchases: Number of purchases made with a
discount
AcceptedCmp1: 1 if customer accepted the offer in the 1st
campaign, 0 otherwise
AcceptedCmp2: 1 if customer accepted the offer in the 2nd
campaign, 0 otherwise
AcceptedCmp3: 1 if customer accepted the offer in the 3rd
campaign, 0 otherwise
AcceptedCmp4: 1 if customer accepted the offer in the 4th
campaign, 0 otherwise
AcceptedCmp5: 1 if customer accepted the offer in the 5th
campaign, 0 otherwise
Response: 1 if customer accepted the offer in the last campaign,
0 otherwise
2.1.4. Place
NumWebPurchases: Number of purchases made through the
company’s website
NumCatalogPurchases: Number of purchases made using a
catalogue
NumStorePurchases: Number of purchases made directly in
stores
NumWebVisitsMonth: Number of visits to company’s website
in the last month
3. Exploratory Data Analysis (Matplotlib, Seaborn, Pandas)
Let’s look at our data.
Data looks good as of now. First thing I have done is to check for
missing values.
Found an Income column with 24 missing values so I filled it up
with median values.
Now, because there’s a birth year column. I changed birth year to
age (I used 2022 year to represent their current age.)
I summed up the total expenses, total no. of purchases, total
accepted campaign and total kids home for each customer.
Then, I changed the values of the Marital Status column.
With the help of the customer ID column, checked for duplicate
data.
Now, let’s check the data information again
3.1 Data Visualization
As we can see from the income graph, most customers have the
income range of 30,000–80,000.
According to the Age column, most customers are between 44 to 57
ages.
We can see that out of total expenses wines is the best selling
product.
We can see the Correlation between Income and Total expenses and
then followed by Total Purchases. And another correlation is
between Total Expenses and Total Purchases.
4. Exploratory Data Analysis (Dataprep.eda)
Exploratory Data Analysis (EDA) is the process of exploring a
dataset and getting insights of its main characteristics. The
dataprep.eda package simplifies this process by allowing the user to
explore important characteristics with simple APIs. Each API allows
the user to analyze the dataset from a high level to a low level, and
from different perspectives.
I have only used one API i.e. create_report which is used to generate
reports from pandas dataframe. It provides information like
overview, variables, quantiles and descriptive statistics, correlations,
missing values, etc.
Its a clear overview of the whole dataset, showing 24 missing values
and almost all variables are skewed.
Here, I have shown all the insights of Education column.
create_report gave me all these details about every column variable
present in he dataset. It made far easy to understand the data.
This is the scatterplot showing the relation between income and
wines. Customers in the specific range of Income are regular buyers
of wine.
Create_report provided 3 kinds of correlation coefficient here. This
is spearman we can see in image above.
bar chart of all variables very easily showing missing values in
Income column.
5. Conclusion
This is an attempt to have insights on how the characteristics of a
person relate to their personality traits and habits. To summarize my
findings, I found and replaced 24 missing values in the Income
column, Correlation between Income and Total expenses , no
correlation between year of birth and amount spent on wine, there
are more customers of wine, mostly graduates and with an average
income. And about data.prep, I am amazed!! Now, we will be able to
make predictions with the help of algorithms based on these
demographics.
Hushhh!!! Huge work I have done here ;) I am gonna grab some tea
and any song by BTS. Will meet you in part 2 .
You can find the code in python on Github.
You can reach me on LinkedIn.
Stay tuned!
Follow for more : https://manaliraut2.medium.com/

More Related Content

What's hot

Advanced Use Case Diagram and Model
Advanced Use Case Diagram and ModelAdvanced Use Case Diagram and Model
Advanced Use Case Diagram and Model
QBI Institute
 
Slide 6 Activity Diagram
Slide 6 Activity DiagramSlide 6 Activity Diagram
Slide 6 Activity Diagram
Niloy Rocker
 
Tourism with recomendation systems
Tourism with recomendation systemsTourism with recomendation systems
Tourism with recomendation systems
Armando Vieira
 

What's hot (20)

Enterprise Software Architecture Project
Enterprise Software Architecture ProjectEnterprise Software Architecture Project
Enterprise Software Architecture Project
 
Object Oriented Analysis and Design with UML2 part1
Object Oriented Analysis and Design with UML2 part1Object Oriented Analysis and Design with UML2 part1
Object Oriented Analysis and Design with UML2 part1
 
Advanced Use Case Diagram and Model
Advanced Use Case Diagram and ModelAdvanced Use Case Diagram and Model
Advanced Use Case Diagram and Model
 
drug store mangement documentation
drug store mangement documentation drug store mangement documentation
drug store mangement documentation
 
R Programming: Introduction to Vectors
R Programming: Introduction to VectorsR Programming: Introduction to Vectors
R Programming: Introduction to Vectors
 
Use Case Modeling
Use Case ModelingUse Case Modeling
Use Case Modeling
 
Er Modeling
Er ModelingEr Modeling
Er Modeling
 
Canteen Store Department
Canteen Store DepartmentCanteen Store Department
Canteen Store Department
 
Slide 6 Activity Diagram
Slide 6 Activity DiagramSlide 6 Activity Diagram
Slide 6 Activity Diagram
 
Inventory management system
Inventory management systemInventory management system
Inventory management system
 
Hospital management synopsis
Hospital management synopsisHospital management synopsis
Hospital management synopsis
 
Data analytics using R programming
Data analytics using R programmingData analytics using R programming
Data analytics using R programming
 
Activity diagram-UML diagram
Activity diagram-UML diagramActivity diagram-UML diagram
Activity diagram-UML diagram
 
Restaurant Management Wireframes
Restaurant Management WireframesRestaurant Management Wireframes
Restaurant Management Wireframes
 
Tourism with recomendation systems
Tourism with recomendation systemsTourism with recomendation systems
Tourism with recomendation systems
 
3.1.3 case 1 markov chain
3.1.3 case 1 markov chain3.1.3 case 1 markov chain
3.1.3 case 1 markov chain
 
DBMS - Normalization
DBMS - NormalizationDBMS - Normalization
DBMS - Normalization
 
Class diagram presentation
Class diagram presentationClass diagram presentation
Class diagram presentation
 
Data Abstraction
Data AbstractionData Abstraction
Data Abstraction
 
Online shopping
Online shoppingOnline shopping
Online shopping
 

Similar to Customer Personality Analysis — Part 1.pdf

OrganicProducts_FinalReport.docx
OrganicProducts_FinalReport.docxOrganicProducts_FinalReport.docx
OrganicProducts_FinalReport.docx
Ryan M. Sulier
 
Ch3 gathering information and scannning the environment berroya
Ch3 gathering information and scannning the environment berroyaCh3 gathering information and scannning the environment berroya
Ch3 gathering information and scannning the environment berroya
Daniel Leon Berroya
 
14Feasibility AnalysisStudent’s Name
14Feasibility AnalysisStudent’s Name14Feasibility AnalysisStudent’s Name
14Feasibility AnalysisStudent’s Name
MatthewTennant613
 
14Feasibility AnalysisStudent’s Name
14Feasibility AnalysisStudent’s Name14Feasibility AnalysisStudent’s Name
14Feasibility AnalysisStudent’s Name
AnastaciaShadelb
 
Democratization of Analytics
Democratization of AnalyticsDemocratization of Analytics
Democratization of Analytics
Prajakta Vaidya
 
1 2Summary Of the Business Model Ca.docx
1  2Summary Of the Business Model Ca.docx1  2Summary Of the Business Model Ca.docx
1 2Summary Of the Business Model Ca.docx
durantheseldine
 
Can Product ReturnsMake You MoneyS P R I N G 2 0 1 0 .docx
Can Product ReturnsMake You MoneyS P R I N G  2 0 1 0  .docxCan Product ReturnsMake You MoneyS P R I N G  2 0 1 0  .docx
Can Product ReturnsMake You MoneyS P R I N G 2 0 1 0 .docx
hacksoni
 
eDataSource Amazon Retail Performance Q212
eDataSource Amazon Retail Performance Q212eDataSource Amazon Retail Performance Q212
eDataSource Amazon Retail Performance Q212
Vivastream
 

Similar to Customer Personality Analysis — Part 1.pdf (20)

Bank churn with Data Science
Bank churn with Data ScienceBank churn with Data Science
Bank churn with Data Science
 
OrganicProducts_FinalReport.docx
OrganicProducts_FinalReport.docxOrganicProducts_FinalReport.docx
OrganicProducts_FinalReport.docx
 
Ch3 gathering information and scannning the environment berroya
Ch3 gathering information and scannning the environment berroyaCh3 gathering information and scannning the environment berroya
Ch3 gathering information and scannning the environment berroya
 
Touch point analysis
Touch point analysisTouch point analysis
Touch point analysis
 
14Feasibility AnalysisStudent’s Name
14Feasibility AnalysisStudent’s Name14Feasibility AnalysisStudent’s Name
14Feasibility AnalysisStudent’s Name
 
14Feasibility AnalysisStudent’s Name
14Feasibility AnalysisStudent’s Name14Feasibility AnalysisStudent’s Name
14Feasibility AnalysisStudent’s Name
 
Democratization of Analytics
Democratization of AnalyticsDemocratization of Analytics
Democratization of Analytics
 
Final presentation zg2088
Final presentation zg2088Final presentation zg2088
Final presentation zg2088
 
1 2Summary Of the Business Model Ca.docx
1  2Summary Of the Business Model Ca.docx1  2Summary Of the Business Model Ca.docx
1 2Summary Of the Business Model Ca.docx
 
Changes in consumer spending habits due to covid 19
Changes in consumer spending habits due to covid 19Changes in consumer spending habits due to covid 19
Changes in consumer spending habits due to covid 19
 
Learn about consumer intelligence to enhance consumer experience
Learn about consumer intelligence to enhance consumer experience Learn about consumer intelligence to enhance consumer experience
Learn about consumer intelligence to enhance consumer experience
 
Brandable newsletter for printers and mailers
Brandable newsletter for printers and mailersBrandable newsletter for printers and mailers
Brandable newsletter for printers and mailers
 
Analytical CRM - Ecommerce analysis of customer behavior to enhance sales
Analytical CRM - Ecommerce analysis of customer behavior to enhance sales Analytical CRM - Ecommerce analysis of customer behavior to enhance sales
Analytical CRM - Ecommerce analysis of customer behavior to enhance sales
 
Analysis on the US Consumer Expenditure
Analysis on the US Consumer ExpenditureAnalysis on the US Consumer Expenditure
Analysis on the US Consumer Expenditure
 
Role of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods IndustryRole of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods Industry
 
Funnels Workshop Web Summit 2014 @geckoboard @GA
Funnels Workshop Web Summit 2014 @geckoboard @GAFunnels Workshop Web Summit 2014 @geckoboard @GA
Funnels Workshop Web Summit 2014 @geckoboard @GA
 
Can Product ReturnsMake You MoneyS P R I N G 2 0 1 0 .docx
Can Product ReturnsMake You MoneyS P R I N G  2 0 1 0  .docxCan Product ReturnsMake You MoneyS P R I N G  2 0 1 0  .docx
Can Product ReturnsMake You MoneyS P R I N G 2 0 1 0 .docx
 
How To Harvard Reference A Website In An Essay
How To Harvard Reference A Website In An EssayHow To Harvard Reference A Website In An Essay
How To Harvard Reference A Website In An Essay
 
eDataSource Amazon Retail Performance Q212
eDataSource Amazon Retail Performance Q212eDataSource Amazon Retail Performance Q212
eDataSource Amazon Retail Performance Q212
 
Effective Business Practices 101 (5/8): Power Your Business With Information
Effective Business Practices 101 (5/8): Power Your Business With InformationEffective Business Practices 101 (5/8): Power Your Business With Information
Effective Business Practices 101 (5/8): Power Your Business With Information
 

Recently uploaded

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 

Recently uploaded (20)

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 

Customer Personality Analysis — Part 1.pdf

  • 1. Customer Personality Analysis — Part 1 Detailed Exploratory Data Analysis Data Science has revolutionized the world a lot through technical transformation. Now, we have gotten accustomed to seeing many machine learning applications in our day-to-day lives. But I am more interested in how machine learning can classify humans based on their personality traits. https://www.lsretail.com/hubfs/BLOG_Retail-queue-covid-times.jpg Follow for more : https://manaliraut2.medium.com/ In this article, I will demonstrate the data analysis of Customer Personalities to extract meaningful insights from a large volume of marketing campaign data .This is an attempt to have insights on how the characteristics of a person relate to their personality traits and habits.
  • 2. 1.Introduction 2.Understanding the data 3.Exploratory Data Analysis (Matplotlib, Seaborn, Pandas) 4.Exploratory Data Analysis ((Dataprep.eda) 5.Conclusion 1. Introduction Customer Personality Analysis is a detailed analysis of a company’s all types of customers. It also helps a business to understand behavior of customers, increase usage, customer satisfaction and also modify products according to needs. Here I am targeting specific people who paved the way for increasing marketing campaigns. These Personality based analysis are highly effective in increasing the popularity and attractiveness of products and services. 2. Understanding The Data Customer personality analysis helps a business to modify its product based on its target customers from different types of customer segments. For example, instead of spending money to market a new product to every customer in the company’s database, a company can analyze which customer segment is most likely to buy the product and then market the product only on that particular segment. 2.1 Content 2.1.1. People
  • 3. ID: Customer’s unique identifier Year_Birth: Customer’s birth year Education: Customer’s education level Marital_Status: Customer’s marital status Income: Customer’s yearly household income Kidhome: Number of children in customer’s household Teenhome: Number of teenagers in customer’s household Dt_Customer: Date of customer’s enrollment with the company Recency: Number of days since customer’s last purchase Complain: 1 if the customer complained in the last 2 years, 0 otherwise 2.1.2. Products MntWines: Amount spent on wine in last 2 years MntFruits: Amount spent on fruits in last 2 years MntMeatProducts: Amount spent on meat in last 2 years MntFishProducts: Amount spent on fish in last 2 years MntSweetProducts: Amount spent on sweets in last 2 years MntGoldProds: Amount spent on gold in last 2 years 2.1.3. Promotion NumDealsPurchases: Number of purchases made with a discount AcceptedCmp1: 1 if customer accepted the offer in the 1st campaign, 0 otherwise AcceptedCmp2: 1 if customer accepted the offer in the 2nd
  • 4. campaign, 0 otherwise AcceptedCmp3: 1 if customer accepted the offer in the 3rd campaign, 0 otherwise AcceptedCmp4: 1 if customer accepted the offer in the 4th campaign, 0 otherwise AcceptedCmp5: 1 if customer accepted the offer in the 5th campaign, 0 otherwise Response: 1 if customer accepted the offer in the last campaign, 0 otherwise 2.1.4. Place NumWebPurchases: Number of purchases made through the company’s website NumCatalogPurchases: Number of purchases made using a catalogue NumStorePurchases: Number of purchases made directly in stores NumWebVisitsMonth: Number of visits to company’s website in the last month 3. Exploratory Data Analysis (Matplotlib, Seaborn, Pandas) Let’s look at our data.
  • 5. Data looks good as of now. First thing I have done is to check for missing values.
  • 6. Found an Income column with 24 missing values so I filled it up with median values.
  • 7. Now, because there’s a birth year column. I changed birth year to age (I used 2022 year to represent their current age.) I summed up the total expenses, total no. of purchases, total accepted campaign and total kids home for each customer. Then, I changed the values of the Marital Status column. With the help of the customer ID column, checked for duplicate data. Now, let’s check the data information again
  • 9. As we can see from the income graph, most customers have the income range of 30,000–80,000.
  • 10. According to the Age column, most customers are between 44 to 57 ages.
  • 11. We can see that out of total expenses wines is the best selling product.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. We can see the Correlation between Income and Total expenses and then followed by Total Purchases. And another correlation is between Total Expenses and Total Purchases. 4. Exploratory Data Analysis (Dataprep.eda) Exploratory Data Analysis (EDA) is the process of exploring a dataset and getting insights of its main characteristics. The dataprep.eda package simplifies this process by allowing the user to explore important characteristics with simple APIs. Each API allows the user to analyze the dataset from a high level to a low level, and from different perspectives.
  • 18. I have only used one API i.e. create_report which is used to generate reports from pandas dataframe. It provides information like overview, variables, quantiles and descriptive statistics, correlations, missing values, etc. Its a clear overview of the whole dataset, showing 24 missing values and almost all variables are skewed.
  • 19.
  • 20.
  • 21. Here, I have shown all the insights of Education column. create_report gave me all these details about every column variable present in he dataset. It made far easy to understand the data.
  • 22. This is the scatterplot showing the relation between income and wines. Customers in the specific range of Income are regular buyers of wine. Create_report provided 3 kinds of correlation coefficient here. This is spearman we can see in image above.
  • 23. bar chart of all variables very easily showing missing values in Income column. 5. Conclusion This is an attempt to have insights on how the characteristics of a person relate to their personality traits and habits. To summarize my findings, I found and replaced 24 missing values in the Income column, Correlation between Income and Total expenses , no correlation between year of birth and amount spent on wine, there are more customers of wine, mostly graduates and with an average income. And about data.prep, I am amazed!! Now, we will be able to
  • 24. make predictions with the help of algorithms based on these demographics. Hushhh!!! Huge work I have done here ;) I am gonna grab some tea and any song by BTS. Will meet you in part 2 . You can find the code in python on Github. You can reach me on LinkedIn. Stay tuned! Follow for more : https://manaliraut2.medium.com/