Public Sector Reform: Challenges and Prospects in Ghana and Beyond
DSC4213 Final Presentation (Trump vs Hillary Campaign Strategy Analysis)
1. USA Election 2016:
Studying the use of
Financial Data to
supplement
Election Strategy
A Project by Group 5:
Cai Hongliang
Ken Lok Jing Wen
Ong Zhi Kan
Toh Yi Da
3. Banners, Cards, Posters. These things costs a
lot of money
Advertisements1
Funds are required to pay wages for
people that help run the campaign
Campaign Personnel Wages2
Activities need to be organised to
rally voters and generate hype
Grassroots Activities3
INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
5. Goal
1. To create a model to classify which candidate is transaction going to
2. To analyse the factors that affect donation amount
INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
Purpose
To analyse the 2015-2016 elections campaign finance dataset
Raw dataset from the FEC includes:
• Donor’s name
• City, state
• Occupation
• Amount of donation
• Transaction date
Dataset
6. Large proportion of political donors are male
Gender1
Donations made by a family/couple only
bear the male’s name
Possible Explanation2
Proportion of females donors increased
from 30% in 2000 to 36% in 2004
Changing Trend3
INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
In our dataset, the proportion of female in large donors was 61%
7.
8. John
Abraham
Derrick
Betty
Patricia
Male
Female
INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
1. Variation in data; hard to eyeball trends
2. Convert that to something more general, which was gender
First names used to infer using a register of new-borns with name and
gender from the US Social Security Administration (from 1880 to 2015)
Analysis
9. INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
1. Variation in data; hard to eyeball trends
2. Some obvious categories such as “Retired/Unemployed”
• Each bucket has a list of job titles e.g. accountant, teacher
Analysis
1. Top 7-10 words with highest frequency selected
• Classified into 28 different buckets
2. Run script to sort occupations directly into buckets
Breakdown
10. Management
Occupation
Education, Training,
Library
Legal Occupations Business/Financial
Academic Dean
Administrator
Academic Teacher
Accounting Professor
Arbitrators
Attorneys
Budget Analyst
Auditors
List of Keywords
INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
11. INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
1. Removed those that did not belong to the 50 US states
2. Reduce the dataset; narrow down the demographics of donors
Method
AS, GU, MH, FM, MP, PW, PR, VI
12. Clinton Trump
Original 13,058,232
Donors 1,115,330 70,819
Outside of 50 States + DC 4,495 50
Negative Donations 638 419
Donations to self 44 ($877, 947) 101 ($6,496,946)
INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
14. Buckets Count %
RETIRED 185880 24.0%
MANAGEMENT OCCUPATIONS 92543 12.0%
OTHERS 70572 9.1%
EDUCATION, TRAINING, AND LIBRARY OCCUPATIONS 60300 7.8%
LEGAL OCCUPATIONS 54083 7.0%
BUSINESS AND FINANCIAL OCCUPATIONS 54070 7.0%
HEALTHCARE OCCUPATIONS 42573 5.5%
SALES OCCUPATIONS 23589 3.1%
MEDIA AND COMMUNICATION OCCUPATIONS 21955 2.8%
ARCHITECTURE AND ENGINEERING OCCUPATIONS 19380 2.5%
COMPUTER AND INFORMATION TECHNOLOGY OCCUPATIONS 15452 2.0%
PRODUCTION OCCUPATIONS 15036 1.9%
COMMUNITY AND SOCIAL SERVICE OCCUPATIONS 14306 1.9%
ARTS AND DESIGN OCCUPATIONS 13765 1.8%
FARMING, FISHING, AND FORESTRY OCCUPATIONS 11350 1.5%
ENTERTAINMENT AND SPORTS OCCUPATIONS 10704 1.4%
NOT EMPLOYED 10055 1.3%
PERSONAL CARE AND SERVICE OCCUPATIONS 9895 1.3%
MATH OCCUPATIONS 8454 1.1%
LIFE, PHYSICAL, AND SOCIAL SCIENCE OCCUPATIONS 8306 1.1%
OFFICE AND ADMINISTRATIVE SUPPORT OCCUPATIONS 6493 0.8%
STUDENT 6286 0.8%
TRANSPORTATION AND MATERIAL MOVING OCCUPATIONS 5913 0.8%
INSTALLATION, MAINTENANCE, AND REPAIR OCCUPATIONS 4315 0.6%
FOOD PREPARATION AND SERVING OCCUPATIONS 3890 0.5%
BUILDING AND GROUNDS CLEANING OCCUPATIONS 1583 0.2%
CONSTRUCTION AND EXTRACTION OCCUPATIONS 1165 0.2%
PROTECTIVE SERVICE OCCUPATIONS 821 0.1%
MILITARY 344 0.0%
INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
Occupations
Education
Sales Media
Health
Retired
Management
Others
Legal
Business
Others
15. Us states %
CA
FL
MA
TX
NY
Legend:
CA California
FL Florida
MA Massachusetts
TX Texas
NY New York
INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
19. INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
Donation Target Gender
Occupation
• Excludes non-US
• Ref Cat California
• Most common
States
• Reference category is Female
• Most common amongst voters
Data
Features
Picked
• 28 Categories;
• Ref Cat “Management”
• Most common after “Retired”
• Used Clinton as Baseline
Transaction Amount
• Numeric Variable
20. INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
Model Creation: Classification of Pro-Trump or Pro-Clinton
Trump Transactions vs Clinton
• Created on basis of Gender, Occupation, State, and Amount WILLING to Donate
• No. of Trump Transactions is 6% of total of Clinton
o Workaround by random sampling of 6% of Clinton Data
o Deals with Class Imbalance
Backwards Stepwise Variables Selection
• Removal of highest p-value and re-run till all p-values < 0.05
• No Removal of Occupation Dummies; 7 states grouped under California
22. INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
Adjusted R2 for each model -
Hillary Clinton: 0.0412
Donald Trump: 0.0076
Separate data model for each
candidate -
Hillary Clinton Model
Donald Trump Model
Driving Factors for Donation
Amount per candidate
Factors include: Gender,
Occupation, State
Significance to Transaction
amount
Analyse Build Results
23. INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
Adjusted R2 increases for each
of the presidential candidates -
Hillary Clinton: 0.8685
Donald Trump: 0.4185
Outlier dummy variable
Backwards stepwise regression
Usage of StatTools to identify
outliers
Outliers: observation with
standardised residuals >3
Analyse Build Results
24. INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
Predict:
Pro-Clinton or Pro-Trump
Reaffirm support of candidates
Try to win over opposition
Strategy1st
25. INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
2nd
Usage of Regression
Variables
• Determine class of
people and donation
size
• Characteristics of high-
amount donors
Targeted Approach
for limited resource
• Focus on certain states
or occupations known
to donate more
• Focus on states or
occupations with weak
support to boost it
28. INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
2nd
Usage of Regression
Variables
• Determine class of
people and donation
size
• Characteristics of high-
amount donors
Targeted Approach
for limited resource
• Focus on certain states
or occupations known
to donate more
• Focus on states or
occupations with weak
support to boost it
29.
30. Only
donations
above $200
are declared
Many $1
(earmarked)
donations
Same names
with
multiple
transactions
Donations
are due to
political
inclination
INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
31. Extend model to classify Republican or Democrat supporters
Analyse data further to extract more outliers
1
Use dates as point of analysis
Use Cities instead of States
2
3
4
INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
32. INTRODUCTION METHODOLOGY FINDINGS LIMITATIONS FURTHER STUDIES CONCLUSIONDATA VISUALISATION
Candidate
Benefits,
Wins
America
Boost
Campaign
Outreach
Direct
limited
resources
Knowledge to
excel and
strategise
Targeted
Campaign
Approach
Value and
Usability
Deliver
Develop
Data