"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Data Refinement
1. Data Refinement:
The missing link between data collection and decisions
Stephen H. Yu
Data Strategy & Analytics Consultant
2. What we will cover
1
• Database Marketing Landscape
• Analytics and Models
• “Model-Ready” Environment
• Data Summarization & Categorization
• Delivery: Scoring & QC
• Closing the Loop
3. Big Data, Small Data, Neat Data, Messy Data
How is the "Big Data” working out for you?
2.5 quintillion bytes collected per “day”
1 quintillion (exabytes) = 1 billion gigabytes
• Did all this data improve your decision making process?
• Do you have the results to show for?
• Information Overload? You bet!
Harness insights, drop the noise
2
4.
5.
6. Raw Data
• Demographic / Firmographic
• RFM
• Products & Services Used
• Promotion / Response History
• Lifestyle / Survey Responses
• Delinquent history
• Call / Communication Log
• Movement Data
• Sentiments
Marketing Answers
• Likely to buy a luxury car
• Likely to take a foreign vacation
• Likely to response to free
shipping offer
• Likely to be a high value customer
• Likely to be qualified for credit
• Likely to upgrade
• Likely to leave
• Likely to come back
Refined Answers
5
15. • Most modern databases optimized for massive storage and
rapid retrieval, not necessarily for predictive analytics
o Relational databases
o NoSQL databases
• Need “Analytical Sandbox” (or Database/Data-mart)
o Structured & de-normalized
o Variables as descriptors of model targets
o Common analytical language (SAS, R, SPSS)
o Must support “in-database” scoring
Unstructured to Structured
14
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47. It is not just about modeling, but all surrounding services as well
10 essential items to consider when outsourcing analytical projects
1. Consulting capability: Translate marketing goals into mathematics
2. Data processing: Conversion, edit, summarization, data-append, etc.
3. Pricing structure: Model development is only one part; hidden fees?
4. Track record in the industry: Not in rocket science, but in marketing
5. Types of models supported: Watch out for one-trick ponies
6. Speed of execution: Turnaround time measured in days, not weeks
7. Documentation: Full disclosure of algorithms, charts and reports
8. Scoring validation: Job not done until fully scored and validated
9. Back-end analysis: For true “Closed-loop” marketing
10. Ongoing Support: Periodic review and update
10 essential items to consider when outsourcing analytics
46